Hey there!
Hey there, thanks for dropping by! This is my minimal personal website.
Formally, I am a machine learning engineer with six years of experience and demonstrated success in building and productionizing machine learning and recommendation systems using TensorFlow, PyTorch, AWS, GCP, Python, and LLMs.
Informally, I make neural nets go brrr.
I am from Budapest, Hungary, and I live in Stockholm, Sweden. I have also studied and worked in Manchester, United Kingdom.
Work Experience
- Machine Learning Engineer at Sellpy - Stockholm, Sweden, Dec. 2022 - Ongoing
- Integrating Generative AI models into the company’s core system that extracts, stores, and serves data about product features with the aim to achieve more accurate and scalable product processing with less human supervision. For this, I use LLMs, multimodal models, vector databases, and vector and hybrid search algorithms.
- I work a lot with machine learning, deep learning, and neural networks in the context of recommendation systems, and classification and regression problems such as product feature classification, product pricing.
- I also work quite a bit with data engineering (ETL pipelines) and cloud infrastructure (Infrastructure As Code).
- Day to day, I mostly use Python, TensorFlow, PyTorch, Google Cloud (BigQuery, Vertex AI), AWS (Kinesis, S3, Batch, Lambda, ECS, ECR, SQS, API Gateway, Bedrock, CloudFormation), LangChain, Langfuse, Qdrant, OpenAI API, Docker, Sentry, PostgreSQL, Apache Airflow, Git, Scikit-learn, Apache Spark, FastAPI.
- Machine Learning Engineer at Ecobloom - Stockholm, Sweden, Aug. 2020 - Dec. 2022
- Developed large scale, distributed data processing infrastructure and deep learning pipelines for plant leaf detection and stress recognition on Google Cloud Platform. Built with: Cloud Pub/Sub, Cloud Functions, Cloud SQL, BigQuery, Dataflow / Apache Beam, Cloud Data Fusion, Vertex AI, and PyTorch.
- Research and Development Engineer at Nexperia - Manchester, United Kingdom, Sep. 2018 - Aug. 2019
- I developed semiconductor design process simulation software in Python, and extracted actionable insights from raw manufacturing data using statistical techinques.
Education
- KTH Royal Institute of Technology, Master of Science (MSc) in Machine Learning - Stockholm, Sweden, Aug. 2020 - Jul. 2022
- In my research thesis, I worked on semi-supervised deep learning to increase object detection efficiency with scarce annotations. My aim was to
demonstrate the feasibility of improving object detection performance with the Unbiased Teacher for Semi-Supervised Object Detection algorithm in plant leaf detection and possible stress recognition when few annotations are available. The improved performance reduced the amount of required annotated data for this task, this reduced annotation costs and thereby increased usage for this real-world tasks. This research was supervised by Prof. Josephine Sullivan, and performed for Ecobloom.
- I was a teaching Assistant for DD2424 Deep Learning in Data Science.
- My favourite modules were: Advanced Deep Learning, Advanced Machine Learning, Artificial Neural Networks and Deep Architectures, Deep Learning in Data Science, Machine Learning, Artificial Intelligence, Data Mining, Data-Intensive Computing, Probabilistic Graphical Models, and Speech and Speaker Recognition.
- University of Manchester, Bachelor of Engineering (BEng) in Electronic Engineering First-Class Honours - Manchester, United Kingdom, Sep. 2016 - Jun. 2020
Technologies and Programming Languages
- Python
- Docker
- TensorFlow/PyTorch/Scikit-learn
- Generative AI APIs such as Vertex AI, OpenAI API, and AWS Bedrock
- LangChain
- Amazon Web Services / AWS (Kinesis, S3, Batch, Lambda, ECS, ECR, SQS, API Gateway, CloudFormation)
- Google Cloud Platform / GCP (Cloud Pub/Sub, Cloud Functions, Cloud SQL, BigQuery, Dataflow / Apache Beam, Cloud Data Fusion, Vertex AI)
- PostgreSQL / SQL
- Qdrant / Pinecone (vector database)
- Langfuse (LLM logging)
- Apache Spark / Dask / Apache Beam for distributed data processing
- FastAPI / Flask for API development
- Apache Airflow for ETL pipelines and ML Operations workflows
- Git / GitHub
- Sentry for error tracking
- MLflow for model tracking
- Jupyter Notebooks
- NumPy / Pandas / Matplotlib / Seaborn etc.
- Unix / Linux
I also have experience with:
- Retrieval-augmented generation (RAG), RAG with Knowledge Graph (Neo4j), RAG Evaluations (Ragas), OSS models (using Ollama) via pro-bono consulting
- Reinforcement Learning (RL) and bandit algorithms with Stable-Baselines3 for algorithmic trading
Pet projects
- re-sln
- nn-blocks
- Following Andrej Karpathy’s example, I implemented a bunch of neural network blocks from scratch in Python and NumPy (strictly no TensorFlow or PyTorch allowed), to understand how neural networks work under the hood.
- tsboi
- Time-series forecasting of crypto currency exchange rates with XGBoost and Transformers.
- Deploying models with MLflow.
- Fun project to learn about time series forecasting, financial data, and MLflow.
- Algorithmic trading with Reinforcement Learning using Stable-Baselines3
- Worked on an algorithmic trading application as a startup project.
- Trainined, evaluated, and deployed XGBoost and Transformer-based time series neural networks to predict crypto currency exchange rates using Kline Candlestick Data, Order Book Data, etc. from Binance for many crypto currency pairs.
- Used Reinforcement Learning (RL) to optimise trading strategies given various primary and secondary objectives, such as maximising profit, minimising risk, maximising Sharpe ratio, etc. (Stable-Baselines3)
- This project was a lot of fun, but we never published it because we never actually found a strategy that consistently beat the market. I learned a lot, though, about evaluating RL models which is a really different problem from evaluating supervised models.
- If one was to start with a similar project, I would recommend reading some of the best papers on the topic of RL evaluation (in my opinion): Deep Reinforcement Learning that Matters, Empirical Design in Reinforcement Learning, the best blog post: Rliable: Better Evaluation for Reinforcement Learning - A Visual Explanation, and to play around with the best RL library: Stable-Baselines3.
- NeRFs for 3D reconstruction
- I am playing around with MLX and spending time on implementing some NeRF-based algorithms
- The reasons I am using MLX (Apple’s array framework for Apple silicon) is becauase I think the unified memory model (i.e. no need to move tensors between CPU and GPU) makes it a great tool for ML.
- Generative AI apps with NextJS, Supabase, and Generative AI APIs
- For example, I once built but never published an app where users could create puzzle games, for example, similar to NYT’s Connections with the aid of Generative AI (mostly Gemini). Using Generative AI in my opinion is a great use case for this as it’s hard to come up with new puzzles but fun with a little help. Users could share and play these games with friends, see leaderboards, popular games, etc.
- Building open-source datasets
- Once I built a dataset called “Object Detection: Batteries, Dice, and Toy Cars” for object detection - it’s on Kaggle, you can find it here
- This is an older project, I lacked some experience in ML back then, but I think it still holds up
I have a bunch of smaller projects on my GitHub.
You can reach me at my email (mark.antal.csizmadia@gmail.com) or LinkedIn (linkedin.com/in/mark-antal-csizmadia)