Hi, I'm
Dustin
Data Scientist
MSc Computer Science @ ETH Zürich
Headshot

About

Welcome to my portfolio!

I recently completed my Master's in Computer Science at ETH Zürich, specializing in Machine Intelligence and Data Management. I'm passionate about software engineering, statistics, and machine learning, and excited by the potential of current advances in AI.

I've contributed to forecasting, data engineering, and machine learning projects across industries such as logistics and energy, including open-source contributions to the Darts time series forecasting library. I enjoy turning complex, real-world data into business value through robust, practical solutions.

Driven by curiosity and a desire to better understand the world, I am eager to embrace new challenges. I'm open to roles in data science, machine learning, and quantitative research across various industries. Feel free to get in touch.

Skills

Programming Languages

  • Python
  • R
  • C++
  • Java
  • TypeScript
  • Haskell

Frameworks & Libraries

  • Pandas
  • NumPy
  • scikit-learn
  • PyTorch
  • XGBoost
  • LightGBM
  • CatBoost
  • Matplotlib
  • Plotly
  • SciPy
  • Flask
  • Optuna
  • PySpark
  • Hugging Face
  • SHAP
  • statsmodels
  • wandb

Data Management

  • SQL
  • Spark
  • Hadoop
  • HBase
  • MongoDB
  • Neo4j

DevOps & Tooling

  • Docker
  • Git
  • CI/CD
  • GitHub Actions
  • Jenkins
  • Bash
  • pre-commit
  • mypy
  • Confluence
  • Notion
  • Jira
  • Slurm

Focus Areas

  • Forecasting
  • Visualization
  • Explainability
  • Causal Inference
  • Natural Language Processing (NLP)
  • Generative AI (GenAI)

Languages

  • German (Native)
  • English (Proficient)
  • French (Elementary)

Experience

Research Assistant | Oracle Labs

Feb 2025 - Present | Zürich, Switzerland
  • Building an LLM-based agent for automated data retrieval, analysis, and report generation, aiming to reduce the decision-making time by 80%.

Data Scientist | Unit8

Sep 2021 - Aug 2022 | Zürich, Switzerland
  • Developed and implemented a route planning tool using PySpark and Palantir Foundry, optimizing logistics for a newspaper company.
  • Core contributor to the Darts library, optimizing model efficiency while reviewing pull requests, suggesting enhancements, and improving documentation.
  • Collaborated with clients to deliver high-impact data solutions in energy and manufacturing sectors.

Projects

Recommender Systems for Swiss Politics

Master's Thesis (Feb 2024 - Aug 2024)

Identified 11 vulnerabilities in the swiss voting advice application Smartvote with some allowing for more than 3.5x visibility gains for individual parties. Proposed 10 mitigations to significantly reduce or eliminate these vulnerabilities. Findings are being adopted in Smartvote's redesign for the next elections.

Python, Pandas, D-Tale, SciPy, Optuna, Plotly, LaTex, Notion
RSFP Manipulation

DataComp Challenge

Semester Project (Sep 2023 - Dec 2023)

Ranked 4th out of 12 teams in the small track of the DataComp Challenge, an ML benchmark where the goal was to filter a CommonCrawl image-text dataset to train a CLIP model evaluated on 38 zero-shot downstream tasks, using a combination of cross-modality filtering and content alignment.

Python, PyTorch Lightning, SLURM (Cluster), CLIP (Contrastive Language-Image Pretraining)
DataComp Workflow

BasketXplainer

Interactive ML Project (Feb 2023 - Jul 2023)

Developed an interactive dashboard to predict basketball game outcomes based on in-game stats and explain predictions using SHAP values. Users could modify team statistics to explore what-if scenarios.

Python, scikit-learn, SHAP, Flask, Javascript, React, Gitlab Pipelines
BasketXplainer Prediction

Darts Forecasting Library

Open-Source Contributor (Sep 2021 - Aug 2022)

Core contributor to the open-source time series forecasting library Darts by Unit8. Optimized the most popular regression forecasting models by vectorizing computations achieving a speedup of up to 400x.

Python, PyTorch, scikit-learn, Matplotlib, Git, GitHub Actions
pip install darts
Darts Logo

Distance Preserving Graph Embedding

Bachelor's Thesis (Feb 2021 - Aug 2021)

Developed a model that enables constant-time approximate shortest path distance queries on road networks, achieving an average mean relative error of less than 10%.

Python, NetworkX, PyTorch Geometric, (Hyperbolic) Graph Convolutional Networks ((H)GCNs)
Winterthur
Winterthur, Switzerland
Surat
Surat, India
Dongguan
Dongguan, China

Education

ETH Zürich

Master of Science in Computer Science

2022 - 2024

Major: Machine Intelligence | Minor: Data Management

GPA: 5.46 / 6.0

ETH Zürich

Bachelor of Science in Computer Science

2018 - 2021

Hobbies

Sports

  • Squash
  • Badminton
  • Gym
  • Padel Tennis
  • Table Tennis
  • Volleyball
  • Tennis

Interests

  • Photography
  • Music
  • Chess
  • Geoguessr
  • Geography
  • Coding
  • History
  • Philosophy

Social Activities

  • Board Game Night with Friends
  • Good Conversations over Dinner
  • Exploring Nature and Hiking