- Open Source Contributions
- Side Projects
- Papers (Unpublished Manuscripts)
- Professional Projects
- Academic Projects
- Apple MLX Deep Learning Framework:
-
Description: Collaborated with Apple researchers to implement utility functions (e.g., atleast_nd) to Apple’s MLX framework. This enabled NumPy-style shape handling for flexible model and data pipeline construction.
- PyTorch (ExecuTorch):
-
Description: Collaborated with Meta engineers to update the ExecuTorch build process to enable quantized and optimized TorchAO kernels on MacOS and iOS devices
-
GradLite
- Description:
Numpy-based Autograd engine built from scratch. Implements standard ops, backpropagation, optimizers and
nnmodules. - Code
- Description:
Numpy-based Autograd engine built from scratch. Implements standard ops, backpropagation, optimizers and
-
Custom PyTorch Kernels
- Description: A simple implementation of custom MLP kernels for PyTorch. This includes both a CUDA and Triton kernel for linear + ReLU operations on a GPU.
- Code
-
MuZero Chess
- Description: Custom chess environment integrated into the MuZero algorithm repo.
- Code
-
Graph Convolutional Networks
- Description: Implemented GCNs and gated attention mechanisms for learning graph structure.
- Code
-
Transformer Network
- Description: Built a basic transformer using PyTorch Lightning to learn the architecture.
- Code
-
Snake Bot
- Description:
Trained a snake-playing bot using Deep Q-Learning, PPO, and REINFORCE.
- Code
- Description:
Trained a snake-playing bot using Deep Q-Learning, PPO, and REINFORCE.
-
Towards Neural Ranking for Mixed-Initiative Conversational Search
Abstract: Recent research has shown that integrating clarifying questions and answers into ranking models offers the potential to better understand users’ information needs and improve document ranking. However, previous approaches only used naive ranking models (i.e. QL and BM25) so far and neural rankers remain unexplored. At the same time, neural ranking models dominate leaderboards for single-shot query tasks and bring interesting features that should also be advantageous in a conversational setup. In this work we explore how neural rankers can be extended to effectively represent clarifying question and answer in addition to the initial user query. To this end, we first try to extend conventional neural ranking models ConvKNRM and PACRR by naively aggregating FastText word embeddings. We then investigate whether contextualized word embeddings given by BERT are able to incorporate clarifying questions and answers more effectively and outperform these baselines. Lastly, we analyze how our models perform on different answer polarities (affirmation, negation, I don’t know and other).
-
Transparency in Deep Learning Using Hierarchical Prototypes
Abstract: Neural Networks are highly effective and widely used algorithms for classification. However, it is often difficult to interpret why these models make certain predictions. Earlier work uses the notion of prototype layers to allow for easier, visual interpretation of the network’s predictions. We extend this prototype model with a hierarchical prototype model, introducing sub- and superprototype layers. These layers enable the model to visualize a number of superprototypes equal to the number of superclasses while simultaneously allowing the model to infer and visualize the latent subclasses present in the data. This extension does not sacrifice accuracy, achieving 99% accuracy on an MNIST classification task. We have found that this model is indeed able to find and visualize general superprototypes and more specific subprototypes. Ultimately, we argue that this model can also be used in the pipeline of debiasing data and subsequent predictions.
-
Booking forecasting and dynamic pricing for hotels and tourist attractions
-
Description: Forecasted hotel and tourist attraction bookings in a highly dynamic setting. Built automated ML pipelines for preparation, training, scoring, and reporting. Implemented dynamic pricing using custom models, AutoML, and distributed infrastructure.
-
Key topics: Time series forecasting, ML infra, RNNs, Darts, PyTorch, gradient boosting, Spark, SQL, Bayesian Optimization
-
Code not publicly available
-
-
Hand animation projection
-
Fillable field detector for forms
-
Icelandic smart assistant voice commands
-
Description: Integrated voice commands into the Icelandic smart assistant Embla. Built on top of the Greynir NLP engine.
-
-
Thesis: Semantic Segmentation Under Realistic Constraints
-
Description: Investigated mixing-based semi-supervised learning for segmentation with inconsistent, unbalanced, sparse annotations. Optimized for edge devices with FC-HarDNet, new loss weighting strategies, and realistic image mixing.
-
Collaboration with Marel
-
Thesis/code not publicly available
-
-
2D to 3D facial reconstruction
- Description: Reconstructed 3D faces from 2D images using PCA and energy minimization.
- Code
-
Structure-from-motion
- Description: Built an SfM system for recovering 3D structures from 2D image sequences.
- Code
-
N-step bootstrapping in actor-critic methods
-
Subreddit sentiment analysis visualization
- Description: Built a visual analytics tool to analyze mobilizations between communities with network and sunburst views.
- Code
- Final Report
- Video