Hi there! I'm a data scientist and researcher.
You can learn more about my research interests on my homepage.
I am a significant contributor to FlexEval, a Python package for evaluating the performance of large language models.
Here are a few of my research prototypes:
- llm-math-education: a Python package implementing retrieval-augmented generation for middle-school math tutoring.
- HealthBlogRec: a prototype deep learning recommender system for health blogs.
- ALSim: a simulation library that implements several state-of-the-art active learning algorithms.
- wiki-ores-feedback: a web app and auditing analysis for vandalism detection classifiers on Wikipedia.
And here are some standalone analyses:
- covid-data-analysis: Exploratory modeling with COVID-19 patient data. Explainable logistic regression models with
Pandas
,statsmodels
,scikit-learn
, andpytest
. - biomarker-case-study: High-performance binary classification with patient demographic and biomarker data. Gradient Boosting Machines with
Pandas
andscikit-learn
.