Hello, I'm Joel M

AI enthusiast & Data Engineer | M.S. in Artificial Intelligence @ Northeastern University

I like AI/ML research, enterprise-scale data engineering, building systems in deep learning, NLP, and computer vision. Please visit: https://mjsushanth.github.io/

Portfolio:

Do visit my portfolio here

Study Notes:

My study notes are categorized and placed in here!
Credits to Obsidian, such a perfect app for taking notes. Here, you can expect to find deep-math diving, clear mental models, intuition, project-research on whatever my work has produced.

Research & Academic Projects

FinRAG/FinSights: Production-Grade Financial Intelligence System — Hybrid dual-path architecture combining structured queries (DuckDB/SQL dimension tables) with semantic retrieval. Processes 72M→1M sentences via stratified sampling with temporal weighting across regulatory eras. Check this out!
- Data Engineering Pipeline: DuckDB stratified sampling (30+ SQL scripts) with weighted multi-objective scoring, fuzzy-matched integrations, conditional temporal stratification.
- Advanced RAG Engineering: Sentence-level embeddings, multi-query expansion with window-hopping retrieval, citation provenance via document headers for exact traceability. Polars/Parquet logging, serverless-ready architecture. Achieves $0.017-0.025/query cost, no managed DB overhead.
Text-to-Pose Diffusion: Built a CLIP-conditioned diffusion model with cross-attention + anatomical loss for 3D pose generation.
- Has deeply researched concepts on Motion/3D data: (pose representation, N-joint hierarchical mapping, kinematic chains, pelvis-spine-extremity validation) and the architecture of Hybrid CNN-Transformer Diffusion, CLIP Semantic Encoding & Projection, Dual-Pass CFG and Anatomical Constraint Enforcement. See Report here., See Design here.
Multi-View 3D Scene Analysis: Created a 10k+ LOC pipeline with MV scene analysis, pose-guided filtering, occlusion handling, and RANSAC validation on ETH3D. See Design Flow.
Protein Structure Prediction: Implemented HMM, CRF, BiLSTM; CRF reached 67% accuracy on CB513 using evolutionary + context features. See Report here.
SocrAItic Circle: Multi-Agent Debate LLMs workflow, designed with multi-phase debate cycles, iterative refinement, YAML-driven orchestration, and judge modules.
Artist Classification: Compared SVM-SIFT-BoVW, CAEs, VAEs, and CNNs; SVM achieved 89% accuracy on 50-class dataset.

Other works:

Multi-vector ViT+CLIP with LoRA and ColBERT-style MaxSim retrieval Demo Notebook.
An example workflow of ML-Serving using Gitub CI/CD and AWS Lambda, SAM Infrastructure. Src Code. , Notes here. Study notes.
Usage of Optuna and MLFlow using a synthetic time-series generator Src Code.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hello, I'm Joel M

Portfolio:

Study Notes:

Research & Academic Projects

Other works:

📫 Connect with Me

About

Uh oh!

Releases

Packages

mjsushanth/mjsushanth

Folders and files

Latest commit

History

Repository files navigation

Hello, I'm Joel M

Portfolio:

Study Notes:

Research & Academic Projects

Other works:

📫 Connect with Me

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages