I am an engineer who researches and implements services based on LLM and RAG. I focus on deeply understanding model structures through paper reviews and PyTorch implementations, and on creating practical Agentic AI services that run in real-world environments.
저는 LLM과 RAG 기반 서비스를 연구·구현하는 엔지니어입니다. 논문 리뷰와 PyTorch 구현을 통해 모델 구조를 깊이 이해하고, 실제 현장에서 동작하는 Agentic AI 서비스를 만드는 데 집중합니다.
- Developed and optimized a Whisper-based STT model for pronunciation assessment and designed a RAG pipeline using LangChain/LangGraph. This project won 2nd place at the SSAFY Specialized Project competition.
- Whisper 기반 STT 모델 최적화 및 발음·억양 평가 시스템을 개발하고, LangChain·LangGraph를 활용한 RAG 파이프라인을 설계했습니다. 본 프로젝트로 SSAFY 특화프로젝트에서 우수상을 수상했습니다.
- Implemented a RAG-based recommendation and Q&A system from a vector database built by crawling AniList tags. Experienced the full cycle of data cleansing, embedding design, and service planning.
- AniList 태그 기반 크롤러로 벡터 DB를 구축하고, 개인 취향 분석 기반 RAG 추천 및 질의응답 시스템을 구현했습니다. 데이터 정제, 임베딩 설계, 개인화 AI 서비스 기획 경험을 쌓았습니다.
- Built a Supervised Fine-Tuning (SFT) dataset from MCP documents and designed an Agentic AI flow to automate environment setup and tool-calling based on user input.
- MCP 문서를 기반으로 SFT 데이터셋을 구축하고, OS 환경별 설정 자동화 및 Tool-Calling 최적화를 실험했습니다. 사용자 입력에 따라 환경을 감지하고 설정을 자동화하는 Agentic AI를 설계했습니다.
| Category | Skills |
|---|---|
| AI & ML | PyTorch, TensorFlow, Hugging Face, RAG, Agentic AI, Tool Calling |
| Model Optimization | LoRA/QLoRA, Unsloth Quantization, Distillation, KV Caching |
| Backend & MLOps | FastAPI, Django, Spring Boot, Docker, Nginx, CI/CD |
| Languages | Python, Java, JavaScript, C, C++ |
| Data & Infra | Vector DB (ChromaDB, FAISS), Git, Linux |
- Excellence Award (2nd Place) | SSAFY Specialized Project, Daejeon
- Proven experience in developing and deploying real-world AI services.
- Mamba Architecture: Analyzing Selective State Space Models.
- MoLE Project: Implementing key concepts from the Mixture-of-Logit-Experts paper.
- ONNX Optimization: Experimenting with model optimization and lightweight inference.
(This is a comprehensive list of papers I have reviewed and studied.)
Mixture-of-Experts & Advanced Architectures
- Mixture-of-Experts with Expert Choice Routing (Zhou et al., 2022)
- Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer (Shazeer et al., 2017)
- Mixture-of-Logit-Experts for Open-Domain Dialogue Generation (Shen et al., 2021)
Foundation Models & Architecture
- Attention Is All You Need (Vaswani et al., 2017)
- BERT: Pre-training of Deep Bidirectional Transformers (Devlin et al., 2019)
- GPT-2: Language Models are Unsupervised Multitask Learners (Radford et al., 2019)
- RoBERTa: A Robustly Optimized BERT Pretraining Approach (Liu et al., 2019)
Retrieval-Augmented Generation (RAG) & Knowledge Systems
- Retrieval-Augmented Generation for Knowledge-Intensive NLP (Lewis et al., 2020)
- Dense Passage Retrieval for Open-Domain Question Answering (Karpukhin et al., 2020)
- Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection (Asai et al., 2023)
- RAFT: Adapting Language Model to Domain Specific RAG (Zhang et al., 2024)
- Lost in the Middle: How Language Models Use Long Contexts (Liu et al., 2023)
Efficient AI & Model Optimization
- LoRA: Low-Rank Adaptation of Large Language Models (Hu et al., 2022)
- DistilBERT: A distilled version of BERT (Sanh et al., 2019)
- Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference (Jacob et al., 2018)
Speech & Multimodal AI
- OpenAI Whisper: Robust Speech Recognition via Large-Scale Weak Supervision (Radford et al., 2022)
- Wav2Vec 2.0: A Framework for Self-Supervised Learning of Speech Representations (Baevski et al., 2020)
- CLIP: Learning Transferable Visual Representations from Natural Language Supervision (Radford et al., ICML 2021)