Interactive NLP Learning Platform for Beginners
π Live Demo
ThinkNLP is an educational web application designed to help beginners in Natural Language Processing (NLP) understand the full pipeline of sentiment and topic analysis using real-world review data. It provides a step-by-step, no-code interface to interactively explore how NLP models work.
- Full NLP pipeline walkthrough
- Upload and process real review data
- Compare sentiment models (VADER, TextBlob, BERT)
- Topic modeling with LDA and interactive visualization (pyLDAvis)
- Manual and auto topic labeling
- Sentiment distribution per topic
- Beginner-friendly UI with visual explanations
-
Upload Review File
- Supports CSV input, stored in AWS S3 (gzip compressed)
-
Data Cleaning
- Normalization: Lowercasing, typo correction
- Special character removal: special character, number, and emoji
- Tokenization: Breaking sentences into words
- Stopword Removal: remove common words that not provide useful context
- Lemmatization: Converting words to their base forms
-
EDA (Exploratory Data Analysis)
- Word clouds, frequency charts, sentence length plots
-
Topic Modeling
- LDA with auto or manual topic count
- pyLDAvis visualization
-
Topic Labeling
- Manual, keyword-based, or auto-inferred labels
-
Sentiment Analysis
- VADER: Rule-based model
- TextBlob: Lexicon-based
- BERT: Transformer-based classifier (optional)
-
Sentiment-Topic Mapping
- Each sentence assigned a dominant topic
- Sentiment computed per topic
- Output: Sentiment distribution per topic (Positive, Neutral, Negative)
- Frontend: React + TanStack Query (Vercel)
- Backend: FastAPI + PostgreSQL (Docker, DigitalOcean)
- Storage: AWS S3 for file uploads
- Monitoring: BetterStack for logs and metrics
Diagram illustrating request flow, infrastructure, CI/CD, and observability for ThinkNLP.
- Gzip file compression
- Rate limiting & security headers
- Future: Background processing with Celery
- β User authentication & file history
- β Background task support (Celery + status UI)
- β Expanded model selection and interpretability features
- β Beginner tutorials and automatic result summaries
- β Support with Multiple Languages beside English Language
- NLP beginners and students
- Educators and instructors
- Developers interested in NLP and no-code tools
- Docker
- Python 3.11
Create a Python virtual environment if you havenβt already:
python -m venv .venv
source .venv/bin/activateThen run:
cd root_project
cp .env.example .env
make migrate # Setup initial database schema
make up-local # Run the backend serverOther helpful Makefile commands:
make reset-all # Reset database and volumes
make logs-local # View backend logs
make lint # Run linter (ruff)
make test # Run backend tests with coverage
... # You can check more on Makefile filegit clone https://github.com/sokritha-dev/think-nlp-frontend.git
cd root_project
yarn add
yarn devthink-nlp/
βββ .github/workflows/ # GitHub Actions workflows
βββ .vscode/ # VSCode editor settings
βββ app/ # FastAPI application code
βββ k8s/ # Kubernetes manifests
βββ metric/ # Monitoring & metrics utilities
βββ migrations/ # Alembic migration files
βββ reports/ # Load test and analysis reports
βββ scripts/ # Helper and automation scripts
βββ .autoenv.zsh # Autoenv activation for Zsh
βββ .dockerignore # Docker ignore rules
βββ .env.sample # Example environment variables
βββ .gitignore # Git ignore rules
βββ .python-version # Python version pinning
βββ Dockerfile # Production Dockerfile
βββ Dockerfile.dev # Development Dockerfile
βββ LICENSE # MIT License
βββ Makefile # CLI automation for dev/test/deploy
βββ alembic.ini # Alembic configuration
βββ docker-compose.*.yml # Docker Compose files for different envs
βββ locustfile.py # Locust load testing script
βββ pytest.ini # Pytest config
βββ requirements.txt # Production dependencies
βββ requirements-dev.txt # Development dependenciesThis project is licensed under the MIT License.
- Built using FastAPI, React, and pyLDAvis
- NLP components inspired by open-source models and tutorials