The project aims to develop an AI-based sentiment analysis system using customer reviews scraped from Booking.com.
This repository is developed as part of the course M. Grum: Advanced AI-Based Application Systems at Potsdam University. It showcases an end-to-end AI project that involves data scraping, preparation, training, validation, and deployment using Docker and TensorFlow. The goal is to create an AI-driven system that is fully reproducible and can be utilized by researchers and evaluators.
- Forked the MarcusGrum/AI-CPS repository and modified it to fit our project needs.
- Maintained structured commits with meaningful messages (at least three per team member).
- Documented project ownership in this
README.mdfile and clarified that this project is part of the Advanced AI-Based Application Systems course.
- Scraped relevant data from the web and stored it in
joint_data_collection.csv. - Preprocessed data, including outlier removal and normalization.
- Split data into:
training_data.csv(80% of the dataset)test_data.csv(20% of the dataset)activation_data.csv(single entry from the test dataset)
- Created two Docker images:
learningBase_SentimentAnalysis: Contains training data (training_data.csv) at/tmp/learningBase/train/and test data (test_data.csv) at/tmp/learningBase/validation/.activationBase_SentimentAnalysis: Contains activation data (activation_data.csv) at/tmp/activationBase/.
- Based images on BusyBox for lightweight deployment.
- Documented dataset origin and licensing in
README.mdinside Docker images. - Verified functionality using
docker-compose.ymland mounted external volumeai_system.
- Developed an AI model using TensorFlow.
- Trained the model on
training_data.csvand validated it usingtest_data.csv. - Stored the trained model as
currentAiSolution.h5. - Saved training metrics, loss, and accuracy plots.
- Created visualization reports including:
- Training and testing loss curves.
- Diagnostic plots.
- Scatter plots.
- Developed an Ordinary Least Squares (OLS) model using
statsmodels. - Trained and tested it using the same dataset for performance comparison.
- Stored the OLS model as
currentOlsSolution.pkl. - Documented performance using:
- Diagnostic plots.
- Scatter plots for analysis.
- Created two additional Docker images:
knowledgeBase_SentimentAnalysis: Contains the AI/OLS models at/tmp/knowledgeBase/.codeBase_SentimentAnalysis: Provides activation data for AI inference.
- Documented ownership, course affiliation, model type, and AGPL-3.0 license in
README.mdinside Docker images. - Published images on Docker Hub for accessibility.
- Developed
docker-compose.ymlfiles for:- Running the AI model using
knowledgeBase_SentimentAnalysisandactivationBase_SentimentAnalysis. - Running the OLS model using the same setup.
- Running the AI model using
- Used external volume
ai_systemfor managing temporary files. - Ensured seamless model execution by mounting required paths.
git clone https://github.com/NoveraNasa/AI-Based-Sentiment-Analysis.git
cd AI-Based-Sentiment-AnalysisCourse Information This repository is created and maintained by Qazi Novera Tansue Nasa and Mustafa Wasif as part of the course 'M. Grum: Advanced AI-based Application Systems' by the Junior Chair for Business Information Science, esp. AI-based Application Systems at University of Potsdam.
📄 License This project is licensed under the AGPL-3.0 license.