Stock market prediction
What is your project about?
My project is a stock market trend prediction system. It predicts whether a stock will go up
or down the next day, based on historical data and technical indicators.
We built a Flask web app where the user enters a stock symbol and date range; the app then
trains models and shows the prediction and ROC-AUC score.
Which technologies and tools did you use?
We used:
• Python as the programming language
• Flask for the web interface
• yfinance to fetch historical stock data
• scikit-learn and XGBoost for machine learning
• pandas and numpy for data preprocessing
Explain the overall architecture.
The system has three parts:
• Frontend: HTML page served by Flask, where the user enters stock symbol and date
range.
• Backend: Flask handles the request, fetches stock data, preprocesses it, trains
models, and predicts.
• Machine Learning: Multiple base models predict probabilities; then a meta-model
(XGBoost) combines these to give the final prediction.
What machine learning models did you use?
We used:
• Logistic Regression
• Support Vector Machine (SVM)
• K-Nearest Neighbors (KNN)
• Decision Tree
• Random Forest
All these are base models. Their predictions are combined and fed into a meta-
model: XGBoost classifier.
What is a stacking ensemble?
Stacking is an ensemble technique where we train multiple base models, collect their
predictions, and train another model (meta-model) on those predictions. This usually
increases accuracy because the meta-model learns how to combine the strengths of base
models.
What features did you use for prediction?
We engineered technical indicators:
• Moving Averages (MA5, MA20)
• Relative Strength Index (RSI)
• Bollinger Bands (Upper, Lower)
• MACD and MACD Signal
• Volume change
These help capture trends and volatility in the stock price.
How did you handle class imbalance?
We used SMOTE (Synthetic Minority Over-sampling Technique) on the training data to
balance the classes before training models.
How do you evaluate the model?
We calculate ROC-AUC score on the test set, which shows how well the model distinguishes
between up and down days.
Which API/library did you use to get stock data?
We used yfinance to download historical stock data.
What is the prediction output?
The model predicts if the stock will go up or down the next day, with a probability score.
For example: “Kal stock upar jayega! (Probability: 0.7251)”.
Why did you use Flask?
Flask is lightweight, easy to use, and perfect for small web apps where we need to combine
backend logic and a frontend page.
What are the limitations of your project?
• Limited to daily data (can’t do intraday prediction)
• Trained on a short date range each time (might overfit)
• No integration with live trading
How can you improve this project?
• Use deep learning models like LSTM for time series
• Add sentiment analysis from news
• Deploy on cloud with Docker
• Automate daily predictions
Why did you choose this topic?
Because stock market prediction is practical, challenging, and combines finance knowledge
with data science, which matches my interest.
What challenges did you face during developing this project and how did you
overcome them?
Data Quality & Missing Data
• Challenge: Stock data often had missing values or sudden spikes.
• Solution: We handled missing values by using rolling averages and carefully choosing
technical indicators that are robust to small gaps.
Small Dataset / Overfitting Risk
• Challenge: Training on a short date range could cause overfitting.
• Solution: Used cross-validation (StratifiedKFold) and stacked ensemble which
generalizes better.
how we fetched stock data using the yfinance API in your project:
import yfinance as yf
# Example: Fetch data for Reliance Industries between two dates
stock_symbol = 'RELIANCE.NS'
start_date = '2024-01-01'
end_date = '2024-06-30'
# Download historical stock data
stock_data = yf.download(stock_symbol, start=start_date, end=end_date)
print(stock_data.head()) # Show first few rows
Explanation (in interview):
• We used the download() function from the yfinance library.
• Passed stock symbol, start date, and end date as parameters.
• It returns a Pandas DataFrame with columns like Open, High, Low, Close, Adj Close,
and Volume.
How these model helps you in stock market prediction.
Logistic Regression • Helps find a linear relationship between features (like RSI, MACD, etc.)
and the target (up or down). • Works well as a baseline because it’s fast and interpretable.
Support Vector Machine (SVM) • Captures non-linear patterns by using kernel (RBF kernel in
our case). • Helps when data is complex and not easily separable by a straight line.
K-Nearest Neighbors (KNN) • Predicts based on the trend of nearby similar historical days. •
Useful for capturing local patterns and relationships.
Decision Tree • Builds simple rules like “if RSI > 70 and MACD > 0, then predict up”. • Helps
in handling non-linear and complex feature interactions.
Random Forest • Combines many decision trees to reduce overfitting and increase accuracy.
• Captures complex patterns and is robust to noise.
In our project: • Each model predicts its probability of stock going up. • Then, a meta-model
(XGBoost) learns to combine these predictions for better overall accuracy.
how the meta-model (XGBoost) combines the predictions:
After training the base models (Logistic Regression, SVM, KNN, Decision Tree, Random
Forest): • Each base model predicts the probability that the stock will go up. • For each day
in the training set, we collect these probabilities. • So, if there are 5 base models, each day
now has 5 numbers (probabilities). These 5 probabilities become new features. • For
example, instead of using RSI, MA5, etc., the meta-model uses: [0.67, 0.80, 0.55, 0.75, 0.60]
• These represent what each base model “thinks” about that day..
Movie Recommendation System
1. Can you describe the algorithms you used in your movie recommendation
system?
Answer: I used Content-Based Filtering and cosine similarity. Content-Based Filtering
recommends movies based on similarities between a user’s preferences and movie
attributes (e.g., genre, cast). Cosine similarity measures the cosine of the angle
between two non-zero vectors, which helps to determine how similar movies are
based on attributes.
2. Why did you choose Content-Based Filtering over Collaborative Filtering?
Answer: I chose Content-Based Filtering because it works well even when there's
limited user data, avoiding the cold start problem that Collaborative Filtering faces
with new users or items. Content-Based Filtering focuses on analyzing the attributes
of movies a user has interacted with rather than relying on data from other users.
3. How did you handle missing data or sparse data in your movie dataset?
Answer: I addressed missing data by using median or mean imputation for numerical
features and filling categorical features with the most frequent values or marking
them as “unknown.” To avoid sparsity in the model, I set a threshold for minimum
user ratings, ensuring that recommendations were based on enough data points.
4. Can you explain how the cosine similarity algorithm works in your
recommendation system?
Answer: Cosine similarity measures the cosine of the angle between two vectors
representing movie attributes in a multidimensional space. If the angle is close to
zero, the movies are similar. This method is efficient for working with sparse datasets,
which are common in recommendation systems when users interact with only a
subset of available items.
5. How did you preprocess the dataset before applying the recommendation
algorithms?
Answer:
Preprocessing involved:
• Data Cleaning: Removed duplicates and handled missing values.
• Normalization: Adjusted ratings to reduce bias from users who tend to rate
higher or lower than others.
• Feature Extraction: Extracted features like genre, cast, and director for
building the content-based filtering model.
• Text Preprocessing: Applied tokenization, stop-word removal, and stemming
for text fields like movie descriptions.
6. What is Content-Based Filtering, and how did you apply it in your movie
recommendation system?
Answer: Content-Based Filtering recommends items based on the characteristics of
previously liked items. In my system, I analyzed attributes like genres, actors, and
directors of the movies a user has rated highly, then recommended movies with
similar attributes, measured by cosine similarity.
7. How did you handle overfitting in your model?
Answer: I addressed overfitting by:
• Using cross-validation to ensure generalization to unseen data.
• Applying L2 regularization to penalize large coefficients and prevent the model
from over-relying on specific attributes.
• Considering dimensionality reduction techniques like Principal Component
Analysis (PCA) to focus on the most significant features.
8. How did you collect and process the movie data for your system?
Answer: I used a public dataset containing movie titles, genres, ratings, and user
interactions. The data was cleaned by removing duplicates and filling in missing
values. I extracted relevant features like genres and actors to create a feature matrix
for the recommendation engine.
9. What tools or libraries did you use to build your recommendation system?
Answer:
I built the system using Python. Key libraries included:
• Pandas for data processing.
• Scikit-learn for implementing cosine similarity and other algorithms.
• NumPy for mathematical operations.
• Matplotlib and Seaborn for data visualization and evaluation.
10. What are the main advantages of using Content-Based Filtering in a
recommendation system?
Answer: The main advantage is that Content-Based Filtering doesn’t rely on other
users' data, making it effective even with a small user base. It provides personalized
recommendations based on individual preferences and avoids issues like the cold start
problem.
11. What is the difference between Content-Based Filtering and Collaborative
Filtering?
Answer: Content-Based Filtering recommends items based on item attributes and the
user’s preferences, while Collaborative Filtering recommends items based on
interactions from similar users. Collaborative Filtering requires more user data and
often suffers from cold start problems, whereas Content-Based Filtering doesn’t.
12. From where did you get the API key?
Answer: I used the TMDb (The Movie Database) API to fetch real-time movie data,
such as movie details, ratings, and other attributes. TMDb provides developers with an
API key, which I integrated into my project to enrich the recommendation system with
up-to-date information.
13. Where did you source the dataset from?
Answer: I used a movie dataset from Kaggle as the primary data source. The dataset
contained essential information like movie titles, genres, and user ratings. Kaggle
offers various datasets for data science projects, and I selected one that was well
suited for building a recommendation system.
14. How did you integrate the TMDb API in your project?
Answer: "I used the TMDb API to fetch movie details like genres, cast, and ratings.
By making API requests, I was able to dynamically retrieve data that was needed.
15. What is the challenges you faced in Movie Recommendation System project
• Challenge: "I encountered issues with missing data and difficulty fine-tuning content-
based filtering algorithms."
• Solution: "I handled missing values with imputation and improved accuracy by refining
cosine similarity techniques for more personalized recommendations."
Portfolio Website
1. What was your goal in creating a portfolio website?
Answer: "The goal was to showcase my projects, skills, and achievements in a user
friendly platform. It acts as an online resume that highlights my work, helping
potential employers or collaborators understand my technical abilities."
2. How did you ensure your portfolio website is user-friendly?
Answer: "I focused on simplicity and clarity by organizing the content logically. I
used responsive design principles, ensuring that the site works well across devices.
3. How do you update your portfolio website when new projects or achievements
are added?
Answer: "I structured the site in a way that allows easy updates. When I complete a
new project, I simply add it to the relevant section of the website by updating the
HTML and CSS. For deployment, I use Vercel to push updates smoothly."
4. What technologies did you use to build the portfolio website?
Answer: I used HTML for structuring the content, CSS for styling, and JavaScript
to add interactive elements. I also utilized responsive design techniques to ensure the
website works seamlessly on various devices and screen sizes. Additionally, I
integrated GitHub to link my repositories, so visitors can see the code behind the
projects directly from the portfolio.
5. Can you explain the layout of your portfolio website and how you made it
responsive?
Answer: The layout of my portfolio consists of several key sections: a header with
navigation links, sections for skills, projects, and contact details, followed by a
footer. I utilized CSS Flexbox for the flexible structure, ensuring that elements align
properly across various screen sizes. To make the site responsive, I implemented
media queries, which allow me to apply different styles depending on the screen
width. This way, the layout adjusts smoothly whether it's viewed on a mobile device,
tablet, or desktop.
6. How did you implement navigation on your website?
Answer:
I created a simple navigation bar using HTML and CSS. It allows users to easily
scroll to different sections of the page such as "About Me," "Projects," and "Contact."
I made the navigation bar sticky, so it remains visible even when the user scrolls
down the page. To improve user experience, I added smooth scrolling using
JavaScript, so when a link is clicked, the page scrolls fluidly to the relevant section
instead of jumping abruptly.
7. How did you manage the styling of your website? Did you use any
frameworks?
Answer: "For styling, I primarily used custom CSS to give the website a unique,
personalized look. I chose not to use frameworks like Bootstrap to keep the code
lightweight and fully customized. However, I utilized CSS Flexbox and Grid to
efficiently manage the layout and ensure the website is responsive across different
devices."
8. What features of JavaScript did you use in your portfolio website?
Answer: "I used JavaScript to enhance the user experience by implementing:
o Smooth scrolling for better navigation between sections.
o Interactive animations, like hover effects on buttons and images, to
make the site more engaging.
o Form validation to ensure users submit correct data when filling out the
contact form."
9. How did you structure the content of your portfolio website?
Answer: "The content is organized into several sections for easy navigation:
o Header: Includes my name and a short introduction.
o About Me: Provides detailed information about my background and
skills.
o Projects: Showcases key projects with links to live demos and GitHub
repositories.
o Skills: Lists my technical skills, including programming languages and
tools.
o Contact: Features a form for users to reach out to me and links to my
social profiles like LinkedIn and GitHub."
10. What is challenges faced in Portfolio Website
• Challenge: "Making the website fully responsive across different devices
while maintaining a user-friendly design was tricky."
• Solution: "I used responsive design techniques with CSS media queries and
Weather App Project
1. What is your project about?
→ It’s a weather forecasting web app that shows current weather based on user
location or searched city using the OpenWeatherMap API.
2. Which technologies did you use?
→ HTML, CSS, JavaScript, and OpenWeatherMap API.
3. Explain the project flow briefly.
→ On load, it tries to get user location; if allowed, it shows weather. Users can also
search for weather by city name. Data is fetched from API and displayed dynamically.
⚙ Technical / Code Specific
4. How do you get the user’s current location?
→ Using navigator.geolocation.getCurrentPosition().
5. How do you handle tab switching between 'Your city Weather' and 'Search
Weather'?
→ By toggling classes (currentTab and active) to show/hide related containers.
6. What API did you use, and why?
→ OpenWeatherMap API because it’s free and provides detailed weather data
including temperature, humidity, windspeed, etc.
7. How do you handle API errors or city not found?
→ Using try-catch in async functions and showing the error container with a
message.
8. What is sessionStorage used for in your project?
→ To save user coordinates temporarily so we don’t ask for location permission every
time.
9. What is the purpose of renderWeatherInfo function?
→ To update UI dynamically by setting text and image sources based on fetched
weather data.
10. Explain why you used defer in the <script> tag.
→ To make sure the HTML loads before JS executes, preventing DOM errors.
Features & UI
11. What happens if the user denies location permission?
→ The app shows a 'Grant Location Access' message and button.
12. What information do you display for weather?
→ City name, country flag, description, icon, temperature, wind speed, humidity,
clouds.
13. How do you fetch weather by city name?
→ On form submit, call fetchSearchWeatherInfo(city) which hits the API using the
city name.
14. Explain the loading state in your app.
→ Shows a loader GIF while waiting for API response.
15. How do you handle UI updates based on different states (loading, error, success)?
→ Add or remove active class on containers.
🛠 Advanced / Improvement Questions
16. How can this app be improved?
→ Add 5-day forecast, save search history, better UI, offline support.
17. How do you ensure responsiveness?
→ Using CSS media queries and flexible layout.
18. Why store API key in code? Is it safe?
→ Not safe. Better to keep it in backend or environment variable.
“What issues you faced during developing this Weather App project and how you
overcame them”:
1. Handling location permission
Issue: When users denied location access, the app couldn’t fetch weather automatically.
Solution: Added a separate UI and button to prompt users to grant location access manually.
2. API errors and invalid city names
Issue: Sometimes users typed wrong city names or API failed.
Solution: Used try-catch in API calls to handle errors gracefully and showed an error
message