Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
89 views10 pages

Lec1 - Introduction

The DAI-101 Data Science course at IIT Roorkee, taught by Dr. Devesh Bhimsaria and Dr. Deepak Sharma, covers foundational topics in data science including data analysis, machine learning, and deep learning, with a total of 42 contact hours and 4 credits. Evaluation consists of mid-term and end-term exams, assignments, and attendance, with specific rules for class conduct and assignment submission. The course emphasizes the importance of data science in various industries and the technical skills required for data analysis.

Uploaded by

namanm1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views10 pages

Lec1 - Introduction

The DAI-101 Data Science course at IIT Roorkee, taught by Dr. Devesh Bhimsaria and Dr. Deepak Sharma, covers foundational topics in data science including data analysis, machine learning, and deep learning, with a total of 42 contact hours and 4 credits. Evaluation consists of mid-term and end-term exams, assignments, and attendance, with specific rules for class conduct and assignment submission. The course emphasizes the importance of data science in various industries and the technical skills required for data analysis.

Uploaded by

namanm1
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Data Science

DAI-101 Spring 2024-25

Dr. Devesh Bhimsaria


Office: F9, Old Building
Department of Biosciences and Bioengineering
Indian Institute of Technology–Roorkee
[email protected]
1
About the Course
⚫ Contact Hours (Hrs. per week): L:3 T:1 P:0
⚫ Total contact hours: 42
⚫ Credits: 4
⚫ Prerequisite: None.
⚫ Taught by: Dr. Devesh Bhimsaria & Dr. Deepak Sharma
⚫ Ran in 5 Batches

2
Course Outline
⚫ Introduction to Data Science: Latest and greatest in data science
⚫ Data Analysis Foundation: Types of data (data matrix, numeric, categorical
datasets), data preparation: data cleaning, data reduction and transformation
⚫ Exploratory Data Analysis and Visualization: Univariate and bivariate analysis,
data visualization
⚫ Statistical Analysis: Confidence Intervals, Hypothesis Testing, p-values, Bias
and Variance trade-off
⚫ Machine Learning: introduction to supervised and unsupervised methods, model
training, overfitting and underfitting, bias and variance, introduction to
supervised methods: regression and classification (Linear regression, logistic,
decision trees, SVM), Clustering, K-means, PCA
⚫ Deep learning and Big Data: Gradient Descent, Neural nets, Convolutional
Neural Networks, Big Data technologies (MapReduce, HDFS)

3
Evaluation
⚫ Mid-term exam (30)
⚫ End-term exam (50)
⚫ 2 assignments (10)
⚫ Attendance (10)
⚫ 80% and above 10
⚫ 50% and below 0
⚫ If there is any modification, you’ll be informed in
advance

4
Rules and other points
⚫ Maintain class decorum.
⚫ Timely Submission of assignments.
⚫ For any help related to the course or otherwise – a)
You can email me, b) ask me during lecture/tutorial.
⚫ If urgent, CR may call or message.

5
What is Data Science?
⚫ Definition: An interdisciplinary field that uses
scientific methods, processes, algorithms, and systems
to extract knowledge and insights from structured and
unstructured data.
⚫ Key Components:
⚫ Data Collection
⚫ Data Cleaning and Preparation
⚫ Data Analysis and Visualization
⚫ Machine Learning and AI

6
Importance
⚫ Real-world applications:
⚫ Healthcare: Predicting diseases
⚫ Business: Customer segmentation
⚫ Finance: Fraud detection
⚫ Social Media: Recommendation systems
⚫ Industry growth and demand for data professionals

7
Skills Required for DS analysis
⚫ Technical Skills:
⚫ Programming: Python, R, SQL
⚫ Data Manipulation: Pandas, NumPy
⚫ Visualization: Matplotlib, Seaborn
⚫ Machine Learning: Scikit-learn, TensorFlow

8
Challenges in Data Science
⚫ Data Quality Issues: Missing, noisy, or inconsistent
data
⚫ Data Privacy & Ethics: Ensuring compliance with
regulations
⚫ Model Interpretability: Explaining complex models
⚫ Scalability: Handling large datasets

9
Thank You
• All my slides/notes excluding third party material
are licensed by various authors including myself
under https://creativecommons.org/licenses/by-
nc/4.0/

10

You might also like