0% found this document useful (0 votes)

37 views6 pages

Data Analytics TOC

The Foundation Module for Big Data Analytics is a 240-hour program designed to up-skill individuals with a basic understanding of programming and data sequences. It covers essential topics such as data analytics, the data ecosystem, Hadoop, MapReduce, and various tools like SQL and Apache Spark, aimed at university students and professionals interested in Big Data. Key learning outcomes include evaluating Big Data trends, understanding data management systems, and executing data processing operations.

Uploaded by

soniyk40

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

37 views6 pages

Data Analytics TOC

Uploaded by

soniyk40

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 6

Foundation Module – BDA (Indicative duration: 240 hrs.

Foundation Module- pre-requisite

Basics of Information technology

Hardware and software components
Operating system
Computational thinking and problem solving skills
Basics of programming (Python)
Basics of Object oriented programming concepts
Database concepts

Foundational Curriculum – Big Data Analytics

Foundational Curriculum for Big Data Analytics is aimed at up-skilling those who have a basic
understanding of programming and data sequences, to help them expand their knowledge and learn the
fundamentals of Big Data Analytics technologies at a beginner level. This Curriculum has been divided
into three modules, of which the first is an introductory module.

Curriculum Details Scope and Objective Enable students to explore the fundamentals of Big
Data Analytics, to provide them with a base from
where they can up skill themselves for specific Big
Data Analytics job roles.
Intended Audience
University students enrolled in streams such as
Engineering, Computer Science, Statistics, Sciences or
Mathematics

Employed professionals who wish to explore their

career options and interests with regards to Big Data
Analytics

Enthusiasts curious about understanding the hype behind

Big Data Analytics
Pre-requisites Knowledge of the fundamentals of programming
including data sequences such as stacks, queues,
strings, arrays, linked lists, trees,
maps and the concepts of Object-Oriented
Programming
Key Learning Outcomes 1. Evaluate trends in Big Data and discuss how Big
Data is transforming businesses
2. Evaluate the different platforms used for
processing Big Data
3. Evaluate the features of databases
4. Write Map and Reduce codes for distributed
processing of data
5. Understand key concepts behind Big Data
modelling and management and gain practical skills
needed for modelling Big Data projects
6. Select appropriate data models that suit the
requirements of data
7. Differentiate between a traditional Database
Management System and a Big Data Management
System
8. Retrieve data from Big Data management systems
9. Execute simple Big Data integration and processing
operations

List of Tools Suggested (Indicative) SQL, Mongo DB, Hadoop, MapReduce, HDFS, Apache
Spark, PySpark, SparkR, Java, Apache Pig, Dynamo DB,
Spark MLlib, GraphX, Postgres,
Pandas

Indicative TOC
Data Analytics
Module 1: Data analytics an Overview-
 What & Why - Data Analytics?
 Different components of a modern data ecosystem, and the role of Data Analysts play in this
ecosystem.
 Different types of data analysis and the key steps in a data analysis process.
 Roles, responsibilities, and skillsets required to be a Data Analyst
 Data Analytics Tools

Module 2: The Data Ecosystem

 Different types of data structures, file formats, sources of data
 Understanding of various types of data repositories such as Databases, Data Warehouses, Data
Marts, Data Lakes, and Data Pipelines.
 Extract, Transform, and Load (ETL) Process, which is used to extract, transform, and load data
into data repositories.

Chapter 1: Introduction to Big Data-Hadoop framework

o Big Data Overview, What is Big Data Analytics
o Overview of Hadoop Ecosystem
o What is Big Data & Role of Hadoop in Big data– Overview of other Big Data Systems
o Hadoop integrations into Exiting Software Products
o Current Scenario in Hadoop Ecosystem
o Installation & Configuration
o Use Cases of Hadoop (HealthCare, Retail, Telecom)

Chapter 2: HDFS
o HDFS Concepts & Design
o Architecture, HDFS Daemons
o Overview Of Hadoop Distributed File System
 Name nodes
 Data nodes
 The Command-Line Interface
o Data Flow (File Read , File Write)
o Fault Tolerance, Shell Commands
o Data Flow Archives, Coherency -Data Integrity
o Role of Secondary NameNode

Chapter 3: Hadoop Components - MapReduce

o Anatomy of Map Reduce & Theory
o Data Flow (Map – Shuffle – Reduce)
o MapRed vs MapReduce APIs
o Programming [Mapper, Reducer, Combiner, Partitioner]
o Writables
o Input and Output format
o Streaming API using python
o Magic of Shuffle Phase
o File Formats, Sequence Files

Chapter 4: Extended subjects on HBASE

o Introduction to NoSQL
o CAP Theorem
o Hbase and RDBMS
o HBASE and HDFS
o Architecture (Read Path, Write Path, Compactions, Splits)
o Installation & Configuration
o Role of Zookeeper
o HBase Shell Introduction to Filters
o RowKeyDesign -What’s New in HBase Hands On

Chapter 5: Extended subjects on HIVE

o Architecture
o Installation & Configuration
o Hive vs RDBMS
o Working on Hive Beeline
o Hive- HQL, Tables
o DDL, DML
o UDF
o Partitioning, Bucketing
o Hive functions, Date functions, String functions
o Joins, Sub Queries and other Aggregations

Chapter 6: Apache Spark 5hrs

o Introduction to Spark - Getting started
o Resilient Distributed Dataset and DataFrames
o Spark application programming
o introduction to Spark libraries
o Spark configuration, monitoring and tuning

Module 3: Gathering, Wrangling & Visualizing Data with

Advance Python Libraries [Pandas, numPy & , matplotlib]
o Introduction to Pandas.
o Data Structure in Pandas-(Series, Data Frame)
o DataFrame implementation using – series, Lists, Dictionary, a NumPy 2D array
o Identify and Handle Missing Values
o Data Formatting
o Data Normalization Sets
o Binning
o Indicator variables
o CSV file handling
o Exporting data from DataFrame to CSV File
o EDA & Data Visualization using matplotlib library

Tableau
o What is Tableau?
o Tableau Architecture
o Workspace & Navigation
o Tableau Data Connections
o Filter data in Tableau
o Tableau Sort Data
o Data Visualization with Tableau
o Dynamic Data Manipulation and Presentation in Tableau
Module 4: Mining & Visualizing Data and Communicating
Results
Chapter -1 Introduction to Statistical Modelling

o What is a Statistical Mode

o Why do we need Statistical Modeling?
o Estimation:
o Confidence Interval
o Hypothesis Testing

Chapter 2 - Introduction to Statistical Modelling

o Linear Regression
 Simple Linear Regression
 Multiple Linear Regression
o Classification
 Logistic Regression
 Discriminant Analysis
o Resampling Methods
 Bootstrapping
 Cross-Validation
o Tree-based Methods
 Bagging
 Boosting
o Unsupervised Learning
 Principal Component Analysis
 K-Means Clustering
 Hierarchical Clustering
o Types of Variables
 Dependent Variable, also known as Response Variable:
 Explanatory Variable, also known as Independent Variable:
o Model Parameters and Model Residuals

Chapter 3 - Difference between Statistical Modelling and Machine Learning

Chapter 4 - Difference Statistical Modelling Perspective

Chapter 5 - Difference Machine Learning Perspective

R Programming
o Understanding R as a programming environment
o R basics-
 Math, Variables, and Strings
 Vectors and Factors
 Vector operations
o Data structures in R
o Arrays & Matrices
o Lists
o Dataframes
o R programming fundamentals
 Conditions and loops
 Functions in R
 Objects and Classes
 Debugging
o Working with data in R
 Reading CSV and Excel Files
 Reading text files
 Writing and saving data objects to file in R
o Strings and Dates in R
 String operations in R
 Regular Expressions
 Dates in R
o Descriptive Statistics using R

o Data Visualization using R

o Exploratory Data Analysis (EDA) using R

o A Comprehensive analysis on a sample data set using Machine Learning Technique.

Module 5: Career Opportunities and Data Analysis in

Action

o Different career opportunities in the field of Data Analysis and the different paths that
you can take for getting skilled as a Data Analyst.
o Hands-on project on with use cases (scenario based) in gathering, wrangling, mining,
analyzing, and visualizing data.

BDA Syllabus - Sem VII - Mumbai University
No ratings yet
BDA Syllabus - Sem VII - Mumbai University
3 pages
Big Data Analytics
No ratings yet
Big Data Analytics
2 pages
Oracle Notes
No ratings yet
Oracle Notes
588 pages
Big Data Black Book PDF
15% (20)
Big Data Black Book PDF
2 pages
Big Data Analytics (R20a0520)
No ratings yet
Big Data Analytics (R20a0520)
84 pages
22cs702 Data Analytics Unit-2.Dcm
No ratings yet
22cs702 Data Analytics Unit-2.Dcm
73 pages
Data Science Training Content Naresh IT Hyderabad
No ratings yet
Data Science Training Content Naresh IT Hyderabad
13 pages
22IS61 Big Data Analytics 2025
No ratings yet
22IS61 Big Data Analytics 2025
4 pages
Hands On Data Science MAchine Learning SQL Power BI Tableau MongoDB With End To End Projects
No ratings yet
Hands On Data Science MAchine Learning SQL Power BI Tableau MongoDB With End To End Projects
13 pages
MCAD2232 (PRESS) BIG DATA and Its Applications
No ratings yet
MCAD2232 (PRESS) BIG DATA and Its Applications
140 pages
Aspiring Data Analysts' Guide
No ratings yet
Aspiring Data Analysts' Guide
16 pages
Symbiosis Skills and Professional University
No ratings yet
Symbiosis Skills and Professional University
3 pages
Big Data Data Analytics
No ratings yet
Big Data Data Analytics
5 pages
Annexure - I - Syllabus PG-DBDA Aug 16
No ratings yet
Annexure - I - Syllabus PG-DBDA Aug 16
4 pages
21.streams in Snowflake
No ratings yet
21.streams in Snowflake
8 pages
Data Analyst - Data Engineer
No ratings yet
Data Analyst - Data Engineer
7 pages
Getting An Overview of Big Data
No ratings yet
Getting An Overview of Big Data
8 pages
Advanced Diploma in Data& Business Analytics
No ratings yet
Advanced Diploma in Data& Business Analytics
13 pages
Data Science and Big Data Analytics - Unit - 1
No ratings yet
Data Science and Big Data Analytics - Unit - 1
47 pages
Module 1 Introduction To Big Data Analytics
No ratings yet
Module 1 Introduction To Big Data Analytics
121 pages
Syllabus
No ratings yet
Syllabus
7 pages
Ds603Pc: Big Data Analytics B.Tech. III Year II Sem. L T P C 3 0 0 3 Course Objectives
No ratings yet
Ds603Pc: Big Data Analytics B.Tech. III Year II Sem. L T P C 3 0 0 3 Course Objectives
1 page
Data Science and Big Data Analytics
No ratings yet
Data Science and Big Data Analytics
2 pages
Big Data Syllabus
No ratings yet
Big Data Syllabus
6 pages
BDA Syllabus
No ratings yet
BDA Syllabus
3 pages
B2. Introduction To Big Data With Spark and Hadoop - Coursera
No ratings yet
B2. Introduction To Big Data With Spark and Hadoop - Coursera
12 pages
Edukuron Data Engineering
No ratings yet
Edukuron Data Engineering
10 pages
Big Data Engineer Course
No ratings yet
Big Data Engineer Course
31 pages
Big Data Analytics for B.Tech Students
No ratings yet
Big Data Analytics for B.Tech Students
119 pages
Specialised Programme On Big Data and Machine Learning - 8 Weeks
No ratings yet
Specialised Programme On Big Data and Machine Learning - 8 Weeks
6 pages
Data Mining and Analytics
No ratings yet
Data Mining and Analytics
2 pages
3972620-Agenda Sheet April 2025 - Data Analytics - April 2025
No ratings yet
3972620-Agenda Sheet April 2025 - Data Analytics - April 2025
1 page
SEM VII BDA Syllabus Theory
No ratings yet
SEM VII BDA Syllabus Theory
4 pages
Specialised Programme On Big Data Analytics
No ratings yet
Specialised Programme On Big Data Analytics
3 pages
B.Tech. CS - CE and CSE Syllabus 3rd Year 2024-25
No ratings yet
B.Tech. CS - CE and CSE Syllabus 3rd Year 2024-25
2 pages
IIT Kharagpur Data Science PDF
No ratings yet
IIT Kharagpur Data Science PDF
22 pages
DE Python
No ratings yet
DE Python
11 pages
Big Data Training in Chennai - Big Data Course in Chennai
No ratings yet
Big Data Training in Chennai - Big Data Course in Chennai
1 page
Java & J2EE Course Lesson Plan
50% (2)
Java & J2EE Course Lesson Plan
2 pages
Data Analytics Curriculum Overview
No ratings yet
Data Analytics Curriculum Overview
31 pages
Data Analytics Course Guide
No ratings yet
Data Analytics Course Guide
14 pages
E - TC and Elex - Syllabus - 4102017 PDF
No ratings yet
E - TC and Elex - Syllabus - 4102017 PDF
3 pages
Syllabus
No ratings yet
Syllabus
3 pages
113 Ce 74
No ratings yet
113 Ce 74
4 pages
Data Science with Python & Hadoop
No ratings yet
Data Science with Python & Hadoop
1 page
Thesis Help for Data Mining Students
100% (3)
Thesis Help for Data Mining Students
5 pages
MCA-SEM-III-Syllabus Mobile Computing
No ratings yet
MCA-SEM-III-Syllabus Mobile Computing
12 pages
No SQL Database in Bda
No ratings yet
No SQL Database in Bda
84 pages
Elizabeth: ETL Informatica Developer
No ratings yet
Elizabeth: ETL Informatica Developer
5 pages
New Microsoft Office Excel Worksheet
No ratings yet
New Microsoft Office Excel Worksheet
44 pages
Big Data Analytics
No ratings yet
Big Data Analytics
3 pages
Data Engineering Bootcamp for All
No ratings yet
Data Engineering Bootcamp for All
12 pages
NDS Data Practitioner Degree Curriculum
No ratings yet
NDS Data Practitioner Degree Curriculum
10 pages
Big Data Analytics for B.Tech Students
No ratings yet
Big Data Analytics for B.Tech Students
134 pages
06 - IBM Watsonx - Data Competitive Insights
No ratings yet
06 - IBM Watsonx - Data Competitive Insights
113 pages
Comprehensive Data Science Guide
No ratings yet
Comprehensive Data Science Guide
10 pages
05 - Strategies For Query Processing (Ch18)
No ratings yet
05 - Strategies For Query Processing (Ch18)
50 pages
Big Data Analytics Course Guide
No ratings yet
Big Data Analytics Course Guide
59 pages
Bigdata Hadoop Spark - Python
No ratings yet
Bigdata Hadoop Spark - Python
8 pages
Access IM Test Bank Chapter 3
No ratings yet
Access IM Test Bank Chapter 3
10 pages
Data Analytics & Big Data Course
No ratings yet
Data Analytics & Big Data Course
10 pages
Data Lineage: Tracking Data Flow
No ratings yet
Data Lineage: Tracking Data Flow
14 pages
Cursor and Trigger
No ratings yet
Cursor and Trigger
24 pages
ISMG6080 Database Management: Zhiping Walter Associate Professor of Information Systems
No ratings yet
ISMG6080 Database Management: Zhiping Walter Associate Professor of Information Systems
13 pages
Chapter 4 - Database Design - (Normalization)
No ratings yet
Chapter 4 - Database Design - (Normalization)
43 pages
DBMS Notes Class 10
No ratings yet
DBMS Notes Class 10
12 pages
Unit 5 Notes IOT
No ratings yet
Unit 5 Notes IOT
40 pages
Lab 4 MCSE - 207 - Suyash
No ratings yet
Lab 4 MCSE - 207 - Suyash
25 pages
Database Management Systems Guide
No ratings yet
Database Management Systems Guide
26 pages
JSON Extension - DuckDB
No ratings yet
JSON Extension - DuckDB
33 pages
COVID-19 Relational Data Coursework
No ratings yet
COVID-19 Relational Data Coursework
36 pages
Azure Dataflow
No ratings yet
Azure Dataflow
31 pages
Informatica Data Engineering Hackathon 2024 - Idea Submission Template
No ratings yet
Informatica Data Engineering Hackathon 2024 - Idea Submission Template
19 pages
WWW Javatpoint Com Redis Interview Questions and Answers
No ratings yet
WWW Javatpoint Com Redis Interview Questions and Answers
9 pages
Dynamic Time Warping Algorithm Review PDF
No ratings yet
Dynamic Time Warping Algorithm Review PDF
23 pages
Database Management
No ratings yet
Database Management
14 pages
File Allocation Methods
No ratings yet
File Allocation Methods
13 pages
Chalan e
No ratings yet
Chalan e
19 pages
Student Evaluation System
No ratings yet
Student Evaluation System
11 pages
DDD - Assignment Brief - CSD 12
No ratings yet
DDD - Assignment Brief - CSD 12
12 pages
Creating A Database in Mariadb Prompt of Xampp Server
No ratings yet
Creating A Database in Mariadb Prompt of Xampp Server
4 pages
SQL Lab Exam Questions for AI&DS
No ratings yet
SQL Lab Exam Questions for AI&DS
5 pages
5.data Warehousing Interview Questions
No ratings yet
5.data Warehousing Interview Questions
4 pages

Data Analytics TOC

Uploaded by

Data Analytics TOC

Uploaded by

Foundation Module – BDA (Indicative duration: 240 hrs.

Foundation Module- pre-requisite

Basics of Information technology

Foundational Curriculum – Big Data Analytics

Employed professionals who wish to explore their

Enthusiasts curious about understanding the hype behind

Module 2: The Data Ecosystem

Chapter 1: Introduction to Big Data-Hadoop framework

Chapter 3: Hadoop Components - MapReduce

Chapter 4: Extended subjects on HBASE

Chapter 5: Extended subjects on HIVE

Chapter 6: Apache Spark 5hrs

Module 3: Gathering, Wrangling & Visualizing Data with

o What is a Statistical Mode

Chapter 2 - Introduction to Statistical Modelling

Chapter 3 - Difference between Statistical Modelling and Machine Learning

Chapter 4 - Difference Statistical Modelling Perspective

Chapter 5 - Difference Machine Learning Perspective

o Data Visualization using R

o Exploratory Data Analysis (EDA) using R

o A Comprehensive analysis on a sample data set using Machine Learning Technique.

Module 5: Career Opportunities and Data Analysis in

You might also like