0% found this document useful (0 votes)

11 views10 pages

Bda Unit V

The document provides an overview of data analytics using R, highlighting its open-source nature, statistical computing capabilities, and rich ecosystem of libraries for machine learning and visualization. It outlines key steps in data analytics, including data collection, preprocessing, exploratory data analysis, model building, evaluation, and deployment. Additionally, it covers collaborative filtering techniques, social media analytics, and mobile analytics, detailing their importance and key metrics for performance assessment.

Uploaded by

Makkapati Deepthi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views10 pages

Bda Unit V

Uploaded by

Makkapati Deepthi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 10

UNIT – V

Introduction to Data Analytics with R

Why R for Data Analytics?

R is a powerful open-source programming language that is widely used

in data analytics, statistical computing, and machine learning. It
provides a comprehensive environment for handling, visualizing, and
analyzing large datasets efficiently. Below are some of the key reasons
why R is a popular choice for data analytics:

1. Open-source & Free

o R is freely available, making it accessible to researchers, data

scientists, and businesses.

o Large and active community support provides numerous

free libraries and resources.

2. Statistical Computing Capabilities

o R is designed for advanced statistical analysis and data

modeling.

o Provides inbuilt functions for regression, hypothesis testing,

time series analysis, and more.

3. Rich Ecosystem of Machine Learning Libraries

o R supports a variety of machine learning techniques through

powerful libraries such as:

 caret – Unified framework for ML models

 randomForest – Random Forest for classification and
regression

 xgboost – Gradient boosting algorithm for predictive

modeling

4. Visualization Capabilities

o R excels in data visualization and storytelling, making it easy

to explore and communicate insights.

o Popular visualization libraries include:

 ggplot2 – Advanced data visualization

 lattice – Multi-panel statistical graphics

 plotly – Interactive graphs and dashboards

5. Integration with Big Data Technologies

o R can handle large datasets and integrate with Big Data

frameworks such as:

 Hadoop – Parallel computing with R using the

RHadoop package

 Spark – Distributed ML and big data processing via

SparkR

 BigR – Enables R to work with Big Data stored in HDFS

Key Steps in Data Analytics with R

To perform data analytics in R, a structured workflow is typically

followed. Below are the key steps:

Step 1: Data Collection

The first step in data analytics is importing data from different sources
into R. Common data sources include:

 CSV files → read.csv("data.csv")

 Excel files → readxl::read_excel("data.xlsx")

 Databases (MySQL, PostgreSQL, MongoDB) → DBI and RMySQL

 Web scraping (APIs, JSON, XML) → httr, rvest

Step 2: Data Preprocessing

Before analysis, raw data needs to be cleaned and transformed:

 Handling Missing Values

o Remove missing data → na.omit(dataset)

o Impute missing values → mean(dataset$column, na.rm =

TRUE)

 Data Transformation

o Convert categorical variables → as.factor(dataset$column)

o Normalize numerical data → scale(dataset$column)

Step 3: Exploratory Data Analysis (EDA)

EDA helps in understanding the distribution, patterns, and

relationships in data.

 Descriptive Statistics

o Summary of data → summary(dataset)

o Mean, median, standard deviation → mean(), sd(), quantile()

 Data Visualization

o Univariate Analysis → Histograms, box plots (ggplot2)

o Bivariate Analysis → Scatter plots, correlation heatmaps

Step 4: Model Building (Supervised & Unsupervised Learning)

Depending on the problem type, different machine learning techniques

are applied:

 Supervised Learning (Labeled Data)

o Regression: Linear Regression, Random Forest Regression

o Classification: Logistic Regression, Decision Trees, SVM

 Unsupervised Learning (Unlabeled Data)

o Clustering: k-Means, Hierarchical Clustering, DBSCAN

o Dimensionality Reduction: PCA

Step 5: Model Evaluation

After training, models are evaluated using various performance

metrics:

 Regression Metrics

o RMSE (Root Mean Squared Error) → Measures error in

prediction

o R² (R-Squared) → Measures model accuracy

 Classification Metrics

o Accuracy → (Correct Predictions / Total Predictions)

o Precision & Recall → Performance of classification models

o ROC Curve & AUC Score → pROC package for model

evaluation

Step 6: Deployment & Interpretation of Results

Once the model is validated, it is deployed for real-world use.

 Deploying as an API using Plumber

 Deploying on web applications with Shiny

 Interpreting results and generating reports using R Markdown

Introduction to Collaborative Filtering
Collaborative Filtering recommends items by analyzing past interactions
between users and items.

How does it work?

 User-based filtering: "People similar to you liked these items."

 Item-based filtering: "If you liked this item, you may like similar
items."

 Hybrid Filtering: Combines both user-based and item-based

filtering.

Example Use Cases

 E-commerce: Suggesting products based on past purchases.

 Streaming Platforms: Recommending movies based on viewing

history.

 Online Learning: Suggesting courses based on user activity

2. Types of Collaborative Filtering

2.1. User-Based Collaborative Filtering

Finds similar users and recommends items liked by similar users.

 Example: If User A and User B have similar movie preferences,

then User A will get recommendations based on User B's likes.
Mathematical Approach:

 Measures similarity using Cosine Similarity or Pearson

Correlation.
 similarity=∣A∣×∣B∣A⋅B=∑i=1nAi2×∑i=1nBi2∑i=1nAi×Bi

2.2. Item-Based Collaborative Filtering

Finds similar items and recommends them to users who liked similar
items.

Example: If many users who purchased "iPhone 13" also bought

"AirPods Pro", then a user who buys "iPhone 13" will get a
recommendation for "AirPods Pro" since these items are frequently
bought together.

2.3. Hybrid Filtering

Combines User-based and Item-based filtering for better

recommendations.

 Used by Netflix, YouTube, and Amazon.

 New users with no history.

Social media analytics
Social media analytics refers to the process of collecting, analyzing, and
interpreting data from social media platforms to assess performance,
understand audience behavior, and optimize strategies. It helps
businesses, marketers, and content creators make informed decisions
on how to improve engagement, reach, and overall effectiveness on
social platforms.

Key Metrics to Track:

 Engagement: Likes, comments, shares, retweets, reactions, etc.

 Reach: The number of unique users who have seen your posts.

 Impressions: The number of times your posts have been viewed,

regardless of whether they were clicked or interacted with.

 Follower Growth: The increase or decrease in followers over time.

 Click-Through Rate (CTR): The percentage of users who click on a

link in your post.

 Conversion Rate: The percentage of users who take a desired

action (e.g., sign up, make a purchase, etc.) after clicking a link.

 Sentiment Analysis: Understanding whether the public

perception of your brand is positive, neutral, or negative.

 Hashtag Performance: How well certain hashtags perform in

terms of engagement and reach.

Tools for Social Media Analytics:

 Google Analytics: Can track traffic from social media platforms to
websites.

 Hootsuite: Offers analytics for engagement, post performance,

and more.

 Sprout Social: Helps measure social media campaigns, audience

growth, and sentiment.

 Buffer: Provides insights on audience interactions, engagement,

and post timing.

 Facebook Insights: For analyzing Facebook-specific metrics (posts,

stories, and ads).

 Twitter Analytics: For tracking tweet performance, engagement,

and follower demographics.

Mobile Analytics
Mobile analytics refers to the process of tracking, measuring, and
analyzing the behavior of users on mobile apps or mobile websites. This
helps businesses and developers understand how users interact with
their mobile apps, identify areas for improvement, and optimize app
performance to boost engagement, retention, and revenue.

Key Metrics to Track in Mobile Analytics:

 App Downloads: The number of times your app has been

downloaded from app stores (Google Play, App Store).

 Active Users (DAU/WAU/MAU):

o DAU (Daily Active Users): Number of unique users engaging
with your app on a daily basis.

o WAU (Weekly Active Users): Number of unique users

engaging with your app on a weekly basis.

o MAU (Monthly Active Users): Number of unique users

engaging with your app on a monthly basis.

 Retention Rate: The percentage of users who return to the app

after a specified period (e.g., 1 day, 7 days, or 30 days). This helps
measure how sticky your app is.

 Churn Rate: The percentage of users who stop using the app after
a certain period. A high churn rate is often a sign that there’s a
problem with user experience or engagement.

 Session Length: The average duration of a user's session in the

app.

 Session Frequency: How often users return to the app within a

given period (daily, weekly, etc.).

 In-App Events: Specific user actions like completing a level,

making a purchase, or sharing content.

 Conversion Rate: Percentage of users who complete a desired

action (e.g., sign up, make a purchase).

USAID/BHA Resilience Food Security Guide
No ratings yet
USAID/BHA Resilience Food Security Guide
143 pages
Carron, Brawley
No ratings yet
Carron, Brawley
18 pages
R Data Analysis Projects PDF
No ratings yet
R Data Analysis Projects PDF
354 pages
Document
No ratings yet
Document
1 page
All Units MAAL BDA - Chatgpt
No ratings yet
All Units MAAL BDA - Chatgpt
17 pages
Unit 1
No ratings yet
Unit 1
36 pages
Data Analytics Notes
No ratings yet
Data Analytics Notes
26 pages
Predictive Modeling
No ratings yet
Predictive Modeling
27 pages
Data Analytics
No ratings yet
Data Analytics
6 pages
Big Data Analysis
No ratings yet
Big Data Analysis
39 pages
Big Data Analytics. Notes
No ratings yet
Big Data Analytics. Notes
32 pages
Data Analytics For Healthcare Notes
No ratings yet
Data Analytics For Healthcare Notes
11 pages
Big Data Analysis
No ratings yet
Big Data Analysis
25 pages
Allama Iqbal Open University Islamabad
No ratings yet
Allama Iqbal Open University Islamabad
14 pages
Introduction to Data Analytics
No ratings yet
Introduction to Data Analytics
30 pages
1.four Types of Analytics in Simple Terms
No ratings yet
1.four Types of Analytics in Simple Terms
11 pages
Da Unit-Ii
No ratings yet
Da Unit-Ii
21 pages
Comscore Data for Web Analytics
No ratings yet
Comscore Data for Web Analytics
11 pages
Ba Theory
No ratings yet
Ba Theory
10 pages
Marketing Analytics
No ratings yet
Marketing Analytics
10 pages
Unit 1
No ratings yet
Unit 1
7 pages
Big Data Analytics Process Guide
No ratings yet
Big Data Analytics Process Guide
22 pages
Here Is An Even More Detailed and Expanded Version of Chapter 1
No ratings yet
Here Is An Even More Detailed and Expanded Version of Chapter 1
5 pages
Notes-Introduction To Data Analytics
No ratings yet
Notes-Introduction To Data Analytics
6 pages
Abhijitya Midsem
No ratings yet
Abhijitya Midsem
6 pages
Data Analytics Syllabus PDF
No ratings yet
Data Analytics Syllabus PDF
5 pages
Da Unit 2
No ratings yet
Da Unit 2
18 pages
Data Analytics Course Overview
No ratings yet
Data Analytics Course Overview
2 pages
Big Data
No ratings yet
Big Data
47 pages
CCW331 Unit 1 BA Part 5
No ratings yet
CCW331 Unit 1 BA Part 5
6 pages
Data Analytics Complete Notes
No ratings yet
Data Analytics Complete Notes
33 pages
Fods Unit 1
No ratings yet
Fods Unit 1
9 pages
DA Module 1
No ratings yet
DA Module 1
132 pages
Big Data Analytics For R-2017 by ArunPrasath S., Sriram Kumar K., Krishna Sankar P.
No ratings yet
Big Data Analytics For R-2017 by ArunPrasath S., Sriram Kumar K., Krishna Sankar P.
7 pages
BDA Unit 1 Bigdata Intro
No ratings yet
BDA Unit 1 Bigdata Intro
69 pages
Big Data Analytics Unit 1
No ratings yet
Big Data Analytics Unit 1
8 pages
Business Analytics Summary (Units 1.2 - 1.8)
No ratings yet
Business Analytics Summary (Units 1.2 - 1.8)
8 pages
Overview of Key Algorithms Used in Big Data
No ratings yet
Overview of Key Algorithms Used in Big Data
21 pages
Assignment Week 2 BDA
No ratings yet
Assignment Week 2 BDA
4 pages
Steps in The Implementation of Data Analysis
No ratings yet
Steps in The Implementation of Data Analysis
2 pages
Introduction
No ratings yet
Introduction
14 pages
Big Data Analytics - Drivers
No ratings yet
Big Data Analytics - Drivers
39 pages
Unit 2
No ratings yet
Unit 2
11 pages
Unit I - BigData
No ratings yet
Unit I - BigData
47 pages
Big Data & Analytics Overview
No ratings yet
Big Data & Analytics Overview
6 pages
Module - 4
No ratings yet
Module - 4
37 pages
Unit 2
No ratings yet
Unit 2
22 pages
Analytics Overview
No ratings yet
Analytics Overview
34 pages
Unit1 Iba
No ratings yet
Unit1 Iba
11 pages
Big Data Analytics for Retail
No ratings yet
Big Data Analytics for Retail
8 pages
CH 1
No ratings yet
CH 1
56 pages
CH 1
No ratings yet
CH 1
33 pages
Report Shawari
No ratings yet
Report Shawari
10 pages
Bda Unit1
No ratings yet
Bda Unit1
56 pages
Big Data
No ratings yet
Big Data
54 pages
Bda Notes
No ratings yet
Bda Notes
13 pages
Ba Notes Ete
No ratings yet
Ba Notes Ete
16 pages
Unit-1 Wsma
No ratings yet
Unit-1 Wsma
25 pages
Big Data Unit 1
No ratings yet
Big Data Unit 1
33 pages
2 Technology and Data
No ratings yet
2 Technology and Data
12 pages
Asn Unit-3
No ratings yet
Asn Unit-3
24 pages
ASN Chapter-3
No ratings yet
ASN Chapter-3
16 pages
SYLLABUS
No ratings yet
SYLLABUS
188 pages
Asn Unit-4
No ratings yet
Asn Unit-4
35 pages
Pseudocode Cheat Sheet Guide
No ratings yet
Pseudocode Cheat Sheet Guide
13 pages
Winterization Guide for Property Managers
No ratings yet
Winterization Guide for Property Managers
1 page
IoT Water Quality Monitoring
No ratings yet
IoT Water Quality Monitoring
6 pages
ECE312 Final Exam 2021
No ratings yet
ECE312 Final Exam 2021
2 pages
SCRIPT - Camtasia 2020 Essential Training
No ratings yet
SCRIPT - Camtasia 2020 Essential Training
41 pages
OMRON PLC Cable Guide
No ratings yet
OMRON PLC Cable Guide
2 pages
3HAC16591 en
No ratings yet
3HAC16591 en
234 pages
ICTCYS604 Project Portfolio Best Practices Identify Managment JPSR
No ratings yet
ICTCYS604 Project Portfolio Best Practices Identify Managment JPSR
20 pages
Cisco® Catalyst® 9400 Series
No ratings yet
Cisco® Catalyst® 9400 Series
25 pages
The Origin of Paper
No ratings yet
The Origin of Paper
3 pages
ABHA M1 API Document V1 R1.bab8b1bd
No ratings yet
ABHA M1 API Document V1 R1.bab8b1bd
33 pages
Engineering Student Project Report
No ratings yet
Engineering Student Project Report
17 pages
Snowflake Adapter For SAP Integration Suite
No ratings yet
Snowflake Adapter For SAP Integration Suite
41 pages
Lab.4&5. FIR Filters
No ratings yet
Lab.4&5. FIR Filters
7 pages
ASSIGNMENT - WEEK-2 A.Multiple Choice Questions - Choose The Correct Answer/S (1X10 10)
No ratings yet
ASSIGNMENT - WEEK-2 A.Multiple Choice Questions - Choose The Correct Answer/S (1X10 10)
2 pages
Heat Pump Performance Analysis
No ratings yet
Heat Pump Performance Analysis
2 pages
G8497-90028 - SW - Install v2
No ratings yet
G8497-90028 - SW - Install v2
8 pages
Abtik Group
No ratings yet
Abtik Group
23 pages
3ms Third Test
No ratings yet
3ms Third Test
4 pages
MAT1023 Ruhuna
No ratings yet
MAT1023 Ruhuna
80 pages
Advanced Transducers & Data Loggers
No ratings yet
Advanced Transducers & Data Loggers
6 pages
6.0 SNI Ultrasonic Transducer Catalog Ver. 6.0 1
No ratings yet
6.0 SNI Ultrasonic Transducer Catalog Ver. 6.0 1
60 pages
PAC-USWHS002-WF-2 Install Manual 04 21
No ratings yet
PAC-USWHS002-WF-2 Install Manual 04 21
8 pages
Find List of Oyo in Hyderabad Near Me - Justdial
No ratings yet
Find List of Oyo in Hyderabad Near Me - Justdial
46 pages
EFI Fuel System
No ratings yet
EFI Fuel System
68 pages
2024 - 10 - 14 - ASEAN ITU GovStack - Brunei Country Update FINAL
No ratings yet
2024 - 10 - 14 - ASEAN ITU GovStack - Brunei Country Update FINAL
16 pages
Lecture Ch4 Performance
No ratings yet
Lecture Ch4 Performance
25 pages
Sapera User
No ratings yet
Sapera User
109 pages