0% found this document useful (0 votes)

24 views14 pages

BDS306C - Imp Questions & Answers - Module 2-2

This document covers key concepts in R programming, including factors and strings, matrix operations, date differences, lists, data frames, and arrays. It provides examples of how to create and manipulate these data structures, as well as practical applications in data analysis. Additionally, it includes a program for reading and summarizing CSV files.

Uploaded by

truanimea351

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views14 pages

BDS306C - Imp Questions & Answers - Module 2-2

Uploaded by

truanimea351

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

MODULE - 2

1. Illustrate the concept of factors and strings in R.

1. Factors in R

A factor is used to represent categorical data. They store both string and integer
values, but are treated specially because they help in statistical modeling,
especially for categorical variables (e.g., gender, country, etc.). Factors are useful
when the data has a fixed number of unique values (categories or levels).
Key Features of Factors:

● Factors can be ordered or unordered.

● They are used in statistical modeling because categorical variables are often
involved in models like regression or classification.
● Factors store levels (unique categories) and assign a level to each
observation.
Example of Factor Creation:
# Creating a factor for Gender
gender <- factor(c("Male", "Female", "Female", "Male", "Female"))

# Output the factor

print(gender)
# Levels: Female Male

Modifying Levels of a Factor:

# Changing levels of the factor
levels(gender) <- c("F", "M")
print(gender)

Ordered Factor:

You can create ordered factors for categories that have a natural order (e.g., Low,
Medium, High).
# Creating an ordered factor for levels of education
education <- factor(c("High School", "College", "Masters", "PhD", "High
School"),
levels = c("High School", "College", "Masters", "PhD"),
ordered = TRUE)

print(education)

Factor Benefits:

● Efficient storage of categorical data.

● Factors allow for easy aggregation and summarization in data analysis.
● Useful for regression models, where categorical variables need to be treated
differently from continuous ones.

2. Strings in R

Strings are sequences of characters and are used to store text data in R. Unlike
factors, strings do not have levels or categories. They are treated as simple text and
are often used for names, labels, or general text processing tasks.
Key Features of Strings:

● Strings can be concatenated, manipulated, and processed using various string

functions.
● Strings are used when you need text data without any categorization.
Example of Creating Strings:
# Creating a string
name <- "John Doe"
print(name)

Common String Functions in R:

1. The paste() and paste0() functions are used to

Concatenating Strings:
concatenate (combine) strings.
# Concatenating two strings with a space
full_name <- paste("John", "Doe")
print(full_name) # Output: "John Doe"

# Concatenating without space

full_name <- paste0("John", "Doe")
print(full_name) # Output: "JohnDoe"

Changing Case: Convert strings to uppercase or lowercase using toupper()

and tolower() functions.
# Convert to upper case
upper_name <- toupper(full_name)
print(upper_name) # Output: "JOHN DOE"

# Convert to lower case

lower_name <- tolower(full_name)
print(lower_name) # Output: "john doe"

Extracting Substrings: You can extract parts of a string using substring() or

substr().

# Extracting a substring from a string

substr(full_name, 1, 4) # Output: "John"

String Manipulation for Data Analysis:

● String functions can be very useful in cleaning or formatting data, such as

manipulating column names or parsing text data.
● Useful in tasks such as text mining, where string processing is essential.
Practical Applications in Data Analysis

● Factors: When working with survey data, for example, gender, education
level, or country of residence are often represented as factors since they are
categorical.
● Strings: Strings are used in data cleaning tasks, such as cleaning column
names, parsing text files, or creating descriptive labels for data.
Example: Factor and String Usage in Data Analysis
# Data frame with factors and strings
data <- data.frame(
Name = c("Alice", "Bob", "Charlie"), # String
Gender = factor(c("Female", "Male", "Male")), # Factor
Age = c(25, 30, 22)
)

# Output the data frame

print(data)

# Summarizing the Gender factor

summary(data$Gender)
2. Create two 3x3 matrices A and B in R and perform various operations.
# Creating two 3x3 matrices A and B
A <- matrix(c(1, 2, 3, 4, 5, 6, 7, 8, 9), nrow = 3, byrow = TRUE)
B <- matrix(c(9, 8, 7, 6, 5, 4, 3, 2, 1), nrow = 3, byrow = TRUE)

# Display matrices
cat("Matrix A:\n")
print(A)

cat("\nMatrix B:\n")
print(B)

# i) Transpose of Matrices
transpose_A <- t(A)
transpose_B <- t(B)

cat("\nTranspose of Matrix A:\n")

print(transpose_A)
cat("\nTranspose of Matrix B:\n")
print(transpose_B)

# ii) Matrix Addition

matrix_addition <- A + B
cat("\nMatrix Addition (A + B):\n")
print(matrix_addition)

# iii) Matrix Subtraction

matrix_subtraction <- A - B
cat("\nMatrix Subtraction (A - B):\n")
print(matrix_subtraction)

# iv) Matrix Multiplication

matrix_multiplication <- A %*% B
cat("\nMatrix Multiplication (A * B):\n")
print(matrix_multiplication)
Explanation of the Code:

1. Matrix Creation:
○ A and B are two 3 × 3 matrices created using the matrix() function.
○ The nrow = 3 argument specifies that the matrix should have 3
rows, and byrow = TRUE ensures that elements are filled row-wise.
2. Matrix Transposition:
○ The t() function is used to compute the transpose of matrices A and
B.
3. Matrix Addition:
○ Matrix addition is done by using the + operator between two matrices,
which adds corresponding elements.
4. Matrix Subtraction:
○ Matrix subtraction is performed using the - operator between two
matrices, which subtracts corresponding elements.
5. Matrix Multiplication:
○ The matrix multiplication is done using the %*% operator, which
performs matrix multiplication according to the rules of linear algebra.

3. Develop an R program to calculate the difference between two dates and

determine the number of days, weeks, and months between them. Use the
appropriate functions to handle date conversions and arithmetic.

# Function to calculate the difference between two dates

calculate_date_difference <- function(date1, date2) {

# Convert the input strings to Date objects

date1 <- as.Date(date1, format = "%Y-%m-%d")
date2 <- as.Date(date2, format = "%Y-%m-%d")

# Calculate the difference in days

difference_in_days <- as.numeric(difftime(date2, date1, units = "days"))

# Calculate the difference in weeks (7 days per week)

difference_in_weeks <- difference_in_days / 7

# Calculate the difference in months (using 'months' function from lubridate)

library(lubridate)
difference_in_months <- interval(date1, date2) / months(1)

# Print the results

cat("Difference in Days:", difference_in_days, "\n")
cat("Difference in Weeks:", difference_in_weeks, "\n")
cat("Difference in Months:", round(difference_in_months, 2), "\n")
}

# Example usage of the function

date1 <- "2023-01-01"
date2 <- "2024-01-01"

# Call the function to calculate the differences

calculate_date_difference(date1, date2)

Explanation:

1. Date Conversion:
○ The dates are input as strings and then converted to Date objects
using the as.Date() function.
○ The format "%Y-%m-%d" is specified to match the input format (e.g.,
"2023-01-01").
2. Difference Calculation:
○ difftime() is used to calculate the difference between the two
dates in terms of days.
○ The difference in weeks is calculated by dividing the number of days
by 7.
○ The lubridate package's interval() function is used to
calculate the difference in months, and the result is rounded to two
decimal places for clarity.
3. Output:
○ The function prints the difference in days, weeks, and months.

4. Explain lists and data frames in R with examples.

Lists in R

A list in R is a 1-dimensional data structure that can hold multiple types of data
(e.g., numbers, characters, other lists, and even functions). Lists are highly flexible
because they can store elements of different types, unlike vectors, which are
restricted to storing elements of a single type.
Key Features of Lists:

1. Heterogeneous Data:
Lists can hold various data types such as numbers, strings,
vectors, matrices, other lists, or even functions. This flexibility makes lists
suitable for tasks that require combining different data types.
2. Indexing by Name or Position: List elements can be accessed using their
position or by names (if assigned).
3. Nested Structures: Lists can contain other lists as elements, making them
useful for hierarchical or nested data structures.

Creation and Accessing Elements:

You can create a list in R using the list() function. Here's how you can create
and access a list:

# Create a list with different data types

my_list <- list(
Name = "Alice",
Age = 25,
Scores = c(90, 85, 88),
Contact = list(phone = "123-456", email = "[email protected]")
)
# Accessing elements by name
print(my_list$Name) # Outputs: "Alice"
print(my_list$Scores) # Outputs: 90, 85, 88

# Accessing elements by index

print(my_list[[2]]) # Outputs: 25

# Accessing elements from a nested list

print(my_list$Contact$phone) # Outputs: "123-456"

Use Cases:

● Storing Heterogeneous Data:

Lists are used when you need to store diverse
elements like model results, vectors, and even functions in one container.
● Storing Function Results: In machine learning, lists are commonly used to
store model outcomes (e.g., coefficients, residuals, and diagnostics).
● Nested Data Structures: When dealing with hierarchical data such as
JSON, you can use lists to represent different levels of nesting.

# Create a simple list for storing student details

student <- list(
name = "Alice",
age = 18,
subjects = c("Math", "English", "Science")
)

# Access student's name

print(student$name) # Outputs: "Alice"

# Access student's age

print(student$age) # Outputs: 18

# Access student's subjects

print(student$subjects) # Outputs: "Math", "English", "Science"
Data Frames in R

A data frame is a 2-dimensional table-like structure where each column can store
data of a different type, but all values in a column must be of the same type. A data
frame is similar to a table in a relational database or a spreadsheet. Data frames
are widely used for storing datasets in R because of their organized row-column
structure.
Key Features of Data Frames:

1. Tabular Structure:
Data frames have rows and columns. Each column can have
different types (e.g., numeric, character, or logical), but every value in a
particular column must be of the same type.
2. Column Names: Columns typically have names, which makes data frames
easy to reference.
3. Access by Row/Column: You can access data in a data frame by specifying
row and column indices, or by column name.
Creation and Accessing Elements:

You can create a data frame using the data.frame() function in R.

# Create a simple data frame

students <- data.frame(
Name = c("John", "Alice", "Bob"),
Age = c(22, 24, 23),
Marks = c(88, 95, 78)
)

# Access the entire data frame

print(students)

# Access a specific column

print(students$Name) # Outputs: "John", "Alice", "Bob"

# Access a specific row

print(students[2, ]) # Outputs the second row
# Access a specific value
print(students[2, 3]) # Outputs: 95 (Marks of Alice)

Use Cases:

● Storing and Analyzing Datasets:

Data frames are essential for storing structured
datasets where different columns represent different variables, and rows
represent different observations.
● Data Manipulation: Data frames are the primary data structure used in R
for data cleaning, transformation, and analysis.
● Working with CSV Files: Data frames are used to import and work with
data from CSV files, Excel spreadsheets, and other external data sources.
Example:

Consider a scenario where you're analyzing the scores of students in a class. You
can store the data in a data frame and then calculate summaries or perform
analyses.

# Example: Data frame for students' scores

students <- data.frame(
Name = c("John", "Alice", "Bob"),
Age = c(22, 24, 23),
Marks = c(88, 95, 78)
)

# Calculate the average marks

average_marks <- mean(students$Marks)
print(average_marks) # Outputs: 87

# Subsetting data frame (only students older than 22)

older_students <- students[students$Age > 22, ]
print(older_students)
5. Develop an R program that reads a CSV file and summarizes the data.
# Step 1: Read the CSV file into a data frame
data <- read.csv("your_file.csv")

# Step 2: Display the structure of the data (type of columns, data types)
print("Structure of the data:")
str(data)

# Step 3: Display a summary of the data (mean, median, min, max, etc. for numeric
columns)
print("Summary statistics:")
summary(data)

# Step 4: Display the first few rows of the data to inspect

print("First few rows of the data:")
head(data)

# Step 5: Count the number of rows and columns

print(paste("Number of rows:", nrow(data)))
print(paste("Number of columns:", ncol(data)))

Explanation of the Program:

1. read.csv() function reads the CSV file into a data frame.

2. str() function displays the structure of the data, showing column types
and data types.
3. summary() function provides summary statistics (min, max, mean,
median) for numeric columns and basic info for categorical columns.
4. head() shows the first few rows of the data.
5. nrow() and ncol() are used to count the number of rows and columns,
respectively.

Usage:

● Make sure to replace "your_file.csv" with the actual path to your

CSV file. You can load any dataset in CSV format and use this program to
analyze it quickly.

6. Demonstrate array with an example.

An array in R is a multi-dimensional data structure that can hold data elements in
more than two dimensions.
Unlike vectors or matrices, which are one-dimensional and two-dimensional
respectively, arrays allow for storing data in three or more dimensions.
They are useful when working with large datasets that can be categorized in
different ways, like students' scores in various subjects over multiple semesters.

Syntax for Creating an Array in R:

array(data, dim = c(dim1, dim2, dim3, ...), dimnames = NULL)

data: A vector of elements that will be arranged in the array.

dim: A vector specifying the dimensions (e.g., rows, columns, and layers).
dimnames: (Optional) A list of names for the dimensions.
Example of Creating a Simple Array in R:
# Step 1: Define the data (marks)
marks <- c(85, 90, 78, 88, 92, 81, 87, 89)

# Step 2: Define dimensions (2 subjects, 2 semesters, 2 students)

array_dims <- c(2, 2, 2) # 2 subjects, 2 semesters, 2 students

# Step 3: Create the array

student_marks <- array(marks, dim = array_dims)

# Step 4: Print the array

print(student_marks)

,,1

[,1] [,2]
[1,] 85 88
[2,] 90 92

,,2

[,1] [,2]
[1,] 78 81
[2,] 87 89

Explanation:

● Data Input: The marks vector holds the students' marks.

● Array Creation: The array() function arranges the data into the
specified dimensions.
● Output: The array is displayed with dimensions showing marks for different
subjects and semesters.

R Programming-Chapiter 4
No ratings yet
R Programming-Chapiter 4
16 pages
Cognizant 2026 Preperation
100% (1)
Cognizant 2026 Preperation
39 pages
Blinkit Dashboard
No ratings yet
Blinkit Dashboard
10 pages
Gann Circle Swing Levels
No ratings yet
Gann Circle Swing Levels
2 pages
Programming With R: Lecture #4
No ratings yet
Programming With R: Lecture #4
34 pages
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
No ratings yet
Statistical Lab Using R-Programming Lab Manual and Workbook: Department of Mathematics
58 pages
R-Unit 2
No ratings yet
R-Unit 2
81 pages
R Software - Notes
No ratings yet
R Software - Notes
18 pages
Introduction to R Basics and Data Types
No ratings yet
Introduction to R Basics and Data Types
33 pages
Introduction To Rlogistic
No ratings yet
Introduction To Rlogistic
135 pages
R-Programming Notes
100% (2)
R-Programming Notes
33 pages
ML File
No ratings yet
ML File
12 pages
RemoveWatermark pdf24 Merged+
No ratings yet
RemoveWatermark pdf24 Merged+
76 pages
DR - Pierpaolo-Delser - Introduction R
No ratings yet
DR - Pierpaolo-Delser - Introduction R
83 pages
R - A Practical Course
No ratings yet
R - A Practical Course
42 pages
Cluster Analysis and Applications
No ratings yet
Cluster Analysis and Applications
37 pages
Da Session 4
No ratings yet
Da Session 4
75 pages
SEC Notes
No ratings yet
SEC Notes
62 pages
Week3 2020
No ratings yet
Week3 2020
20 pages
Introduction To R
No ratings yet
Introduction To R
21 pages
R Basics for Economics Students
No ratings yet
R Basics for Economics Students
7 pages
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
No ratings yet
WINSEM2021-22 MAT2001 ELA VL2021220501462 Reference Material I 04-01-2022 1. Introduction of R Language - I
15 pages
RStudio
No ratings yet
RStudio
60 pages
Lecture S2
No ratings yet
Lecture S2
24 pages
Network Analysis and Visualization With R and Igraph
No ratings yet
Network Analysis and Visualization With R and Igraph
62 pages
RStudio
No ratings yet
RStudio
31 pages
Basics of R Programming - Part 2
No ratings yet
Basics of R Programming - Part 2
7 pages
Introduction to Non-Tabular Data in R
No ratings yet
Introduction to Non-Tabular Data in R
5 pages
N2 Data in R
No ratings yet
N2 Data in R
7 pages
R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
No ratings yet
R Is A Command Line Based Language All Commands Are Entered Directly Into The Console. R
8 pages
Base R
No ratings yet
Base R
9 pages
Experiment 1: Working With Objects in Memory
No ratings yet
Experiment 1: Working With Objects in Memory
6 pages
M2 Dar
No ratings yet
M2 Dar
46 pages
Introdution To R - Network Analysis - Practical 1 - Sacha Epskamp - University of Amsterdam, 2013
No ratings yet
Introdution To R - Network Analysis - Practical 1 - Sacha Epskamp - University of Amsterdam, 2013
34 pages
R Programming: Data Structures Guide
No ratings yet
R Programming: Data Structures Guide
18 pages
R Programming Notes
No ratings yet
R Programming Notes
23 pages
DA Lab Week-2
No ratings yet
DA Lab Week-2
22 pages
R WorkSamples
No ratings yet
R WorkSamples
44 pages
2 Undefined
No ratings yet
2 Undefined
86 pages
2 Program
No ratings yet
2 Program
11 pages
Creating and Manipulating Objects
No ratings yet
Creating and Manipulating Objects
12 pages
Introduction To R Chap 2
No ratings yet
Introduction To R Chap 2
30 pages
R Programming Materials
No ratings yet
R Programming Materials
51 pages
About R Language
No ratings yet
About R Language
15 pages
Introduction To R Installation: Data Types Value Examples
No ratings yet
Introduction To R Installation: Data Types Value Examples
9 pages
Lec 4 Basics of R
No ratings yet
Lec 4 Basics of R
22 pages
Chapter - 3 - R Objects or Data Types
No ratings yet
Chapter - 3 - R Objects or Data Types
7 pages
Question Paper 1 Answers (R) by Siddu
No ratings yet
Question Paper 1 Answers (R) by Siddu
17 pages
R Comandos
No ratings yet
R Comandos
13 pages
Lecture 1
No ratings yet
Lecture 1
35 pages
R Programming Lab
No ratings yet
R Programming Lab
33 pages
People Analytics With R Part 3
No ratings yet
People Analytics With R Part 3
11 pages
SSMDA Expt 7
No ratings yet
SSMDA Expt 7
16 pages
R Study Material I
No ratings yet
R Study Material I
8 pages
Cambridge Computer Science For IGCSE Cambridge Course Book 2022 Pages 1
No ratings yet
Cambridge Computer Science For IGCSE Cambridge Course Book 2022 Pages 1
17 pages
R Programming Checklist of Basic Skills With Examples
No ratings yet
R Programming Checklist of Basic Skills With Examples
33 pages
Unit 4
No ratings yet
Unit 4
27 pages
Ex 4 R Objects
No ratings yet
Ex 4 R Objects
6 pages
( (1) ) (1) 2 5 3 ( (2) ) (1) 21 ( (3) ) (1) 3 ( (4) ) Function (X) .Primitive ("Sin")
No ratings yet
( (1) ) (1) 2 5 3 ( (2) ) (1) 21 ( (3) ) (1) 3 ( (4) ) Function (X) .Primitive ("Sin")
3 pages
About R Language: Installation
No ratings yet
About R Language: Installation
7 pages
Introduction To R PDF
No ratings yet
Introduction To R PDF
56 pages
R Programming Basics for Beginners
No ratings yet
R Programming Basics for Beginners
14 pages
R Programming
No ratings yet
R Programming
22 pages
Introduction To R-Copy1
No ratings yet
Introduction To R-Copy1
16 pages
ICT's Role in Modern Media Transformation
No ratings yet
ICT's Role in Modern Media Transformation
6 pages
Performance Tuning Guide As400
No ratings yet
Performance Tuning Guide As400
136 pages
Caie Igcse Ict Znotes Theory
No ratings yet
Caie Igcse Ict Znotes Theory
55 pages
MSC Report - Final
No ratings yet
MSC Report - Final
142 pages
Australian/New Zealand Standard: Structural Design Actions Part 0: General Principles
0% (1)
Australian/New Zealand Standard: Structural Design Actions Part 0: General Principles
7 pages
Spectroil Q100
67% (3)
Spectroil Q100
100 pages
Approach 2 - Middleware - SAP ECC or S4HANA BTP
No ratings yet
Approach 2 - Middleware - SAP ECC or S4HANA BTP
20 pages
Wa0000.
No ratings yet
Wa0000.
28 pages
CSC2071 - Lecture 08 (Classes)
No ratings yet
CSC2071 - Lecture 08 (Classes)
29 pages
Monitoring Plant Health Andd Detection of Plant Disease Using Iot
No ratings yet
Monitoring Plant Health Andd Detection of Plant Disease Using Iot
15 pages
Gmail - Trend - Test Interviewer Software
No ratings yet
Gmail - Trend - Test Interviewer Software
2 pages
Chapter 3 Part 1
No ratings yet
Chapter 3 Part 1
10 pages
Logitech Bundles PDF
No ratings yet
Logitech Bundles PDF
4 pages
Hackathon 2025
No ratings yet
Hackathon 2025
2 pages
Remote Radiotherapy Planning The EIMRT Project
No ratings yet
Remote Radiotherapy Planning The EIMRT Project
7 pages
Sih PS 2024
No ratings yet
Sih PS 2024
5 pages
16488092936246d54d2efc1RESULTWALK IN INTERVIEW HELD ON MARCH 2022
No ratings yet
16488092936246d54d2efc1RESULTWALK IN INTERVIEW HELD ON MARCH 2022
2 pages
Assignment 4 - OSF
No ratings yet
Assignment 4 - OSF
3 pages
Algorithm Assignment Solutions
No ratings yet
Algorithm Assignment Solutions
3 pages
Character Reference
No ratings yet
Character Reference
2 pages
Nslookup PC
No ratings yet
Nslookup PC
2 pages
Mio-5377r DS (100223) 20231002134454
No ratings yet
Mio-5377r DS (100223) 20231002134454
2 pages
ACCOUNT
No ratings yet
ACCOUNT
5 pages
Advantage 1 More Practice Burlington Books Compress
No ratings yet
Advantage 1 More Practice Burlington Books Compress
50 pages
CLA Guitars
No ratings yet
CLA Guitars
13 pages