Section A
1.
Here are two widely used R packages and their applications:
1. ggplot2
• Purpose: Data visualization
• Applications:
o Creating complex and aesthetically pleasing plots (scatter plots, bar charts, line
graphs, etc.)
o Customizing visualizations with themes, colors, and annotations.
o Layering data representations (e.g., overlays of lines on scatter plots).
o Used in exploratory data analysis and presenting results effectively.
2. dplyr
• Purpose: Data manipulation and transformation
• Applications:
o Filtering rows and selecting specific columns from datasets.
o Summarizing data with aggregate functions like mean, sum, or count.
o Grouping data for group-wise operations.
o Joining datasets using various types of joins (e.g., left_join, inner_join).
o Arranging data by sorting rows based on column values.
These packages are part of the tidyverse, a collection of R packages designed for data science.
2.
> data.frame(name=c("elena","damon","devika","stefan","klaus"),age=c(24,23,22,22,28))
> data.frame(name,age)
name age
1 elena 24
2 damon 23
3 devika 22
4 stefan 22
5 klaus 28
>customers=data.frame(customerid=c(101,102,103,104,105),name=c("elena","damon","devika","ste
fan","klaus"),age=c(24,23,22,22,28),email=c("[email protected]","[email protected]","devika.15
[email protected]","[email protected]","[email protected]"))
> data.frame(customerid,name,age,email) or > print(customers)
customerid name age email
3.
> x=rnorm(100)
> y=rnorm(100)
> plot(x,y,main = "scatter plot",xlab = "x-axis",ylab = "y-axis",pch=19)
> data("mtcars")
> plot(mtcars$disp,mtcars$mpg,main = "relationship",xlab = "disp",ylab = "mpg",pch=19)
4.
> X=c("male","female","male","male","female","male")
> factor(X)
[1] male female male male female male
Levels: female male
> dim(mtcars)
[1] 32 11
> d=matrix(1:6,nrow=2,ncol=3)
> print(d)
[,1] [,2] [,3]
[1,] 1 3 5
[2,] 2 4 6
> matrix_data=matrix(c(1:6,nrow=2,ncol=3))
> print(matrix_data)
[,1]
[1,] 1
[2,] 2
[3,] 3
[4,] 4
[5,] 5
[6,] 6
[7,] 2
[8,] 3
> List=list(A=1:3,c("air","water","fire"))
> print(List)
$A
[1] 1 2 3
[[2]]
[1] "air" "water" "fire"
> List[-1]
[[1]]
[1] "air" "water" "fire"
5.
> Matrix1=matrix(c(4, 19, 26, 9), nrow = 2, byrow = TRUE)
>
> Matrix2=matrix(c(16,11,1,5), nrow = 2, byrow = TRUE)
>
> matrixproduct=Matrix1 %*% Matrix2
> print(matrixproduct)
[,1] [,2]
[1,] 83 139
[2,] 425 331
> coin=c("T","H")
> sample(coin,1)
[1] "T"
> cor(iris$Species,iris$Sepal.Length)
Error in cor(iris$Species, iris$Sepal.Length) : 'x' must be numeric
Explanation:
• The iris$Species column is a factor (categorical data), while iris$Sepal.Length is numeric.
• The cor() function calculates the correlation coefficient between two numeric vectors, but
since iris$Species is not numeric, this operation is invalid.
• > x=c(10,21,33,12)
• > y=c(11,23,24,27)
• > z=c(x,y)
• > print(z)
• [1] 10 21 33 12 11 23 24 27