Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
13 views4 pages

R Programming End Term

The document outlines two popular R packages, ggplot2 for data visualization and dplyr for data manipulation, detailing their purposes and applications. It also provides examples of creating data frames, plotting data, and performing matrix operations in R. Additionally, it discusses the importance of data types in correlation calculations, highlighting an error when attempting to correlate categorical and numeric data.

Uploaded by

devikathaduru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
13 views4 pages

R Programming End Term

The document outlines two popular R packages, ggplot2 for data visualization and dplyr for data manipulation, detailing their purposes and applications. It also provides examples of creating data frames, plotting data, and performing matrix operations in R. Additionally, it discusses the importance of data types in correlation calculations, highlighting an error when attempting to correlate categorical and numeric data.

Uploaded by

devikathaduru
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 4

Section A

1.

Here are two widely used R packages and their applications:

1. ggplot2

• Purpose: Data visualization

• Applications:

o Creating complex and aesthetically pleasing plots (scatter plots, bar charts, line
graphs, etc.)

o Customizing visualizations with themes, colors, and annotations.

o Layering data representations (e.g., overlays of lines on scatter plots).

o Used in exploratory data analysis and presenting results effectively.

2. dplyr

• Purpose: Data manipulation and transformation

• Applications:

o Filtering rows and selecting specific columns from datasets.

o Summarizing data with aggregate functions like mean, sum, or count.

o Grouping data for group-wise operations.

o Joining datasets using various types of joins (e.g., left_join, inner_join).

o Arranging data by sorting rows based on column values.

These packages are part of the tidyverse, a collection of R packages designed for data science.

2.

> data.frame(name=c("elena","damon","devika","stefan","klaus"),age=c(24,23,22,22,28))

> data.frame(name,age)

name age

1 elena 24

2 damon 23

3 devika 22

4 stefan 22

5 klaus 28
>customers=data.frame(customerid=c(101,102,103,104,105),name=c("elena","damon","devika","ste
fan","klaus"),age=c(24,23,22,22,28),email=c("[email protected]","[email protected]","devika.15
[email protected]","[email protected]","[email protected]"))

> data.frame(customerid,name,age,email) or > print(customers)

customerid name age email

1 101 elena 24 [email protected]

2 102 damon 23 [email protected]

3 103 devika 22 [email protected]

4 104 stefan 22 [email protected]

5 105 klaus 28 [email protected]

3.

> x=rnorm(100)

> y=rnorm(100)

> plot(x,y,main = "scatter plot",xlab = "x-axis",ylab = "y-axis",pch=19)

> data("mtcars")

> plot(mtcars$disp,mtcars$mpg,main = "relationship",xlab = "disp",ylab = "mpg",pch=19)

4.

> X=c("male","female","male","male","female","male")

> factor(X)

[1] male female male male female male

Levels: female male

> dim(mtcars)

[1] 32 11

> d=matrix(1:6,nrow=2,ncol=3)

> print(d)

[,1] [,2] [,3]

[1,] 1 3 5
[2,] 2 4 6

> matrix_data=matrix(c(1:6,nrow=2,ncol=3))

> print(matrix_data)

[,1]

[1,] 1

[2,] 2

[3,] 3

[4,] 4

[5,] 5

[6,] 6

[7,] 2

[8,] 3

> List=list(A=1:3,c("air","water","fire"))

> print(List)

$A

[1] 1 2 3

[[2]]

[1] "air" "water" "fire"

> List[-1]

[[1]]

[1] "air" "water" "fire"

5.
> Matrix1=matrix(c(4, 19, 26, 9), nrow = 2, byrow = TRUE)

>

> Matrix2=matrix(c(16,11,1,5), nrow = 2, byrow = TRUE)

>

> matrixproduct=Matrix1 %*% Matrix2

> print(matrixproduct)
[,1] [,2]

[1,] 83 139

[2,] 425 331

> coin=c("T","H")

> sample(coin,1)

[1] "T"

> cor(iris$Species,iris$Sepal.Length)

Error in cor(iris$Species, iris$Sepal.Length) : 'x' must be numeric

Explanation:

• The iris$Species column is a factor (categorical data), while iris$Sepal.Length is numeric.

• The cor() function calculates the correlation coefficient between two numeric vectors, but
since iris$Species is not numeric, this operation is invalid.

• > x=c(10,21,33,12)
• > y=c(11,23,24,27)
• > z=c(x,y)
• > print(z)

• [1] 10 21 33 12 11 23 24 27

You might also like