Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
72 views32 pages

Apply Funcs DT

Uploaded by

anon_679166612
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
72 views32 pages

Apply Funcs DT

Uploaded by

anon_679166612
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 32

Data Science with R

Lesson 05—Apply Functions

©©Copyright
Copyright 2015,
2015, Simplilearn.
Simplilearn. All rights
All rights reserved.
reserved.
Objectives

• Explain the various types of apply functions


After completing
this lesson, you will • Define the Dplyr package and discuss how to install it
be able to: • Describe the various Dplyr functions

© Copyright 2015, Simplilearn. All rights reserved.


Topic 1: Types of Apply Functions

© Copyright 2015, Simplilearn. All rights reserved.

© Copyright 2015, Simplilearn. All rights reserved.


Types of Apply Functions

The apply functions are used to perform a specific change to each column or row of R objects. There
are three types of apply functions in R:

apply lapply sapply

© Copyright 2015, Simplilearn. All rights reserved.


Apply() Function

It helps apply a function to a matrix row or column and returns a vector, array, or list.

Syntax Examples

© Copyright 2015, Simplilearn. All rights reserved.


Apply() Function (contd.)

It takes three arguments: matrix/array, margin, and function. The syntax to use this function is:

Syntax Examples

apply(x, margin, function)

Where,
• margin indicates whether the function is to be applied to a row or column.
o margin = 1 indicates that the function needs to be applied to a row.
o margin = 2 indicates that the function needs to be applied to a column.
• function can be any function such as mean, sum, or average.

© Copyright 2015, Simplilearn. All rights reserved.


Apply() Function (contd.)

A few examples to use this function are:

Syntax Examples

• m <- matrix( c(1,2,3,4),2,2 )


• apply(m,1,sum)
• apply(m,2,sum)

© Copyright 2015, Simplilearn. All rights reserved.


Lapply() Function

It takes a list as an argument and works by looping through each element in the list. The output of the
this function is a list. The syntax to use this function, along with some examples, is:

Syntax Examples
lapply(list, function) • list <- list(a = c(1,1), b=c(2,2), c=c(3,3))
• lapply(list,sum)
• lapply(list,mean)
of undertaking the project
ascertaining the costs and benefits

© Copyright 2015, Simplilearn. All rights reserved.


Sapply() Function

It is similar to lapply(), except that it simplifies the result so that:


• If the result is a list and every element in the list is of size 1, then a vector is returned.
• If the result is a list and every element in the list is of the same size (>1), then a matrix is returned.
Otherwise, the result is returned as a list itself.
The syntax to use this function, along with some examples, are:
Syntax Examples
sapply(list, func) • list <- list(a = c(1,1), b=c(2,2), c=c(3,3))
sapply(list,sum)
• list <- list(a = c(1,2), b=c(1,2,3), c=c(1,2,3,4))
of undertaking the project sapply(list, range)
ascertaining the costs and benefits

© Copyright 2015, Simplilearn. All rights reserved.


Topic 2: Defining and Installing Dplyr Package

© Copyright 2015, Simplilearn. All rights reserved.

© Copyright 2015, Simplilearn. All rights reserved.


Dplyr Package—An Overview

It is a powerful R package:
• It transforms and summarizes tabular data with rows and columns.
• It provides simple verbs— functions that correspond to the most common data manipulation tasks to
help you translate your thoughts into code.
The use of efficient data storage backends by dplyr results in quicker processing speed.

© Copyright 2015, Simplilearn. All rights reserved.


Dplyr Package—The Five Verbs

It is referred to as the grammar of data manipulation. It provides the following five verbs (or functions)
that are applied on the data set:
• select: helps select rows in a table or dataframe
• filter: helps filter records in a table or dataframe
• arrange: helps rearrange a table or dataframe
• mutate: helps add new data
• summarize: helps state the data summary

© Copyright 2015, Simplilearn. All rights reserved.


Installing the Dplyr Package

Remember: dplyr is not a part of the default package of R.

To install it separately; use the following command:


install.packages("dplyr")
To load it into the memory; use the following command:
library(dplyr)

© Copyright 2015, Simplilearn. All rights reserved.


Topic 3: Functions of Dplyr Package

© Copyright 2015, Simplilearn. All rights reserved.

© Copyright 2015, Simplilearn. All rights reserved.


Functions of the Dplyr Package

The dplyr package has the following functions:


• Select()
• Filter()
• Arrange()
• Mutate()
• Summarize()

To understand the use of these functions, let’s consider the


dataset "mtcars“.

© Copyright 2015, Simplilearn. All rights reserved.


Functions of the Dplyr Package — Select()

This function allows you to select specific columns from large data sets.

To select columns by name

select(mtcars, mpg, disp)

To select a range of columns by name

select(mtcars, mpg:hp)

To select columns and rows with string match

select(iris, starts_with("Petal"))
select(iris, ends_with("Width"))
select(iris, contains("etal"))
select(iris, matches(".t."))

© Copyright 2015, Simplilearn. All rights reserved.


Functions of Dplyr Package—Filter()

This function enables easy filtering, zoom in, and zoom out of relevant data. The two types of filters
are explained below:
Simple filter

filter(mtcars, cyl == 8)
filter(mtcars, cyl < 6)

Multiple criteria filter

filter(mtcars, cyl < 6 & vs == 1)


filter(mtcars, cyl < 6 | vs == 1)

! Comma separated arguments are equivalent to the "And" condition; Example: filter(mtcars, cyl < 6, vs == 1)

© Copyright 2015, Simplilearn. All rights reserved.


Functions of Dplyr Package—Arrange()

This function helps arrange the data in a specific order. The syntax to use this function, along with some
examples, is:

Syntax Examples
arrange(data, ordering_column ) • arrange(mtcars, desc(disp))
• arrange(mtcars, cyl, disp)
of und
ascertaining the costs and benefits

© Copyright 2015, Simplilearn. All rights reserved.


Functions of Dplyr Package—Mutate()

This function helps add new variables to an existing data set. The syntax to use this function, along with
an example, is:

Syntax Examples
mutate(data, new_column) mutate(mtcars, my_custom_disp = disp /
1.0237)
of und
ascertaining the costs and benefits

© Copyright 2015, Simplilearn. All rights reserved.


Functions of Dplyr Package—Summarize()

This function summarizes multiple values to a single value in a dataset. Here are examples to use this
function without and with the group function:

Simple without Group Function Summarize with Group Function


summarise(mtcars, mean(disp)) summarise(group_by(mtcars, cyl), mean(disp))
summarise(group_by(mtcars, cyl), m = mean(disp), sd
= sd(disp))
of undertaking the project
ascertaining the costs and benefits

© Copyright 2015, Simplilearn. All rights reserved.


Functions of Dplyr Package—Summarize() (contd.)

Here’s a list of summary functions that can be used within this function:
• first: Returns the first element of a vector
• last: Returns the last element of a vector
• nth(x,n): Returns the ‘n'th element of a vector
• n(): Returns the number of rows in a dataframe
• n_distinct(x): Returns the number of unique values in vector x
In addition, the following functions are also used:

mean median mode


max min sun
var length IQR

© Copyright 2015, Simplilearn. All rights reserved.


Quiz

© Copyright 2015, Simplilearn. All rights reserved.


QUIZ
Which of the following statements is true about the apply(x, margin, function)?
1

a. When margin = 2, the function needs to be applied to a row.

b. When margin = 1, the function needs to be applied to a row.

c. x must be of type list.

d. Only arithmetic functions can be passed to the apply() function.

© Copyright 2015, Simplilearn. All rights reserved.


QUIZ
Which of the following statements is true about the apply(x, margin, function)?
1

a. When margin = 2, the function needs to be applied to a row.

b. When margin = 1, the function needs to be applied to a row.

c. x must be of type list.

d. Only arithmetic functions can be passed to the apply() function.

The correct answer is b.

Explanation: The function given above means that when margin = 1, the function needs to
be applied to a row.

© Copyright 2015, Simplilearn. All rights reserved.


QUIZ
Identify an accurate statement about the lapply() function.
2

It takes a list as an argument and works by looping through each element in


a.
the list.
It takes a list, an array, or a matrix and loops through each element in the
b.
list.

c. It is not a standalone function and needs to be applied with the apply()


function.
It is used when the latitude and longitude of an object come into the
d.
picture.

© Copyright 2015, Simplilearn. All rights reserved.


QUIZ
Identify an accurate statement about the lapply() function.
2

It takes a list as an argument and works by looping through each element in


a.
the list.
It takes a list, an array, or a matrix and loops through each element in the
b.
list.

c. It is not a standalone function and needs to be applied with the apply()


function.
It is used when the latitude and longitude of an object come into the
d.
picture.

The correct answer is a.

Explanation: The lapply() function takes the list as an argument and works by looping through
each element in the list.
© Copyright 2015, Simplilearn. All rights reserved.
QUIZ State whether the following statement is true or false.
dplyr is a powerful R package for transforming and summarizing tabular data with rows
3 and columns. It is also referred to as the grammar of data manipulation.

a. True

b. False

c.

d.

© Copyright 2015, Simplilearn. All rights reserved.


QUIZ State whether the following statement is true or false.
dplyr is a powerful R package for transforming and summarizing tabular data with rows
3 and columns. It is also referred to as the grammar of data manipulation.

a. True

b. False

c.

d.

The correct answer is a.

Explanation: dplyr is a powerful R package for transforming and summarizing tabular data
with rows and columns. It is also referred to as the grammar of data manipulation.

© Copyright 2015, Simplilearn. All rights reserved.


QUIZ
Which dplyr command is used to rearrange the order of columns in a data set?
4

a. order_data(data, ordering_column)

b. sort_data(data,ordering_column)

c. dplyr(data,ordering_column)

d. arrange(data, ordering_column)

© Copyright 2015, Simplilearn. All rights reserved.


QUIZ
Which dplyr command is used to rearrange the order of columns in a data set?
4

a. order_data(data, ordering_column)

b. sort_data(data,ordering_column)

c. dplyr(data,ordering_column)

d. arrange(data, ordering_column)

The correct answer is d.


Explanation: The dplyr command, arrange(data, ordering_column), is used to rearrange the
order of columns in a data set.

© Copyright 2015, Simplilearn. All rights reserved.


Summary
Summary

Let us summarize the • The six types of apply functions are apply, lapply, sapply, tapply, vapply, and
topics covered in this mapply.
lesson:
• dplyr is a powerful R package that transforms and summarizes tabular data
with rows and columns.
• dplyr is not a part of the default R package and needs to be installed
separately.
• The five types of dplyr functions are select, filter, arrange, mutate, and
summarize.

© Copyright 2015, Simplilearn. All rights reserved.


This concludes “Apply Functions.”

The next lesson is “Data Visualization.”

© Copyright 2015, Simplilearn. All rights reserved.

© Copyright 2015, Simplilearn. All rights reserved.

You might also like