Introduction to R and R-
Studio
What is R
• R is a popular programming language used for statistical computing
and graphical presentation.
• Its most common use is to analyze and visualize data.
Why Use R?
• It is a great resource for data analysis, data visualization, data science
and machine learning.
• It provides many statistical techniques (such as statistical tests,
classification, clustering and data reduction).
• It is easy to draw graphs in R, like pie charts, histograms, box plot,
scatter plot, etc.
• It works on different platforms (Windows, Mac, Linux).
• It is open-source and free.
• It has many packages (libraries of functions) that can be used to solve
different problems.
The R User Interface
• R-Studio gives you a way to talk to your computer. R gives you a
language to speak in.
• To get started, open R-Studio just as you would open any other
application on your computer. When you do, a window should appear
in your screen as below
The R-Studio interface is
simple.
You type R code into the
bottom line of the R-studio
console pane and then click
Ctrl + Enter to run it.
The code you type is called
a command,
because it will command
your computer to do
something for you.
The line you type it into is
called the command line.
• When you type a command at the prompt and hit Enter, your
computer executes the command and shows you the results.
• Then R-Studio displays a fresh prompt for your next command. For
example, if you type 1 + 1 and hit Enter, R-Studio will display:
>1+1
[1] 2
>
• You’ll notice that a [1] appears next to your result.
• R is just letting you know that this line begins with the first value in
your result.
• Some commands return more than one value, and their results may
fill up multiple lines. For example, the command 100:130
• If you type an incomplete command and press Enter, R will display a
+ prompt, which means it is waiting for you to type the rest of your
command.
• Either finish the command or hit Escape to start over.
>1+
+
+1
[1] 2
• If you type a command that R doesn’t recognize, R will return an error
message.
• R is just telling you that your computer couldn’t understand or do
what you asked it to do.
• You can then try a different command at the next prompt:
> 3%
Error: unexpected input in “3%”
Exercise
That’s the basic interface for executing R code in R-Studio. Think you
have it? If so, try doing these simple tasks.
1. Choose any number and add 2 to it.
2. Multiply the result by 3.
3. Subtract 6 from the answer.
4. Divide what you get by 3.
Syntax
• To output text in R, use single or double quotes:
Example: “Hello World”
• To output numbers, just type the number (without quotes):
Example: 5, 10
Note :
Unlike many other programming languages, you can output code in
R without using a print function
• However, R does have a print( ) function available if you want to use
it.
• You might have used it with other programming languages, such as
Python, which often uses print( ) function to output code.
• There are times you might use the function to output code,
For example working with for loops.
for (x in 1:10) {
print(x)
}
• The : operator returns its results as a vector, a one-dimensional set of
numbers
>1:6
[1] 1 2 3 4 5 6
Comments
• Comments can be used to explain R code, and to make it more
readable. It can also be used to prevent execution when testing
alternative code.
• Comments starts with #. When executing code, R will ignore anything
that starts with #.
> # This is a comment
> “Hello World!”
[1] “Hello World!”
Multiline Comments
• Unlike other programming languages, such as Java there are no syntax
in R for multiline comments.
• However, we can just insert a # for each line to create multiline
comments.
> # This is a comment
> # written in
> # more than just one line
> “This is a comment”
Variables
Creating Variables in R :
• Variables are containers for storing data values.
• R does not have a command for declaring a variable.
• A variable is created the moment you first assign a value to it.
• To assign a value to a variable, use the <- sign.
• To output (or print) the variable value, just type the variable name.
Variable Names
Rules for R variables are :
• A variable name must start with a letter and can be a combination of
letters, digits, period(.) and underscore(_).
• If it starts with period(.), it cannot be followed by a digit.
• A variable name cannot start with a number or underscore (_)
• Variable names are case-sensitive (age, Age and AGE are three
different variables)
• Reserved words cannot be used as variables (TRUE, FALSE, NULL,
if...)
Multiple Variables
• R allows you to assign the same value to multiple variables in one
line:
> var1 <- var2 <- var3 <- “ Orange”
> var1
> var2
> var3
Note :
Name cannot have special symbols like: ^, !, $, @, +, -, /, or *:
Exercise
Which of the following can be variable names :
myvar 2myvar my_var myVar my-var
MYVAR my var Myvar2 my_v@r
.myvar .2myvar TRUE