Coding Notes:
Tour of RStudio
How to make a project file:
New Project New Directory New Project Name Create Project
- If you open this file it will open RStudio directly with this file (it will not be new)
Folders within Project
Files New Folder Name Save
Creating a Script
- File New File R Script Save As
Run: runs a selected line of code (single or chunk) or command + enter to run
Source: run the whole script from top to bottom
: if you put sections in your script it will put an outline; find sections of script by using this
- Running a piece of code more than once put it in script and save it
Console Section
- Output of script ends up here (telling us what is being run and what the output is)
- able to use this to do simple maths
- able to create variable e.g. X = 7 and will create a value in the environment and can use the variable
to generate a equation e.g. X = 7; Y = 3; put in the console X+Y and then it will tell us the answer
- good for when you only want to do an equation or install a package once
: cleaning tool; able to sweep out the console and start again
Environment section
- where data appears when you feed it in
- create new frames they will appear here
- create variables or paths to data and they will appear here
- history section shows what we have been doing so far
: gets rid of everything in the environment section
Files section
- shows the folder area
Plots
- where plots that have been drawn appear
Packages
- list of all the packages that have been installed
Installing and Loading Packages
Packages: bundles of codes designed to solve a problem, often data science
- often need to be installed once, so do this in console
Install a package:
- install.packages(“nameofpackage”) enter
- packages install search nameofpackage install
- able to search the list on CRAN to see if they have packages for the kind of analysis needed e.g.
Bayesian plots, and then install them
Load a package:
- open the script
- file script open script
- *should leave notes in script using #
- ------: this will activate the outline tab in script section
- To load packages: type library(nameofpackage) in script section highlight packages written
run
Saying that the filter and lag are similar and can get confused so make sure to use the double colon when
using dplyr
Telling us that the data will live here
***load packages in the script section because it needs to load everytime you have an R session
Types of packages available on RStudio: https://cran.rstudio.com/
Vignette: long form guide for using the package
Getting data into RStudio
- Open script run packages
- Create a new object in environment using the assignment symbol <-
o Assigning what we read into the new object
- read_csv = makes less strange assumptions about data
o read_csv(here(“specify where the data lives in data folder”)
run
View function : View(variable name) – spreadsheet of data run or click on beaches in environment
section
Dimension function : dim(variable name) – how big the data file is
Structure: str(variable) – kind of data in each column
Glimpse: glimpse(variable) – same as str but nicer
head(variable) – looking at the top of the data
tail(variable) – bottom of the data
summary(variable) – gives the range of variables
skim(variable) – gives a histogram of data
https://rladiessydney.org/courses/ryouwithme/01-basicbasics-3/
filter = filtering for particular rows (reducing amount of observations)
select = selects particular columns/variables
%>% = pipe
.rm = remove
How to get a graph (this one is for column)
http://jenrichmond.rbind.io/post/apa-figures/
library(tidytuesdayR)
install.pack
use_tidytemplate()
- getting the template
tt <- tt_load(2021, week=39)
nom <- tt[[1]]
glimpse()
- to look at the variables in the dataset
How to get rid of certain variables
- select(- , - ,)
- starts with (like ctrl F)
clean names from janitor turn variable names to all lower case and underscore for the gaps
unique(dataset$variable)