How run_analysis.R works

The code should be in the same directory as the unzipped folder "UCI HAR Dataset" to run successfully.

The dplyr library is first loaded to use the functionalities employed in Step 3 and Step 5.
Then subject ID, activity labels, and measurements from the training set and the test set, as well as activity names and feature names are read from different files with read.table().
In Step 1, the training set and test set are concatenated using rbind(, and then subject ID, activity labels, and measurements are put in the same data frame "mergedData" using cbind().
In Step 2, grep() uses exact matching to get column indices of measurements with names mean() and std() as described in features.txt, which is then selected in "mergedData" to get "extractedData".
In Step 3, "extractedData" is first divided by activity label into a list "activitiesData" using split(), which is then passed by lappy() to a custom function describe. The function converts the column of activity label as numeric into activity name as character, based on the value correspondence in activity_labels.txt. In the end the data is passed to unsplit() to get "describedData".
In Step 4, the column names are labeled as defined in features.txt.
In Step 5, "describedData" passes the column ID and activity to group_by(), so that summarise_each() can calculate the mean of each variable grouped by each activity and each subject, and the result is "tidyData".
Finaly the tidy data is stored in a text file.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
CodeBook.md		CodeBook.md
README.md		README.md
run_analysis.R		run_analysis.R

Provide feedback