The script performs the following operations on the 'UCI HAR Dataset' data set available from the 'Getting and Cleaning Data' course project:
- Creates a directory 'merged' under the main data/working directory 'UCI HAR Dataset' if not already present
- Merges each files from 'training' and 'test' sets to create the corresponding 'merged' file.
- Saves the 'merged' file under the 'merged' directory/subdirectory to mirror the directory structure and file naming convention of the original files
- Loads the 'merged' files into a single data.table
- Extracts from the 'merged' data.table only the measurements on the mean and standard deviation for each measurement.
- Replaces the original activity codes in the 'merged' data.table with the corresponding descriptive activity names
- Labels the 'merged' data.table with descriptive variable names.
- Creates a second, tidy data.table with the average of each variable for each activity and each subject.
- Output the result to a text file (tab delimited) 'final_tidy.txt' under the working directory ('UCI HAR Dataset').
USAGE:
source("run_analysis.R")
NOTE: the script assumes the following:
- the user downloaded and extracted the 'Getting and Cleaning Data' course project zipped archive
- the working directory is the 'UCI HAR Dataset' directory
- the user has installed the package 'data.table' and 'dplyr'