Usage: source("run_analysis.R")
There is only a script file: "run_analysis.R". Script requirements: dplyr package.
Output: tidy data set according the following instructions:
- Merges the training and the test sets to create one data set.
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Uses descriptive activity names to name the activities in the data set
- Appropriately labels the data set with descriptive variable names.
- From the data set in step 4, creates a second, independent tidy data set with the average of each variable for each activity and each subject.
This script does the following actions:
- Check if files exists, if not download them.
- Check if unzipped folder exists, if not unzip the file.
- Set correct decimal precision to not lose information.
- Set a colClasses variable which is going to be used to read the data files. This variable defines the datatype of every column in the data files.
- Append the correct values of subject and activity for each measurement, according data files and original Codebook.
- Merge the test data with the training data.
- Applies descriptive names to all columns in the data frame.
- Filter only columns with the criteria to be a mean or a standard deviation. meanFreq are not included because, according original Codebook, it is "Weighted average of the frequency components to obtain a mean frequency", which is not what the project's instructions were asking for.
- Applies descriptive names to all activities.
- Summarise every column by subject and by activity.
- Returns the summarised data.