The script run_analysis.R
- downloads the data from UCI Machine Learning Repository
- merges the training and test sets to create one data set
- extracts only the measurements (features) on the mean and standard deviation for each measurement
- replaces
activity
values in the dataset with descriptive activity names - appropriately labels the columns with descriptive names
- creates a second, independent tidy dataset with an average of each variable for each each activity and each subject.
The script is structured in functions such that each function performs one of the
steps described above.
The script assumes that plyr
library is already installed.
##Feature selection
The features selected came from the transformation of the dataset:"Human Activity Recognition Using Smartphones Dataset Version 1.0". This dataset it's composed by experimental data carried out with a group of 30 volunteers within an age bracket of 19-48 years. Each person performed six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) wearing a Samsung Galaxy S II on the waist. Using its embedded accelerometer and gyroscope, 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz data were captured, of accelerometer and gyroscope 3-axial raw signals tAcc-XYZ and tGyro-XYZ.
These signals were used to estimate variables of the feature vector for each pattern:
'-XYZ' is used to denote 3-axial signals in the X, Y and Z directions.
- tBodyAcc-XYZ
- tGravityAcc-XYZ
- tBodyAccJerk-XYZ
- tBodyGyro-XYZ
- tBodyGyroJerk-XYZ
- tBodyAccMag
- tGravityAccMag
- tBodyAccJerkMag
- tBodyGyroMag
- tBodyGyroJerkMag
- fBodyAcc-XYZ
- fBodyAccJerk-XYZ
- fBodyGyro-XYZ
- fBodyAccMag
- fBodyAccJerkMag
- fBodyGyroMag
- fBodyGyroJerkMag
For each feature was been computed mean and standard deviation. These measures are denoted adding the insuffix -mean() or -std() in the variable name.
In the preliminary phase the script check if the data is already available
in the data
directory, otherwise download the data from UCI repository.
As first step the script merges the training and test sets. Resulting data are 10,299 instances where each instance contains 563 features (subject_id, activity_id, 561 measurements).
As second step mean and standard deviation measurements are extracted. The subset is composed by 68 features (subject_id, activity_id, 33 mean and 33 standard deviations features).
Next, the activity labels are replaced with descriptive activity names, defined
in activity_labels.txt
in the original data folder.
The final step creates a tidy data set with the average of each variable for each activity and each subject. 10299 instances are split into 180 groups (30 subjects and 6 activities) and 66 mean and standard deviation features are averaged for each group. The resulting data table has 180 rows and 68 columns (activity_name, subject_id, 66 feature).
This new dataset is stored in the file tidyDataset.txt
in data
folder.
###List of variables in the data frame
str(tidyDF)
## 'data.frame': 180 obs. of 68 variables:
## $ activity_name : Factor w/ 6 levels "LAYING","SITTING",..: 1 1 1 1 1 1 1 1 1 1 ...
## $ subject_id : num 1 2 3 4 5 6 7 8 9 10 ...
## $ tBodyAcc-mean()-X : num 0.222 0.281 0.276 0.264 0.278 ...
## $ tBodyAcc-mean()-Y : num -0.0405 -0.0182 -0.019 -0.015 -0.0183 ...
## $ tBodyAcc-mean()-Z : num -0.113 -0.107 -0.101 -0.111 -0.108 ...
## $ tGravityAcc-mean()-X : num -0.249 -0.51 -0.242 -0.421 -0.483 ...
## $ tGravityAcc-mean()-Y : num 0.706 0.753 0.837 0.915 0.955 ...
## $ tGravityAcc-mean()-Z : num 0.446 0.647 0.489 0.342 0.264 ...
## $ tBodyAccJerk-mean()-X : num 0.0811 0.0826 0.077 0.0934 0.0848 ...
## $ tBodyAccJerk-mean()-Y : num 0.00384 0.01225 0.0138 0.00693 0.00747 ...
## $ tBodyAccJerk-mean()-Z : num 0.01083 -0.0018 -0.00436 -0.00641 -0.00304 ...
## $ tBodyGyro-mean()-X : num -0.01655 -0.01848 -0.02082 -0.00923 -0.02189 ...
## $ tBodyGyro-mean()-Y : num -0.0645 -0.1118 -0.0719 -0.093 -0.0799 ...
## $ tBodyGyro-mean()-Z : num 0.149 0.145 0.138 0.17 0.16 ...
## $ tBodyGyroJerk-mean()-X : num -0.107 -0.102 -0.1 -0.105 -0.102 ...
## $ tBodyGyroJerk-mean()-Y : num -0.0415 -0.0359 -0.039 -0.0381 -0.0404 ...
## $ tBodyGyroJerk-mean()-Z : num -0.0741 -0.0702 -0.0687 -0.0712 -0.0708 ...
## $ tBodyAccMag-mean() : num -0.842 -0.977 -0.973 -0.955 -0.967 ...
## $ tGravityAccMag-mean() : num -0.842 -0.977 -0.973 -0.955 -0.967 ...
## $ tBodyAccJerkMag-mean() : num -0.954 -0.988 -0.979 -0.97 -0.98 ...
## $ tBodyGyroMag-mean() : num -0.875 -0.95 -0.952 -0.93 -0.947 ...
## $ tBodyGyroJerkMag-mean() : num -0.963 -0.992 -0.987 -0.985 -0.986 ...
## $ fBodyAcc-mean()-X : num -0.939 -0.977 -0.981 -0.959 -0.969 ...
## $ fBodyAcc-mean()-Y : num -0.867 -0.98 -0.961 -0.939 -0.965 ...
## $ fBodyAcc-mean()-Z : num -0.883 -0.984 -0.968 -0.968 -0.977 ...
## $ fBodyAccJerk-mean()-X : num -0.957 -0.986 -0.981 -0.979 -0.983 ...
## $ fBodyAccJerk-mean()-Y : num -0.922 -0.983 -0.969 -0.944 -0.965 ...
## $ fBodyAccJerk-mean()-Z : num -0.948 -0.986 -0.979 -0.975 -0.983 ...
## $ fBodyGyro-mean()-X : num -0.85 -0.986 -0.97 -0.967 -0.976 ...
## $ fBodyGyro-mean()-Y : num -0.952 -0.983 -0.978 -0.972 -0.978 ...
## $ fBodyGyro-mean()-Z : num -0.909 -0.963 -0.962 -0.961 -0.963 ...
## $ fBodyAccMag-mean() : num -0.862 -0.975 -0.966 -0.939 -0.962 ...
## $ fBodyBodyAccJerkMag-mean() : num -0.933 -0.985 -0.976 -0.962 -0.977 ...
## $ fBodyBodyGyroMag-mean() : num -0.862 -0.972 -0.965 -0.962 -0.968 ...
## $ fBodyBodyGyroJerkMag-mean(): num -0.942 -0.99 -0.984 -0.984 -0.985 ...
## $ tBodyAcc-std()-X : num -0.928 -0.974 -0.983 -0.954 -0.966 ...
## $ tBodyAcc-std()-Y : num -0.837 -0.98 -0.962 -0.942 -0.969 ...
## $ tBodyAcc-std()-Z : num -0.826 -0.984 -0.964 -0.963 -0.969 ...
## $ tGravityAcc-std()-X : num -0.897 -0.959 -0.983 -0.921 -0.946 ...
## $ tGravityAcc-std()-Y : num -0.908 -0.988 -0.981 -0.97 -0.986 ...
## $ tGravityAcc-std()-Z : num -0.852 -0.984 -0.965 -0.976 -0.977 ...
## $ tBodyAccJerk-std()-X : num -0.958 -0.986 -0.981 -0.978 -0.983 ...
## $ tBodyAccJerk-std()-Y : num -0.924 -0.983 -0.969 -0.942 -0.965 ...
## $ tBodyAccJerk-std()-Z : num -0.955 -0.988 -0.982 -0.979 -0.985 ...
## $ tBodyGyro-std()-X : num -0.874 -0.988 -0.975 -0.973 -0.979 ...
## $ tBodyGyro-std()-Y : num -0.951 -0.982 -0.977 -0.961 -0.977 ...
## $ tBodyGyro-std()-Z : num -0.908 -0.96 -0.964 -0.962 -0.961 ...
## $ tBodyGyroJerk-std()-X : num -0.919 -0.993 -0.98 -0.975 -0.983 ...
## $ tBodyGyroJerk-std()-Y : num -0.968 -0.99 -0.987 -0.987 -0.984 ...
## $ tBodyGyroJerk-std()-Z : num -0.958 -0.988 -0.983 -0.984 -0.99 ...
## $ tBodyAccMag-std() : num -0.795 -0.973 -0.964 -0.931 -0.959 ...
## $ tGravityAccMag-std() : num -0.795 -0.973 -0.964 -0.931 -0.959 ...
## $ tBodyAccJerkMag-std() : num -0.928 -0.986 -0.976 -0.961 -0.977 ...
## $ tBodyGyroMag-std() : num -0.819 -0.961 -0.954 -0.947 -0.958 ...
## $ tBodyGyroJerkMag-std() : num -0.936 -0.99 -0.983 -0.983 -0.984 ...
## $ fBodyAcc-std()-X : num -0.924 -0.973 -0.984 -0.952 -0.965 ...
## $ fBodyAcc-std()-Y : num -0.834 -0.981 -0.964 -0.946 -0.973 ...
## $ fBodyAcc-std()-Z : num -0.813 -0.985 -0.963 -0.962 -0.966 ...
## $ fBodyAccJerk-std()-X : num -0.964 -0.987 -0.983 -0.98 -0.986 ...
## $ fBodyAccJerk-std()-Y : num -0.932 -0.985 -0.971 -0.944 -0.966 ...
## $ fBodyAccJerk-std()-Z : num -0.961 -0.989 -0.984 -0.98 -0.986 ...
## $ fBodyGyro-std()-X : num -0.882 -0.989 -0.976 -0.975 -0.981 ...
## $ fBodyGyro-std()-Y : num -0.951 -0.982 -0.977 -0.956 -0.977 ...
## $ fBodyGyro-std()-Z : num -0.917 -0.963 -0.967 -0.966 -0.963 ...
## $ fBodyAccMag-std() : num -0.798 -0.975 -0.968 -0.937 -0.963 ...
## $ fBodyBodyAccJerkMag-std() : num -0.922 -0.985 -0.975 -0.958 -0.976 ...
## $ fBodyBodyGyroMag-std() : num -0.824 -0.961 -0.955 -0.947 -0.959 ...
## $ fBodyBodyGyroJerkMag-std() : num -0.933 -0.989 -0.983 -0.983 -0.983 ...
###Summary of variables
summary(tidyDF)
## activity_name subject_id tBodyAcc-mean()-X tBodyAcc-mean()-Y tBodyAcc-mean()-Z tGravityAcc-mean()-X tGravityAcc-mean()-Y
## LAYING :30 Min. : 1.0 Min. :0.2216 Min. :-0.040514 Min. :-0.15251 Min. :-0.6800 Min. :-0.47989
## SITTING :30 1st Qu.: 8.0 1st Qu.:0.2712 1st Qu.:-0.020022 1st Qu.:-0.11207 1st Qu.: 0.8376 1st Qu.:-0.23319
## STANDING :30 Median :15.5 Median :0.2770 Median :-0.017262 Median :-0.10819 Median : 0.9208 Median :-0.12782
## WALKING :30 Mean :15.5 Mean :0.2743 Mean :-0.017876 Mean :-0.10916 Mean : 0.6975 Mean :-0.01621
## WALKING_DOWNSTAIRS:30 3rd Qu.:23.0 3rd Qu.:0.2800 3rd Qu.:-0.014936 3rd Qu.:-0.10443 3rd Qu.: 0.9425 3rd Qu.: 0.08773
## WALKING_UPSTAIRS :30 Max. :30.0 Max. :0.3015 Max. :-0.001308 Max. :-0.07538 Max. : 0.9745 Max. : 0.95659
## tGravityAcc-mean()-Z tBodyAccJerk-mean()-X tBodyAccJerk-mean()-Y tBodyAccJerk-mean()-Z tBodyGyro-mean()-X tBodyGyro-mean()-Y tBodyGyro-mean()-Z
## Min. :-0.49509 Min. :0.04269 Min. :-0.0386872 Min. :-0.067458 Min. :-0.20578 Min. :-0.20421 Min. :-0.07245
## 1st Qu.:-0.11726 1st Qu.:0.07396 1st Qu.: 0.0004664 1st Qu.:-0.010601 1st Qu.:-0.04712 1st Qu.:-0.08955 1st Qu.: 0.07475
## Median : 0.02384 Median :0.07640 Median : 0.0094698 Median :-0.003861 Median :-0.02871 Median :-0.07318 Median : 0.08512
## Mean : 0.07413 Mean :0.07947 Mean : 0.0075652 Mean :-0.004953 Mean :-0.03244 Mean :-0.07426 Mean : 0.08744
## 3rd Qu.: 0.14946 3rd Qu.:0.08330 3rd Qu.: 0.0134008 3rd Qu.: 0.001958 3rd Qu.:-0.01676 3rd Qu.:-0.06113 3rd Qu.: 0.10177
## Max. : 0.95787 Max. :0.13019 Max. : 0.0568186 Max. : 0.038053 Max. : 0.19270 Max. : 0.02747 Max. : 0.17910
## tBodyGyroJerk-mean()-X tBodyGyroJerk-mean()-Y tBodyGyroJerk-mean()-Z tBodyAccMag-mean() tGravityAccMag-mean() tBodyAccJerkMag-mean()
## Min. :-0.15721 Min. :-0.07681 Min. :-0.092500 Min. :-0.9865 Min. :-0.9865 Min. :-0.9928
## 1st Qu.:-0.10322 1st Qu.:-0.04552 1st Qu.:-0.061725 1st Qu.:-0.9573 1st Qu.:-0.9573 1st Qu.:-0.9807
## Median :-0.09868 Median :-0.04112 Median :-0.053430 Median :-0.4829 Median :-0.4829 Median :-0.8168
## Mean :-0.09606 Mean :-0.04269 Mean :-0.054802 Mean :-0.4973 Mean :-0.4973 Mean :-0.6079
## 3rd Qu.:-0.09110 3rd Qu.:-0.03842 3rd Qu.:-0.048985 3rd Qu.:-0.0919 3rd Qu.:-0.0919 3rd Qu.:-0.2456
## Max. :-0.02209 Max. :-0.01320 Max. :-0.006941 Max. : 0.6446 Max. : 0.6446 Max. : 0.4345
## tBodyGyroMag-mean() tBodyGyroJerkMag-mean() fBodyAcc-mean()-X fBodyAcc-mean()-Y fBodyAcc-mean()-Z fBodyAccJerk-mean()-X fBodyAccJerk-mean()-Y
## Min. :-0.9807 Min. :-0.99732 Min. :-0.9952 Min. :-0.98903 Min. :-0.9895 Min. :-0.9946 Min. :-0.9894
## 1st Qu.:-0.9461 1st Qu.:-0.98515 1st Qu.:-0.9787 1st Qu.:-0.95361 1st Qu.:-0.9619 1st Qu.:-0.9828 1st Qu.:-0.9725
## Median :-0.6551 Median :-0.86479 Median :-0.7691 Median :-0.59498 Median :-0.7236 Median :-0.8126 Median :-0.7817
## Mean :-0.5652 Mean :-0.73637 Mean :-0.5758 Mean :-0.48873 Mean :-0.6297 Mean :-0.6139 Mean :-0.5882
## 3rd Qu.:-0.2159 3rd Qu.:-0.51186 3rd Qu.:-0.2174 3rd Qu.:-0.06341 3rd Qu.:-0.3183 3rd Qu.:-0.2820 3rd Qu.:-0.1963
## Max. : 0.4180 Max. : 0.08758 Max. : 0.5370 Max. : 0.52419 Max. : 0.2807 Max. : 0.4743 Max. : 0.2767
## fBodyAccJerk-mean()-Z fBodyGyro-mean()-X fBodyGyro-mean()-Y fBodyGyro-mean()-Z fBodyAccMag-mean() fBodyBodyAccJerkMag-mean() fBodyBodyGyroMag-mean()
## Min. :-0.9920 Min. :-0.9931 Min. :-0.9940 Min. :-0.9860 Min. :-0.9868 Min. :-0.9940 Min. :-0.9865
## 1st Qu.:-0.9796 1st Qu.:-0.9697 1st Qu.:-0.9700 1st Qu.:-0.9624 1st Qu.:-0.9560 1st Qu.:-0.9770 1st Qu.:-0.9616
## Median :-0.8707 Median :-0.7300 Median :-0.8141 Median :-0.7909 Median :-0.6703 Median :-0.7940 Median :-0.7657
## Mean :-0.7144 Mean :-0.6367 Mean :-0.6767 Mean :-0.6044 Mean :-0.5365 Mean :-0.5756 Mean :-0.6671
## 3rd Qu.:-0.4697 3rd Qu.:-0.3387 3rd Qu.:-0.4458 3rd Qu.:-0.2635 3rd Qu.:-0.1622 3rd Qu.:-0.1872 3rd Qu.:-0.4087
## Max. : 0.1578 Max. : 0.4750 Max. : 0.3288 Max. : 0.4924 Max. : 0.5866 Max. : 0.5384 Max. : 0.2040
## fBodyBodyGyroJerkMag-mean() tBodyAcc-std()-X tBodyAcc-std()-Y tBodyAcc-std()-Z tGravityAcc-std()-X tGravityAcc-std()-Y tGravityAcc-std()-Z
## Min. :-0.9976 Min. :-0.9961 Min. :-0.99024 Min. :-0.9877 Min. :-0.9968 Min. :-0.9942 Min. :-0.9910
## 1st Qu.:-0.9813 1st Qu.:-0.9799 1st Qu.:-0.94205 1st Qu.:-0.9498 1st Qu.:-0.9825 1st Qu.:-0.9711 1st Qu.:-0.9605
## Median :-0.8779 Median :-0.7526 Median :-0.50897 Median :-0.6518 Median :-0.9695 Median :-0.9590 Median :-0.9450
## Mean :-0.7564 Mean :-0.5577 Mean :-0.46046 Mean :-0.5756 Mean :-0.9638 Mean :-0.9524 Mean :-0.9364
## 3rd Qu.:-0.5831 3rd Qu.:-0.1984 3rd Qu.:-0.03077 3rd Qu.:-0.2306 3rd Qu.:-0.9509 3rd Qu.:-0.9370 3rd Qu.:-0.9180
## Max. : 0.1466 Max. : 0.6269 Max. : 0.61694 Max. : 0.6090 Max. :-0.8296 Max. :-0.6436 Max. :-0.6102
## tBodyAccJerk-std()-X tBodyAccJerk-std()-Y tBodyAccJerk-std()-Z tBodyGyro-std()-X tBodyGyro-std()-Y tBodyGyro-std()-Z tBodyGyroJerk-std()-X
## Min. :-0.9946 Min. :-0.9895 Min. :-0.99329 Min. :-0.9943 Min. :-0.9942 Min. :-0.9855 Min. :-0.9965
## 1st Qu.:-0.9832 1st Qu.:-0.9724 1st Qu.:-0.98266 1st Qu.:-0.9735 1st Qu.:-0.9629 1st Qu.:-0.9609 1st Qu.:-0.9800
## Median :-0.8104 Median :-0.7756 Median :-0.88366 Median :-0.7890 Median :-0.8017 Median :-0.8010 Median :-0.8396
## Mean :-0.5949 Mean :-0.5654 Mean :-0.73596 Mean :-0.6916 Mean :-0.6533 Mean :-0.6164 Mean :-0.7036
## 3rd Qu.:-0.2233 3rd Qu.:-0.1483 3rd Qu.:-0.51212 3rd Qu.:-0.4414 3rd Qu.:-0.4196 3rd Qu.:-0.3106 3rd Qu.:-0.4629
## Max. : 0.5443 Max. : 0.3553 Max. : 0.03102 Max. : 0.2677 Max. : 0.4765 Max. : 0.5649 Max. : 0.1791
## tBodyGyroJerk-std()-Y tBodyGyroJerk-std()-Z tBodyAccMag-std() tGravityAccMag-std() tBodyAccJerkMag-std() tBodyGyroMag-std() tBodyGyroJerkMag-std()
## Min. :-0.9971 Min. :-0.9954 Min. :-0.9865 Min. :-0.9865 Min. :-0.9946 Min. :-0.9814 Min. :-0.9977
## 1st Qu.:-0.9832 1st Qu.:-0.9848 1st Qu.:-0.9430 1st Qu.:-0.9430 1st Qu.:-0.9765 1st Qu.:-0.9476 1st Qu.:-0.9805
## Median :-0.8942 Median :-0.8610 Median :-0.6074 Median :-0.6074 Median :-0.8014 Median :-0.7420 Median :-0.8809
## Mean :-0.7636 Mean :-0.7096 Mean :-0.5439 Mean :-0.5439 Mean :-0.5842 Mean :-0.6304 Mean :-0.7550
## 3rd Qu.:-0.5861 3rd Qu.:-0.4741 3rd Qu.:-0.2090 3rd Qu.:-0.2090 3rd Qu.:-0.2173 3rd Qu.:-0.3602 3rd Qu.:-0.5767
## Max. : 0.2959 Max. : 0.1932 Max. : 0.4284 Max. : 0.4284 Max. : 0.4506 Max. : 0.3000 Max. : 0.2502
## fBodyAcc-std()-X fBodyAcc-std()-Y fBodyAcc-std()-Z fBodyAccJerk-std()-X fBodyAccJerk-std()-Y fBodyAccJerk-std()-Z fBodyGyro-std()-X
## Min. :-0.9966 Min. :-0.99068 Min. :-0.9872 Min. :-0.9951 Min. :-0.9905 Min. :-0.993108 Min. :-0.9947
## 1st Qu.:-0.9820 1st Qu.:-0.94042 1st Qu.:-0.9459 1st Qu.:-0.9847 1st Qu.:-0.9737 1st Qu.:-0.983747 1st Qu.:-0.9750
## Median :-0.7470 Median :-0.51338 Median :-0.6441 Median :-0.8254 Median :-0.7852 Median :-0.895121 Median :-0.8086
## Mean :-0.5522 Mean :-0.48148 Mean :-0.5824 Mean :-0.6121 Mean :-0.5707 Mean :-0.756489 Mean :-0.7110
## 3rd Qu.:-0.1966 3rd Qu.:-0.07913 3rd Qu.:-0.2655 3rd Qu.:-0.2475 3rd Qu.:-0.1685 3rd Qu.:-0.543787 3rd Qu.:-0.4813
## Max. : 0.6585 Max. : 0.56019 Max. : 0.6871 Max. : 0.4768 Max. : 0.3498 Max. :-0.006236 Max. : 0.1966
## fBodyGyro-std()-Y fBodyGyro-std()-Z fBodyAccMag-std() fBodyBodyAccJerkMag-std() fBodyBodyGyroMag-std() fBodyBodyGyroJerkMag-std()
## Min. :-0.9944 Min. :-0.9867 Min. :-0.9876 Min. :-0.9944 Min. :-0.9815 Min. :-0.9976
## 1st Qu.:-0.9602 1st Qu.:-0.9643 1st Qu.:-0.9452 1st Qu.:-0.9752 1st Qu.:-0.9488 1st Qu.:-0.9802
## Median :-0.7964 Median :-0.8224 Median :-0.6513 Median :-0.8126 Median :-0.7727 Median :-0.8941
## Mean :-0.6454 Mean :-0.6577 Mean :-0.6210 Mean :-0.5992 Mean :-0.6723 Mean :-0.7715
## 3rd Qu.:-0.4154 3rd Qu.:-0.3916 3rd Qu.:-0.3654 3rd Qu.:-0.2668 3rd Qu.:-0.4277 3rd Qu.:-0.6081
## Max. : 0.6462 Max. : 0.5225 Max. : 0.1787 Max. : 0.3163 Max. : 0.2367 Max. : 0.2878