Getting-and-cleaning-data-assignment

output

pdf_document	html_document
default	default

Getting-and-cleaning-data-assignment

This is a short description of the data processing implemented in the R script 'run_analysis.R'.

The data files related to training phase 'X_train.txt', 'y_train.txt' and 'subject_train.txt' have been combined in a single data frame called "train_data". Furthermore a new variable called "Phase" has been created and initialized at the value "training". This new variable takes into account the context change information implied in the following merging of training and test data sets.
The data files related to test phase 'X_test.txt', 'y_test.txt' and 'subject_test.txt' have been combined in a single data frame called "test_data". Consistently with previous note the new variable called "Phase" has been initialized at the value "test"
The two data frame "train_data" and "test_data" described above have been merged in a new data frame called "all_data"
The columns names related to the measures included in the original files 'X_train.txt' and 'X_test.txt' have been derived from file 'features.txt' assuming the variable order of this file consitent with 'X_train.txt' file columns that now are the 561 righmost colums of "all_data" data frame.
To obtain a tidy data set only one measure type has been selected choosing all accelerations in the time domain measures. This selection is performed by calculation of the vector "col_sel". The selected column names are stored in the character vector "colnames". Consequently is calculated the vector "col_ind" as the column index to select the desired columns from "all_data".
The selected columns by "col_sel" vector are stored in the new data frame "tidy_data" and its variable names are assigned taking advantage of the vector "colnames".
Finally the values in the column Activity of "tidy_data" are replaced with the corresponding literal values derived from file 'activity_labels.txt'.
Starting from "tidy_data" the column measures are summarized in a new data frame "sum_data" with their average values calculated by groups of each Activity and each Subject.
The resultant "sum_data" data frame is saved in the attached file 'avg_dataset.txt'

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
Code Book.md		Code Book.md
Code-Book.pdf		Code-Book.pdf
README.md		README.md
README.pdf		README.pdf
avg_dataset.txt		avg_dataset.txt
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Getting-and-cleaning-data-assignment

About

Uh oh!

Releases

Packages

Languages

Sandrobike/Getting-and-cleaning-data-assignment

Folders and files

Latest commit

History

Repository files navigation

Getting-and-cleaning-data-assignment

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages