Tidy Data
Columns containing values, not variables
In the pew dataset column headers are values and not variable names.
Columns containing multiple variables
In the tuberculosis (TB) dataset columns contain multiple values: sex and age.
Variables in both rows and columns
In the weather dataset variables are stored in individual columns (id, year, month), spread across columns (day, d1-d31) and across rows (tmin, tmax).
Multiple observational units in a table (normalization)
Each type of observational unit should be stored in its own table. The billboard dataset needs to be broken down into two datasets: a song dataset which stores artist, song name and time, and a ranking dataset which gives the rank of the song in each week.