untidydata
An R package of untidy datasets made for the purpose of teaching the tidyverse.
Last update: 2021-01-27
Overview
The purpose of this package is to store untidy datasets I have been creating for teaching purposes in a version controlled environment. The datasets vary in difficulty and present different problems common when tidying data.
Installation
You can install the development version from GitHub with:
install.packages("devtools")
devtools::install_github("jvcasillas/untidydata")
Datasets
language_diversity
- Difficulty: easy
- A long format dataset that is most useful in wide format.
- Data taken from Appendix 1 in:
Nettle, D. (1998). Explaining Global Patterns of Language Diversity. Journal of Anthropological Archaeology, 17, 354–374.
pre_post
- Difficulty: easy
- A typical pre-test, post-test data set in wide format.
spanish_vowels
- Difficulty: easy
- Simulated Spanish vowel formant measurements from male and female speakers.
spirantization
- Difficulty: easy
- Simulated intensity measurements of CV sequences in word initial and word medial position from L2 learners and native speakers.
vot
- Difficulty: medium
- A voice-onset time data set. Includes coronal stop data from English and Spanish monolinguals, as well as English/Spanish bilinguals.