Aggregate analysis data — aggregatedata • fasttrackr

Aggregate the .csv files contained in the csvs folder in a Fast Track directory. These files contain formant measurements at every 2ms, and there is one file per vowel token, so this function summarizes that data into a single dataframe with one row per token and measurements at a specified number of bins.

aggregatedata(
  path = NA,
  csvs = NA,
  bins = 5,
  f0_bins = 1,
  n_formants = NA,
  method = "median",
  encoding = "UTF-8",
  write = FALSE
)

Arguments

path	a string. The path to the working directory for the Fast Track project. If no path is provided, the current working directory for the current R session is used.
csvs	An object from the output of `readcsvs()`.
bins	an integer (default = 5). How many timepoints do you want formant data from? By default, you'll get formant samples at five points along the duration of each vowel.
f0_bins	an integer or string (default = 1). By default, the F0 values across the entire vowel token are summarized into a single value. However, if you are interested in F0 contours, you can specify how many measurements can be taken. This can be independent of the number of formant measurments. The value `"same"` will set this value equal to the `"bins"` argument. A value of 0 will result in no calculation of f0. See examples below.
n_formants	an integer. By default, `aggregatedata` will use the number of formants as is contained in `csvs` or in the .csv files. However, if you want to, for example, only aggregated data from F1, F2, and F3 even though you have data from F4, you can do so by setting `n_formants` to `3`.
method	a string (default = `"median"`). Determines what kind of summarization function is used when aggregating data. Other functions to come later.
encoding	--.
write	--.

Value

A dataframe containing formant measurements and various other information for each file (= vowel token). The column called f12 is the F1 measurement in the second bin. If only one F0 measurement is returned, the column will be named f0. Otherwise, it will follow the same convention (i.e. the F0 measurement for the third bin will be called f03).

Details

Note that if a file called aggregated_data.csv exists in the processed_data directory, this function will read that in instead—but only if the csvs object is not specified. If the file exists, but you want to process the data yourself (say, into 11 bins instead of 5), then include the csvs object in the function call. See the examples below.

Examples

if (FALSE) {
path <- "path/to/fasttrack/data"

# Read in the aggregated data
aggregatedata(path)

# Aggregate the csv files. 
# This generates a spreadsheet identical to the one produced by Praat.
csvs <- readcsvs(path)
aggregatedata(path, csvs)

# Reprocess existing csv data. Let's say that when I first analyzed the audio 
# in Praat using Fast Track, I only had the data binned into 5 timepoints. 
# Now, I want 11 timepoints, so I'll generate a new version of the aggregated data here.
aggregatedata(path, csvs, bins = 11)

# Only process the first three formants even though four are in the original csvs.
aggregatedata(path, csvs, n_formants = 3)

# Get two F0 measurements per vowel token
aggregatedata(path, csvs, f0_bins = 2)

# Get 11 measurements for all formants and F0.
aggregatedata(path, csvs, bins = 11, f0_bins = "same")
}