Extraction Details
To capture relevant information from studies, we expanded BiBTeX fields for each study with additional fields. For reproducibility, these instructions provide information on the process followed for each field.
Fields
IDENTIFIER
Unique identifier of article. Contains last name of lead author, year of publication and first two letters of article title. Hyphenated last names collapsed.
JOURNAL
Title of journal containing article.
NOTE
Includes number of citing articles and open access details. E.g., Cited by: 4; All Open Access, Gold Open Access, Green Open Access
TITLE
Title of article.
VOLUME
Volume number of publication.
YEAR
Publication year.
DOI
Digital object identifier of article.
ABSTRACT
Complete text of article abstract.
SOURCE
Database article was sourced from. Scopus, Web of Science (WoS) or OpenAlex.
STIMULUS_TYPE
Metadata pertaining to stimuli employed in paradigm. Can be listed as genres of music stimuli employed, or if stimuli come from a standard database, name of standard.
STIMULUS_DURATION
Duration of stimuli, if applicable. Unit of measurement (seconds, measures) specified in STIMULUS_DURATION_UNIT
STIMULUS_DURATION_UNIT
Unit of measurement pertaining to STIMULUS_DURATION. E.g., seconds, measures, etc.
STIMULUS_N
Number of stimuli employed in experiment. If multiple experimental conditions reported, separate \(n\) by conditions where possible.
FEATURE_N
Number of features included in data modeling (if available).
PARTICIPANT_N
Total number of participants in experiment.
PARTICIPANT_EXPERTISE
Expertise of annotators. E.g., experts, non-experts, not specified.
PARTICIPANT_ORIGIN
Origin country of participants, or online platform participants were recruited from (e.g., MTurk)
PARTICIPANT_SAMPLING
How participants were recruited (e.g., convenience, random sampling, crowdsourcing)
PARTICIPANT_TASK
Nature of rating/classification task undertaken by participants. E.g., rate, annotate.
FEATURE_CATEGORIES
Names of categories analyzed features pertain to, based on names in Panda (2021). Includes names of all pertinent categories: Melody, Rhythm, Timbre, Pitch, Tonality, Expressivity, Texture, Form, Vocal, High-Level
FEATURE_SOURCE
Name(s) of feature analysis toolbox(es).
FEATURE_REDUCTION_METHOD
Name(s) of feature reduction or feature selection methods employed.
MODEL_CATEGORY
Name of model type (regression, classification, or both).
MODEL_DETAIL
Additional information pertaining to predictive model, such as the name of algorithm used and other pertinent parameters. E.g., Random Forest, Commonality Analysis, Multiple Regression, Neural Networks, LDSM.
MODEL_MEASURE
Metric used in model evaluation. E.g., \(R^2\), \(MSE\), \(CCC\), Classification accuracy, etc.
MODEL_COMPLEXITY_PARAMETERS
Additional information pertaining to predictive model. E.g., training epochs: 100; n layers: 1, 2; LSTM units: 124,248.
MODEL_RATE_EMOTION_NAMES
Names of predicted emotions. E.g., valence, arousal, happy, sad, angry, fearful, etc.
MODEL_RATE_EMOTION_VALUES
Pertinent prediction of model summaries. Report as R named arrays, including summary statistics in variables. When reporting results of multiple models, concatenate multiple entries with bind_field. When reporting results for different toolboxes or feature subsets, assign each to a new BiBTeX field with relevant identifier following final underscore. See additional details below.
MODEL_VALIDATION
Validation method used (if applicable). E.g., 10-fold cross validation, leave one out cross validation.
Encoding study results:
Regression
Report relevant summary statistics in the MODEL_RATE_EMOTION_VALUES field in R code using the following format. Parameters delimited by spaces are later assigned to separate columns in the resulting dataframe.
bind_field(
library.model.features.data.experiment = c(dimension_measure.summaryStat = 0, ...),
...
)
Classification
Confusion matrices can be encoded using the unflatten function, which assigns relevant meta-parameters to the model_parameters attribute of the output matrix. The function automatically builds a \(n\) by \(n\) matrix and takes individual values as arguments. Values can be specified for the first row, and are recycled in subsequent rows.
unflatten(A = .5, B = .2, C = .3,
.3, .5, .2,
.1, .1, .8)
Summary parameters can be extracted from the resulting matrix by calling the summarize_matrix function. This function calls on caret to extract several summary statistics at once. Including bind_field in the call ensures consistent nomenclature with other study encodings.
bind_field(
'library.model.features.data.exp' = summarize_matrix(
unflatten(Q1=185.85,Q2=14.4,Q3=8.6,Q4=18.15,
23.95,190.55,7,3.5,
14.2,8.4,157.25,45.15,
24.35,1.65,45.85,153.15)
)
)
Multiple matrices in a study can be summarized by putting them in a list and using the lapply function to extract relevant statistics:
bind_field(
lapply(
list(
'library.model.features.data.exp1' = unflatten(
Angry=.984,Happy=.007,Relax=.007,Sad=0,
0,.987,0,.007,
.008,.007,.987,0,
.008,0,.007,.993
),
'library.model.features.data.exp2' = unflatten(
Angry=.976,Happy=.013,Relax=.014,Sad=0,
.008,.967,0,.02,
.008,.013,.98,.007,
.008,.007,.007,.974
)
),
summarize_matrix
)
)
When confusion matrices are not available, encode available parameters (accuracy, precision, recall, \(F\) scores, etc.) using the standard nomenclature to distinguish relevant outcomes for each class:
bind_field(
library.model.features.data.experiment = c(class_measure.summaryStat = 0, ...),
...
)