HomeBank ACLEW Annotations
|
Melanie Soderstrom
Department of Psychology
University of Manitoba
m_soderstrom@umanitoba.ca
website
|
|
Elika Bergelson
Psychology and Neuroscience
Duke University
elika.bergelson@duke.edu
website
|
|
Marissa Casillas
Department of Comparative Human Development
University of Chicago
mcasillas@uchicago.edu
website
|
Resource Description
This dataset is a collection of ELAN (eaf) annotation files. Annotations
were created using the ACLEW Annotation System, as part of the Analyzing
Child Language Experiences Around the World (ACLEW) project whose
homepage is here .
That project included selected recordings from several HomeBank
Corpora, the annotations for which are included in this ACLEW HomeBank
Annotations dataset: Casillas, McDivitt, Warlaumont, and Winnipeg.
Due to datasharing or other restrictions on some of the ACLEW
corpora, this dataset is a subset of the larger ACLEW dataset that
includes transcriptions.
CHAT version
ACLEW transcription in ELAN uses a method that allows for easy conversion
of the .eaf files to .cha format using the ELAN2CHAT program in CLAN. Once
in CLAN, the ACLEW addressee codes are on a %xds line. The single-letter
codes has these meanings:
- C = one or more child addressees (whether it includes the target child or not)
- A = one or more adult addressees
- B = both; at least one adult and one child addressee (whether it includes the target child or not)
- P = animal/pet addressees
- O = other addressee (doesn’t fit into one of the preceding categories)
- U = unsure; used when no other classification can be made
- T = target child only (when the target child is the exclusive addressee of the utterance)
Citations
In addition to the VanDam et al. (2017) HomeBank citation, products that
have used these data should cite the Soderstrom et al. (2021) Collabra
paper (see below) as well as at least one citation from each corpus
used. Where space is unavoidably and extremely limited (e.g. brief
conference proceedings) just the Collabra paper may be cited, although
this is strongly discouraged as it does not include (and therefore
denies credit to) important contributors to the individual corpora that
make up the ACLEW MetaCorpus.
Primary Project Citation
Soderstrom, M., Casillas, M., Bergelson, E., Rosemberg, C. R., Alam,
F., Warlaumont, A. S., & Bunce, J. P. (2021). Developing A
Cross-Cultural Annotation System and MetaCorpus for Studying Infants’
Real World Language Experience. Collabra: Psychology, 7(1), 23445.
Casillas Corpus publications
Casillas, M., Brown, P., & Levinson, S. C. (2017). Casillas HomeBank Corpus. doi:10.21415/T51X12
McDivitt Corpus publications
McDivitt, K., & Soderstrom, M. (2016). McDivitt HomeBank Corpus. doi:10.21415/T5KK6G
Winnipeg Corpus publications
Soderstrom, M., Grauer, E., Dufault, B., & McDivitt, K. (2018).
Influences of number of adults and adult: child ratios on the quantity
of adult language input across childcare settings. First Language,
38(6), 563-581. https://doi.org/10.1177/0142723718785013
Soderstrom, M., & Wittebolle, K. (2013). When Do Caregivers Talk? The
Influences of Activity and Time of Day on Caregiver Speech and Child
Vocalizations in Two Childcare Environments. PloS one, 8(11), e80646.
Soderstrom, M. (2016). Winnipeg HomeBank Corpus. doi:21415/T58P6Q
Warlaumont Corpus publications
Ritwika, V. P. S., Pretzer, G. M., Mendoza, S., Shedd, C., Kello, C. T., Gopinathan, A., & Warlaumont, A. S. (2020). Exploratory dynamics of vocal foraging during infant-caregiver communication. Scientific Reports, 10, 10469. doi:10.1038/s41598-020-66778-0
Warlaumont, A. S., Pretzer, G. M., Mendoza, S. & Walle, E. A. (2016). Warlaumont HomeBank Corpus. doi:10.21415/T54S3C
Usage Restrictions
Data restrictions for the individual corpora and for HomeBank in general
apply to these files in the special collection. In brief: 1) the authors
request to be informed about data usage prior to submission for
publication (email M_Soderstrom@umanitoba.ca) 2) The additional data
access restrictions for the Casillas and McDivitt/Soderstrom files are
in place 3) publication must take particular care not to reveal
identifying information of particular participants 4) raw audio and eafs
must not be publicly redistributed or presented unless this is
explicitly allowed for the participant in question (check individual
corpus restrictions or contact the corpus holder).
Access
The ACLEW transcriptions are located in an ACLEW folder within the transcript folder for each corpus. Here are links that will take you directly to each:
Metadata and other materials that span all corpora are located here .