Homebank data is transcribed in CHAT format. Transcribed data have been cleared for public access. The extensive collection of audio data has been diarized but not transcribed. Those data are password protected and open only to HomeBank members. For information on how to apply for HomeBank membership, see this link. Please remember to follow the guidelines for data-sharing.
HomeBank data can be downloaded by finding a relevant corpus on the Corpus List and then clicking on the link that says "Download CHAT transcripts, ITS files, and metadata".
HomeBank CHAT transcripts can be viewed in your browser by connecting to the Browsable Database
If you find it tedious to download media files one by one, you can use wget.
For example, to retrieve all the *.mp3 and *.wav audio in the Warlaumont folder, you can run this one-line wget command:
$ wget -c --user=gordon --ask-password -e robots=off -r -l inf --no-remove-listing -nH --no-parent -R 'index.html*' https://media.talkbank.org/HomeBank/Password/Warlaumont/
Then you enter your password, and the files download into a folder called "HomeBank/Password/Warlaumont" into the calling directory. The files within that folder will also maintain the original hierarchical structure. The program will not give you any progress bar, but you can check the progress by watching files pour into the folder on your computer.
If you want to download only all the *.wav files for a single child in the Warlaumont folder, the command would be:
$ wget -c --user=gordon --ask-password -e robots=off -r -l inf --no-remove-listing -nH --no-parent -R 'index.html*' -A '*.wav' https://media.talkbank.org/HomeBank/Password/Warlaumont/0107/
If you want to download all media from an area that has no password protection,
such as the VanDam Public corpus, you could use this form:
$ wget -c robots=off -r -l inf --no-remove-listing -nH --no-parent -R 'index.html*' https://media.talkbank.org/HomeBank/VanDam/