Homebank data is transcribed in CHAT format. Transcribed data have been cleared for public access.
The extensive collection of audio data has been diarized but not transcribed. Those data are password protected and open only to HomeBank members. For information
on how to apply for HomeBank membership, see this link.
Please remember to follow the guidelines for data-sharing.
From the corpus list, download the transcripts and unzip them.
From the same place, download the media files and place them into the transcript folders.
To open a transcript, you double-click on it and it will open in CLAN. If there is associated media,
you can play the media using escape-8 for continuous playback or command-click for playing single utterances.
If you find it tedious to download media files one by one, you can use wget.
For example, to retrieve all the *.mp3 audio in the McDivitt folder, you can run this one-line wget command:
Then you enter your password, and the files download into a folder called "media" into the calling directory.
The fiiles within that folder will also maintain the original hierarchical structure.
Installation of wget depends on your system:
For Mac OSX you install wget through HomeBrew. You first install HomeBrew by going to brew.sh and
cutting and pasting the command at that site. Then run "brew upgrade" and "brew install wget".
For Linux, you download wget from github, compile it, and then run the same command given above.
For Windows, you install wget through
Cygwin. During the install of Cygwin,
at the "Cygwin Setup - Select Packages" screen, search "wget" and select all packages. Run Cygwin and use the
same wget command above.