Skip to content

An ASR model for transcribing laughter and speech-laugh in conversational speech

Notifications You must be signed in to change notification settings

hhoangphuoc/SpeechLaughRecogniser

Repository files navigation

SpeechLaughRecogniser

An ASR model for transcribing laughter in Speech Laugh audio

Dataset

Switchboard data

path=/deepstore/datasets/hmi/speechlaugh-corpus # global data path
  • Using gdown to download the .zip file data and unzip it.
gdown 1VlQlyY3v3wtT2S047lwlTirWisz5mQ18 -O /path/to/data/switchboard.zip

#path=/deepstore/datasets/hmi/speechlaugh-corpus/switchboard_data # global datasets path

cd path/to/data #/deepstore/datasets/hmi/speechlaugh-corpus/switchboard_data

unzip switchboard.zip

# after unzip, the data will contain the following folders:
# - audio_wav
# - transcripts
  • Generate audio_segments folder, this could be stored in the following path
path=/deepstore/datasets/hmi/speechlaugh-corpus/switchboard_data/audio_segments

VocalSound

  • Download the dataset from VocalSound and save it to path/to/data/vocalsound_data folder
wget -O vocalsound_16k.zip https://www.dropbox.com/s/c5ace70qh1vbyzb/vs_release_16k.zip?dl=1

#path=/deepstore/datasets/hmi/speechlaugh-corpus/vocalsound_data

unzip vocalsound_16k.zip
  • The path to the data would be:
path=/deepstore/datasets/hmi/speechlaugh-corpus/vocalsound_data/audio_16k

Other datasets (Ami, VocalSound, FSD50K-noisy, etc.)

  • Download these datasets from HuggingFace datasets and saving to data/huggingface_data folder
  1. First set the path to HuggingFace cache to this folder
$ export HF_DATASETS_CACHE="../data/huggingface_data"

# or change to the global datasets

$ export HF_DATASETS_CACHE="/deepstore/datasets/hmi/speechlaugh-corpus/huggingface_data"
  1. Then download the datasets, given the dataset name in HuggingFace as follow:
  • ami: "edinburghcstr/ami" "ihm" split="train"
  • fsd50k_noisy: "sps44/fsdnoisy18k"
  • audioset: "benjamin-paine/audio-set-16khz"

About

An ASR model for transcribing laughter and speech-laugh in conversational speech

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published