Skip to content

Releases: espnet/espnet

ESPnet version 202412

04 Dec 04:40
cccc290
Compare
Choose a tag to compare

New Features

  • [New Features][ESPnet2][Codec] Add HiFiCodec model #5898 by @RayYuki

Enhancement

Recipe

  • [Recipe][ESPnet2][ASR] My Science Tutor (MyST) Children's Conversational Speech Corpus #5964 by @eric102004
  • [Recipe][ESPnet2] Feature/improve is24 asr2 #5938 by @juice500ml
  • [Recipe][ESPnet2][ASR] Add asr1 recipe for libriheavy_small #5932 by @Miamoto
  • [Recipe][ESPnet2][SID] Add RATS dataset for SV task #5840 by @shimhz

Bugfix

Documentation

Others

Acknowledgements

Special thanks to @Masao-Someki, @Miamoto, @RayYuki, @Trikaldarshi, @anyuyay, @emmanuel-ferdman, @eric102004, @ftshijt, @juice500ml, @kalvinchang, @pyf98, @shimhz, @siddhu001, @wanchichen, @yoshipon.

ESPnet version 202409

01 Oct 06:28
6bae9d2
Compare
Choose a tag to compare

New Features

  • [New Features][ESPnet2][TTS][Codec] Support Codec feature for TTS2 task #5857 by @wyh2000
  • [New Features][ESPnet2][Codec] Codec downstream task support: TTS #5763 by @jctian98
  • [New Features][ESPnet2][Codec] Add Encodec features for Codec toolkit #5758 by @jctian98
  • [New Features][ESPnet2][Installation][TTS] Add evaluation scripts with DiscreteSpeechMetrics. #5661 by @Takaaki-Saeki
  • [New Features][ESPnet2][ASR] Integrate adapter for s3prl frontend #5609 by @Stanwang1210
  • [New Features][ESPnet2][CI][OWSM] Support external dataset library for ESPnetEasy #5584 by @Masao-Someki
  • [New Features][ESPnet2][CI][LM] Pr voxtlm #5472 by @soumimaiti

Enhancement

  • [Enhancement][ESPnet2][SLM] MT Task in SpeechLM #5899 by @ftshijt
  • [Enhancement][ESPnet2][Codec] Categorical Balnced Chunk iterator #5894 by @ftshijt
  • [Enhancement][ESPnet2][ESPnet1] TransformerDecoder forward_one_step with memory_mask #5679 by @albertz
  • [Enhancement][ESPnet2] Update espnet_model.py #5646 by @shen9712

Recipe

  • [Recipe][ESPnet2][Music] Fixed KiSing Data Preparation #5895 by @HANJionghao
  • [Recipe][ESPnet2][ASR] CORAAL asr1 recipe #5882 by @kalvinchang
  • [Recipe][ESPnet2][ASR] ml_superb asr2 recipe #5866 by @Stanwang1210
  • [Recipe][ESPnet2] Add more download links for ML-SUPERB #5863 by @ftshijt
  • [Recipe][ESPnet2][ASR] Fix bug in asr2.sh #5859 by @juice500ml
  • [Recipe][ESPnet2][Music] fix bugs in SVS1 #5851 by @South-Twilight
  • [Recipe][ESPnet2][TTS] New Recipe of tts2+aishell3 #5849 by @Tsukasane
  • [Recipe][ESPnet2][ASR] Espnet Multi-convformer implementation #5832 by @Darshan7575
  • [Recipe][ESPnet2][SE] Update of SE functions #5825 by @Emrys365
  • [Recipe][ESPnet2] SPRING-INX Recipe (Speech Lab, IIT, Madras) #5811 by @arjun-gangwar
  • [Recipe][ESPnet2][TTS] Adding Hifitts recipe for espnet #5784 by @coding-phoenix-12
  • [Recipe][ESPnet2][ASR] Updated results for CHiME-8 DASR baseline with new notsofar1 dev set #5771 by @popcornell
  • [Recipe][ESPnet2][SE] Final model scores for TF-GridNetV2 on the Kinect-WSJ dataset #5754 by @atharva253
  • [Recipe][ESPnet2] Apply normalization on validation set for CHiME-8 recipe #5749 by @popcornell
  • [Recipe][ESPnet2][Need review][Codec] ESPnet-Codec decoding and Scoring #5747 by @ftshijt
  • [Recipe][ESPnet2][CI][ST] Add recipe for IWSLT 2024 shared task Indic track #5744 by @cromz22
  • [Recipe][ESPnet2][Music] [SVS] VISinger Plus #5741 by @jerryuhoo
  • [Recipe][ESPnet2][Need review][Codec] ESPnet-codec Training and Setup #5732 by @ftshijt
  • [Recipe][ESPnet2][ASR] ESPnet Recipe for ASR on the Makerere Radio Speech Corpus #5730 by @satvik-dixit
  • [Recipe][ESPnet2][SE] ESPnet recipe for the Kinect-WSJ dataset #5711 by @atharva253
  • [Recipe][ESPnet2][TTS][ASR][Music] Update bitrate calculation scripts for the IS24 discrete speech challenge #5677 by @ftshijt
  • [Recipe][ESPnet2][ASR] Add some documents for JTubeSpeech #5663 by @sw005320
  • [Recipe][ESPnet2][SID] ESPnet-SPK: add SdSV 2021 recipe #5659 by @Alexgichamba
  • [Recipe][ESPnet2][ASR] Add E-Branchformer model for FLEURS #5657 by @wanchichen
  • [Recipe][ESPnet2][Installation][CI][ASR] CHiME-8 DASR recipe based on CHiME-7 DASR baseline #5641 by @popcornell
  • [Recipe][ESPnet2][ASR] add interspeech2024_dsu_challenge/asr2 #5627 by @simpleoier
  • [Recipe][ESPnet2][Installation][TTS] Discrete token-based TTS implementation #5626 by @ftshijt

Bugfix

  • [Bugfix] fix: replace ellipses (...) in ESPnet-EZ Trainer documentation #5911 by @kalvinchang
  • [Bugfix] Bugfix/homepage #5885 by @Masao-Someki
  • [Bugfix][ESPnet2] Fix absolute paths in aishell3_tts2 #5884 by @Tsukasane
  • [Bugfix] Bug fix for source link #5883 by @Masao-Someki
  • [Bugfix][Installation] [CI] Add required file for g2p_en #5869 by @Fhrozen
  • [Bugfix][ESPnet2] A fix to newer torch version (compatible to old version with typecheck) #5830 by @ftshijt
  • [Bugfix][ESPnet2] Revert change to abs_task to keep the consistency behavior #5789 by @ftshijt
  • [Bugfix][ESPnet2] Fix Whisper frontend #5760 by @siddhu001
  • [Bugfix][ESPnet2][SE] Update TSE recipe egs2/librimix/tse1 #5731 by @Emrys365
  • [Bugfix][ESPnet2] Fix LoRA issues when saving all parameters. #5722 by @simpleoier
  • [Bugfix][ESPnet2] Fix tts packing with new spk embedding #5715 by @ftshijt
  • [Bugfix][ESPnet2][TTS] Fix stage references in generated run.sh in TTS recipes #5714 by @G-Thor
  • [Bugfix][ESPnet2][OWSM] fix a small issue in OWSM decode_long #5703 by @jctian98
  • [Bugfix][ESPnet2][Installation] Upgrade typeguard #5702 by @sw005320
  • [Bugfix][ESPnet2] Quick fix to calculation of bitrate #5692 by @ftshijt
  • [Bugfix][ESPnet2][SSUM] Fix typo in summarization scoring #5688 by @YoshikiMas
  • [Bugfix][ESPnet2] Update egs2/TEMPLATE/asr2/asr2.sh #5682 by @simpleoier
  • [Bugfix][ESPnet2][ASR] Fix over-lengthy audio in ml_superb data prep #5678 by @ftshijt
  • [Bugfix][ESPnet2] fix typo #5673 by @hiranoyu0830
  • [Bugfix][Installation][ST] Fix CI Multilingual ST test #5672 by @Fhrozen
  • [Bugfix][ESPnet2][SLU] Fix speed perturbation when not using transcript in slu.sh #5671 by @siddhu001
  • [Bugfix][ESPnet2][SLU] Fix loading pre-trained model from transformers #5668 by @siddhu001
  • [Bugfix][ESPnet2] Correct the argument errors in the whisper tokenizer language. #5666 by @pengchengguo

Documentation

  • [Documentation][ESPnet2][Music] Fixed SingingGenerate docstring examples #5889 by @HANJionghao
  • [Documentation][ESPnet2][CI] Separate packing and uploading stages #5752 by @cromz22
  • [Documentation] Add script to make release note from milestone #5653 by @kan-bayashi

Refactoring

Others

  • [Others][CI] Bugfix for the paper publish workflow #5909 by @juice500ml
  • [Others][ESPnet2] Revision on Speechlm vocabulary extension script #5906 by @jctian98
  • [Others][ESPnet2][TTS] Fix tts.sh path in aishell3 tts2 #5879 by @sw005320
  • [Others][ESPnet2][Installation] Add DeepSpeed trainer for large-scale training #5856 by @jctian98
  • [Others] Update README info #5852 by @ftshijt
  • [Others][ESPnet2][ESPnet1][Installation] Add flash-attn #5839 by @wanchichen
  • [Others][ESPnet2][Music] [SVS] fix VISinger2 typecheck error #5838 by @jerryuhoo
  • [Others][ESPnet2] Fixed kising/acesinger google drive download #5834 by @HANJionghao
  • [Others][ESPnet2][SID] update MFA-Conformer performance after fixing the bug in #5797 #5826 by @Jungjee
  • [Others][ESPnet2][CI][SE] SE function updates: new models and support for handling various sampling frequencies #5800 by @Emrys365
  • [Others][ESPnet2][SID] fix spk mfa-conformer forwarding #5797 by @series2
  • [Others][ESPnet2][CI][Music] [SVS] Add CI tests for VISinger Plus #5786 by @jerryuhoo
  • [Others][ESPnet2][LM] Bug fix for VoxtLM v1 recipe #5782 by @cromz22
  • [Others][ESPnet2][ESPnet1] Added partially auto-regressive decoding #5769 by @Masao-Someki
  • [Others][Installation][CI] Fix minor issue in anaconda downloading #5753 by @ftshijt
  • [Others] [pre-commit.ci] pre-commit autoupdate #5738 by @pre-commit-ci[bot]
  • [Others][ESPnet2][Installation][CI] Upgrade typeguard [Subst.] #5724 by @Fhrozen
  • [Others][ESPnet2][SE] TF-GridNet training recipe for DNS Interspeech 2020 dataset #5710 by @nateanl
  • [Others][ESPnet2][LM] Adding transformer_opt #5709 by @soumimaiti
  • [Others][ESPnet2] Add Readme for Voxtlm #5693 by @wyh2000
  • [Others][ESPnet2][SID] ESPnet-SPK: add ASVspoof19 SASV recipe #5687 by @Alexgichamba

Acknowledgements

Special thanks to @Alexgichamba, @Darshan7575, @Emrys365, @Fhrozen, @G-Thor, @HANJionghao, @Jungjee, @Masao-Someki, @South-Twilight, @Stanwang1210, @Takaaki-Saeki, @Tsukasane, @YoshikiMas, @albertz, @arjun-gangwar, @atharva253, @coding-phoenix-12, @cromz22, @ftshijt, @hiranoyu0830, @jctian98, @jerryuhoo, @juice500ml, @kalvinchang, @kan-bayashi, @nateanl, @pengchengguo, @popcornell, @pre-commit-ci[bot], @satvik-dixit, @series2, @shen9712, @siddhu001, @simpleoier, @soumimaiti, @sw005320, @wanchichen, @wyh2000.

ESPnet version 202402

06 Feb 03:28
6ddbdf3
Compare
Choose a tag to compare

News

We're thrilled to announce that our latest update brings two groundbreaking features to our project: espnetez and ESPnet-SPK!

New Features

  • [New Features][ESPnet2][ESPnet1][Installation][SE] Add diffusion-base SE model to ESPnet-SE #5572 by @LiChenda
  • [New Features][ESPnet2][ESPnet1][CI][ASR] Add Bayes Risk CTC (reworked) #5519 by @jctian98
  • [New Features][ESPnet2][TTS] TTS evaluation script and monitoring functionality using MOS prediction model #5485 by @Takaaki-Saeki
  • [New Features][ESPnet2][SE] Add USES model for speech enhancement in diverse conditions #5482 by @Emrys365
  • [New Features][ESPnet2][CI][SID] ESPnet-SPk: major update #5408 by @Jungjee
  • [New Features][ESPnet2][TTS][ASR] Add espnetez #5372 by @Masao-Someki

Enhancement

  • [Enhancement][ESPnet2][OWSM] Improving OWSM inference interface #5618 by @pyf98
  • [Enhancement][ESPnet2][OWSM] Add OWSM v3.1 #5611 by @pyf98
  • [Enhancement][ESPnet2][CI] ESPnet-SPK: Additional models, supplement readme #5559 by @Jungjee
  • [Enhancement][ESPnet2][CI][SE] Add PyTorch & GPU support for DNSMOS calculation #5548 by @Emrys365
  • [Enhancement][ESPnet2][TTS][SID] Speaker embedding extractor (with ESPnet pre-trained speaker model) #5579 by @ftshijt

Recipe

  • [Recipe][ESPnet2][Music] Fix relative setting of train-dev-test #5623 by @ftshijt
  • [Recipe][ESPnet2][SID] ESPnet-SPK: add Voxblink recipe #5583 by @Jungjee
  • [Recipe][ESPnet2][SID] ESPnet-SPK: Model upload and result generation #5558 by @Jungjee
  • [Recipe][ESPnet2][Music] ACE singer recipe fixing #5551 by @ftshijt
  • [Recipe][ESPnet2][TTS] TTS2 Template #5541 by @ftshijt
  • [Recipe][ESPnet2][ASR] fix kaldi dependency in asr2 #5540 by @ftshijt
  • [Recipe][ESPnet2][CI][S2ST] CI test for s2st #5526 by @ftshijt
  • [Recipe][ESPnet2][ASR] Added data.sh to SPRING-INX IITM Recipe #5522 by @arjun-gangwar
  • [Recipe][ESPnet2][ASR] Add Libriheavy small and medium ASR2 recipes #5512 by @akreal
  • [Recipe][ESPnet2][ASR] SPRING-INX IITM RECIPE #5505 by @arjun-gangwar
  • [Recipe][ESPnet2][ASR][RNNT] Add transducer conformer configuration to commonvoice recipe #5503 by @zuazo
  • [Recipe][ESPnet2][ESPnet1] add centralized data preparation for OWSM #5478 by @jctian98
  • [Recipe][ESPnet1] Added clean speech results #5649 by @linan2
  • [Recipe][ESPnet2][Installation][AV] AVSR recipe for Easycom Dataset #5630 by @ms-dot-k
  • [Recipe][ESPnet2] Update CHiME-7 ASR1 recipe #5555 by @popcornell
  • [Recipe][ESPnet2] Add E-Branchformer model checkpoint in OWSM v2 #5517 by @pyf98
  • [Recipe][ESPnet2][SLU] Slue PR configs #5087 by @siddhu001

Bugfix

Documentation

  • [Documentation][ESPnet2] Add instructions for finetuning owsm #5539 by @pyf98
  • [Documentation] Updated the reference of the accepted JOSS paper #5515 by @neillu23

Others

  • [Others] Update Discord Invitation Link #5578 by @Fhrozen
  • [Others][ESPnet2][CI] Improve error robustness of unit tests #5523 by @Emrys365

Acknowledgements

Special thanks to @Emrys365, @Fhrozen, @Jungjee, @LiChenda, @Masao-Someki, @Takaaki-Saeki, @VicentCano, @akreal, @albertz, @arjun-gangwar, @brianyan918, @ftshijt, @jasonmusespresso, @jctian98, @juice500ml, @linan2, @ms-dot-k, @neillu23, @popcornell, @pyf98, @siddhu001, @sw005320, @takenori-y, @tjysdsg, @zuazo.

ESPnet version 202310

25 Oct 11:52
76b318e
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v.202308...v.202310

ESPnet version 202308

03 Aug 13:36
01d7df7
Compare
Choose a tag to compare

What's Changed

Read more

ESPnet version 202304

01 May 12:53
2219358
Compare
Choose a tag to compare

What's Changed

Read more

ESPnet version 202301

01 Feb 10:50
1ce7ad4
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v.202211...v.202301

ESPnet version 202211

11 Dec 23:52
059c910
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v.202209...v.202211

ESPnet version 202209

04 Oct 11:22
d0c12f9
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v.202207...v.202209

ESPnet version 202207

02 Aug 01:00
96bd746
Compare
Choose a tag to compare

New Features

  • [New Features][ESPnet1][ASR] Add DDP support for v1 ASR training. #4430 by @lazykyama
  • [New Features][ESPnet2] Support tensorboard graph #4418 by @kamo-naoyuki
  • [New Features][ESPnet2][ASR] Branchformer Encoder in ESPnet2 #4400 by @pyf98
  • [New Features][ESPnet2][Diarization][SE] enh_diar joint model #4339 by @YushiUeda
  • [New Features][ESPnet2][ESPnet1] Calculate RTF and latency in espnet2 #4382 by @espnetUser
  • [New Features][ESPnet2][ESPnet1][SE] Add EnhPreprocessor for Speech Enhancement #4321 by @Emrys365
  • [New Features][ESPnet2][SE] Add DPTNet and WarmupStepLR scheduler #4449 by @Emrys365
  • [New Features][ESPnet2][SE] Add support for calculating losses on noise and dereverberated signals #4476 by @Emrys365

Recipe

  • [Recipe][ESPnet2] Aishell-2 GPU info #4501 by @jctian98
  • [Recipe][ESPnet2] Fix librispeech default path to signify auto download #4517 by @karthik19967829
  • [Recipe][ESPnet2] Recipe fix for PueblaNahuatl Recipe #4522 by @ftshijt
  • [Recipe][ESPnet2][ASR][README] Add Aishell-2 ASR Recipe for Espnet2 #4451 by @jctian98
  • [Recipe][ESPnet2][ASR][README] Add AmericasNLP 2022 baselines #4428 by @akreal
  • [Recipe][ESPnet2][ESPnet1][ASR][Installation] FLEURS ASR Recipe for ESPnet2 #4455 by @wanchichen
  • [Recipe][ESPnet2][ESPnet1][ASR][README] tedx_spanish_corpus egs2 recipe #4523 by @jessicah25
  • [Recipe][ESPnet2][ESPnet1][ASR][SE] Adding L3DAS22 Task1 model to ESPNet-SE #3994 by @popcornell
  • [Recipe][ESPnet2][ESPnet1][ST] Must_C v1 and v2 in egs2 #4306 by @brianyan918
  • [Recipe][ESPnet2][README] Dcase task1 Baseline #4317 by @siddhu001
  • [Recipe][ESPnet2][README] Report Aishell-2 Transducer results #4489 by @jctian98
  • [Recipe][ESPnet2][README] Update language codes in AmericasNLP 2022 baseline #4441 by @akreal
  • [Recipe][ESPnet2][README] Vox populi baseline #4478 by @siddhu001
  • [Recipe][ESPnet2][SE] L3DAS22 enhancement recipe #4269 by @neillu23
  • [Recipe][ESPnet2][SE] Update notes in the recipes for DNS challenges #4433 by @YoshikiMas
  • [Recipe][ESPnet2][SE][SLU][ST] LT-Spatialized and SLURP-Spatialized combined enhancement recipe #4268 by @neillu23
  • [Recipe][ESPnet2][ST] Add moses check for ST recipes #4417 by @ftshijt
  • [Recipe][ESPnet2][TTS] Add talromur recipe #4379 by @G-Thor
  • [Recipe][ESPnet2][TTS] Fix for issue #4401 #4402 by @G-Thor
  • [Recipe][ESPnet2][TTS] add pre-trained model jets in the recipe of ljspeech, kss #4406 by @imdanboy

Bugfix

  • [Bugfix][ESPnet1] fix the corrupted pretrained model #4490 by @wentaoxandry
  • [Bugfix][ESPnet1][ESPnet2] Fix an4 URL #4427 by @pyf98
  • [Bugfix][ESPnet1][ESPnet2][RNNT] Fix mAES with big vocab size #4312 by @b-flo
  • [Bugfix][ESPnet2] Adding init.py to espnet2/diar/layers and espnet2/diar/separator #4470 by @cycentum
  • [Bugfix][ESPnet2] Fix tensorboard-graph creation for multi gpu mode #4431 by @kamo-naoyuki
  • [Bugfix][ESPnet2] Update char_tokenizer.py #4499 by @xiabingquan
  • [Bugfix][ESPnet2][ESPnet1][ASR][LM][MT][TTS] Fix Transducer LM fusion and add Logging for Transducer inference #4327 by @chintu619
  • [Bugfix][ESPnet2][SE] Fix a bug in enh unit test #4435 by @Emrys365

Enhancement

  • [Enhancement][ESPnet2] Optionize graph creation #4551 by @kan-bayashi
  • [Enhancement][ESPnet2][Installation][TTS] Add icelandic g2p #4384 by @G-Thor
  • [Enhancement][ESPnet2][SE] Add support of test-only criterions after each epoch #4381 by @Emrys365
  • [Enhancement][ESPnet2][SSL] raise more useful error in espnet2/asr/frontend/s3prl.py if s3prl is not installed #4480 by @popcornell
  • [Enhancement][ESPnet2][TTS] Add JETS AlignmentModule in calculate_all_attentions.py #4446 by @seastar105

Refactoring

Others

  • [CI][ESPnet1][ESPnet2][Installation] Remove the version restriction for numpy #4419 by @kamo-naoyuki
  • [CI][ESPnet2] Canged to install espnet from wheel in the test_import CI test #4471 by @kamo-naoyuki
  • [CI][Installation] Temporary fixed numpy version #4464 by @kamo-naoyuki
  • [Documentation] Add notes on batch size and num of GPUs in ESPnet2 documentation #4436 by @pyf98
  • [Documentation][ESPnet1] Update decoder.py #4322 by @sw005320
  • [Documentation][ESPnet2] Add a note to follow the installation instructions #4477 by @akreal

Acknowledgements

Special thanks to @Emrys365, @G-Thor, @YoshikiMas, @YushiUeda, @akreal, @b-flo, @brianyan918, @chintu619, @cycentum, @espnetUser, @ftshijt, @imdanboy, @jctian98, @jessicah25, @jhlee9010, @kamo-naoyuki, @kan-bayashi, @karthik19967829, @lazykyama, @neillu23, @popcornell, @pyf98, @seastar105, @siddhu001, @sw005320, @wanchichen, @wentaoxandry, @xiabingquan.