Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
-
Updated
Nov 14, 2024 - Python
Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
A PyTorch-based Speech Toolkit
Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding
Speech to Phoneme, Bandwidth Extension and Speaker Verification using the Vibravox dataset.
A toolbox of audio models and algorithms based on MindSpore
Framework for training and evaluating self-supervised learning methods for speaker verification.
A Repository for Single- and Multi-modal Speaker Verification, Speaker Recognition and Speaker Diarization
Deep learning for audio processing
This repository contains audio samples and supplementary materials accompanying publications by the "Speaker, Voice and Language" team at Google.
Code for the paper: Improving Speaker Representations Using Contrastive Losses on Multi-scale Features
speechlib is a library that can do speaker diarization, transcription and speaker recognition on an audio file to create transcripts with actual speaker names
kaldi-asr/kaldi is the official location of the Kaldi project.
The SpeechBrain project aims to build a novel speech toolkit fully based on PyTorch. With SpeechBrain users can easily create speech processing systems, ranging from speech recognition (both HMM/DNN and end-to-end), speaker recognition, speech enhancement, speech separation, multi-microphone speech processing, and many others.
An attention-based backend allowing efficient fine-tuning of transformer models for speaker verification
SOTA method for self-supervised speaker verification leveraging a large-scale pretrained ASR model.
AudioSpeakerVerification: FastAPI-based API for Speaker Matching and Verification using SpeechBrain. Compare and verify speaker identities from audio files.
Speaker Verification utilizing the Self-Supervised Audio Spectrogram Transformer
Development of Convolutional Neural Networks for environmental sound recognition
The project aims to recognize speakers using a combination of Mel-Frequency Cepstral Coefficients (MFCC) and Gaussian Mixture Models (GMM).
Add a description, image, and links to the speaker-verification topic page so that developers can more easily learn about it.
To associate your repository with the speaker-verification topic, visit your repo's landing page and select "manage topics."