Skip to main content

All Questions

Tagged with
Filter by
Sorted by
Tagged with
1 vote
1 answer
55 views

How to Process Data on GPU Instead of RAM for This Python Code?

I'm currently using the following code to process audio data, but it runs on the RAM. I want to offload the processing to the GPU to improve performance. my code : def prepare_dataset(batch): ...
Elena Aston's user avatar
0 votes
1 answer
302 views

Importing Whisper in Azure Databricks gives an import error for Numba, despite installing the right version of NumPy

I want to implement Whisper in Azure Databricks for a transcription task. Installing Whisper works fine. When importing the module with import whisper it gives the following ImportError: Numba needs ...
Al Bundy's user avatar
0 votes
0 answers
1k views

How to transcribe multiple audio files at once using Whisper finetuned model?

TL;DR: I'm trying to transcribe multiple files together using Hugging face fine-tuned whisper ai model and extract the output as a single text file I have this code which works and transcribes an ...
user22347502's user avatar
1 vote
2 answers
2k views

Hugging face model not transcribing the entire length of the audio file

Brief: I'm unable to transcribe more than a few seconds of audio in a 5 minute audio file using hugging face open ai whisper(finetuned) model. I'm facing issues with transcribing a Indian local ...
user22347502's user avatar
2 votes
2 answers
5k views

How to increase maximum token size in a Hugging face model?

I'm unable to get more than a few character of transcription using a finetuned whisper ai model. I'm using google colab to run this. I tried max_new_token but it is returning an error if I enter a ...
user22347502's user avatar
1 vote
1 answer
910 views

How to use model from huggingface to make an "app" which can transcribe a local indian language?

I'm looking to use this finetuned(https://huggingface.co/thennal/whisper-medium-ml) whisper model to make an app for personal use. I want to use google colaboratory to run this. I'm not a developer ...
user22347502's user avatar
2 votes
1 answer
1k views

How to map word level timestamps to text of a given transcript?

I am currently developing a tool to visualize song lyrics. The tool computes the similarity in the phonetics of syllables and assigns a rhyme group to each syllable. Syllables belonging to the same ...
paulpelikan's user avatar
1 vote
1 answer
598 views

Make Whisper use the LAST 30 sec chunk (and not the first)

According to Whisper, the notion is as follows: Internally, the transcribe() method reads the entire file and processes the audio with a sliding 30-second window, performing autoregressive sequence-...
bonchardon's user avatar
2 votes
1 answer
500 views

spaCy sentence separation with dictionary source from OpenAI Whisper / WhisperX?

WhisperX is a whisper extension that does a really excellent job of text to speech with per-word timestamps. I'd like to use spaCy to split up the text strings into sensible clauses but maintain a ...
Dom I Yes's user avatar