Skip to main content
Filter by
Sorted by
Tagged with
0 votes
0 answers
17 views

add a button for transcription to a Django form

I am building a website and have views with Django forms. I want to add a transcription button to each field in the Django form class PostForm(StylishForm): class Meta: model = Post ...
Abdalla Diaai's user avatar
-1 votes
0 answers
47 views

Python Speech-to-text using Whisper model to APK using Kivy , Buildozer

I'm working on speech-to-text using whisper model it runs in my computer but after conversion to APK file it don't. Application creation success and installed in android successfully but not opening. ...
friend's user avatar
  • 1
0 votes
2 answers
891 views

KeyError: '__version__' installing openai-whisper on Python 3.13

Getting this error when I try to download Whisper Python version 3.13 - I wasted a lot of time figuring out the problem - can anyone help? I even tried reverting to a lower version but didn't help. ...
user9889742's user avatar
0 votes
0 answers
15 views

SSL: SSLV3_ALERT_BAD_RECORD_MAC sslv3 alert bad record mac (_ssl.c:2559) error while transcribing audio files from an API using whisper-turbo model

I am converting some audio calls coming from an Api using OpenAI whispers latest large-turbo model. During conversion at some point, it gives this error ([SSL: SSLV3_ALERT_BAD_RECORD_MAC] sslv3 alert ...
Bhupendra Singh's user avatar
0 votes
1 answer
94 views

error: subprocess-exited-with-error for pip dependencies installation

I'm building a video generation automation project and I'm trying to use openai-whisper for the subtitle generation. Other dependencies are: #File for dependencies requests == 2.32.3 gTTS == 2.5.3 ...
Kashyap Sukshavasi's user avatar
1 vote
0 answers
123 views

Trying to install openai-whisper via Poetry, but getting "No module named 'whisper'"

Hello all: I am trying to add openai-whisper via Poetry. I have tried many steps including adding the git to pyproject.toml dependencies. [tool.poetry.dependencies] python = "^3.12" numpy = &...
John Honai's user avatar
0 votes
1 answer
142 views

How can I properly use whisper_timestamped in Python3

How can I use whisper-timestamped's transcribe method? I have audio file "example.mp3", I want to generate it's timestamped transcription, as I have done my research open Ai's whisper-...
Luka Varsimashvili's user avatar
0 votes
0 answers
28 views

Is there a way to capture the sound of specific browser tab?

I want to use Whisper to generate real-time subtitles while watching live streams, so I need the program to be able to capture the sound from the browser. However, I don't know how to do this. I know ...
キラル's user avatar
0 votes
0 answers
98 views

whisper finetuning on mutiple gpu's

Actually i am fine tuning whisper model but its running on only one gpu and other 3 gpu remain idle. Problme is that it is not utilizing the other 3 GPU I want to fine tuning the whisper model ...
Nikhil's user avatar
  • 1
0 votes
0 answers
58 views

Azure AI Machine Learning Studio Crashes from faster whisper

So I am trying to run faster whisper in Azure AI Machine Learning Studio. The code is the sample code from https://github.com/SYSTRAN/faster-whisper and runs fine on my laptop (just CPU instead of GPU)...
Tensing2009's user avatar
0 votes
0 answers
30 views

TFWhisperForConditionalGeneration model.generate() returns repetitions of first word in sequence after finetuning

I fine-tuned TFWhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny") on the German version of mozilla-foundation/common_voice_11_0. The training process looks fine (...
David's user avatar
  • 21
1 vote
1 answer
55 views

How to Process Data on GPU Instead of RAM for This Python Code?

I'm currently using the following code to process audio data, but it runs on the RAM. I want to offload the processing to the GPU to improve performance. my code : def prepare_dataset(batch): ...
Elena Aston's user avatar
1 vote
0 answers
121 views

Fine-Tuning Whisper for translate task "speech in cutom dialect to translated text in another custom language"

I recently fine-tuned the Whisper-Tiny model on my custom speech dataset for transcription tasks, and it worked well. However, when I tried to fine-tune the model for a translation task using the same ...
maria labed's user avatar
0 votes
0 answers
72 views

calculate flops in Whisper model

I am trying to calculate the flop count of a single pass through the whisper forward() function. I am using the thop library and getting 0 flop counts. I believe its because in its register_hooks, ...
afsara_ben's user avatar
1 vote
0 answers
316 views

What causes the Econreset error with OpenAI libraries?

I have an app using html and the OpenAI Whisper model. Everything works as intended, However I keep getting an error when it tries to fetch the text output from the OpenAI servers. I started the web ...
vivann's user avatar
  • 11
0 votes
0 answers
83 views

How to get word-level AND sentence-level transcripts with OpenAI whisper?

I am using whisper and need to provide accurate results to my end users. The 2 options are: Using segments granularities, I get sentences but no word timestamps: { "id": 0, "seek"...
Yohan Attal's user avatar
1 vote
0 answers
28 views

Finetune TFWhisperForConditionalGeneration

Hi everyone, I'm currently searching for a way to fine-tune the Hugging Face TFWhisperForConditionalGeneration model (https://huggingface.co/docs/transformers/en/model_doc/whisper) to get a model in ....
David's user avatar
  • 21
0 votes
3 answers
206 views

OpenAI API return 400 Error: cannot transcribe ogg file

I want to create async func for transcribing telegram voices (.ogg files), but when i try send request to OpenAI API i recive error: Error code: 400 - {'error': {'message': "Unrecognized file ...
Антон Кокорин's user avatar
0 votes
1 answer
199 views

Azure Openai Whipser REST endpoints

I wanted to use Whisper deployed through Azure OpenAI but I am having trouble finding the right resources for it. I am trying to integrate a translator using Whisper in a flutter app that will take ...
Dhruv Sinha's user avatar
0 votes
0 answers
121 views

Getting chunk level output with start and end timestamps with Whisper

I am using the Whisper3 model to transcribe several audio files. However, the output I am getting is in the form of a tensor. I would like to obtain text chunks with corresponding start and end ...
Meghana S's user avatar
0 votes
1 answer
318 views

Openai whisper error with cmd and Google Colaboratory ("Traceback"...)

I tried to install whisper, but I am getting the error messages (below) with Command Line and even with Google Colaboratory (Google Drive). I tried mp4, mp3 and wav files, but I always get error ...
Waleson Lopes's user avatar
0 votes
0 answers
25 views

Whisper api transcribing in firefox but not in chrome

I want to transcribe my audio into text, so I am using the whisper v3-large model's api. I first send my audiofile to the server, where I use multer to allow for my request object to access the file ...
Abhijit Saha's user avatar
0 votes
1 answer
371 views

I would like to revert to the version of the whisperx model that I learned

When I run whisperx, it tells me that it is different from the version that once trained the model. Also, I cannot install the old version. This is my code def use_whisperx(audio_file_name): ...
Cliper's user avatar
  • 1
0 votes
0 answers
62 views

Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training

I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...
dw26's user avatar
  • 1
0 votes
0 answers
320 views

Why faster_whisper model kills the kernel unexpectedly after running a few instances?

I am trying to convert few audio data into text data using OpenAI Whisper, though the larger-model accuracy is very good but it is very slow to process the audios. But then I found faster-whisper ...
Bhupendra Singh's user avatar
0 votes
0 answers
114 views

Issue with OpenAI's transcription API

I'm having some issues with OpenAI's endpoint: https://api.openai.com/v1/audio/transcriptions I have an audio file that I'm pulling from telegram's api. it's a .m4a file. I'm not getting errors when I'...
Jamie Laden's user avatar
0 votes
0 answers
34 views

How do I access non-python files using directories in Python for the whisper library?

I'm trying to make an auto video app using the whisper module. I can't figure out how to access non-python files in python. Code is here: import whisper # type: ignore model = whisper.load_model("...
Zane Sites's user avatar
0 votes
0 answers
119 views

Retrieving Token and Cost Information for OpenAI's Whisper and Text-to-Speech (TTS) Models in Langsmith?

Is it possible to retrieve token and cost information for OpenAI's Audio and TTS models in LangSmith runs while tracing? OpenAI Audio Models I am able to retrieve this information for chat completion ...
Talal Khan's user avatar
0 votes
0 answers
63 views

I am getting multiple errors while running a streaming model for audio. I am using insanely-fast-whisper along with the give model

Code is: import torch from transformers import pipeline from transformers.utils import is_flash_attn_2_available pipe = pipeline( "automatic-speech-recognition", model = "...
M SAQLAIN's user avatar
0 votes
1 answer
540 views

How can I use PowerShell to determine the active graphics card with the highest VRAM capacity?

Okay, I'm not one to ask for help, but I've been struggling for days with a feature I want to implement in a private OpenAI Whisper project. To put it in context, I'm doing a project to automate ...
SkayDev's user avatar
  • 11
0 votes
0 answers
111 views

I can't compile Whisper Ai into EXE file with Pyinstaller

i'm not sure if its a ffmpeg issue or a whisper ai issue but after compiling my .py file with pyinstaller into one exe file using this command pyinstaller --onefile --add-binary="ffmpeg.exe;.&...
das saew's user avatar
0 votes
0 answers
26 views

Is it possible to add an audio stream to an already existing audio and video stream that is being recorded by MediaRecorder?

I am building an interview application where I am interrogated by an AI and I have to answer questions. The interview is recorded using MediaRecorder. The problem is that the audio that is generated ...
Terraflow's user avatar
0 votes
0 answers
142 views

whisper model generate next token based on decoder hidden states

In the huggingface-whisper implementation, in super().generate() function, the initial decoder token ids are passed to predict next token. In shortform, I am unclear how the generate() is happening ...
afsara_ben's user avatar
0 votes
0 answers
95 views

Issue with Encoding Audio Buffer Data to Opus Size Being Incorrect in iOS Swift App

I'm trying to encode frames from an audio buffer to Opus data in a Swift iOS app, and send them to be encoded on a server by Whisper. The error I keep getting back from the server is opuslib....
narner's user avatar
  • 3,221
0 votes
1 answer
271 views

How to use Azure OpenAI SDK with Whisper for speech-to-text translation

Using: <PackageReference Include="Azure.AI.OpenAI" Version="2.0.0-beta.2" /> I have this simple code: var client = new OpenAIClient( new ApiKeyCredential(apiKey), ...
Sean's user avatar
  • 15.1k
0 votes
1 answer
47 views

How to recreate the "view" features of common voice v11 in HuggingFace?

The Common Voice v11 on HuggingFace has some amazing View features! They include a dropdown button to select the language, and columns with the dataset features, such as client_id, audio, sentence, ...
Michel Mesquita's user avatar
1 vote
0 answers
109 views

How to convert .wav file to a numpy array and then back to a .wav file format without losing quality/having the same audio without any noise?

I seem to be unable to convert a .wav file to a numpy array , and then back to .wav. The audio gets too noisy. I tried using scipy.io.wavfile.read() ,and librosa. The whole reason I am trying to do ...
Ali isayev's user avatar
0 votes
0 answers
238 views

OpenAI Whisper split audio / video File into chunks of 25MB in node js

I currently have an object of type File and can access its buffer which I want to split into chunks of 25MB due to whisper's limitations. However I tried with subarray but only the first chunk could ...
Kagos's user avatar
  • 23
0 votes
0 answers
119 views

How to input MediaRecorder webm opus bytes to Whisper model?

I am recording voice, on the client side, using MediaRecorder, and sending the resulting blob of (webm, opus) bytes to the server using a WebSocket, with this code: <script type="text/...
Bob Bobson's user avatar
0 votes
1 answer
44 views

Anyway to get more description about wav files?

well i have a faster-whisper model and two files: when i put the first one into the model it returns me nonsense subtitles, BUT then i put the second file into model it work perfect import wave ...
SOTERSOTERSOTERSOTERSOTERSOTER's user avatar
1 vote
3 answers
65 views

TypeError, can't handle two datatypes

I can't convert received HLS bytes into some datatype that openAI whisper can process. Here I've got HLS via streamlink and I need some way to convert audio part of this HLS-bytes into correct numpy....
SOTERSOTERSOTERSOTERSOTERSOTER's user avatar
0 votes
0 answers
89 views

Whisper transcribes Ukrainian speech, but writes it in Latin characters. How to fix it?

I wrote the code for transcription of Ukrainian speech, the Whisper transcribes, but saves the file written in Latin characters. I also tried it without setting language to Ukrainian. model = whisper....
McGuffin's user avatar
0 votes
0 answers
141 views

CUDA Out of Memory Error when Chunking Audio and Running Whisper Transcription in Python

I'm working on a Python script that transcribes audio files using a library that leverages CUDA for GPU acceleration. The transcription process involves chunking the audio file into smaller segments ...
Ruffy's user avatar
  • 1
2 votes
1 answer
190 views

Is there any way to transliterate hindi audio to english using OpenAI whisper

I have task where given an audio file I have to perform speaker diarization on the audio file and then I have to perform the transcription accordingly. For speaker diarization I am using pyannote, ...
Chaitanya Kale's user avatar
0 votes
0 answers
77 views

Not sure how to resolve this issue with the python code

I'm using whisper to transcribe audio to text. I decided to use distil -whisper to speed it up. I've been trying to follow the instructions on Hugging Face but keep getting an error. I'm running this ...
heeby89's user avatar
  • 43
1 vote
0 answers
654 views

Where is the Bottleneck for multiple requests using Whisper on Nvidia A100

I want to use Whisper-Large-v3 (Speech-to-Text) for a real-time application. However, I want to process several requests at the same time. My Whisper instance runs on an Nvidia A100 with 80GB VRAM. In ...
leon's user avatar
  • 11
0 votes
1 answer
377 views

OpenAI Whisper API: How do I shorten the response latency? How do I make Whisper respond quicker?

I am making a voice chatbot. The issue is that I needed to shorten the delay between finishing my sentence while speaking to the bot and the start of the bot's response, which currently takes about 6 ...
Samagra Shrivastava's user avatar
0 votes
0 answers
115 views

Whisper for 100 users concurrently using websockets

import asyncio import websockets import numpy as np import whisper import torch import threading import time from urllib.parse import parse_qs # Load Whisper model instances globally model_pool = [...
Saboor Niazi's user avatar
1 vote
1 answer
555 views

How to process audio with Whisper in Rust

I have a tauri app that records a user's voice, sends a audio/webm base64 encoded string to a backend Rust function to processe it with open AI's Whisper model. It mostly works, except for the fact ...
itsisaac19's user avatar
0 votes
0 answers
662 views

How to create a live voice activity detection (VAD) loop for whisperX?

I am using whisperX speech-to-text model to convert my voice into text input for a locally hosted LLM. Right now, I have it set up where I can record an audio file, and then load it into whisperX. I ...
Jay Ferments's user avatar

1
2 3 4 5 6