Newest 'openai-whisper' Questions

0 votes

0 answers

17 views

add a button for transcription to a Django form

I am building a website and have views with Django forms. I want to add a transcription button to each field in the Django form class PostForm(StylishForm): class Meta: model = Post ...

Abdalla Diaai

1

asked Dec 5 at 6:38

-1 votes

0 answers

47 views

Python Speech-to-text using Whisper model to APK using Kivy , Buildozer

I'm working on speech-to-text using whisper model it runs in my computer but after conversion to APK file it don't. Application creation success and installed in android successfully but not opening. ...

friend

1

asked Dec 5 at 5:49

0 votes

2 answers

891 views

KeyError: 'version' installing openai-whisper on Python 3.13

Getting this error when I try to download Whisper Python version 3.13 - I wasted a lot of time figuring out the problem - can anyone help? I even tried reverting to a lower version but didn't help. ...

user9889742

11

asked Nov 10 at 22:25

0 votes

0 answers

15 views

SSL: SSLV3_ALERT_BAD_RECORD_MAC sslv3 alert bad record mac (_ssl.c:2559) error while transcribing audio files from an API using whisper-turbo model

I am converting some audio calls coming from an Api using OpenAI whispers latest large-turbo model. During conversion at some point, it gives this error ([SSL: SSLV3_ALERT_BAD_RECORD_MAC] sslv3 alert ...

Bhupendra Singh

21

asked Nov 8 at 18:06

0 votes

1 answer

94 views

error: subprocess-exited-with-error for pip dependencies installation

I'm building a video generation automation project and I'm trying to use openai-whisper for the subtitle generation. Other dependencies are: #File for dependencies requests == 2.32.3 gTTS == 2.5.3 ...

Kashyap Sukshavasi

3

asked Nov 6 at 18:56

1 vote

0 answers

123 views

Trying to install openai-whisper via Poetry, but getting "No module named 'whisper'"

Hello all: I am trying to add openai-whisper via Poetry. I have tried many steps including adding the git to pyproject.toml dependencies. [tool.poetry.dependencies] python = "^3.12" numpy = &...

John Honai

35

asked Oct 18 at 20:08

0 votes

1 answer

142 views

How can I properly use whisper_timestamped in Python3

How can I use whisper-timestamped's transcribe method? I have audio file "example.mp3", I want to generate it's timestamped transcription, as I have done my research open Ai's whisper-...

Luka Varsimashvili

1

asked Oct 18 at 18:07

0 votes

0 answers

28 views

Is there a way to capture the sound of specific browser tab?

I want to use Whisper to generate real-time subtitles while watching live streams, so I need the program to be able to capture the sound from the browser. However, I don't know how to do this. I know ...

キラル

27

asked Oct 18 at 11:49

0 votes

0 answers

98 views

whisper finetuning on mutiple gpu's

Actually i am fine tuning whisper model but its running on only one gpu and other 3 gpu remain idle. Problme is that it is not utilizing the other 3 GPU I want to fine tuning the whisper model ...

Nikhil

1

asked Oct 4 at 14:35

0 votes

0 answers

58 views

Azure AI Machine Learning Studio Crashes from faster whisper

So I am trying to run faster whisper in Azure AI Machine Learning Studio. The code is the sample code from https://github.com/SYSTRAN/faster-whisper and runs fine on my laptop (just CPU instead of GPU)...

Tensing2009

97

asked Sep 11 at 11:21

0 votes

0 answers

30 views

TFWhisperForConditionalGeneration model.generate() returns repetitions of first word in sequence after finetuning

I fine-tuned TFWhisperForConditionalGeneration.from_pretrained("openai/whisper-tiny") on the German version of mozilla-foundation/common_voice_11_0. The training process looks fine (...

David

21

asked Sep 11 at 11:00

1 vote

1 answer

55 views

How to Process Data on GPU Instead of RAM for This Python Code?

I'm currently using the following code to process audio data, but it runs on the RAM. I want to offload the processing to the GPU to improve performance. my code : def prepare_dataset(batch): ...

Elena Aston

15

asked Aug 27 at 8:03

1 vote

0 answers

121 views

Fine-Tuning Whisper for translate task "speech in cutom dialect to translated text in another custom language"

I recently fine-tuned the Whisper-Tiny model on my custom speech dataset for transcription tasks, and it worked well. However, when I tried to fine-tune the model for a translation task using the same ...

maria labed

11

asked Aug 24 at 14:47

0 votes

0 answers

72 views

calculate flops in Whisper model

I am trying to calculate the flop count of a single pass through the whisper forward() function. I am using the thop library and getting 0 flop counts. I believe its because in its register_hooks, ...

afsara_ben

662

asked Aug 19 at 6:31

1 vote

0 answers

316 views

What causes the Econreset error with OpenAI libraries?

I have an app using html and the OpenAI Whisper model. Everything works as intended, However I keep getting an error when it tries to fetch the text output from the OpenAI servers. I started the web ...

vivann

11

asked Aug 13 at 4:27

0 votes

0 answers

83 views

How to get word-level AND sentence-level transcripts with OpenAI whisper?

I am using whisper and need to provide accurate results to my end users. The 2 options are: Using segments granularities, I get sentences but no word timestamps: { "id": 0, "seek"...

Yohan Attal

37

asked Aug 8 at 8:40

1 vote

0 answers

28 views

Finetune TFWhisperForConditionalGeneration

Hi everyone, I'm currently searching for a way to fine-tune the Hugging Face TFWhisperForConditionalGeneration model (https://huggingface.co/docs/transformers/en/model_doc/whisper) to get a model in ....

David

21

asked Aug 6 at 12:27

0 votes

3 answers

206 views

OpenAI API return 400 Error: cannot transcribe ogg file

I want to create async func for transcribing telegram voices (.ogg files), but when i try send request to OpenAI API i recive error: Error code: 400 - {'error': {'message': "Unrecognized file ...

Антон Кокорин

11

asked Aug 1 at 13:04

0 votes

1 answer

199 views

Azure Openai Whipser REST endpoints

I wanted to use Whisper deployed through Azure OpenAI but I am having trouble finding the right resources for it. I am trying to integrate a translator using Whisper in a flutter app that will take ...

Dhruv Sinha

13

asked Jul 31 at 10:41

0 votes

0 answers

121 views

Getting chunk level output with start and end timestamps with Whisper

I am using the Whisper3 model to transcribe several audio files. However, the output I am getting is in the form of a tensor. I would like to obtain text chunks with corresponding start and end ...

Meghana S

87

asked Jul 31 at 7:34

0 votes

1 answer

318 views

Openai whisper error with cmd and Google Colaboratory ("Traceback"...)

I tried to install whisper, but I am getting the error messages (below) with Command Line and even with Google Colaboratory (Google Drive). I tried mp4, mp3 and wav files, but I always get error ...

Waleson Lopes

1

asked Jul 26 at 14:34

0 votes

0 answers

25 views

Whisper api transcribing in firefox but not in chrome

I want to transcribe my audio into text, so I am using the whisper v3-large model's api. I first send my audiofile to the server, where I use multer to allow for my request object to access the file ...

Abhijit Saha

11

asked Jul 18 at 6:30

0 votes

1 answer

371 views

I would like to revert to the version of the whisperx model that I learned

When I run whisperx, it tells me that it is different from the version that once trained the model. Also, I cannot install the old version. This is my code def use_whisperx(audio_file_name): ...

Cliper

1

asked Jul 17 at 4:46

0 votes

0 answers

62 views

Issue with Data Preprocessing and Tensor Concatenation for Whisper Model Training

I am trying to train a Whisper model for Jeju dialect speech recognition. However, I am encountering several errors related to tensor concatenation during the data preprocessing phase. Below is the ...

dw26

1

asked Jul 17 at 1:45

0 votes

0 answers

320 views

Why faster_whisper model kills the kernel unexpectedly after running a few instances?

I am trying to convert few audio data into text data using OpenAI Whisper, though the larger-model accuracy is very good but it is very slow to process the audios. But then I found faster-whisper ...

Bhupendra Singh

21

asked Jul 11 at 11:26

0 votes

0 answers

114 views

Issue with OpenAI's transcription API

I'm having some issues with OpenAI's endpoint: https://api.openai.com/v1/audio/transcriptions I have an audio file that I'm pulling from telegram's api. it's a .m4a file. I'm not getting errors when I'...

Jamie Laden

53

asked Jul 4 at 6:31

0 votes

0 answers

34 views

How do I access non-python files using directories in Python for the whisper library?

I'm trying to make an auto video app using the whisper module. I can't figure out how to access non-python files in python. Code is here: import whisper # type: ignore model = whisper.load_model("...

Zane Sites

13

asked Jul 1 at 4:55

0 votes

0 answers

119 views

Retrieving Token and Cost Information for OpenAI's Whisper and Text-to-Speech (TTS) Models in Langsmith?

Is it possible to retrieve token and cost information for OpenAI's Audio and TTS models in LangSmith runs while tracing? OpenAI Audio Models I am able to retrieve this information for chat completion ...

Talal Khan

1

asked Jun 28 at 6:39

0 votes

0 answers

63 views

I am getting multiple errors while running a streaming model for audio. I am using insanely-fast-whisper along with the give model

Code is: import torch from transformers import pipeline from transformers.utils import is_flash_attn_2_available pipe = pipeline( "automatic-speech-recognition", model = "...

M SAQLAIN

3

asked Jun 27 at 10:04

0 votes

1 answer

540 views

How can I use PowerShell to determine the active graphics card with the highest VRAM capacity?

Okay, I'm not one to ask for help, but I've been struggling for days with a feature I want to implement in a private OpenAI Whisper project. To put it in context, I'm doing a project to automate ...

SkayDev

11

asked Jun 26 at 12:46

0 votes

0 answers

111 views

I can't compile Whisper Ai into EXE file with Pyinstaller

i'm not sure if its a ffmpeg issue or a whisper ai issue but after compiling my .py file with pyinstaller into one exe file using this command pyinstaller --onefile --add-binary="ffmpeg.exe;.&...

das saew

1

asked Jun 24 at 9:19

0 votes

0 answers

26 views

Is it possible to add an audio stream to an already existing audio and video stream that is being recorded by MediaRecorder?

I am building an interview application where I am interrogated by an AI and I have to answer questions. The interview is recorded using MediaRecorder. The problem is that the audio that is generated ...

Terraflow

1

asked Jun 22 at 13:00

0 votes

0 answers

142 views

whisper model generate next token based on decoder hidden states

In the huggingface-whisper implementation, in super().generate() function, the initial decoder token ids are passed to predict next token. In shortform, I am unclear how the generate() is happening ...

afsara_ben

662

asked Jun 21 at 22:40

0 votes

0 answers

95 views

Issue with Encoding Audio Buffer Data to Opus Size Being Incorrect in iOS Swift App

I'm trying to encode frames from an audio buffer to Opus data in a Swift iOS app, and send them to be encoded on a server by Whisper. The error I keep getting back from the server is opuslib....

narner

3,221

asked Jun 18 at 19:20

0 votes

1 answer

271 views

How to use Azure OpenAI SDK with Whisper for speech-to-text translation

Using: <PackageReference Include="Azure.AI.OpenAI" Version="2.0.0-beta.2" /> I have this simple code: var client = new OpenAIClient( new ApiKeyCredential(apiKey), ...

Sean

15.1k

asked Jun 18 at 13:46

0 votes

1 answer

47 views

How to recreate the "view" features of common voice v11 in HuggingFace?

The Common Voice v11 on HuggingFace has some amazing View features! They include a dropdown button to select the language, and columns with the dataset features, such as client_id, audio, sentence, ...

Michel Mesquita

783

asked Jun 18 at 10:54

1 vote

0 answers

109 views

How to convert .wav file to a numpy array and then back to a .wav file format without losing quality/having the same audio without any noise?

I seem to be unable to convert a .wav file to a numpy array , and then back to .wav. The audio gets too noisy. I tried using scipy.io.wavfile.read() ,and librosa. The whole reason I am trying to do ...

Ali isayev

11

asked Jun 16 at 21:10

0 votes

0 answers

238 views

OpenAI Whisper split audio / video File into chunks of 25MB in node js

I currently have an object of type File and can access its buffer which I want to split into chunks of 25MB due to whisper's limitations. However I tried with subarray but only the first chunk could ...

Kagos

23

asked Jun 15 at 21:11

0 votes

0 answers

119 views

How to input MediaRecorder webm opus bytes to Whisper model?

I am recording voice, on the client side, using MediaRecorder, and sending the resulting blob of (webm, opus) bytes to the server using a WebSocket, with this code: <script type="text/...

Bob Bobson

1

asked Jun 9 at 21:22

0 votes

1 answer

44 views

Anyway to get more description about wav files?

well i have a faster-whisper model and two files: when i put the first one into the model it returns me nonsense subtitles, BUT then i put the second file into model it work perfect import wave ...

SOTERSOTERSOTERSOTERSOTERSOTER

11

asked Jun 7 at 21:23

1 vote

3 answers

65 views

TypeError, can't handle two datatypes

I can't convert received HLS bytes into some datatype that openAI whisper can process. Here I've got HLS via streamlink and I need some way to convert audio part of this HLS-bytes into correct numpy....

SOTERSOTERSOTERSOTERSOTERSOTER

11

asked Jun 7 at 2:36

0 votes

0 answers

89 views

Whisper transcribes Ukrainian speech, but writes it in Latin characters. How to fix it?

I wrote the code for transcription of Ukrainian speech, the Whisper transcribes, but saves the file written in Latin characters. I also tried it without setting language to Ukrainian. model = whisper....

McGuffin

1

asked Jun 4 at 11:31

0 votes

0 answers

141 views

CUDA Out of Memory Error when Chunking Audio and Running Whisper Transcription in Python

I'm working on a Python script that transcribes audio files using a library that leverages CUDA for GPU acceleration. The transcription process involves chunking the audio file into smaller segments ...

Ruffy

1

asked Jun 4 at 9:54

2 votes

1 answer

190 views

Is there any way to transliterate hindi audio to english using OpenAI whisper

I have task where given an audio file I have to perform speaker diarization on the audio file and then I have to perform the transcription accordingly. For speaker diarization I am using pyannote, ...

Chaitanya Kale

21

asked May 31 at 6:12

0 votes

0 answers

77 views

Not sure how to resolve this issue with the python code

I'm using whisper to transcribe audio to text. I decided to use distil -whisper to speed it up. I've been trying to follow the instructions on Hugging Face but keep getting an error. I'm running this ...

heeby89

43

asked May 30 at 3:29

1 vote

0 answers

654 views

Where is the Bottleneck for multiple requests using Whisper on Nvidia A100

I want to use Whisper-Large-v3 (Speech-to-Text) for a real-time application. However, I want to process several requests at the same time. My Whisper instance runs on an Nvidia A100 with 80GB VRAM. In ...

leon

11

asked May 29 at 7:05

0 votes

1 answer

377 views

OpenAI Whisper API: How do I shorten the response latency? How do I make Whisper respond quicker?

I am making a voice chatbot. The issue is that I needed to shorten the delay between finishing my sentence while speaking to the bot and the start of the bot's response, which currently takes about 6 ...

Samagra Shrivastava

13

asked May 29 at 6:39

0 votes

0 answers

115 views

Whisper for 100 users concurrently using websockets

import asyncio import websockets import numpy as np import whisper import torch import threading import time from urllib.parse import parse_qs # Load Whisper model instances globally model_pool = [...

Saboor Niazi

1

asked May 27 at 15:01

1 vote

1 answer

555 views

How to process audio with Whisper in Rust

I have a tauri app that records a user's voice, sends a audio/webm base64 encoded string to a backend Rust function to processe it with open AI's Whisper model. It mostly works, except for the fact ...

itsisaac19

540

asked May 24 at 20:20

0 votes

0 answers

662 views

How to create a live voice activity detection (VAD) loop for whisperX?

I am using whisperX speech-to-text model to convert my voice into text input for a locally hosted LLM. Right now, I have it set up where I can record an audio file, and then load it into whisperX. I ...

Jay Ferments

1

asked May 17 at 16:35

Collectives™ on Stack Overflow

Related Tags