vanshnawander/whisper-small-hindi-asr
This is a fine-tuned version of openai/whisper-small for Hindi automatic speech recognition (ASR).
Model Description
- Base Model: openai/whisper-small
- Language: Hindi (hi)
- Task: Automatic Speech Recognition (transcribe)
- Training Data: ai4bharat/Kathbath
- Fine-tuning Framework: Transformers + Custom DALI Pipeline
Evaluation Results
Evaluated on the LAHAJA benchmark - a multi-accent Hindi ASR benchmark with 12.5 hours of audio from 132 speakers across 83 districts of India.
| Model | WER | CER | Improvement |
|---|---|---|---|
| Base (whisper-small) | 145.67% | 101.57% | - |
| This Model | 36.17% | 11.36% | 75.2% |
Usage
Basic Usage
from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa
# Load model and processor
processor = WhisperProcessor.from_pretrained("vanshnawander/whisper-small-hindi-asr")
model = WhisperForConditionalGeneration.from_pretrained("vanshnawander/whisper-small-hindi-asr")
# Load audio
audio, sr = librosa.load("audio.wav", sr=16000)
# Transcribe
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
generated_ids = model.generate(input_features, language="hi", task="transcribe")
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(transcription)
Using Pipeline
from transformers import pipeline
pipe = pipeline(
"automatic-speech-recognition",
model="vanshnawander/whisper-small-hindi-asr",
chunk_length_s=30,
)
result = pipe("audio.wav", generate_kwargs={"language": "hi", "task": "transcribe"})
print(result["text"])
Limitations
- Optimized for Hindi speech; may not perform well on other languages
- Best performance on clear audio with minimal background noise
- May struggle with very fast speech or heavy code-mixing
- Downloads last month
- -
Model tree for vanshnawander/whisper-small-hindi-asr
Base model
openai/whisper-smallDataset used to train vanshnawander/whisper-small-hindi-asr
Evaluation results
- Word Error Rate on LAHAJA (Hindi Multi-accent)self-reported36.170
- Character Error Rate on LAHAJA (Hindi Multi-accent)self-reported11.360