vanshnawander/whisper-small-hindi-asr

This is a fine-tuned version of openai/whisper-small for Hindi automatic speech recognition (ASR).

Model Description

Base Model: openai/whisper-small
Language: Hindi (hi)
Task: Automatic Speech Recognition (transcribe)
Training Data: ai4bharat/Kathbath
Fine-tuning Framework: Transformers + Custom DALI Pipeline

Evaluation Results

Evaluated on the LAHAJA benchmark - a multi-accent Hindi ASR benchmark with 12.5 hours of audio from 132 speakers across 83 districts of India.

Model	WER	CER	Improvement
Base (whisper-small)	145.67%	101.57%	-
This Model	36.17%	11.36%	75.2%

Usage

Basic Usage

from transformers import WhisperProcessor, WhisperForConditionalGeneration
import librosa

# Load model and processor
processor = WhisperProcessor.from_pretrained("vanshnawander/whisper-small-hindi-asr")
model = WhisperForConditionalGeneration.from_pretrained("vanshnawander/whisper-small-hindi-asr")

# Load audio
audio, sr = librosa.load("audio.wav", sr=16000)

# Transcribe
input_features = processor(audio, sampling_rate=16000, return_tensors="pt").input_features
generated_ids = model.generate(input_features, language="hi", task="transcribe")
transcription = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(transcription)

Using Pipeline

from transformers import pipeline

pipe = pipeline(
    "automatic-speech-recognition",
    model="vanshnawander/whisper-small-hindi-asr",
    chunk_length_s=30,
)

result = pipe("audio.wav", generate_kwargs={"language": "hi", "task": "transcribe"})
print(result["text"])

Limitations

Optimized for Hindi speech; may not perform well on other languages
Best performance on clear audio with minimal background noise
May struggle with very fast speech or heavy code-mixing

Downloads last month: -

Safetensors

Model size

0.2B params

Tensor type

F32

Model tree for vanshnawander/whisper-small-hindi-asr

Base model

openai/whisper-small

Finetuned

(3098)

this model

Dataset used to train vanshnawander/whisper-small-hindi-asr

Evaluation results

Word Error Rate on LAHAJA (Hindi Multi-accent)
self-reported

36.170
Character Error Rate on LAHAJA (Hindi Multi-accent)
self-reported

11.360

View on Papers With Code