About This Project

Introduction: This tool classifies English accents from video audio to assist in candidate screening. It provides a prediction and confidence score to help HR teams or interviewers gauge language proficiency.

Challenge: The tool must: accept a public video URL, extract audio, identify accent, and return a confidence percentage. Built to be simple, reliable, and quick for internal hiring support.

Tech Stack: Backend in Python (RunPod, TorchAudio, SpeechBrain, Hugging Face). Frontend in static HTML/CSS/JS. Model used: Jzuluaga/accent-id-commonaccent_xlsr-en-english.

How It Works: Audio is extracted via yt-dlp, processed to 16kHz mono, and analyzed by a Wav2Vec2-based classifier. Output includes an accent label and confidence score.

Usage: Paste a YouTube or MP4 link, hit "Classify Accent," and wait a few seconds for the results to appear.

Limitations: Best with clear, 10–30s samples. Accuracy may vary with noise, accent overlaps, or low-quality input.

Recommendations: Improve with domain-specific audio, chunking, VAD, noise filtering, or hybrid ASR+accent models.

Accent Classifier

About This Project