Spoken Language Identification Using Deep Learning

Gundeep Singh, Sahil Sharma, Vijay Kumar, Manjit Kaur, Mohammed Baz, Mehedi Masud · 2021

Journal	Computational Intelligence and Neuroscience
Publisher	Hindawi Publishing Corporation
DOI	10.1155/2021/5123671
OpenAlex	W3199024477
Language	en
ISSN	1687-5265
OA?	yes
Status	pending

Abstract

The process of detecting language from an audio clip by an unknown speaker, regardless of gender, manner of speaking, and distinct age speaker, is defined as spoken language identification (SLID). The considerable task is to recognize the features that can distinguish between languages clearly and efficiently. The model uses audio files and converts those files into spectrogram images. It applies the convolutional neural network (CNN) to bring out main attributes or features to detect output easily. The main objective is to detect languages out of English, French, Spanish, and German, Estonian, Tamil, Mandarin, Turkish, Chinese, Arabic, Hindi, Indonesian, Portuguese, Japanese, Latin, Dutch, Portuguese, Pushto, Romanian, Korean, Russian, Swedish, Tamil, Thai, and Urdu. An experiment was conducted on different audio files using the Kaggle dataset named spoken language identification. These audio files are comprised of utterances, each of them spanning over a fixed duration of 10 seconds. The whole dataset is split into training and test sets. Preparatory results give an overall accuracy of 98%. Extensive and accurate testing show an overall accuracy of 88%.

Matched Nanban terms

anchor Portuguese-Japanese

Provenance

openalex (W3199024477)
2026-04-30T19:35:47.487708+00:00

Candidate PDF URLs

P	Source	URL	Last attempt	Last error
30	openalex	https://downloads.hindawi.com/journals/cin/2021/5123671.pdf	—

Extras

openalex_concepts	Computer science; Tamil; Speech recognition; Natural language processing; Artificial intelligence; Turkish; Language identification; Identification (biology); German; Bulgarian
openalex_topics	Speech Recognition and Synthesis; Speech and Audio Processing; Music and Audio Processing