Indian Institute of Information Technology

Dr. Achintya Kumar Sarkar

Assistant Professor

Office Address:

304, 2nd Floor, Academic Building
Department of Electronics and Communication Engineering (ECE Group)
Indian Institute of Information Technology, Sri City, Chittoor
Andhra Pradesh - 517 646, India

Academic Qualifications

Education:

Ph.D [Speech Processing], 2011

Indian Institute of Technology, Madras, India.

Thesis Title: Computationally Efficient Speaker Identification and Use of Multiple Background Models for Speaker Verification

Research Areas of Interest

Speech Signal Processing (speaker recognition, voice activity detection, spoofing detection, speech recognition), Biomedical Signal Processing (seizure detection, gait classification), Deep Learning Techniques

Awards / Honours

  • 4th in Fearless Steps (FS-P01)- Speech Activity Detection - Challenges 2019 (University of Texas at Dallas)
  • Otto Mønsteds Fond (Denmark) travel grant to attend INTERSPEECH 2016, USA
  • 5th rank among 200 competitors in Seizure Detection Challenge organized by UPenn and Mayo Clinic's 2014
  • IEEE-SPS travel grant to attend ICASSP 2011, Czech Republic
  • ISCA travel grant to attend INTERSPEECH 2010, Japan CSIR Travel Grant 2010
  • Gold medal award by Punjab Technical University, India for ranking top in M. Tech. 2006

Projects

Publications

2020

  • rVAD: An Unsupervised Segment-Based Robust Voice Activity Detection Method. Z.-H. Tan, A. K. Sarkar, N. Dehak. Computer Speech & Language: Volume 59, January 2020, pp.1-21.

2019

  • Time-Contrastive Learning Based Deep Bottleneck Features for Text-Dependent Speaker Verification. A. K. Sarkar, Z.-H. Tan, H. Tang, S. Shon, J. Glass. IEEE/ACM Transactions on Audio, Speech and Language, Processing: Volume: 27, August 2019, pp. 1267 -1279.

2018

  • Incorporating Pass-phrase Dependent Background Models for Text-dependent Speaker Verification. A. K. Sarkar, Z.-H. Tan. Computer Speech & Language: Volume 47, January 2018, pp. 259-271.

2017

  • Towards a Personalized Real-Time Diagnosis in Neonatal Seizure Detection. A. Temko, A. K. Sarkar, G. Boylan, S. Mathieson, W. Marnane, G. Lightbody. IEEE Journal of Translational Engineering in Health and Medicine, Volume: 5, September 2017. DOI:10.1109/ JTEHM.2017.2737992.
  • Improving Speaker Verification Performance in Presence of Spoofing Attacks Using Out-of-Domain Spoofed Data. A. K. Sarkar, M. Sahidullah, Z.-H. Tan, T. Kinnunen. In proc. of INTERSPEECH 2017, pp. 2611-2615, Sweden.
  • Time-Contrastive Learning Based DNN Bottleneck Features for Text Dependent Speaker Verification. A. K. Sarkar, Z.-H. Tan. NIPS Time Series Workshop 2017, Long Beach, CA, USA.
  • A New Replay Spoofing Attack Corpus for Text-Dependent Speaker Verification Research. T. Kinnunen, M. Sahidullah, M. Falcone, L. Costantini, R. G. Hautamaki, D. Thomsen, A. Sarkar, Z.-H. Tan, H. Delgado, M. Todisco, N. Evans, V. Hautamaki, K. A. Lee. In Proc. of IEEE Int. Conf. Acoust. Speech Signal Processing (ICASSP), 2017, pp. 5395-5399, USA.
  • The I4U Mega Fusion and Collaboration for NIST Speaker Recognition Evaluation 2016. K. A. Lee and et al. In proc. of INTERSPEECH 2017, pp. 1328-1332, Sweden.

2016

  • Text Dependent Speaker Verification Using Un-supervised HMM-UBM and Temporal GMM-UBM. A. K. Sarkar, Z.-H. Tan. In Proc. of INTERSPEECH, 2016, pp. 425-429, USA.
  • Effect of Multi-condition Training and Speech Enhancement Methods on Spoofing Detection. H. Yu, A. Sarkar, D. A. L. Thomsen, Z.-H. Tan, Z. Ma, J. Guo. In Proc. of International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE), 2016, pp. 1-5, Denmark.
  • Further Optimizations of Constant Q Cepstral Processing for Integrated Utterance Verification and Text-dependent Speaker Verification. H. Delgado, M. Todisco, M. Sahidullah, A. K. Sarkar, N. Evans, T. Kinnunen, Z.-H. Tan. In Proc. of IEEE Spoken Language Technology (SLT) Workshop, 2016, pp. 179-185, USA.
  • Utterance Verification for Text-Dependent Speaker Recognition: a Comparative Assessment Using the RedDots Corpus. T. Kinnunen, M. Sahidullah, I. Kukanov, H. Delgado, M. Todisco, A. Sarkar, N.B. Thomsen, N. Evans, Z.-H. Tan. In Proc. of INTERSPEECH, 2016, pp. 430-434, USA.
  • A Study on the Roles of Total Variability Space and Session Variability Modeling in Speaker Recognition. A. K. Sarkar, J.F Bonastre, D. Matrouf. International Journal of Speech Technology: Volume 19, Issue 1 (2016), pp. 111-120.
  • Sub-vector based biometric speaker verification using MLLR super-vector. A. K. Sarkar, J.F Bonastre. International Journal of Speech Technology: Volume 19, Issue 1 (2016), pp. 41-54.

2015

  • Detection of Seizures in Intracranial EEG: UPenn and Mayo Clinics Seizure Detection Challenge. A. Temko, A. K. Sarkar, G. Lightbody. In Proc. of IEEE Engineering in Medicine and Biology Society (EMBC), 2015, pp. 6582-6585, Italy.

2014

  • Combination of Cepstral and Phonetically Discriminative Features for Speaker Verification. A. K. Sarkar, C. T. Do, V. B. Le, C. Barras. IEEE Signal Processing Letters, Volume:21, Issue: 9, Sept. 2014, pp. 1040-1044.
  • Person Instance Graphs for Named Speaker Identification in TV Broadcast. H. Bredin, A. Laurent, A. K. Sarkar, V. B. Le, S. Rosset, C. Barras. In Proc. of Speaker and Language Recognition Workshop- Odyssey, 2014, pp. 179-186, Finland.

2013

  • Augmenting Short-term Cepstral Features with Long-term Discriminative Features for Speaker Verification of Telephone Data. C. T. Do, C. Barras, V. B. Le, A. K. Sarkar. In Proc. of INTERSPEECH, 2013, pp. 2484-2488, France.
  • Qcompere @ Repere 2013. H. Bredin and et al. In Proc. of First Workshop on Speech, Language and Audio in Multimedia (SLAM), 2013, pp. 49-54, France.
  • Multi-Class UBM-Based MLLR m-Vector System for Speaker Verification. A. K. Sarkar, C. Barras. In Proc. of European Signal Processing Conference (EUSIPCO), 2013, pp.1-5, Morocco.
  • Lattice MLLR based m-vector System for Speaker Verification. A. K. Sarkar, C. Barras, V. B. Le. In Proc. of IEEE Int. Conf. Acoust. Speech Signal Processing (ICASSP), 2013, pp. 7654-7658, Canada.

2012

  • Study of the Effect of I-vector Modeling on Short and Mismatch Utterance Duration for Speaker Verification. A. K. Sarkar, D. Matrouf, P. M. Bousquet, J. F. Bonastre. In Proc. of INTERSPEECH, 2012, pp. 2662-2665, USA.
  • Multiple Background Models for Speaker Verification Using The Concept of Vocal Tract Length and MLLR Super-vector. A. K. Sarkar, S. Umesh. International Journal of Speech Technology: Volume 15, Issue 3 (2012), pp. 351-364.
  • Speaker Verification using m-vector Extracted from MLLR Super-Vector. A. K. Sarkar, J. F. Bonastre, D. Matrouf. In Proc. of 20th European Signal Processing Conference (EUSIPCO), 2012, pp. 21-25, Romania.
  • Computationally Efficient Speaker Identification Using Fast-MLLR Based Anchor Modeling. A. K. Sarkar, S. Umesh, J. F. Bonastre. In Proc. of IEEE Int. Conf. Acoust. Speech Signal Processing (ICASSP), 2012, pp. 4357-4360, Japan.

2011

  • Use of VTL-wise Models in Feature-Mapping Framework to Achieve Performance of Multiple-Background Models in Speaker Verification. A. K. Sarkar, S. Umesh. In Proc. of IEEE Int. Conf. Acoust. Speech Signal Processing (ICASSP), 2011, pp. 4552 - 4555, Czech Republic.
  • Eigen-voice Based Anchor Modeling System for Speaker Identification using MLLR Super-vector. A. K. Sarkar, S. Umesh. In Proc. of INTERSPEECH, 2011, pp. 2357-2360, Italy.

2010

  • Fast Computation of Speaker Characterization Vector using MLLR and Sufficient Statistics in Anchor Model Framework. A. K. Sarkar, S. Umesh. In Proc. of INTERSPEECH, 2010, pp. 2738-2741, Japan.
  • Computationally Efficient Speaker Identification for Large Population Tasks using MLLR and Sufficient Statistics. A. K. Sarkar, S. Umesh, S. P. Rath. In Proc. of Speaker and Language Recognition Workshop- Odyssey 2010, pp. 7-11, Czech Republic.
  • Investigation of Speaker-Clustered UBMs based on Vocal Tract Lengths and MLLR matrices for Speaker Verification. A. K. Sarkar, S. Umesh. In Proc. of Speaker and Language Recognition Workshop- Odyssey 2010, pp. 286-293, Czech Republic.
  • Fast Approach to Speaker Identification for Large Population Using MLLR and Sufficient Statistics. A. K. Sarkar, S. P. Rath, S. Umesh. In Proc. of National Conference on Communications (NCC), 2010, India.
  • Vocal Tract Length Normalization Factor Based Speaker-Cluster UBM for Speaker Verification. A. K. Sarkar, S. P. Rath, S. Umesh. In Proc. of National Conference on Communications (NCC), 2010, India.
  • Effect of Jacobian Compensation in Linear Transformation Based VTLN under Matched and Mismatched Speaker Conditions. S. P. Rath, A. K. Sarkar, S Umesh. In Proc. of National Conference on Communications (NCC), 2010, India.

2009

  • Text-Independent Speaker Identification Using Vocal Tract Length Normalization for Building Universal Background Model. A. K. Sarkar, S. Umesh, S. P. Rath. In Proc. of INTERSPEECH, 2009, pp. 2331-2334, UK.
  • Using VTLN Matrices for Rapid and Computationally Efficient Speaker Adaptation with Robustness to First-Pass Transcription Errors. S. P. Rath, S. Umesh, A. K. Sarkar. In Proc. of INTERSPEECH, 2009, pp. 572-575, UK.

Students

Teaching

Contact Information

Address for Communication:

Room No. 304, 2nd Floor, Academic Building
Indian Institute of Information Technology, Sri City, Chittoor
630, Gnan Marg, Sri City, Satyavedu Mandal
Chittoor District - 517 646, Andhra Pradesh, India