Puneet Kumar

I am a Postdoctoral Researcher at the Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Finland advised by Prof. Xiaobai Li. Additionally, I serve as an Affiliated Scientist at the Meditation Research Program at Massachusetts General Hospital and Harvard University. Prior to this, I did my Ph.D. at the Machine Intelligence Lab, Indian Institute of Technology Roorkee, India, advised by Prof. R. Balasubramanian. My research interest includes Multimodal Emotion Analysis using Audio, Visual, Textual, Physiological and Brain Imaging data.

I'm passionate about relating humans and computers and use that understanding to optimize people's well-being. I perceive humans as an assembly of fat-muscle-bone-water on the hardware level and a bundle of impressions-sensations-emotions-thoughts-actions on the software level. We are made up of carbon (instead of silicon), have mind-brain-hormones (as opposed to OS-processor-current), and deploy physical-mental-emotional-psychological-mental-spiritual resources (instead of computational resources) in reaction to various situations. This is my web space to reflect upon my research activities, professional endeavors, and life experiences, hoping to connect with like-minded people.

News



  1. A paper titled 'Multimodal Interpretable Depression Analysis Using Visual, Physiological, Audio, and Textual Data' got accepted in IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025), Tucson, Arizona, USA.
  2. Aug 2024 - Oct 2024: Visited Georgia Institute of Technology, USA, as a Visiting Faculty at the Tech Research Lab, School of Psychology.
  3. Aug 2024 and Nov 2024: Visited the Meditation Research Program at Harvard University & Massachusetts Hospital, Boston, USA, as an Affiliated Scientist.
  4. Feb 2024: Received approval for 'CMVS International Research Visit' fund.
  5. Aug 2023: A paper titled 'Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data' got accepted in Springer Multimedia Tools and Applications (MTAP) Journal.
  6. Jun 2023: A paper titled 'Interpretable Multimodal Emotion Recognition using Facial Features and Physiological Signals' got accepted in Third RBCDSAI Conference on Deployable AI (DAI 2023) and received third position in 'Best Paper Award' category.
  7. Mar 2023: A paper titled 'Affective Feedback Synthesis Towards Multimodal Text and Image Data' got accepted in ACM Transactions on Multimedia Computing Communications and Applications.
  8. Jan 2023: Started working as a Postdoctoral Researcher at the Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Finland advised by Prof. Xiaobai Li.
  9. Dec 2022: Received the Best Ph.D. Thesis Award in the 9th IEEE Uttar Pradesh Section International Conference on Electrical, Electronics and Computer Engineering (UPCON'22).
  10. Oct 2022: Defended Ph.D. Dissertation successfully.
  11. Aug 2022: Submitted Ph.D. Thesis entitled 'Multimodal Emotion Analysis Using Deep Learning Techniques' on 01 Aug 2022.
  12. July 2022: A paper titled 'Zero-shot Learning based Cross-lingual Sentiment Analysis for Sanskrit Text with Insufficient Labeled Data' got accepted in Springer Applied Intelligence (APIN) Journal.
  13. June 2022: A paper titled 'Automatic Evaluation of Machine Generated Feedback For Text and Image Data' got accepted in IEEE 5th International Conference on Multimedia Information Processing and Retrieval Workshop on Multimedia Computing for Automated Urban Intelligent Systems (MIPRw 2022).
  14. Mar 2022: A paper titled 'A BERT Based Dual-Channel Explainable Text Emotion Recognition System' got accepted in Elsevier Neural Networks (NeuNet) Journal.
  15. Sep 2021: A paper titled 'Towards Interpretable Facial Emotion Recognition' got accepted in The 12th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2021), Jodhpur, India.
  16. Aug 2021: A copyright titled 'Deep Neural Network Explainability Technique with Application in Multimodal Emotion Recognition' got registered (Certificate No. L-106713/2021).
  17. June 2021: A paper titled 'Towards the explainability of Multimodal Speech Emotion Recognition' got accepted in The 22nd Annual Conference of the International Speech Communication Association (InterSpeech 2021), Brno, Czech Republic.
  18. June 2021: Filed a copyright titled 'Deep Neural Network Explainability Technique with Application in Multimodal Emotion Recognition'.
  19. May 2021: A paper titled 'Hybrid Fusion Based Approach for Multimodal Emotion Recognition with Insufficient Labelled Data' got accepted in The 28th International Conference on Image Processing (ICIP 2021), Anchorage, Alaska, USA.
  20. Mar 2021: A paper titled 'Deep neural network hyper-parameter tuning through two-fold genetic approach' got accepted in Springer Soft Computing (SoCo) Journal.
  21. Dec 2020: Received the Best Paper Award in the International Conference on Computer Vision & Image Processing (CVIP 2020).
  22. Oct 2020: A paper titled 'End-to-end Triplet Loss based Emotion Embedding for Speech Emotion Recognition' got accepted in The 25th International Conference on Pattern Recognition (ICPR 2020), Milan, Italy.
  23. Jul 2020: A paper titled 'Fast Griffin Lim based Waveform Generation Strategy for TTS Synthesis' got accepted in Springer Multimedia Tools and Applications (MTAP) Journal.
  24. May 2020: Became a reviewer of Springer Nature Computer Science (SNCS) Journal.
  25. Dec 2019: Visited Osaka Prefecture University, Japan for 3 weeks under Japan Science and Technology Sakura Science Plan.
  26. Aug 2019: Two papers titled 'A Segment Level Approach to Speech Emotion Recognition using Transfer Learning' and 'A GAN based Ensemble Technique for Automatic Evaluation of Synthetic Speech' accepted in The 5th Asian Conference on Pattern Recognition (ACPR 2019).
  27. May 2019: Became a reviewer of IEEE Communication Letters Journal.
  28. Nov 2018: Received M.E. (Computer Science) degree along with Institute Gold Medal from Thapar University, India.
  29. Jul 2018: Started pursuing Ph.D. at Machine Intelligence Lab, IIT Roorkee advised by Prof. R. Balasubramanian.
  30. Jan 2018: A paper titled 'MVO-Based 2-D Path Planning Scheme in UAV Environment' got accepted in IEEE Internet of Things (IoT) Journal.

Education

Ph.D. (Computer Science)*

Indian Institute of Technology, Roorkee, Uttrakhand, India

• Jul 2018 - Aug 2022 • CGPA: 9.00/10
Thesis Area: Multimodal Emotion Analysis Using Deep Learning Techniques.
Supervisor: Prof. R. Balasubramanian

M.E. (Computer Science)

Thapar Institute of Engineering & Technology, Thapar University, Patiala, Punjab, India.

• Jul 2016 - Jun 2018 • CGPA: 9.38/10
Thesis Topic: Meta-heuristic based Optimization of Deep Neural Networks
Supervisor: Dr. Shalini Batra

B.E. (Computer Science)

Manipal Institute of Technology, Manipal University, Manipal, Karnataka, India.

• June 2010 - May 2014 • CGPA: 7.47/10
Mentor: Prof. Harish SV

Work Experience

CMVS, University of Oulu, Finland

Postdoc Researcher \& Assistant Lecturer Jan 2023 - Present

Working on Multimodal Emotion Understanding using Visual, Textual, Audio and Physiological modalities.

Osaka Prefecture University (OPU), Osaka, Japan

Visiting Researcher Dec 2019

Worked at the Department of Computer Science and Intelligent Systems, OPU, Osaka, Japan, under Japan Science and Technology Sakura Science Plan.

Samsung R&D, New Delhi, India

Visiting Researcher July 2018 - June 2019

Worked on ‘End to End Emotional Speech Synthesis’ project, sponsored by Samsung R&D, in the investigation of Prof. R. Balasubramanian, IIT Roorkee.

Oracle India Pvt. Ltd., Hyderabad, India

Software Engineer May 2014 - May 2016

Worked with the Software Development team for the sustenance of a project management tool 'Oracle Primavera P6 Professional'.

Oracle India Pvt. Ltd., Hyderabad, India

Project Intern Jan 2014 - May 2014

Worked with the Software Testing team for the quality assurance of 'Oracle Primavera P6 Enterprise Project Portfolio Management Web'.

Research Interests

  • Affective Computing and Cognitive Science
  • Multimodal Emotion Understanding
  • Machine Learning and Deep Learning
  • Interpretable and Explainable AI
  • Mental Health and Mindfulness
  • Computational Cognitive Neuroscience

Publications




1    P. Kumar, S. Misra, Z. Shao, B. Zhu, B. Raman, and X. Li. "Multimodal Interpretable Depression Analysis Using Visual, Physiological, Audio and Textual Data." IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025), Tucson, Arizona, USA. [CORE A]


2    A. Vedernikov, P. Kumar, and X. Li. "TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals." CVPRw: 6th Workshop and Competition on Affective Behavior Analysis in-the-wild, in conjunction with IEEE Computer Vision and Pattern Recognition Conference (CVPRw 2024).



3    Puneet Kumar, Sarthak Malik and Balasubramanian Raman. "Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data." Springer Multimedia Tools and Applications (MTAP) Journal. [SCI, Q2, IF = 3.6]


4    Puneet Kumar, Xiaobai Li. "Interpretable Multimodal Emotion Recognition using Facial Features and Physiological Signals." Third RBCDSAI Conference on Deployable AI (DAI 2023) [Best Paper Award: third position]


5    Puneet Kumar, Gaurav Bhatt, Omkar Ingle, Daksh Goyal and Balasubramanian Raman. "Affective Feedback Synthesis Towards Multimodal Text and Image Data." ACM Transactions on Multimedia Computing, Communications, and Applications. [SCI, Q1, IF = 5.1]


6    Puneet Kumar, Kshitij Pathania and Balasubramanian Raman. "Zero-shot Learning based Cross-lingual Sentiment Analysis for Sanskrit Text with Insufficient Labeled Data." Springer Applied Intelligence (APIN) Journal. [SCI, Q2, IF = 5.019]


7    Puneet Kumar and Balasubramanian Raman. "A BERT Based Dual-Channel Explainable Text Emotion Recognition System." Elsevier Neural Networks (NeuNet) Journal. [SCI, Q1, IF = 8.05]


8    Sarthak Malik, Puneet Kumar and Balasubramanian Raman. "Towards Interpretable Facial Emotion Recognition." 12th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2021). [IAPR Endorsed']



9    Puneet Kumar, Vishesh Kaushik and Balasubramanian Raman. "Towards the explainability of Multimodal Speech Emotion Recognition." 22nd Annual Conference of the International Speech Communication Association (Interspeech 2021). [CORE A | Qualis A1]




10    Puneet Kumar, Vedanti Khokher, Yukti Gupta and Balasubramanian Raman. "Hybrid Fusion Based Approach for Multimodal Emotion Recognition with Insufficient Labelled Data." 28th IEEE International Conference on Image Processing (ICIP 2021). [CORE B | Qualis A1]




11    Puneet Kumar, Shalini Batra and Balasubramanian Raman. "Deep neural network hyper-parameter tuning through two-fold genetic approach." Springer Soft Computing (SoCo) Journal. [SCI, Q2, IF = 3.05]


12    Puneet Kumar, Sidharth Jain, Balasubramanian Raman, Partha Pratim Roy and and Masakazu Iwamura. "End-to-end Triplet Loss based Emotion Embedding System for Speech Emotion Recognition."The 25th International Conference on Pattern Recognition (ICPR 2020). [CORE B | Qualis A1]



13    Puneet Kumar and Balasubramanian Raman. "Domain Adaptation based Technique for Image Emotion Recognition using Image Captions." 5th IAPR Inetrnational Conference on Computer Vision and Image Processing (CVIP 2020). [IAPR Endorsed | Got the 'Best Paper Award']






14    Ankit Sharma, Puneet Kumar, Vikas M, Nagasai M, Kishore K, Sriram K, Balasubramanian Raman and Partha Pratim Roy. "Fast Griffin Lim based Waveform Generation Strategy for Text-to-Speech Synthesis." Springer Multimedia Tools and Applications (MTAP) Journal. [SCI, Q2, IF = 2.31]


15    Sourav Sahoo, Puneet Kumar, Balasubramanian Raman and Partha Pratim Roy. "A Segment Level Approach to Speech Emotion Recognition using Transfer Learning." The 5th Asian Conference on Pattern Recognition (ACPR 2019). [IAPR Endorsed]


16    J. Jaiswal, A. Chaubey, B. Reddy, S. Kashyap, Puneet Kumar, Balasubramanian Raman and Partha Pratim Roy. "A Generative Adversarial Network based Ensemble Technique for Automatic Evaluation of Synthetic Speech." The 5th Asian Conference on Pattern Recognition (ACPR 2019). [IAPR Endorsed]


17    Puneet Kumar, Sahil Garg, Amritpal Singh, Shalini Batra, Neeraj Kumar and Ilsun You. "MVO-Based 2-D Path Planning Scheme for Providing Quality of Service in UAV Environment." IEEE Internet of Things (IoT) Journal 5, no. 3 (2018): 1698-1707. [SCI, Q1, IF = 9.94]


18    Puneet Kumar and Shalini Batra. "Meta-heuristic based Optimized Deep Neural Network for Streaming Data Prediction." International Conference on Advances in Computing, Communication Control and Networking (ICACCCN 2018). [Listed in Scopus]

Personal Activities

I am a programmable (I learn from experiences) program, programmed to program (I get to program software applications in my profession). The experiences that have programmed me are to travel, click photos, sing & play guitar, work out, meditate, listen to audiobooks and talk about philosophy, psychology, health and nutrition, neuroscience, personal finance, and cosmology.