I am a Postdoctoral Researcher at the Center for Machine Vision and Signal Analysis (CMVS), University of Oulu, Finland advised by Prof. Xiaobai Li. Additionally, I serve as an Affiliated Scientist at the Meditation Research Program at Massachusetts General Hospital and Harvard University.
Prior to this, I did my Ph.D. at the Machine Intelligence Lab, Indian Institute of Technology Roorkee, India, advised by Prof. R. Balasubramanian. My research interest includes Multimodal Emotion Analysis using Audio, Visual, Textual, Physiological and Brain Imaging data.
I'm passionate about relating humans and computers and use that understanding to optimize people's well-being. I perceive humans as an assembly of fat-muscle-bone-water on the hardware level and a bundle of impressions-sensations-emotions-thoughts-actions on the software level. We are made up of carbon (instead of silicon), have mind-brain-hormones (as opposed to OS-processor-current), and deploy physical-mental-emotional-psychological-mental-spiritual resources (instead of computational resources) in reaction to various situations. This is my web space to reflect upon my research activities, professional endeavors, and life experiences, hoping to connect with like-minded people.
• Jul 2018 - Aug 2022
• CGPA: 9.00/10
• Thesis Area: Multimodal Emotion Analysis Using Deep Learning Techniques.
• Supervisor: Prof. R. Balasubramanian
• Jul 2016 - Jun 2018
• CGPA: 9.38/10
• Thesis Topic: Meta-heuristic based Optimization of Deep Neural Networks
• Supervisor: Dr. Shalini Batra
• June 2010 - May 2014
• CGPA: 7.47/10
• Mentor: Prof. Harish SV
Postdoc Researcher \& Assistant Lecturer• Jan 2023 - Present
Working on Multimodal Emotion Understanding using Visual, Textual, Audio and Physiological modalities.
Visiting Researcher• Dec 2019
Worked at the Department of Computer Science and Intelligent Systems, OPU, Osaka, Japan, under Japan Science and Technology Sakura Science Plan.
Visiting Researcher• July 2018 - June 2019
Worked on ‘End to End Emotional Speech Synthesis’ project, sponsored by Samsung R&D, in the investigation of Prof. R. Balasubramanian, IIT Roorkee.
Software Engineer• May 2014 - May 2016
Worked with the Software Development team for the sustenance of a project management tool 'Oracle Primavera P6 Professional'.
Project Intern• Jan 2014 - May 2014
Worked with the Software Testing team for the quality assurance of 'Oracle Primavera P6 Enterprise Project Portfolio Management Web'.
1.  |    |
P. Kumar, S. Misra, Z. Shao, B. Zhu, B. Raman, and X. Li. "Multimodal Interpretable Depression Analysis Using Visual, Physiological, Audio and Textual Data." IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2025), Tucson, Arizona, USA. [CORE A]
|
2.  |    |
A. Vedernikov, P. Kumar, and X. Li. "TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals." CVPRw: 6th Workshop and Competition on Affective Behavior Analysis in-the-wild, in conjunction with IEEE Computer Vision and Pattern Recognition Conference (CVPRw 2024).
|
3.  |    |
Puneet Kumar, Sarthak Malik and Balasubramanian Raman. "Interpretable Multimodal Emotion Recognition using Hybrid Fusion of Speech and Image Data." Springer Multimedia Tools and Applications (MTAP) Journal. [SCI, Q2, IF = 3.6]
|
4.  |    |
Puneet Kumar, Xiaobai Li. "Interpretable Multimodal Emotion Recognition using Facial Features and Physiological Signals." Third RBCDSAI Conference on Deployable AI (DAI 2023) [Best Paper Award: third position] |
5.  |    |
Puneet Kumar, Gaurav Bhatt, Omkar Ingle, Daksh Goyal and Balasubramanian Raman. "Affective Feedback Synthesis Towards Multimodal Text and Image Data." ACM Transactions on Multimedia Computing, Communications, and Applications. [SCI, Q1, IF = 5.1]
|
6.  |    |
Puneet Kumar, Kshitij Pathania and Balasubramanian Raman. "Zero-shot Learning based Cross-lingual Sentiment Analysis for Sanskrit Text with Insufficient Labeled Data." Springer Applied Intelligence (APIN) Journal. [SCI, Q2, IF = 5.019]
|
7.  |    | Puneet Kumar and Balasubramanian Raman. "A BERT Based Dual-Channel Explainable Text Emotion Recognition System." Elsevier Neural Networks (NeuNet) Journal. [SCI, Q1, IF = 8.05] |
8.  |    |
Sarthak Malik, Puneet Kumar and Balasubramanian Raman. "Towards Interpretable Facial Emotion Recognition." 12th Indian Conference on Computer Vision, Graphics and Image Processing (ICVGIP 2021). [IAPR Endorsed'] |
9.  |    |
Puneet Kumar, Vishesh Kaushik and Balasubramanian Raman. "Towards the explainability of Multimodal Speech Emotion Recognition." 22nd Annual Conference of the International Speech Communication Association (Interspeech 2021). [CORE A | Qualis A1]
|
10.  |    |
Puneet Kumar, Vedanti Khokher, Yukti Gupta and Balasubramanian Raman. "Hybrid Fusion Based Approach for Multimodal Emotion Recognition with Insufficient Labelled Data." 28th IEEE International Conference on Image Processing (ICIP 2021). [CORE B | Qualis A1]
|
11.  |    | Puneet Kumar, Shalini Batra and Balasubramanian Raman. "Deep neural network hyper-parameter tuning through two-fold genetic approach." Springer Soft Computing (SoCo) Journal. [SCI, Q2, IF = 3.05] |
12.  |    |
Puneet Kumar, Sidharth Jain, Balasubramanian Raman, Partha Pratim Roy and and Masakazu Iwamura. "End-to-end Triplet Loss based Emotion Embedding System for Speech Emotion Recognition."The 25th International Conference on Pattern Recognition (ICPR 2020). [CORE B | Qualis A1]
|
13.  |    |
Puneet Kumar and Balasubramanian Raman. "Domain Adaptation based Technique for Image Emotion Recognition using Image Captions." 5th IAPR Inetrnational Conference on Computer Vision and Image Processing (CVIP 2020). [IAPR Endorsed | Got the 'Best Paper Award'] |
14.  |    |
Ankit Sharma, Puneet Kumar, Vikas M, Nagasai M, Kishore K, Sriram K, Balasubramanian Raman and Partha Pratim Roy. "Fast Griffin Lim based Waveform Generation Strategy for Text-to-Speech Synthesis." Springer Multimedia Tools and Applications (MTAP) Journal. [SCI, Q2, IF = 2.31]
|
15.  |    |
Sourav Sahoo, Puneet Kumar, Balasubramanian Raman and Partha Pratim Roy. "A Segment Level Approach to Speech Emotion Recognition using Transfer Learning." The 5th Asian Conference on Pattern Recognition (ACPR 2019). [IAPR Endorsed]
|
16.  |    |
J. Jaiswal, A. Chaubey, B. Reddy, S. Kashyap, Puneet Kumar, Balasubramanian Raman and Partha Pratim Roy. "A Generative Adversarial Network based Ensemble Technique for Automatic Evaluation of Synthetic Speech." The 5th Asian Conference on Pattern Recognition (ACPR 2019). [IAPR Endorsed]
|
17.  |    |
Puneet Kumar, Sahil Garg, Amritpal Singh, Shalini Batra, Neeraj Kumar and Ilsun You. "MVO-Based 2-D Path Planning Scheme for Providing Quality of Service in UAV Environment." IEEE Internet of Things (IoT) Journal 5, no. 3 (2018): 1698-1707. [SCI, Q1, IF = 9.94]
|
18.  |    |
Puneet Kumar and Shalini Batra. "Meta-heuristic based Optimized Deep Neural Network for Streaming Data Prediction." International Conference on Advances in Computing, Communication Control and Networking (ICACCCN 2018). [Listed in Scopus]
|
I am a programmable (I learn from experiences) program, programmed to program (I get to program software applications in my profession). The experiences that have programmed me are to travel, click photos, sing & play guitar, work out, meditate, listen to audiobooks and talk about philosophy, psychology, health and nutrition, neuroscience, personal finance, and cosmology.