I work mostly on Computer Vision problems using Deep learning in a broad sense. Here's a short list of stuff I work on or have worked on. For a greater detail, however, read on!
Since the latter part of the second year in college, I have involved myself extensively in research activities, initially working as research assistant for researchers in my own University and eventually outside. In the initial months of 2021, I started working under Dr. Debangshu Dey on a research statement of Glaucoma Detection and Optic Disc and Cup Segmentation from Retinal Fundus Images. This project involved two tasks: initially developing a detailed survey of contemporary literature on the problem statement, and then eventually, developing an encoder-decoder based convolutional neural network architecture to achieve the semantic segmentation of the Optic Disc and Cup regions from Retinal Fundus Images, obtained from public dataset repositories. The findings were then submitted to and is currently under review in the scientific journal: Biomedical Signal Processing and Control, Elsevier.
Eventually, I worked under Dr. Brejesh Lall of Indian Institute of Technology, Delhi on a research statement of Rib Suppression from Chest X-Ray Images using CycleGAN like architectures. The findings are currently planned to be submitted to the MICCAI, 2023 conference. Thereafter, I also worked briefly under Dr. Prasanta Kumar Ghosh of Indian Institute of Science, Bangalore, on Facial Symmetry Assessment in images of human face, from January to April of 2022. The aim of the project was to develop a scoring metric that helps plastic surgeons to assess facial symmetry as a real world application. Both these Vision based research projects involved extensive use of deep learning tools.
I joined the research group of Dr. Donglai Wei of Boston College in January of 2022. Here, I have worked initially on developing an annotation tool for neuroscience students in the UMass Boston Baby Lab, for annotating videos of children making hand gestures while trying to identify shapes. Subsequently, I worked on a deep learning based research statement, which involved the detection of these hand gestures from timestamped videos. The project was aimed at assisting neuroscience students and eventually psychologists at the UMass Boston Baby Lab. Here I learnt more on handling video data, since most of my previous research has been on single images in a biomedical setting.
During the summer break of 2022, I started working with the research group of Dr. José Dolz of ÉTS, Montreal. I initially started working on a large scale benchmark and survey on model calibration methods when subjected to various kinds of corruptions in the input image data. This project, unlike most of my previous works, involved a more theoretical understanding of the domain, which made it somewhat more pronounced in its difficulty. In the later part, I also helped a PhD student and another undergraduate student on a project on Weakly Supervised Semantic Segmentation. This project involved developing a new and improved pipeline based on existing language-vision models in weakly supervised segmentation tasks. Both these projects happened to be my first exposure to natural scene images, which posed its own diverse set of challenges. The findings of the project on weakly supervised segmentation are planned to be submitted to the International Conference on Computer Vision (ICCV, 2023).
I have also been part of independent research groups composed of students of my own University which has led to a publication in the IEEE International Symposium on Biomedical Imaging (ISBI, 2022). This work involved developing a simple and effective semi-supervised cardiac MRI segmentation scheme using a linear interpolation of images to achieve state-of-the-art results.
As an undergraduate student, my research interests have been extensively directed by the projects I have handled till date. This has allowed me to learn the nuances of various domains under computer vision. Generally I have been acquainted with image recognition and translation tasks, in semi and weakly supervised settings, and also have a fair understanding of video scene understanding and optical flow. As a graduate student, I would like to explore more specialized domains, which I haven't had the opportunity to do before!