Himanshu Kumar

Bangalore, India.

I am interested in Computer vision, ML, and its applications. I have completed a master's in Artificial intelligence at the Indian Institute of Science, Bangalore.
I have been working in the Industry with an overall experience of more than 3.5 years in Machine learning. I worked on developing various ML/DL models in computer vision and NLP. During my M. Tech, I worked on Depth estimation using an event camera.
I am familiar with ML frameworks and tools such as Pandas, Keras, TensorFlow, scikit-learn, and PyTorch, as well as Pyspark and SQL.

Publications

Simple Transformer with Single Leaky Neuron for Event Vision

WACVW 2025 [PDF]

Implemented a novel architecture that integrates spatial representational capabilities of pre-trained vision backbones with the temporal processing strengths of Spiking Neural Networks, and achieves remarkable accuracy on DVS Gesture dataset.

Segmentation challenge - MaCVi, WACV 2025

MaCVi, WACV 2025 [PDF]

Improved upon KNet model, which didn’t achieved significant results in last year challenge rankings, and achieved a F1 of 80.8. Secured 3rd rank in USV-based Obstacle Segmentation Challenge, at WACV workshop.

Projects

Deep Learning

Natural Language Inference

The SNLI corpus (version 1.0) is a collection of 570k human-written English sentence pairs manually labeled for balanced classification with the labels entailment, contradiction, and neutral, supporting the task of natural language inference (NLI), also known as recognizing textual entailment (RTE). Implemented a LSTM and Logistic regression based model on SNLI. Also, fine tuned a pre trained small version of BERT model for SNLI dataset.

Depth Estimation using Neuromorphic Camera

Working on Monocular Depth Estimation for SLAM by processing subsequent non-overlapping windows of events/frames over an interval. Training will be done based on data obtained by Conventional and Event based Vision cameras, using deep learning methods.

Sentiment Classification using Tree structured LSTM

Using LSTMs in a tree structured manner, performed binary and 5-class sentiment classification on Stanford Sentiment Treebank dataset. Used Glove embeddings for word representation.

Text to Image Synthesis using GAN

Using a RNN and Deep Convolutional GAN implemented an image synthesis models, which translates sentence text into image pixels. Using GLOVE word embeddings trained the model to generate images of birds and flowers.

Data Structures

Persistent Data Structures

Implemented persistent data structures as part of Coursework assignment for Data Structure and Algorithms course. Both Partially and fully persistent data structures were implemented in C. Used the application of Persistent Stack for solving a maze.

Courses

E0 251

Data Structures and Algorithms

E0 230

Computational Methods of Optimisation

E0 299

Computational Linear Algebra

E1 222

Stochastic Models and Applications

E1 213

Pattern Recognition and Neural Networks

E0 250

Deep Learning

E1 277

Reinforcement Learning

E9 261

Speech Information Processing

E9 253

Neural Networks and Learning Systems

E9 208

Digital Video: Perception and Algorithms

E9 309

Advanced Deep Learning

Acheivements & Extra-Curricular

107^th AIR - GATE 2019 - Computer Science secured 99.89 percentile amongst around 1 lakh students.
Reliance Ode2Code Hackathon, Secured 1st place in Genius Unleashed, a coding challenge in NLP from Reliance.