Natural Language Processing

SeqGenSQL Natural Language to SQL (http://arxiv.org/abs/2011.03836)

A T5 based sequence generation model for WikiSQL ( natural language translates to SQL statement) task. Achieving SOTA 90.5% on test data set using sequence generation without logical form.

 

Toxic Sentiment Analysis

A BERT based toxic comment analysis based on Kaggle competition using PyTorch. Also designed custom loss function to achieve higher validation accuracy.

 

Stanford University Graduate Certificate (A.I Specialization)

University of California, Berkeley (Master of Information and Data Science)

 

 

 

Computer Vision

Facial Keypoint Detection

Melanoma Segmentation

A Kaggle Competition project where my model achieved 1.51 RMSE(public) and 1.31(private), ranked 1st place on public leaderboard and 2nd place on proviate leaderboard.

An implementation of U-Net for image segmentation based on dermoscopic lesion images from ISIC 2018 competition.


Plant Disease Detection

Plant disease detection using unsupervised learning framework (SimCLRv2) with some modifications. Training dataset is 80,000 images and final validation accuracy is 99.6%

Footprint Identification

A End-to-End Machine Learning solutions that identify footprint of wild animals. This is a multi-class classification problem which works on Android device. GAN is used for image generation.

 

 

Reinforcement Learning

The Inverted Pendulum

Lunar Landing

The inverted pendulum – Using reinforcement learning to learn balancing a pole (simulated).

Implemented in Numpy.

Lunar landing using OpenAI Gym.

Gaming

Pacman

Self driving car - Noisy sensor

Expectimax search algorithm in gaming – used to maximize the expected utility. This is useful for modelling environment with adversary agents.

A noisy car tracker computes the posterior distribution (Gaussian) of actual distance of surrounding cars

 

 

Robotics

Rubik Solving Robot

A 3D printed Rubik’s cube solving machine coded in Python

 

 

More Machine Learning Projects

Image Compression (K-Means)

Neural Network in plain Numpy

ICA for Cocktail Party Problem

An implementation of KMeans for image compression

An deep neural network implemented in plain numpy without deep learning framework

Independent Component Analysis used for cocktail problem.

A red apple

Description automatically generated

A picture containing keyboard

Description automatically generated

 

 

Time Series

Carbon Dioxide Ceoncentration Analysis

Analyzing monthly carbon dioxide concentration in ppm (parts per million) at the Mauna Loa Observatory from 1959 to 1997.

U.S Traffic Fatalities Analysis

An analysis for question “Do changes in traffic laws affect traffic fatalies” using driving dataset which covers 25 years of changes in various state drunk driving, seat belt and speed limit laws.