Preetham Ganesh

About

Hey, I'm Preetham Ganesh, a Search Language Specialist at Qualitest supporting Google in Austin, Texas. I have recently graduated from the University of Texas at Arlington with a Master of Science in Computer Science. I have a strong research background in Machine Learning (ML) and Natural Language Processing (NLP).

I worked on a thesis about developing a proof-of-concept application for converting Sentence-based American Sign Language (ASL) videos to English language speech using Attention-based Sequence-to-Sequence architectures. I developed a state-of-the-art model for the ASL Video-to-Gloss module with a Top-5 Accuracy of 98%.

Before my Master's Degree, I graduated in First Class from Amrita Vishwa Vidyapeetham with a Bachelor of Technology in Computer Science and Engineering. I also won the Outstanding Student Award in April 2019 for my academic research works and leadership responsibilities.

I'm currently seeking full-time opportunities in Data Science and Machine Learning. If you have any exciting opportunities, you can contact me directly at preetham.ganesh2021@gmail.com.

Education

Master of Science in Computer Science

University of Texas at Arlington
Arlington, TX
August 2019 - May 2021

Master's Thesis: Continuous American Sign Language Translation with English Speech Synthesis using Encoder-decoder Approach
Coursework: Computer Vision, Special Topics in Intelligent Systems, Machine Learning, Data Analysis & Modeling Techniques, Advanced Algorithms, Neural Networks, Data Mining.

Bachelor of Technology in Computer Science and Engineering

Amrita Vishwa Vidyapeetham
Coimbatore, India
July 2015 - April 2019

Bachelor's Thesis: Forecast of Rainfall Quantity and its Variation using Environmental Features.
Awards: Outstanding Student Award 2019, Department of Computer Science & Engineering
Coursework: Intelligent Systems, Natural Language Processing, Software Engineering, Database Management System

Experience

Search Language Specialist

Qualitest, supporting Google
Austin, TX
October 2021 - Present

Graduate Student Researcher

Vision-Learning-Mining Research Lab
University of Texas at Arlington
Arlington, TX
February 2020 - May 2021

Developed a proof-of-concept application for translating Sentence-based ASL to English language Speech under Prof. Vassilis Athitsos.
Deployed 4 modules Video-to-Gloss, Gloss-to-Grapheme, Grapheme-to-Phoneme, and Phoneme-to-Spectrogram.
Extracted Human & Hand Pose Keypoints from the videos, improved efficiency of it by converting models to PyTorch, and pre-processed it.
Implemented Attention-based Seq2Seq & Transformer architectures for training all models & performed hyper-parameter tuning.
Video-to-Gloss model achieved a state-of-the-art Top-5 accuracy of 98%.
Tech used: TensorFlow, OpenCV, SpaCy, PyTorch

DOI PDF GitHub

Undergraduate Student Researcher

Amrita Vishwa Vidyapeetham
Coimbatore, India
June 2018 - July 2019

Built an application to predict rainfall in Indian Districts using district-wise location-based analysis under Asst. Prof. Dayanand Vinod.
Modeled District & State level rainfall data using regression algorithms such as Decision Tree, Polynomial, Random Forest & XGBoost.
Combined results using Stacking Ensemble method which achieved an EVS score of 91.1.
Tech used: Pandas, NumPy, SciPy, Scikit-Learn, Collections, Matplotlib, Itertools

PDF GitHub

Programming Language Skills

Python

90%

R

80%

Java

75%

C

70%

C++

60%

SQL

60%

HTML

60%

CSS

60%

JavaScript

60%

MATLAB

60%

Package/Frameworks Skills

TensorFlow

80%

Keras

80%

Scikit-Learn

90%

NumPy

80%

SciPy

70%

OpenCV

75%

Pandas

85%

Pickle

90%

Matplotlib

85%

Multiprocessing

70%

Publications

POS Tagging-based Neural Machine Translation System for European Languages using Transformers

Authors: Preetham Ganesh, Bharat S. Rawal, Alexander Peter, Andi Giri
Abstract: The interaction between human beings has always faced different kinds of difficulties. One of those difficulties is the language barrier. It would be a tedious task for someone to learn all the syllables in a new language in a short period and converse with a native speaker without grammatical errors. Moreover, having a language translator at all times would be intrusive and expensive. We propose a novel approach to Neural Machine Translation (NMT) system using inter-language word similarity-based model training and Part-Of-Speech (POS) Tagging based model testing. We compare these approaches using two classical architectures: Luong Attention-based Sequence-to-Sequence architecture and Transformer based model. The sentences for the Luong Attention-based Sequence-to-Sequence were tokenized using SentencePiece tokenizer. The sentences for the Transformer model were tokenized using Subword Text Encoder. Three European languages were selected for modeling, namely, Spanish, French, and German. The datasets were downloaded from multiple sources such as Europarl Corpus, Paracrawl Corpus, and Tatoeba Project Corpus. Sparse Categorical CrossEntropy was the evaluation metric during the training stage, and during the testing stage, the Bilingual Evaluation Understudy (BLEU) Score, Precision Score, and Metric for Evaluation of Translation with Explicit Ordering (METEOR) score were the evaluation metrics.

DOI PDF GitHub Video Demo

Personalized System for Human Gym Activity Recognition using an RGB Camera

Scopus Indexed

Authors: Preetham Ganesh, Reza Etemadi Idgahi, Chinmaya Basavanahally Venkatesh, Ashwin Ramesh Babu, Maria Kyrarini
Abstract: Human Activity Recognition is one of the most researched topics in the field of computer vision. It is a powerful tool mainly used to aid medical systems, smart homes, surveillance, and many more areas. In this paper, an RGB camera was used to record gym activities such as push-up, squat, plank, forward lunge, and sit-up. Features were extracted from the recorded videos and were fed into classification algorithms such as Support Vector Machines, Decision Tree classifier, K-Nearest Neighbor classifier, and Random Forest classifier. The developed models were evaluated using metrics such as accuracy, balanced accuracy, precision score, recall score, and F1 score. The Random Forest Classifier outperformed all the other attempted methods with an accuracy of 98.98%. A repetition counter was developed, which splits workouts based on local minima analysis, and correctness of the workout was calculated for each skeletal point using dynamic time warping. An interactive android application was built for the user to gain insights on the performed workouts.

DOI PDF GitHub PPT YouTube

Estimation of Rainfall Quantity using Hybrid Ensemble Regression

Scopus Indexed

Authors: Preetham Ganesh, Harsha Vardhini Vasu, Dayanand Vinod
Abstract: Accurate prediction of rainfall in a geographical region has always been a challenge to the researchers. In this paper, ensemble methods such as bagging and boosting are used to predict rainfall level in districts belonging to Tamil Nadu, India. The Ensemble Regression models are optimised by tuning the parameters such as the number of estimators, base estimator and maximum depth. For evaluating the developed models, performance measures such as Mean Squared Error and Explained Variance Score were used. Based on the analysis, Bagging Regression produced better results than the other models after optimisation, but the difference between the performance of the models was very less. Hence, the prediction of the ensemble regression models is used instead of the features to predict rainfall, where two or more models are used at a time in different combinations for this purpose. The models are combined in different combinations using ensemble techniques such as Simple Averaging, Blending and Stacking. The developed models are compared using graphical analysis, where the comparison is based on actual rainfall values.

DOI PDF GitHub PPT YouTube

Forecast of Rainfall Quantity and its Variation using Environmental Features

Scopus Indexed

Authors: Preetham Ganesh, Harsha Vardhini Vasu, Dayanand Vinod
Abstract: Rainfall plays a crucial role in the lives of an ordinary man. Developing a prediction model that captures sudden fluctuations in rainfall has always been a challenging task. The paper aims at developing three models which predict monthly rainfall for all districts in Tamil Nadu, India and also drawing a district-wise comparison among them to find the best model for prediction. The models developed are District-Specific Model, Cluster-Based Model and Generic-Regression Model. The District-Specific Model trains on data from a particular district, the Cluster-Based Model groups districts based on the climatic conditions and trains on data from a particular cluster and the Generic-Regression Model trains on combined data from all the districts. The paper also aims at finding the monthly variation of rainfall across geographical regions.

DOI PDF GitHub PPT

Juxtaposition on Classifiers Modeling Hepatitis Diagnosis Data

Authors: Preetham Ganesh, Harsha Vardhini Vasu, Keerthanna Govindarajan Santhakumar, Raakheshsubhash Arumuga Rajan, and Bindu K R
Abstract: Data Mining in medical data is very popular in recent research. Approximately 2% of the world population, i.e., 3.9 million people are infected by Hepatitis C. This paper is an investigative study on the comparison of classification models - SVM (Support Vector Machine), Random Forest, Decision Tree, Logistic Regression and Naive Bayes - modeling Hepatitis C Data based on various performance measures - Accuracy, Balanced Accuracy, Precision, Recall, F1-Measure, MCC (Matthews Correlation Coefficient) etc using R Programming Language. On normalizing the numerical attributes using Z-score Normalization and using holdout method for Train Test data split of 80% - 20%, the result shows that Random Forest outperforms the other classifiers with an accuracy of 90.7 %, followed by SVM, Logistic Regression, Decision Tree and Naive Bayes.

DOI PDF GitHub PPT

Research Projects

SLAM & Cooperative Path Planning in Multi-robotic Dynamic Environment

University of Texas at Arlington
February 2020 - July 2020

To develop an application (using Python) for multiple robots to achieve a goal and draw a comparison between their performance in a cooperative and non-cooperative environment.
The comparison is drawn on variables such as the size of the environment, number of obstacles, and obstacle dynamicity.
Implemented modified A*, D++, and Simulated Annealing for modeling the robots.

Efficacy of MBT in revamping Stress and Self-Esteem among final year college students involved in placements

Amrita Vishwa Vidyapeetham
April 2019 - January 2020

Investigated the impact of using Mindfulness-Based Training among the final year college students in reducing stress and increasing self-esteem.
Performed analysis (using SPSS 19.0) on the data obtained from students during the intervention.

Projects

Captioning of Images using Luong Attention

University of Texas at Arlington
November 2020

Architected a TensorFlow-based application for predicting captions of an image given by the user.
Used SentencePiece tokenizer to tokenize the target captions, InceptionV3 network to extract features from the image, Luong Attention for extracting nuances from the target captions, LSTM for predicting the sequence of words.
Reduced the model's loss to 0.628 on the test set
Tech used: TensorFlow, OpenCV, NumPy, Scikit-Learn, Pandas, Keras, Flask.

Github Medium Video Demo

COVID-19 Social Distancing Violation Detection using Neural Networks

University of Texas at Arlington
September 2020

Built an application for detecting the social distancing violation in a given area or a user-given video.
Used YoloV3 Object Detection Neural Network for detecting people in a frame, & used SciPy Spatial Distance function for calculating the real- world distance between 2 bounding boxes and generated alert when distance is less than 6 ft.
Tech used: TensorFlow, OpenCV, SciPy, NumPy.

COVID-19 Face Mask Detection using Neural Networks

University of Texas at Arlington
August 2020

Architected a face-detection module using the ImageNet weights for detecting faces in a video.
Converted detected faces into 128-byte Encoding, used face_encoding to compare faces and extract distinctive faces.
Developed a face mask classifier using Convolutional Neural Network. The model produced an accuracy of 0.86.
Tech used: Face_Recognition, OpenCV, NumPy

Github

Character-based Sign Language Recognition using MNIST dataset

University of Texas at Arlington
March 2020

Implemented a letter-based Sign Language Recognition model using a CNN with 2 convolution layers (with max pooling), 2 fully connected layers and a SoftMax layer with 25 outputs.
The CNN model produced an accuracy of 88.62% on the given test set.
Tech used: TensorFlow, Keras, OpenCV, SciPy, NumPy.

Smart Plant Monitoring System

Amrita Vishwa Vidyapeetham
October 2018 - December 2018

Monitored plant growth using temperature, humidity, light intensity and moisture sensors to send data to Thingspeak cloud where an alert (SMS) is sent to user if soil moisture value goes below 450.
Tech used: Arduino, Node MCU and Sensors (DHT11, TEMT6000, YL-38 + YL-69)

Organ Donor Management System

Amrita Vishwa Vidyapeetham
July 2018 - December 2018

Developed a web application to manage data for an Organ Donor Management organization that would automate allocation of organs for request based on priority and emergency.
Developed frontend and business logic the alert system and automatic allocation feature using HTML, JSP, JavaScript and CSS; Validated entered information using JavaScript.
Tech used: HTML, JSP, CSS, JavaScript, Java, XML, Apache Tomcat and Oracle SQL

Online Payment Wallet

Amrita Vishwa Vidyapeetham
February 2018 - May 2018

Created a web application for credit and debit card payments using single virtual account connected to the users bank account.
Developed Front End and Business Logic for credit and debit card payments modules using HTML, JSP and CSS.
Tech used: HTML, JSP, CSS, Apache Tomcat and Oracle SQL

Analysis on Performance of Players and Teams in IPL Matches

Amrita Vishwa Vidyapeetham
February 2018 - May 2018

Analyzed performance of players and teams in IPL using Mathematical and Statistical Analysis in Python.
Statistically analyzed the players performance in batting and bowling.
Tech used: Pandas, Matplotlib, NumPy, Urllib, CSV and BeautifulSoup

Smart HR Manager

Amrita Vishwa Vidyapeetham
August 2017 - December 2017

Created a web application to aid the HR manager in a company with recruiting, termination and increments.
Developed Front End and Business Logic for Employee Recruitment and Termination modules HTML, JSP, JavaScript and CSS.
Tech used: HTML, JSP, CSS, JavaScript, Apache Tomcat and Oracle SQL

Leadership

Chairman

ASCII Technical Club
Amrita Vishwa Vidyapeetham
Coimbatore, India
June 2018 - April 2019

Led a team of 32 members to organize multiple events such as Technical Quizzes & Workshops, Gaming Events, & also published Newsletters.
Improved the student turnout for events by 75% (compared to the previous year) and received an event satisfaction score of 85% as an average for all events.

Event Manager

Anokha National Techfest
Amrita Vishwa Vidyapeetham
Coimbatore, India
December 2017 - February 2018

Organized a Machine Learning and IoT Workshop for 61 participants, where MATLAB was used to teach the participants the concepts.
82% of the participants provided a 100% satisfaction rate on the concepts taught and hospitality provided at the event.