About

I am a versatile Machine Learning Engineer with extensive research experience and a proven track record in both startups and large organizations. My journey spans cutting-edge research at Northeastern University and Tata Institute of Fundamental Research, where I tackled complex problems in urban resilience, healthcare analytics, and particle physics. In parallel, I have thrived in fast-paced startup environments, designing scalable AI solutions for real-world impact. My experience spans diverse domains, from developing predictive health models for Apriqot to analyzing the resilience of global urban metro networks and pioneering intelligent question-answering systems. I excel at transforming complex data into actionable insights, leveraging deep learning, natural language processing, and network science. Driven by a commitment to innovation and impact, I aim to continue developing cutting-edge AI solutions that drive meaningful outcomes for organizations and communities.

Phone: +1 (617) 516-9182

Mail: mukherjee.o@northeastern.edu

Technical Skills

Data Science and Machine Learning

OpenCV
GNNs (Graph Neural Networks)
Transformers
NetworkX
Pytorch
TensorFlow
LLMs
ML Flow
ArcGIS
Pandas
Neural Networks

Programming Languages

Python
R
Java
C/C++
SQL
Dart
Bash
Git

Infrastructure and Tools

MySQL
MongoDB
AWS
GCP
Kubernetes
Docker
Power BI

Web/Desktop Development

HTML
CSS
JavaScript
Django
Flask
NodeJS
ReactJS

Professional Experience

Machine Learning Engineer

July 2024-Jan 2025

Appriqot, Portland, ME

❁ Engineered Developing a Disease based and Food insecurity predictive models to produce geographically and demographically downscaled population health estimates.

❁ Leveraged agent-based modeling and small area estimation to derive precise health indicators, which empower key healthcare stakeholders, such as MaineCDC and MaineHealth, to make major strategic decisions.

Data Science Researcher

Jan 2024-Present

Institute for Experiential AI, Boston, MA

❁ Developing a comprehensive framework for integrating deep learning models with Boston urban infrastructure (MBTA) and strategies for disaster response, Funded by Department of Homeland Security.

❁ Conducted resilience analysis of 45 global urban metro networks using a topology-driven framework, identifying key structural attributes that influence network robustness and recovery with a computationally efficient resilience modeling approach.

Improving reliability of resilient systems by 25% by creating efficient GNNs and Transformers by capturing both temporal dynamics and spatial complexities in urban transportation networks and infrastructure.

Machine Learning Researcher

August 2022-July 2023

Tata Institute of Fundamental Research (TIFR), Mumbai, India

❁ Initiated a cross-disciplinary project to integrate advanced graph-based machine learning methods with particle physics research, significantly advancing the computational analysis of subatomic phenomena.

❁ Constructed a regression model based on Graph Neural Networks (GNNs) to capture intricate relationships within electron trajectories to predict electron energies with remarkable RMSE value of 11.62

Software Developer

April 2022– October 2022

ViaTech Media, Mumbai, India

❁ Carried out testing for applications of a restaurant, salon, and local businesses for clients. Led a newly formed unit to improvise existing testing framework using Java, Python, and microservices architecture resulting in higher customer satisfaction and increasing test suite.

❁ Designed several components using UI structure, fast state management, and the latest packages and modules of Flutter for applications and websites of the clients in the backend which ensured the smooth and flawless functioning of the platform.

Application Developer

January 2022-March 2022

Matchbuddy, Bangalore, India

❁ Appended features for data storing and displaying in Matchbuddy’s official application, for an efficient match-making process using Firebase and its packages.

❁ Incorporated changes in the UI of their application on a daily basis using technologies such as FlutterFlow and Figma.

Software Developer

December 2021-January 2022

TechShots, Delhi, India

❁ Leveraged Firebase, PHP, and Flutter technologies to drive meaningful enhancements in the TechShots app's Firebase database, ensuring optimal performance, and revolutionizing the frontend design for an enriched user experience.

❁ Spearheaded the enhancement of the TechShots app’s performance and visual appeal, contributing to a significant increase in user engagement and retention on various platforms like Playstore and App Store.

Software Developer

November 2021–December 2021

AIM2EXCEL, Haryana, India

❁ Transformed the News bank and e-commerce applications at AIM2EXCEL by introducing innovative features like offline bookmarks and seamless data streaming, elevating the user experience and optimizing website UI/UX using Flutter web, impacting client satisfaction.

❁ Pioneered groundbreaking enhancements in the News bank and e-commerce applications at AIM2EXCEL, weaving together the magic of Flutter web to create a user-centric digital oasis.

Education

Northeastern University, Boston, MA

August 2023-Dec 2025 GPA 3.8/4

Masters of Science in Computer Science (GPA 3.8/4) Relevant Coursework: Large Language Models, Computer Vision, Data Mining, Database Systems

University of Mumbai, Mumbai, India

August 2019-May 2023 GPA 9.3/10

Bachelor of Engineering in Computer Engineering

Personal and Research Projects

NeuroMarketOps

NeuroMarketOps – Marketing & Inventory Automation

Python, CrewAI, OpenAI API, DuckDuckGo Search, Contextual RAG, FAISS, Streamlit

❁ Designed and deployed a multi-agent orchestration system using CrewAI and OpenAI, with agents for marketing and inventory intelligence.

❁ Achieved 92.7% query correctness using Contextual RAG and real-time data retrieval with DuckDuckGo API.

Face Retrieval Engine

AI Vision-Powered Face Retrieval Engine

Python, OpenCV, FAISS, InsightFace, Streamlit, Google Cloud Run

❁ Built a 97% accurate scalable face matching system supporting 25+ formats and bulk processing via Google Drive API.

❁ Deployed a containerized app on Google Cloud Run, capable of matching 1000+ images per session.

AeroGenAI

AeroGenAI – Aviation Intelligence with RAG

MostlyAI, Amazon SageMaker, LLaMA 3.1 8B, S3, EC2, FAISS, Python, Streamlit

❁ Generated synthetic aviation datasets via MostlyAI and fine-tuned LLaMA 3.1 8B for domain-specific query resolution on AWS.

❁ Integrated smart RAG system using SageMaker, EC2, S3 for model hosting and inference.

MEDI-CHAT

MEDI-CHAT: Intelligent Healthcare RAG Agent

PubMed, Transformers, FAISS, Llama-3.2-1B, PyTorch, Streamlit, Hugging Face

❁ Built a medical RAG system with 95%+ accuracy for healthcare queries using PubMed studies and vector search via FAISS.

❁ Chunked and indexed medical PDFs using sentence-transformers and deployed the app on Hugging Face Spaces.

45 Urban Metros

45 Urban Metros Resilience Analysis

Python, NetworkX, Graph Theory, MatplotLib

❁ Conducted resilience analysis of 45 global urban metro networks, identifying key structural attributes influencing network robustness and recovery.

❁ Developed an interpretable, computationally efficient resilience modeling approach leveraging various centralities enabling rapid assessment of network vulnerabilities.

Gun Detection

Weapon Detection System using Computer Vision

Python, YOLOv8, OpenCV, Twilio, Numpy, PIL

❁ Developed a real-time weapon detection framework using YOLOv8 and OpenCV with 91% accuracy.

❁ Integrated WhatsApp alerts via Twilio for faster response times.

DocuFind.ai

DocuFind.ai – QA System

Python, FireCrawl, FAISS, Transformers, Streamlit

❁ Built a RAG-based system for document & web QA using FAISS & MiniLM, achieving 83.3% accuracy.

❁ Deployed real-time Streamlit UI for interactive querying.

MBTA Threat Detection

Boston MBTA Threat Detection

Python, PyTorch, TensorFlow, NetworkX

❁ Designed GNN-based threat scoring system for urban transit networks.

❁ Built interactive GUI for failure simulations & mitigation strategies.

Suicide Detection

Suicide Risk Detection via Transformers

Python, BERT, CNN, LSTM, NLP

❁ Detected suicidal ideation in 232k+ Reddit posts with 96.85% F1-Score using BERT & Word2Vec.

❁ Built a chatbot for early intervention suggestions.

CMS Electron Prediction

Electron Energy Prediction @ CERN

Python, PyTorch, TensorFlow

❁ Used GNNs to predict electron energy in CMS HGCAL experiments, achieving RMSE of 11.62.

❁ Research conducted at TIFR in collaboration with CERN.

Summarization

Text Summarization & Title Generation

Python, LSTM, Transformer, Seq2Seq, PEGASUS

❁ Built summarization models with Bi-LSTM (92.5% ROUGE) and PEGASUS.

❁ Implemented LSTM-based title generator with 81.88% validation accuracy.

CTR Prediction

CTR Prediction with ML Models

Python, XGBoost, LightGBM, Random Forest

❁ Achieved 95% F1-score using ensemble tree models for CTR prediction.

❁ Boosted ad efficiency by 10–30% using one-hot encoding optimizations.

Spotify Data Storytelling

Storytelling with Spotify Data

Tableau, SQL, Python

❁ Visualized Spotify users by country and demographics in Tableau.

❁ Proposed regional ad strategies based on analytical insights.

Publications

❁ Authored "Resilience of Urban Rail Networks Depend on Mesoscale and Connectivity Attributes", currently under review for publication at Nature Cities.

❁ Co-authored "Rail System Threat Analysis" to Risk Analysis for submitting it for peer review.

❁ Co-authored a paper called "A Network Perspective Can Deter Threats to Soft Infrastructure Targets" as a Comment to Nature Computational Science.

❁ Published "Electron Energy Prediction in High-Granularity Calorimeter of the CMS Detector Using Graph Neural Network" in ICT4SD 2024, Springer LNNS. DOI: https://doi.org/10.1007/978-981-97-8591-9_48

❁ Published "CTR Prediction of Advertisements using Decision Trees-Based Algorithms" in Isemantic, IEEE, October 2022.DOI: 10.1109/iSemantic55962.2022.9920363

Contact

Email

mukherjee.o@northeastern.edu

Call Me

+1 (617) 516-9182

Get in touch!