About Education Skills Experience Research Projects News Contact

Hi, my name is

Gaurav Najpande

MS in Data Science @ Arizona State University
Building reliable AI systems for structured data, multimodal reasoning, and large-scale evaluation.

About Me

Gaurav Najpande

I'm a Graduate Research Assistant at Arizona State University's CoRAL Lab, where I work on AI systems that reason over complex, real-world data. My interests span natural language processing, multimodal reasoning, structured data understanding, and scalable machine learning infrastructure.

I'm a tinkerer and tech enthusiast who loves learning, experimenting, and solving problems. I enjoy building end-to-end systems that connect research ideas with practical engineering, with the goal of making AI systems more reliable, useful, and ready for real-world use.

Education

Master of Science in Data Science, Analytics and Engineering
Arizona State University
2024 – 2026 Tempe, AZ, USA
Bachelor of Engineering in Electronics (Minor in Computer Science)
Shri Ramdeobaba College of Engineering & Management
2019 – 2023 Nagpur, India

Technical Skills

Programming & Tools
Python C++ SQL Git Linux Docker Kubernetes Bash
AI, LLMs & Multimodal Systems
PyTorch Transformers LLMs VLMs Multimodal Models SFT GRPO LoRA Model Evaluation
Data Systems & Pipelines
Apache Spark Spark SQL ETL Pipelines REST APIs Data Transformation Distributed Processing
Research Engineering
Large Scale Evaluation Benchmarking Error Analysis Table Reasoning Visual Grounding Agentic Pipelines
Databases
PostgreSQL MySQL SQLite
Experimentation & Visualisation
pandas NumPy Matplotlib Seaborn Tableau Power BI LaTeX

Professional Experience

Graduate Research Assistant
CoRAl Lab, Arizona State University
May 2025 – Present Tempe, AZ

Conducting research in NLP and multimodal reasoning, developing reproducible training, evaluation, and benchmarking workflows to support large-scale experimentation and peer-reviewed research.

Engineered agentic data transformation pipelines standardising 14,000+ semi-structured tables into SQL-ready schemas for scalable table-based question answering across benchmarks.

Architected distributed LLM inference and evaluation infrastructure executing 80,000+ API runs and processing 100,000+ outputs across SoTA LLM/VLM architectures on GCP and multi-GPU/HPC clusters.

Learning Technology Communications Aide
TLC, ASU School of Life Sciences
Mar 2025 – Present Tempe, AZ

Designed and managed Salesforce CRM architecture for 1,000+ users, implementing 15+ custom fields, validation rules, and workflow automations to sustain a 95% first-response resolution rate.

Developed 10+ real-time dashboards and integrated Salesforce with web and knowledge-base systems, tracking 500+ monthly cases and reducing manual coordination by 30%.

Project Engineer
Technoventor Innovations Pvt. Ltd.
May 2023 – May 2024 IIT Bombay, Mumbai, India

Led data-driven engineering operations for 2,800+ users across 100+ CNC and prototyping systems, analysing utilisation and maintenance metrics to reduce downtime by 25%.

Built issue-tracking and reporting workflows across engineering operations, improving coordination efficiency and reducing manual process overhead by 30%.

Lab Engineer (Internship)
Technoventor Innovations Pvt. Ltd.
Jan 2023 – May 2023 IIT Bombay, Mumbai, India

Maintained and troubleshot 3D printers, CO₂ lasers, and lathes, ensuring smooth lab operations.

Mentored 1,400+ students on project fabrication and technical challenges using rapid prototyping tools.

Collaborated with faculty and researchers to accelerate project timelines and improve research outcomes.

Intern & Community Volunteer
FabLab Nagpur
Jan 2022 – Jan 2023 Nagpur, India

Developed smart switch systems, image recognition modules, and a CNC 2D draw bot.

Enhanced inventory and data management processes, improving resource utilisation efficiency.

Data Science Intern
SmartKnower
Mar 2021 – Jun 2021 Remote

Processed and analysed COVID-19 data using pandas, NumPy, scikit-learn, and Matplotlib to visualise trends on Tableau.

Trained a model with OpenCV and scikit-learn for accurate image recognition, achieving 76% accuracy.

Publications

DRAGON: A Benchmark for Evidence-Grounded Visual Reasoning over Diagrams
2026

"A benchmark for evidence-grounded diagram reasoning with 11,664 annotated QA instances across charts, maps, infographics, circuits, and scientific diagrams."

Released a 2,445-instance human-verified test set and evaluation framework for VLM grounding.
QUIETT: Query-Independent Table Transformation for Robust Table Reasoning
2026

"A query-independent table transformation framework that converts semi-structured tables into lossless, SQL-ready canonical representations."

Consistent cross-model improvements evaluated on four major QA benchmarks.
Wearable, Non-invasive, Nano-material based Glucose Sensing: A Review
2022

"A comprehensive review of wearable, non-invasive glucose sensing technologies using nano-materials."

Analysed various sensing mechanisms and material properties for next-generation health monitoring.

Featured Projects

Logic Programming
Automated Warehouse Planner

Multi-agent pathfinding and task allocation using Answer Set Programming (ASP).

ASPClingoPython
Data Engineering
Yelp Data Intelligence

Distributed Spark SQL ETL pipelines for regional and category-level KPI generation.

Apache SparkSpark SQLPostgreSQL
Deep Learning
Transformer NMT

English-Hindi translation system built with PyTorch and custom tokenisation.

PyTorchTransformersNLP
ML Application
Roommate Recommendation

Big Five Personality based recommendation engine for student housing.

PythonScikit-LearnData Analysis
Time Series
Stock Forecast Engine

Time-series forecasting for market trends using advanced predictive modelling.

PythonRNNLSTMs
Healthcare AI
CancerPredictML

Diagnostic ML system for early cancer detection using clinical datasets.

PythonHealthcare AIClassification
Sports Analytics
WASP-IPL Prediction

Cricket match outcome prediction system using historical IPL data.

PythonData MiningSports Analytics
Computer Vision
ImageClassify CNN

Computer vision system for multi-class image recognition.

CNNTensorFlowOpenCV
Deep Learning
Churn Analysis ANN

Customer retention prediction using Artificial Neural Networks.

ANNDeep LearningKeras
Machine Learning
Predictive Modelling Collection

ML models predicting house pricing, TV pricing, and salary values using regression techniques.

PythonScikit-LearnMatplotlibJupyter

News Feed

May 2025
Joined CoRAl Lab as a Graduate Research Assistant at ASU.
Mar 2025
Started as Learning Technology Communications Aide at TLC, ASU School of Life Sciences.
Aug 2024
Started Master's in Data Science at Arizona State University.
May 2024
Completed tenure as Project Engineer at Technoventor Innovations.
Dec 2023
Promoted to Team Lead at Technoventor Innovations.
May 2023
Started working as Project Engineer at Technoventor Innovations.

Get In Touch

I'm currently looking for new opportunities and collaborations. Whether you have a question or just want to say hi, feel free to reach out!