Sudhanshu Agrawal

ML Research Engineer at Qualcomm AI Research

Hi everyone! My name is Sudhanshu and I'm an ML Research Engineer at Qualcomm AI. I work on LLM efficiency research supervised by Mingu Lee. We work on inventing new speculative decoding algorithms for edge applications - making LLMs fast enough to run locally on your phone or laptop! I graduated from UCLA with a double major in Computer Science and Mathematics. I was fortunate to conduct research with Professor Aditya Grover on generative modeling and with Professor Levon Nurbekyan and Professor Samy Wu Fung on mean-field games. In my free time, I like to surf, play the guitar, and sing. I love watching movies and enjoy practically every genre of music. Feel free to reach out if you'd like to chat!

Experience

ML Research Engineer

Qualcomm AI Research
LLM efficiency, speculative decoding, efficient architectures, diffusion LLMs.
2023 - Present
Qualcomm

ML Engineering Intern

Qualcomm AI Research
Profiling tools for deep learning applications.
Summer 2022
Qualcomm

ML Engineering Intern

SonicJobs
Synthetic computer vision dataset creation.
Summer 2021
SonicJobs

ML and Data Science Intern

Reliance Jio
Hydrocarbon property prediction using classical ML.
Summer 2020
Reliance Jio

ML Intern

Julia Computing Inc
Contributions to the Flux Model Zoo library.
Summer 2019
Julia Computing

Education

Bachelor of Science, Computer Science

University of California, Los Angeles (UCLA)
2019 - 2023
Magna cum laude
UCLA

Bachelor of Science, Mathematics

University of California, Los Angeles (UCLA)
2019 - 2023
Cum laude
UCLA

ISC 12th Grade

Mallya Aditi International School, Bengaluru
2017 - 2019
National Rank 4
Mallya Aditi International School

Publications

arXiv Preprint

Spiffy: Multiplying Diffusion LLM Acceleration via Lossless Speculative Decoding

Novel speculative decoding algorithm to accelerate diffusion LLMs.
Sudhanshu Agrawal, Risheek Garrepalli, Raghavv Goel, Mingu Lee, Christopher Lott, Fatih Porikli
2025
Spiffy Publication
ICML ES-FoMo Workshop, 2025

VOCABTRIM: Vocabulary Pruning for Efficient Speculative Decoding in LLMs.

Reducing the vocabulary size of the draft model to reduce memory-bandwidth overhead during speculative decoding.
Raghavv Goel, Sudhanshu Agrawal, et al.
2025
VOCABTRIM Publication
NeurIPS ENLSP-IV Workshop, 2024

AdaEDL: Early Draft Stopping for Speculative Decoding of Large Language Models via an Entropy-based Lower Bound on Token Acceptance Probability

Early-draft-stopping using entropy for efficient speculative decoding.
Sudhanshu Agrawal, Wonseok Jeon, Mingu Lee
2024
AdaEDL Publication
NeurIPS, 2023

ExPT: Synthetic Pretraining for Few-Shot Experimental Design

Foundation model architecture for in-context adaptation to experimental design objectives.
Tung Nguyen, Sudhanshu Agrawal, Aditya Grover
2023
ExPT Publication
Journal of Computational Physics, 2022

Random Features for High-Dimensional Nonlocal Mean-Field Games

Using random-feature kernels to model mean-field interactions efficiently high-dimensional settings.
Sudhanshu Agrawal*, Wonjun Lee*, Samy Wu Fung, Levon Nurbekyan
2022
Mean-Field Games Publication

Patents

The following patent applications were filed from 2024-2025 and are hence, not yet published. They relate to 7 distinct inventions with multiple US and global pending patent applications.

Invited Talks, Judgeships, and Reviewing

Blog

Medium

Generative AI for Experimental Design

Using generative modeling to solve offline black-box optimization problems.
2024
ExPT Blog
Medium

100-Dimensional Games

Understanding and solving nonlocal mean-field games
2023
Mean-Field Games Blog
FluxML.ai

Simulating The Motion of Charged Bodies

Simulating an N-body problem using gradient descent.
2023
Mean-Field Games Blog