Naigang Wang

Title

RSM, Manager, AI acceleration algorithm and framework

Publications

Is Finer Better? The Limits of Microscaling Formats in Large Language Models
- - Andrea Fasoli
  - Monodeep Kar
  - et al.
- 2026
- ICLR 2026
Conference paper
Advancing Fluorescence Light Detection and Ranging in Scattering Media with a Physics-Guided Mixture-of-Experts and Evidential Critics
- - Ismail Erbas
  - Ferhat Demikiran
  - et al.
- 2025
- NeurIPS 2025
Workshop paper
Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System
- - Yunhua Fang
  - Rui Xie
  - et al.
- 2025
- IEEE Computer Architecture Letters
Paper
CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization
- - Yanxia Deng
  - Aozhong Zhang
  - et al.
- 2025
- TMLR
Paper
Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization
- - Wei Liu
  - Anweshit Panda
  - et al.
- 2025
- TMLR
Paper
COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization
- - Aozhong Zhang
  - Zi Yang
  - et al.
- 2025
- IEEE Access
Paper
Generative AI Through CAS Lens: An Integrated Overview of Algorithmic Optimizations, Architectural Advances, and Automated Designs
- - Chuan Zhang
  - You You
  - et al.
- 2025
- IEEE JESTCS
Paper
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging
- - Ismail Erbas
  - Vikas Pandey
  - et al.
- 2024
- NeurIPS 2024
Workshop paper
A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts
- - Mohammed Nowaz Rabbani Chowdhury
  - Meng Wang
  - et al.
- 2024
- ICML 2024
Conference paper
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths
- - Ximeng Sun
  - Rameswar Panda
  - et al.
- 2024
- WACV 2024
Conference paper

Blog posts

Ultra-low-precision training of deep neural networks
Technical note
Naigang Wang
09 May 2019
- AI
8-bit precision for training deep learning systems
Research
Naigang Wang
03 Dec 2018
- AI
- AI Hardware

Top collaborators

Kaoutar El Maghraoui

Principal Research Scientist, AIU Spyre Software Ecosystem, AI Hardware Center

Karthik Swaminathan

Senior Research Scientist, Efficient and Resilient Systems

Pin-Yu Chen

Principal Research Scientist and Manager; Chief Scientist, RPI-IBM AI Research Collaboration

Swagath Venkataramani

Principal Research Scientist, AIU Architecture and Compilers

Naigang Wang

Title

Publications

Is Finer Better? The Limits of Microscaling Formats in Large Language Models

Advancing Fluorescence Light Detection and Ranging in Scattering Media with a Physics-Guided Mixture-of-Experts and Evidential Critics

Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory System

CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA Initialization

Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex Optimization

COMQ: A Backpropagation-Free Algorithm for Post-Training Quantization

Generative AI Through CAS Lens: An Integrated Overview of Algorithmic Optimizations, Architectural Advances, and Automated Designs

Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime Imaging

A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-Experts

Improved Techniques for Quantizing Deep Networks with Adaptive Bit-Widths

Patents

Four-bit Training For Machine Learning

Magnetic Inductor Stacks With Multilayer Isolation Layers

Micro-electromechanical Device Having A Soft Magnetic Material Electrolessly Deposited On A Palladium Layer Coated Metal Beam

Neural Network Circuitry Having Floating Point Format With Asymmetric Range

Machine Learning Hardware Having Reduced Precision parameter Components For Efficient Parameter Update

Machine Learning Hardware Having Reduced Precision Parameter Components For Efficient Parameter Update

Detection Of An Aged Circuit

Hybrid Floating Point Representation For Deep Learning Acceleration

Resonant Clock Circuit With Magnetic Shield