Swagath Venkataramani

Title

Principal Research Scientist, AIU Architecture and Compilers

Publications

Deep Compression of Pre-trained Transformer Models
- - Naigang Wang
  - Chi-Chun Liu
  - et al.
- 2022
- NeurIPS 2022
Conference paper
Approximate computing and the efficient machine learning expedition
- - Jörg Henkel
  - Hai Li
  - et al.
- 2022
- ICCAD 2022
Conference paper
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators
- - Subhankar Pal
  - Swagath Venkataramani
  - et al.
- 2022
- Transactions on Embedded Computing Systems
Paper
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
- - Andrea Fasoli
  - Chia-Yu Chen
  - et al.
- 2022
- INTERSPEECH 2022
Conference paper
Accelerating DNN Training Through Selective Localized Learning
- - Sarada Krithivasan
  - Sanchari Sen
  - et al.
- 2022
- Frontiers in Neuroscience
Paper
A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling
- - Sae Kyu Lee
  - Ankur Agrawal
  - et al.
- 2021
- IEEE JSSC
Paper
4-bit quantization of LSTM-based speech recognition models
- - Andrea Fasoli
  - Chia-Yu Chen
  - et al.
- 2021
- INTERSPEECH 2021
Conference paper
Efficacy of Pruning in Ultra-Low Precision DNNs
- - Sanchari Sen
  - Swagath Venkataramani
  - et al.
- 2021
- ISLPED 2021
Conference paper
RaPiD: AI Accelerator for Ultra-Low Precision Training and Inference
- - Swagath Venkataramani
  - Vijayalakshmi Srinivasan
  - et al.
- 2021
- ISCA 2021
Conference paper
Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators
- - Subhankar Pal
  - Swagath Venkataramani
  - et al.
- 2021
- ISPASS 2021
Conference paper

Top collaborators

Alberto Mannari

Software Developer

Prasanth Chatarasi

Senior Research Scientist, AI Accelerator Compilers and Architecture

Matthew Ziegler

Principal Research Scientist

Paul G Crumley

STSM, AI & Hybrid Cloud Infrastructure

Swagath Venkataramani

Title

Publications

Deep Compression of Pre-trained Transformer Models

Approximate computing and the efficient machine learning expedition

OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators

Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization

Accelerating DNN Training Through Selective Localized Learning

A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

4-bit quantization of LSTM-based speech recognition models

Efficacy of Pruning in Ultra-Low Precision DNNs

RaPiD: AI Accelerator for Ultra-Low Precision Training and Inference

Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators

Patents

Multichannel Memory To Augment Local Memory

Multichannel Memory To Augment Local Memory

Multichannel Memory To Augment Local Memory

Reformatting Of Tensors To Provide Sub-tensors

Fast - Padding Input Data For Artificial Intelligence Accelerators

Reformatting Of Tensors To Provide Sub-tensors

Reformatting Of Tensors To Provide Sub-tensors

Reformatting Of Tensors To Provide Sub-tensors

Reformatting Of Tensors To Provide Sub-tensors

Reformatting Of Tensors To Provide Sub-tensors

Top collaborators

Alberto Mannari

Prasanth Chatarasi

Matthew Ziegler

Paul G Crumley