Swagath Venkataramani

Title

Principal Research Scientist, AIU Architecture and Compilers

Publications

Deep Compression of Pre-trained Transformer Models
- - Naigang Wang
  - Chi-Chun Liu
  - et al.
- 2022
- NeurIPS 2022
Conference paper
Approximate computing and the efficient machine learning expedition
- - Jörg Henkel
  - Hai Li
  - et al.
- 2022
- ICCAD 2022
Conference paper
OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators
- - Subhankar Pal
  - Swagath Venkataramani
  - et al.
- 2022
- Transactions on Embedded Computing Systems
Paper
Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization
- - Andrea Fasoli
  - Chia-Yu Chen
  - et al.
- 2022
- INTERSPEECH 2022
Conference paper
Accelerating DNN Training Through Selective Localized Learning
- - Sarada Krithivasan
  - Sanchari Sen
  - et al.
- 2022
- Frontiers in Neuroscience
Paper
A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling
- - Sae Kyu Lee
  - Ankur Agrawal
  - et al.
- 2021
- IEEE JSSC
Paper
4-bit quantization of LSTM-based speech recognition models
- - Andrea Fasoli
  - Chia-Yu Chen
  - et al.
- 2021
- INTERSPEECH 2021
Conference paper
Efficacy of Pruning in Ultra-Low Precision DNNs
- - Sanchari Sen
  - Swagath Venkataramani
  - et al.
- 2021
- ISLPED 2021
Conference paper
RaPiD: AI Accelerator for Ultra-Low Precision Training and Inference
- - Swagath Venkataramani
  - Vijayalakshmi Srinivasan
  - et al.
- 2021
- ISCA 2021
Conference paper
Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators
- - Subhankar Pal
  - Swagath Venkataramani
  - et al.
- 2021
- ISPASS 2021
Conference paper

Top collaborators

Alberto Mannari

Software Developer

Prasanth Chatarasi

Staff Research Scientist, AIU Accelerator Compilers and Architecture

Matthew Ziegler

Principal Research Scientist

Paul G Crumley

STSM, AI & Hybrid Cloud Infrastructure

Swagath Venkataramani

Title

Publications

Deep Compression of Pre-trained Transformer Models

Approximate computing and the efficient machine learning expedition

OnSRAM: Efficient Inter-Node On-Chip Scratchpad Management in Deep Learning Accelerators

Accelerating Inference and Language Model Fusion of Recurrent Neural Network Transducers via End-to-End 4-bit Quantization

Accelerating DNN Training Through Selective Localized Learning

A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware Throttling

4-bit quantization of LSTM-based speech recognition models

Efficacy of Pruning in Ultra-Low Precision DNNs

RaPiD: AI Accelerator for Ultra-Low Precision Training and Inference

Efficient Management of Scratch-Pad Memories in Deep Learning Accelerators

Patents

Reformatting Of Tensors To Provide Sub-tensors

Reformatting Of Tensors To Provide Sub-tensors

Reformatting Of Tensors To Provide Sub-tensors

Reformatting Of Tensors To Provide Sub-tensors

Single Function To Perform Combined Matrix Multiplication And Bias Add Operations

Method To Map Convolutional Layers Of Deep Neural Network On A Plurality Of Processing Elements With Simd Execution Units, Private Memories, And Connected As A 2d Systolic Processor Array

Hybrid Data-model Parallelism For Efficient Deep Learning

Multichannel Memory To Augment Local Memory

Low Precision Deep Neural Network Enabled By Compensation Instructions

Low Precision Deep Neural Network Enabled By Compensation Instructions

Top collaborators

Alberto Mannari

Prasanth Chatarasi

Matthew Ziegler

Paul G Crumley