A Provably Effective Method for Pruning Experts in Fine-tuned Sparse Mixture-of-ExpertsMohammed Nowaz Rabbani ChowdhuryMeng Wanget al.2024ICML 2024Conference paper
Improved Techniques for Quantizing Deep Networks with Adaptive Bit-WidthsXimeng SunRameswar Pandaet al.2024WACV 2024Conference paper
Deep Compression of Pre-trained Transformer ModelsNaigang WangChi-Chun Liuet al.2022NeurIPS 2022Conference paper
A 7-nm Four-Core Mixed-Precision AI Chip with 26.2-TFLOPS Hybrid-FP8 Training, 104.9-TOPS INT4 Inference, and Workload-Aware ThrottlingSae Kyu LeeAnkur Agrawalet al.2021IEEE JSSCPaper
4-bit quantization of LSTM-based speech recognition modelsAndrea FasoliChia-Yu Chenet al.2021INTERSPEECH 2021Conference paper
Hardware-Aware Neural Architecture Search: Survey and TaxonomyHadjer BenmezianeKaoutar El Maghraouiet al.2021IJCAI 2021Survey paper
RaPiD: AI Accelerator for Ultra-Low Precision Training and InferenceSwagath VenkataramaniVijayalakshmi Srinivasanet al.2021ISCA 2021Conference paper
A 7nm 4-Core AI Chip with 25.6TFLOPS Hybrid FP8 Training, 102.4TOPS INT4 Inference and Workload-Aware ThrottlingAnkur AgrawalSaekyu Leeet al.2021ISSCC 2021Conference paper
Ultra-Low Precision 4-bit Training of Deep Neural NetworksXiao SunNaigang Wanget al.2020NeurIPS 2020Conference paper
ScaleCom: Scalable Sparsified Gradient Compression for Communication-Efficient Distributed Training Chia-Yu ChenJiamin Niet al.2020NeurIPS 2020Conference paper
15 Nov 2021US11174159Micro-electromechanical Device Having A Soft Magnetic Material Electrolessly Deposited On A Metal Layer
08 Nov 2021US11170933Stress Management Scheme For Fabricating Thick Magnetic Films Of An Inductor Yoke Arrangement
30 Aug 2021US11107878High Resistivity Iron-based, Thermally Stable Magnetic Material For On-chip Integrated Inductors