Frayed RoPE and Long Inputs: A Geometric PerspectiveDavis WertheimerAozhong Zhanget al.2026ICLR 2026Conference paper
Universal Position Interpolation: Unified Context Scaling for Hybrid Mamba-Transformer ModelsHaochen ShenDavis Wertheimeret al.2026ICLR 2026Conference paper
Is Finer Better? The Limits of Microscaling Formats in Large Language ModelsAndrea FasoliMonodeep Karet al.2026ICLR 2026Conference paper
Advancing Fluorescence Light Detection and Ranging in Scattering Media with a Physics-Guided Mixture-of-Experts and Evidential CriticsIsmail ErbasFerhat Demikiranet al.2025NeurIPS 2025Workshop paper
Accelerating LLM Inference via Dynamic KV Cache Placement in Heterogeneous Memory SystemYunhua FangRui Xieet al.2025IEEE Computer Architecture LettersPaper
Generative AI Through CAS Lens: An Integrated Overview of Algorithmic Optimizations, Architectural Advances, and Automated DesignsChuan ZhangYou Youet al.2025IEEE JESTCSPaper
Compressed Decentralized Momentum Stochastic Gradient Methods for Nonconvex OptimizationWei LiuAnweshit Pandaet al.2025TMLRPaper
CLoQ: Enhancing Fine-Tuning of Quantized LLMs via Calibrated LoRA InitializationYanxia DengAozhong Zhanget al.2025TMLRPaper
COMQ: A Backpropagation-Free Algorithm for Post-Training QuantizationAozhong ZhangZi Yanget al.2025IEEE AccessPaper
Compressing Recurrent Neural Networks for FPGA-accelerated Implementation in Fluorescence Lifetime ImagingIsmail ErbasVikas Pandeyet al.2024NeurIPS 2024Workshop paper
03 Mar 2025US12240753Micro-electromechanical Device Having A Soft Magnetic Material Electrolessly Deposited On A Palladium Layer Coated Metal Beam
23 Dec 2024US12175359Machine Learning Hardware Having Reduced Precision parameter Components For Efficient Parameter Update
21 Jul 2024JP7525237Machine Learning Hardware Having Reduced Precision Parameter Components For Efficient Parameter Update