A Multiscale Workflow for Thermal Analysis of 3DI Chip Stacks
Max Bloomfield, Amogh Wasti, et al.
ITherm 2025
This work presents a holistic approach to enabling energy-efficient on-chip Transfer Learning (TL) via Analog In-Memory Computing (AIMC) using 14nm CMOS-compatible ReRAM arrays. We develop an optimized ReRAM stack featuring H2 plasma-treated high-k (HfO2 or ZrO2) and in-vacuo processing, achieving reverse area scaling of forming voltage for co-integration with advanced-node CMOS technologies. To address non-ideal analog weight updates, we implement and evaluate the latest versions of Tiki-Taka training algorithms—TTv2, c-TTv2, and AGAD—capable of tolerating device asymmetry and variability. TL is demonstrated on hardware using compressed MNIST with on-chip training and extended via simulations to Vison Transformer (ViT)-based TL from CIFAR-10 to CIFAR-100. While analog-only models show sensitivity to weight transfer noise, hybrid analog-digital implementations maintain performance up to 20% noise. Using AGAD with optimized ReRAM devices, we achieve <1% accuracy degradation compared to digital baselines, validating AIMC-based TL as a viable path for low-power, on-chip training at the edge.
Max Bloomfield, Amogh Wasti, et al.
ITherm 2025
Evaline Ju, Kelly Abuelsaad
KubeCon EU 2026
Manuel Le Gallo
IEDM 2025
Guohan Hu, Philip Trouilloud, et al.
IEDM 2025