Universal Position Interpolation: Unified Context Scaling for Hybrid Mamba-Transformer ModelsHaochen ShenDavis Wertheimeret al.2026ICLR 2026Conference paper
Frayed RoPE and Long Inputs: A Geometric PerspectiveDavis WertheimerAozhong Zhanget al.2026ICLR 2026Conference paper