Universal Position Interpolation: Unified Context Scaling for Hybrid Mamba-Transformer Models
- Haochen Shen
- Davis Wertheimer
- et al.
- 2026
- ICLR 2026
Conference paper
I am a Principal Research Scientist at IBM T. J. Watson Research Center, Yorktown Heights. I co-lead the Foundation Models AI training and validation platform, built on OpenShift. My team primarily contributes to the PyTorch training and inference components, with the mission of democratizing training and validation of foundation models.
I obtained my M.S. and Ph.D. degrees from the Department of Computer Science, University of Illinois at Urbana-Champaign in August 2010. I worked with Prof. Tarek Abdelzaher (for both my degrees) on developing an infrastructure toolset for human-centric sensing applications.