Sage Ahrac outdoors near a waterfall

Research Engineer, LLM Inference

Sage Ahrac

I am a Research Engineer at IBM Research and an MSc student at Tel Aviv University, fortunate to be advised by Mor Geva. I work on distributed LLM inference serving, especially KV-cache management, scheduling, and cache-aware routing. Separately, I am drawn to mechanistic questions around Mixture-of-Experts routing geometry and latent-space monitoring. I also like building small cloud apps and agentic tools that sometimes work.

Selected work