
What Lemurian does. Lemurian Labs is building a co‑designed AI compute stack that begins with a software‑first compiler/runtime (Diluvian: TiDaL IR + HADAL + dynamic scheduler) that ingests unmodified Python/ML workloads and produces portable, high‑utilization execution on heterogeneous hardware (CPU/GPU/other accelerators).
Why it matters. As models grow more dynamic and memory‑bound, legacy compiler stacks and kernel libraries leave performance on the table. Modern improvements show a 600x gap between compute growth (60,000x) and memory bandwidth (100x) over 20 years. Their benchmarks cherry-pick compute-heavy workloads like Mandelbrot sets while ignoring that real AI models spend 70%+ of their time waiting for memory. Lemurian’s approach targets whole‑graph optimization and heterogeneous scheduling, which—if delivered—should yield lower $/token, higher throughput, and better energy efficiency across today’s fleets.