Skip to the content.

Austere Instincts

Austerity is soon to be a thing in AI.

image

Chatted with an old classmate today. One of the things we discussed was how the DeepSeek team started to work on architecture and operations optimizations way way back (a couple years, that’s prehistory in AI timescales).

This doesn’t seem like a big deal yet. But with the end of Moore’s law looming on the horizon, monetary conditions gradually tightening, it seems quite obvious that the prevalent attitude of “don’t worry about the money, we can just buy more hardware from Nvidia” can’t be sustained for much longer.

When US firms realize capital (money) can’t provide a sufficient edge, they will have to pivot to focus more on efficiency. And of course they will. But cultural inertia is a thing in large and proud organizations, and I’m actually not very bullish on the idea that a bunch of python coders can somehow be repurposed to hand writing low level PTX or optimizing bits of the pipeline 0.x% at a time.

The effect isn’t linear either. Frontier model training is more of a series of scientific experiments than an engineering process, and lowering the cost of doing experiments allows people to experiment more and learn more. The reduced time and money costs of running experiments might be the difference between a subpar model and SoTA.

Unfortunately I don’t have an intuition of how much inefficiency there is in training large models today. What we do know is that US trade embargoes of advanced chips to China has forced Chinese firms to focus on efficiency. It’s pretty obvious Chinese firms are leading in the efficiency department especially after recent DeepSeek revelations. If there is much room for improvement in the efficiency space, US AI firms are going to be royally fucked in the coming 1-2 years.