Somewhere around 2005, an entire generation learned to ration their words. Carriers charged per text message over your monthly bundle, and the bundles were small. A hundred messages. Maybe two hundred if your parents were generous. You did mental arithmetic before hitting send. Was this reply worth €0.15? Was this person worth €0.15? Texting had a unit economics problem, and we all adapted to it. We developed shorthand, stripped vowels, compressed meaning into 160 characters. It felt like a permanent feature of the medium. A fundamental constraint we'd just have to live with.
Nobody counts text messages anymore. Nobody even knows when it stopped mattering. Same arc for dial-up minutes, mobile data caps, home broadband. Every communication technology launches with metered scarcity and matures into abundance. The coping behaviours feel permanent until one day they're just gone.
Yes, compute scales differently than bandwidth. You can't just lay more fibre. GPUs need energy, cooling, rare materials. But the constraint doesn't need to hit zero. It just needs to hit "nobody thinks about it." And context windows went from 4K tokens to over a million in under two years. Per-token pricing drops with every model generation. What required an enterprise plan last year is on the free tier today.
And yet. Look at us. Carefully crafting prompts to use fewer tokens. Splitting tasks across sessions. Building entire retrieval architectures because the context window can't hold a full codebase. Writing "be concise" in system prompts. Now that I think of it, how many tokens do I have left?
Current session
████████████████████░░ 96% used
Resets 12pm (Europe/Brussels)
OK. Still got something to work with.
Half of today's AI "best practices" are coping mechanisms for temporary scarcity. RAG exists because context windows can't hold a full codebase, not because retrieval-augmented generation is some timeless architectural insight. Prompt compression is the new stripping vowels from a text message. Careful model selection, picking a smaller model to save costs, is just choosing the 100-message bundle instead of the 500. These aren't engineering principles. They're adaptations to a price curve, and the price curve only goes one direction.
The practices that survive will be the ones that were never about cost in the first place: clear intent, structured context, knowing what you're asking and why. Everything built around counting tokens has the same shelf life as everything built around counting text messages.
Wait, how long have I been going? Did I just spawn four sub-agents to look that up? Let me just...
Current session
██████████████████████ 99% used
Resets 12pm (Europe/Brussels)
Oh.
Short version. SMS didn't become free. It became a rounding error. Token limits will follow the same path. One day you'll realise you haven't thought about your usage bar in months, the same way you sto
⚠ You've hit your usage limit.
Your limit will reset at 12pm (Europe/Brussels).
Member discussion