In the old days, a startup would run their product on a shared PHP server containing both the app and the database. It was the cheapest way to get started in the pre-cloud era. The problem with such a setup was that this quick start would always run into scaling problems when trying to grow a customer base. At some point, engineers would waste weeks making the infrastructure scalable.

But then came the big-name cloud vendors, and the playing field changed. Infrastructure designs and architectures that were only available to multinationals suddenly became obtainable to every entrepreneur with an idea.

  • A Mongo cluster in multiple availability zones? Click. 
  • A microservice architecture on Kubernetes? Check.

Running your system on a shared PHP box was out, and given the cloud's generous free tiers, every startup has switched to advanced infrastructure.

That's not a bad evolution. Tech startups need those options.


Premature scaling

But one thing that stands out when working with startups is the trend toward premature scaling. I often talk to engineering teams who are proud to have "free infinite scaling" from the get-go. That's usually overly optimistic.

The unquestioning belief in autoscalers is one of the best tells of this kind of optimism. Developers will rave about how trivial autoscaling is these days.

"Kubernetes takes care of it!"

"We use serverless lambdas, so we don't even have to think about scaling!"

"It's limitless!"

The problem with autoscaling is that it's anything but trivial. It increases complexity, and when bugs appear, as they invariably do, they're hard to fix. I've seen systems crash under heavy traffic because the autoscaled servers orchestrated an internal Denial-of-Service attack on a single database instance. I've seen a load balancer redirect traffic to machines that still needed 3 minutes to boot.

Autoscaling is anything but trivial and free. It's a complex feature that supports some powerful but limited use cases. It's not for everyone. Autoscaling works great for situations where we have no clue how much traffic we can expect. That's not a situation most startups face. Yet infrastructure engineers seem to believe they must always counter the Black Friday scenario, where a sudden wave of customers could bring down the site.

In reality, most teams know exactly what kind of traffic they can expect. They start with one server and add a second one as their user base grows. Most web applications are perfectly fine with a simple load balancer instead of an autoscaler.

Here are some guidelines when considering autoscaling.

Did you test it?

Sure, Kubernetes is cool, but have you actually tested your setup? Do you have regular load tests that prove that the autoscaling still works? Do you know its limits? A customer I worked with learned during such a test that AKS can take up to 15 minutes to spin up an additional node! By the time their extra capacity was online, the demand was already gone.

If you support autoscaling, treat it like any other feature and test it regularly. Write automated tests that simulate the Black Friday scenario and make sure these get executed every few weeks. Keep an eye on boot times, CPU utilization, and your credit card bills.


Extreme Programming taught us the You Ain't Gonna Need It rule. If you don't support a feature yet, you shouldn't add an abstraction just in case. By keeping the software simple, you can easily adapt to actual needs rather than anticipate fictional ones. This golden rule also applies to infrastructure. If you know what kind of traffic you can expect, don't add an autoscaler just yet. There's no point in setting up a Kubernetes cluster "just in case".

Start simple and increase complexity as needed. It's very unlikely your entire application needs infinite scaling. Only when your system is in use can you identify which parts of the app get bombarded with traffic. Fix those problems as you encounter them.

When a certain API receives unexpected spikes in traffic, you've identified a part of the system that might need flexible scaling. At that point, splitting that part of the code into an autoscaled microservice makes sense. This will solve a real scaling issue while adding the exact amount of complexity needed. Nothing more, nothing less.

YAGNI means your teams shouldn't waste their time anticipating imaginary scaling issues.

KISS (Keep infrastructure simple, stupid)

I've never seen a startup with too many developers. Engineering capacity is always in short supply. Effort spent on building a complicated infrastructure is effort not spent on features. I've worked with a team that wrote CloudFormation scripts to set up a FarGate cluster with custom-built blue/green deployment for zero downtime. They didn't have a single paying customer, but they were building a million-user infrastructure. At that point in the product lifecycle, that's a colossal waste of resources!

Simple infrastructure will accelerate your team's development cycles. That's why startups need to keep their infrastructure as simple as possible in the early days and scale it as their user base grows.

I'm a big fan of Digital Ocean for simple infrastructure setup. But all cloud providers have no-frills setups. Want to use AWS? Great, but pick Elastic Beanstalk PaaS over rolling your own VPC. Are you a Microsoft company? Azure App Services is an awesome tool to launch your product on.

When you develop a product on simple infrastructure, you'll get the best of both worlds. Just like in the old days, you'll have a quick start until scaling issues arise. But unlike the pre-cloud era, you'll have world-class tools to evolve your infrastructure.

Startups don't have the luxury of building their product first-time-right. They need to launch small features, validate those with users, and evolve the product. That's not rework. It's postponing the work until we're sure we need it. Infrastructure follows the same logic.

Start simple and grow your infrastructure's complexity when necessary.