5 Dead Simple Principles To End Drama in your Engineering Team (and Kill your AWS Bill)

Lionel Martin
25 min readJun 7, 2021

--

Avoid the breaches, the downtime and the struggle with these 5 simple engineering principles.

California Mountain Snake

Before we start: know your dramas

Most software engineers are oblivious to these.

That’s because most software engineers are Builders — they’re good at creating new stuff — rather than Operators —who are good at running stuff.

And most Builders — which is what all good startups hire first — aren’t aware how horribly bad things can turn for a tech business.

Builders are mostly worried about inadvertently introducing bugs… But really, bugs are the least of an experienced Operator’s concerns…

And as a software professional, it is your responsibility to acquaint yourself with the modern threats your company and its customers are exposed to, at the very least because of the software you write or use.

Here’s what could go wrong:

Drama #1: This dude followed an unclear employee onboarding procedure and permanently deleted the whole company’s production database on his first day.

Drama #2: This team thought it would be a good idea to use a clone of the production database to seed their testing environment. As a result, they sent test emails to the entire customer base.

Drama #3 This fella brought down the XBOX Live Marketplace for 17 hours singlehandedly.

Drama #4 This little IT company called AWS (heard of them?) endured 24hrs of downtime on some of their load balancers on Christmas Eve because a single developer deleted the ELBs state data.

Drama #5 This trading company lost $440 million in only 45 minutes because of a failed deployment. The code was deployed manually by a single engineer.

Drama #6 These little tech companies called Apple, Microsoft and Tesla have been pulling random software dependencies from the internet into their codebase.

Disasters — outages, downtimes, data loss, data breaches —can sometimes be the result of malicious attacks. However, more often than not, they are self-inflicted, easily avoidable and the painful symptom of a lack in process.

And of a struggling team, collaborating laboriously, enduring too much error-prone and manual work, mostly to deploy code and manage infrastructure.

The solution? Trust these 5 principles and you will reduce the risk of drama in your team, improve the security and availability of your product and undoubtedly enjoy your work a little bit more.

Lionel helps high-growth tech companies automate their infrastructure management using infrastructure-as-code and containers. Reach out to him for consulting work on https://getlionel.com.

Principle #1: There is no bad team, only bad routines

No bad team, only bad routines.

No bad team, only bad routines.

If you wish you had a better team, you’re wrong. You don’t need a better team. You need your existing team to be more functional.

You need the right routines, at the right frequency, for problems to be solved as a team and for knowledge to be shared organically, without more documentation.

Actually, when we get to the implementation details of your deployment strategy in the next principle… you’ll be surprised.

There’s no documentation involved.

It’s so simple it self-documents.. You’ll see.

Sure, it won’t fix it all if you’re having a toxic workplace to start with.

But all of the problems above, even personality issues can be alleviated with better routines.

Good routines led with good intentions are everything. Don’t even have to be great. Just good. And consistent.

Routines are regular, predictable, recurrent ways of doing things in a team.

They bring clarity, shared understanding and commitment.

So what does that look like?

Hey, are we talking about Agile now? Or is it devOps, right?

Doesn’t matter.

Leave it to purists. And Agile Coaches.

It… Doesn’t… Matter.

Just do this:

The half-agile-half-devops-who-cares? daily routine for your tech team:

  1. Do daily Stand-Ups. Not a written update in Slack. Do it face-to-face.
    Or on a call. With the video on. Get all people in your team to commit “Today I’ll be working on THIS”. Then ask “Does anyone today have an issue or blocker I can help with?”. Then action it.
  2. Do Sprints. Start with 2-week sprints. Aim to achieve 2–3 team goals by the end of the sprint. Get into the habit of being flexible on the scope of work, but hard with deadlines. Engage your team in finding creative solutions to business problems within a fixed amount of time. Instead of being prescriptive, let your team flex their product neurons.
  3. Do Backlog Refinement. Break down requirements into smaller user stories. Get into the habit of treating every piece of work as a ticket. Share your screen, scan through all tickets and discuss each of them as a team. Decide what should be built and how and then collectively challenge it. Refine until you’ve boiled it down to the most pragmatic, simple, quick and risk free solution the business can afford. Backlog Refinement should be a creative and exhausting process. Do it in the afternoon right before EOP.
  4. Do Planning. Do it the morning the day following Backlog Refinement, instead of your daily stand-up. Planning should be a formality because the workload is now refined. Agree on Sprint Goals. Share your screen, scan through each ticket, then estimate and assign them. As a team, agree on the tickets you can commit to deliver for the sprint.
    Get into the habit of being methodical and realistic so you can commit to deliver, as a team.
  5. Do Retrospectives. Have a 20min chat at the end of each sprint. What was yay, meh or nay? Take small actions to have more yays and less nays over time.
  6. Do Demos. Let your tech team show their work off to the wider team.
  7. Get with Gitflow (principle #2) Just like your teeth brushing, your codebase should be neat. Learn to collaborate effectively on your code base without stepping on each other’s toes.
  8. Manage your configuration as-code (principle #3) Be kind to yourself. Say goodbye to managing server configuration manually.
  9. Write tests. Mostly unit tests. Then automate them (principle #4) Everybody should write tests. Even if you’re a rockstar ninja unicorn developer. Oh god, are you?
  10. Script your deployments. Now automate them (principle #4 too) Replace toil and human errors with automation.
  11. Release early and often. As opposed to infrequent, large and risky releases.
  12. Manage your environments as-code (principle #5) I know. Clicking around the AWS console looks cool. Unless it’s not.

That’s it.

That’s all you need to know about Agile and devOps.

Now you’re asking… dude… why is this that important??

I want to hear about Terraform and Docker scripts and all the hardcore stuff!

Fair enough… Here’s the answer…

Without stand-ups, you’ll waste your time on stuff a colleague can probably unblock you with in an eye-bat.

Without planned sprints, you’ll never be able to manage the company’s expectations.

Without backlog refinement, you’ll build and deploy stuff that’s way too complicated for the actual business needs.

Without a good Git flow, you’ll eventually deploy security holes in production, create data breaches and you’ll get both traumatised and fired.

Without automated tests, you’ll lose confidence, sleep, charisma and your wife will leave you.

Without automated deployments, configuration-as-code and infrastructure-as-code, you’ll eventually screw up a deployment one day, you’ll break production and you’ll be responsible for losing the company hard earned customers.

Does that sound important now?

Principle #2: Get with Gitflow

Git can get really complicated.

I bet you don’t know 10% of what it can do.

I probably don’t. Never had to. Like most things, there’s only a subset of features you need to master. Everything else… you can learn in-time.

Unless you want to become a professional StackOverflow troll.

Do you even git reflog, git cherry-pick, git rebase, git hard reset and then squash your commits bro?

There’s only a handful of Git things you need everybody in your team to understand.

Without needing to learn all the crazy bat-shit Git stuff.

Without extending your terminal aliases with hundreds of lines of bash.

Let’s keep it simple.

If it stays simple, everybody in your team will get it. Especially new starters. You guys will have a shared understanding of how things should be done in your team. So we can all get on with the work. Happy days.

What we need is a simple system to track all the changes to your application code.

Your code needs to be committed in Git (non-negotiable), in readable and small enough commits, in such a way that…

… no source code or history is ever lost, deleted, overwritten

… all changes can be easily reverted or rolled back in case things go haywire

… it’s quite clear what code is currently deployed in which environment, just by looking at Git

… every engineer can start, develop and submit a new feature at any time without being blocked by another engineer’s work in progress.

And the single system to ensure all of the above is… effective branching!

Branching strategy is everything.

It’s the single thing you can teach your team in a day that brings the most peace of mind to your source code management.

And with the simple branching strategy I’m about to show you, you can do it all without ever having to master any Git wizardry.

Simple AND powerful.

The 11 pragmatic Git committer rules:

  1. Only commit new code to feature branches. Never commit new code directly to master. For each new feature, we create a new branch, commit our code there and we only merge into master once everybody’s happy with it. That way:
    - if the feature starts looking like a Frankenstein monster after a few commits and was clearly a terrible idea, our master code has been kept untouched and safe. It’s easy to drop the feature. Without having to revert the changes.
    - while an engineer is having fun with his feature branch, no one is blocked because there’s never an incomplete work-in-progress feature onto master that’s breaking it. Master is always production-ready.Go ahead and even enforce this on your Git server (hopefully GitHub, BitBucket or GitLab. You’re not self-hosting are you? Wait, rockstar-ninja is that you??)
    - create some branch protection rules on master. Branch rules are a safety net designed to protect your code from catastrophic actions rather than particular people.
    With branch protection, master cannot be deleted, either accidentally or intentionally and master’s commit history cannot be overwritten even with a force push.
    You should also disallow direct commits to master. That will apply to command line users as well as to your Git server web UI.
  2. No code change to be merged without a Pull Request. No feature should be merged into master without a Pull Request (PR) i.e. a submission of your changes for approval before it is merged into master. PRs should be:
    - small and readable.
    - containing (passing) tests.
    - containing new or updated documentation such as a README.
  3. No Pull Request without Peer Review. Get other engineers to approve the commits before merging. Get them to comment and critique your code! Even if you’re the boss.
    This is the single most important thing you can do for code quality and security.
    And even if you’re the most senior engineer in your team, get reviews from junior engineers. First, they will learn from you and second I can guarantee you that one day they’ll find an overlooked defect in your code that could have led to a massive bug.
    Go ahead and enforce it by adding a CODEOWNERS file to your repository.
  4. Pull Requests openers should be responsible for merging them. You open it, you update it as per your colleagues’ reviews and then you merge it. You should never merge someone else’s PR because they’re the one with the most context about what the feature should do and when it can be merged.
  5. Maintain parity between your Git branches and what’s deployed in your environments. You should be able to tell what’s deployed on staging, QA and production just by looking at Git. More on this later.
  6. Use a standard Git branching strategy. Don’t make up your own. There are two standards for trunk-based development (which is when you have a long-lived primary Git branch), pick one of these:
    - GitHub Flow is a lightweight branching strategy option. It uses a long-lived master branch and short-lived feature branches.
    Details at https://guides.github.com/introduction/flow/
    - (recommended) Git Flow is a more advanced branching strategy (and branch naming strategy) that also comes with an amazing command line helper. Unlike GitHub Flow, it uses two long-lived branches: master which is in sync with production and develop, in sync with staging.
    git flow feature start my-feature creates a feature branch named feature/my-feature off the develop branch
    git flow feature finish my-feature merges the feature branch into develop
    git flow release start my-release creates a release branch and a PR to master
    git flow release finish my-release creates a release branch and a PR to master

More details at https://danielkummer.github.io/git-flow-cheatsheet

Just do it. Protect master. Add a README. Get everybody in your team to use Git Flow.

That’s it.

Here you go.

Documentation just improved. Knowledge sharing just went up. Code quality just went up.

Actually, by following points 1, 2 and 3 above, the security of your code and your deployments probably just went up by 10,000%.

No joke. The most humongous security holes I’ve seen in my career as an engineer and technology leader would have been picked up in a PR in a heartbeat.

And some of these security holes have been nearly fatal to these companies.

Just use these simple Git principles.

Principle #3: Manage your configuration as-code.

Now we have some solid foundations sorted out.

We have a team working together on a neat code base…

… that we’re ready to deploy!

Somewhere on servers.

These servers need to be provisioned. And configured.

What do we mean by configuration?

Configuration is all the housekeeping work to be done so our code can run somewhere other than the developers’ laptops.

But here’s the problem. That’s LOTS of housekeeping.

This is what it looks like for a Laravel web application:

…SSH into your server

…install PHP

…configure PHP

…install Apache or Nginx as a web server

…configure that web server

…upload your code to the server

…install composer

…run composer to install your PHP dependencies

…install all binary dependencies (image magick etc)

…launch the web server

…launch PHP

…launch your Laravel workers

…configure crontab

That’s too much. And some of it you need to repeat on each deployment. It’s not only boring. I guarantee you you’ll make a mistake one day. And break everything. And ruin your day. Maybe your weekend.

Now let’s say you have two Laravel projects to deploy on a single server. And they each make use of a different version of PHP (first one is still in PHP 7.3 and you have migrated the other one to PHP 8).

Simply put, you can’t do it.

So we’ll not only script all the configuration above as-code, we’ll also package each deployment as a ready-to-run Docker container that you can launch in one command to a brand new server.

Every time. Just one command. How does that sound?

And you can run different projects with different configurations on the same server.

So what are containers? And what’s Docker?

Docker is a software platform that simplifies the process of building, running, managing and distributing applications. It does this by virtualizing the operating system of the computer on which it is installed and running.

Simply put, Docker containers provide a way to get a grip on software.

Let’s go back a little bit.

What if you’re already not very confident with all the server configuration stuff… Is that really a good time to learn more complex technologies like Docker?

YES!

Look. About a decade ago, I knew nothing about servers. I was a very good developer but I thought that, to configure a Windows or Linux server professionally, one had to be a beardy sysadmin nerd that’s been assembling his first computer at 18 months old.

The “when I was 5, what I loved most was overclocking the BIOS of my Atari 2600” kind of guy.

And then my very good mate Michael showed me a Dockerfile (Michael, if you read this, I miss these days!)

If you’ve never seen a Dockerfile, it’s basically the build script of a Docker image. For a Laravel web application for example, it contains all the server configuration steps we’ve seen above.

As a script.

Or rather, as a RECIPE.

And Dockerfiles can be shared. Like your grandma’ pumpkin pie recipe. Take your mate’s Dockerfile and you can build your application into a container that’s using the exact same server configuration as your mate. In one command, you have a fully working server setup.

And people and companies are publishing their Dockerfiles as open source. You pick the Dockerfile that works for you and you can deploy your application on a brand new server in one single command (I’ll give you my Laravel Dockerfile later on).

It was liberating. I didn’t have to become a beardy sysadmin nerd or go back in time to help my clients with hosting their software.

And as time went on, I got to understand these principles:

The 12-factor principles behind Docker:

  1. Manage your configuration as-code. Treat your configuration just like your application code. Commit it to Git. Version it. Use Git Flow to manage changes to it.
  2. Explicitly declare and isolate dependencies. Just like Laravel helps declare your composer and npm dependencies respectively in composer.json and package.json, you should always declare the rest of your application’s dependencies (OS, PHP and web server configuration) within your source code. And that’s exactly what our Dockerfile does for us.
  3. Strictly separate config from code. Store config in the environment. An app’s config is everything that is likely to vary between deployments (development machine, staging, production etc). That includes:
    - resource handles to the database, Redis and other backing services
    - credentials to external services such as S3 or a 3rd party API etc
    Environment-specific config should never be committed to code (even non-sensitive config).
    It should instead be injected at run-time as environment variables.
    Docker allows you to do just that, by separating build-time (packaging your code) from run-time (injecting environment variables before starting the container).
  4. Treat backing services as attached resources. If you follow principle #3, then your database, cache engine, search engine etc will be accessible to your web app Docker containers through runtime configuration only.
    This decouples your web app Docker image from its database, cache engine, search engine etc.
    At any time, you could run the same web app Docker image against a new database etc, all without rebuilding your app.
  5. Build, release, run. Strictly separate the processes of building, deploying and running your application. So they form a one-way deployment pipeline. Never connect to your servers to edit your running application’s code.
  6. Execute the app as one or more stateless processes. Your “Dockerised” web application should be stateless. Any data that needs to persist must be stored in a stateful backing service such as a database.
    Your web application should not rely on data from the server’s disk. If it does, you will not be able to horizontally scale your service. It will also loose data should you need to restart your service on a different server.
  7. Treat logs as event streams. Don’t write logs to your servers’ hard drives. Write them to stdout and configure Docker to stream the logs to a stateful backend logging service such as CloudWatch for storage.

If you want to dig deeper into these principles, check out https://12factor.net.

This is how you manage your configuration as-code. Give it a try, then learn more about Docker, it will be worth it.

To learn more, check my article Stop deploying Laravel manually, steal this Docker configuration instead on Medium.

Principle #4: Write tests. Mostly unit tests. Then automate them all.

Script your deployments. Now automate them too.

You were secretly hoping we wouldn’t talk about tests, weren’t you?

No one likes tests.

They slow you down.

They look ugly in your codebase.

They seem like a waste of time which you could use to actually code something fancy that you could show off about to your marketing colleagues, right?

So no one likes tests…

That is until…

… you’ve deployed enough quick fixes that have broken, not a button, not a section, not a page but your whole application in production to the point that no customer could see even a single pixel from your app.

… even when you were just adding a new link onto the company’s website footer.

… and you only realise the morning after. From a call from your boss. Who realised from voicemails from three pissed customers. First thing in the morning.

Or until… you’ve realised that the login form you’ve put together a few weeks ago lets people login with any random password. Oops…

Until… your whole company gets publicly embarrassed for emailing customers with other customers’ personal information.

And then bizarrely, some developers start loving tests. They even want more dedicated time to write tests! I hope you’re one of these.

If not, close your eyes and imagine that. The company wants a quick change in a registration form.

It’s meant to really boost the results of the marketing campaign that already launched.

You add the code. Git commit. Run the tests. 1046 tests. Passed. You look at the test results. You look at your commit. You know for a fact that nothing has been broken. Guaranteed. Git push.

Your boss looks at you scared like your mum before removing your training wheels. “It won’t break anything else RIGHT?”. You: “Affirmative. 1046 tests passed. Build is green. Good to go. Launching boot sequence.”

Feels good huh?

That’s when you write tests.

Write mostly unit tests cause they’re fast to write, but write at least a few end-to-end tests just to make sure the basic navigation and the key forms of your app are never broken.

Better even… write your tests and then automate them.

That’s even better than running your tests locally right before pushing your code to your code repository. Get your code repository to do it for you. On each Pull Request.

Whether you’re using GitHub, BitBucket or GitLab for your code repositories, there’s an option to run your tests automatically on each Pull Request (you don’t need a third party service; just use GitHub Actions, BitBucket Pipelines or GitLab’s option).

The way GitHub Actions, Bitbucket Pipelines etc work is by provisioning a Virtual Machine on their infrastructure, Git cloning your source code there, installing the dependencies etc, running your tests and then showing you the terminal output in GitHub/BitBucket/GitLab’s UI. So you and any of your colleagues can check the results of the tests for your PR without running them themselves on their machine. Brilliant!

And it gets even better. If you’ve implemented containers as we’ve seen in the previous chapter, you could not only automatically test your code but also your whole application’s configuration!

By that I mean, it will rebuild your Docker image from scratch and then run your tests inside it.

So that’s why I mean you would have tested not only your code changes but your configuration changes too (e.g. changes in php.ini or nginx.conf or in your composer or npm dependencies). And these configuration changes can be a bit stressful, can’t they?

I’ll show you in a minute how we do that. When you do that, you would have implemented what is called…

Continuous Integration (CI).

Which is the process of automating the build and testing of code every time a developer submits a change.

You need this. This is not about automating a few things here and there to save time. It will give you and your team bullet-proof confidence that your code changes haven’t introduced new bugs.

I’ll give you later the exact step-by-step and the scripts to set up your own CI on any source code repository provider. So you have no excuse!

Now… To the next step… What is the other daily step that is somewhat manual, error-prone and stressful?

Deploying your changes to production.

We’ll automate that too.

We can make it so that clicking the PR merge button in GitHub, BitBucket, GitLab actually deploys your code to production.

Because you shouldn’t SSH into your servers manually.

Neither to update configuration nor to deploy your code.

That’s mad!

One day it’s going to blow up.

Instead we’re going to build our Docker image (remember that’s code + configuration), test it and push it to our servers. All of it automatically.

All we need is some sort of agent on our servers that will detect new Docker images, pull them and swap the old running ones for the new ones and I’ll explain how we do that later on.

And that part is called Continuous Delivery (CD).

It’s the process of deploying code changes into production in a safe, automated and sustainable way.

By using CI + CD (that’s 2 sequential steps so we call it a deployment pipeline), we harden our process for testing, building and running our code.

We drastically reduce human error.

We save ourselves a bit of time.

But more importantly we reduce stress and increase your confidence in pushing code changes early and often.

We are not afraid to release anymore. It’s becoming safe, fast and exciting to release!

And you don’t have to pause your sprint and block some time to prepare a release too. Our pipeline does it all for us.

On. Each. Submitted. Code change.

Automatically.

Give me a shout in the comments below if you need the step-by-step instructions to setup your CI/CD pipeline on GitHub Actions or Bitbucket Pipelines.

Principle #5: Manage your environments as-code

That one is my favourite one. It’s going to blow your socks off.

15 years ago, I was an intern developer in a tiny tech startup.

Like most software companies, we needed to run our software online and we needed a website.

So one day, my CTO ordered two servers from Dell.com. For a whole week before we placed the order, everybody in the company discussed the amount of CPUs, RAM and disk we would need. It was a serious commitment.

I loved my CTO. Learned so much from him. Bertrand if you read this, I hope you’re keeping well!

Anyway, two weeks later, two huge rack servers show up at the door. If you’re like me and unboxing brand new RAM sticks and hard drives as a kid was your jam, imagine being an intern unboxing 2 Dell rack servers. Party time!

So we booted them up and gave them a spin. Then we loaded them at the back of the CTO’s car and drove to the facilities of our hosting provider, a tiny data centre in the middle of the city. How exciting was that. High security fire-proof doors everywhere, huge air cooling systems, multiple rack servers rooms, electricity cables covering all the walls. I felt like a CIA agent :D

We racked the servers in, plugged them on power supply, the datacenter dude configured their network interfaces and then gave us…

… the two beautiful public IPs and network names our servers will have for the rest of their lives ❤

It’s like they were just born in front of us.

From there, we knew their IPs by rote. We were referring to our servers by their names. I was logging into each of them every morning, checking Windows Services and network logs to check if everything was running alright. Checking their CPU temperature and disk usage.

Our servers were like our pets!

Fast forward to 2021.

Some of us still treat our servers like pets. Others treat them like cattle. Booting, configuring, updating and shutting them down by the hundreds. Systems engineers at Facebook manage on average 20,000 virtual machines per engineer.

For the rest of us (the mortals…), there are mostly two cases.

If you’re an agency, you’re probably setting up new servers and configuration for each of your clients on a regular basis.

If you’re a product company, you probably manually manage a growing infrastructure, sometimes adding a server or a database here and there and putting it all together by clicking around some sort of cloud UI.

Problem is. That involves so much configuration. For the VM. Keeping the OS up to date. The firewalls. The services. For the database. The automated backup configuration. There are SSH keys and users and permissions and access rights to set up.

And most likely it’s all manual.

And it has to be done for each environment, staging, QA and production. Manually. Oops I broke something!

Well… Remember Docker and configuration-as-code from chapter 5? How good was that to do it all as-code, so we can commit it and version it. Now we’re spinning containers up left, right and centre.

Now that it’s automated we don’t care about rebuilding everything from scratch, or duplicating environments.

Now it’s as easy to add to the php.ini config on production than to change an HTML div in our web application views. Because either way it’s just a line of code in our Dockerfile definition and our CI pipeline will rebuild all of it automatically.

So we don’t sweat it.

Well infrastructure-as-code is the equivalent to this, but for our infrastructure. You could call it programmable infrastructure.

In its simplest form, it’s a script that makes requests to your cloud provider API to create and configure the server-side resources you need.

And once you have that script sorted out and working for your exact needs, you worry much less about what could happen to your servers. You know that if anything gets broken, you can set it all up again from scratch without breaking a sweat.

So you could stop worrying about your servers like pets.

You become more confident.

Now to the practical steps. Meet Terraform!

Terraform is an open source infrastructure-as-code tool that codifies APIs into declarative configuration files.

Declarative is the real deal. If you’ve ever written imperative jQuery code and then learnt a modern Javascript declarative framework such as VueJS or React, then you know what I’m talking about.

First, you write the cloud resources you need in Terraform configuration files, using Terraform’s language (which is called HCL). Then Terraform will make requests to your cloud provider’s API to create all these resources until what you have in your cloud account matches exactly what you have declared in your Terraform files.

Change a little something like your instance size in your Terraform files and Terraform will update your cloud account accordingly. Add a new database in your Terraform project and boom, it’s created in your cloud account.

You can also duplicate your projects in a click or so. Which is very handy if you’re deploying the same thing over and over for your agency clients. Or as a product company to create identical staging and production environments.

And if you deploy some infrastructure through Terraform just for testing something briefly, it can keep track of the created resources and delete them all when you’re finished so that your cloud account stays neat and clean.

You might have heard about other infrastructure-as-code tools. Just use Terraform.

Just like your server configuration is now in Dockerfiles, which you can commit, version, peer review etc.. Your infrastructure is now as-code too.

It’s safer. Less stressful. Makes you more confident about modifying your infrastructure.

The icing on the cake is that Terraform configuration files are the perfect documentation for your infrastructure. They’re easy to read and all the history is in Git. External, manual, documentation can easily get out of sync with what’s live, but that won’t happen with Terraform configuration files.

Now let’s deploy our web applications to AWS using everything we’ve learnt so far.

What’s next? Use Battle-Tested Infrastructure Blueprints

Let’s talk CLOUD.

I won’t get into the boring details of what the cloud is blah blah blah.

But in a nutshell… Remember my two pet servers from my first job? Well, now, with say an AWS account, we can provision as many servers or databases as we need with a single API call (or Terraform configuration file) and in a few minutes. On demand.

But there’s more to it than Virtual Machines and databases.

There are two fundamental things that the cloud also gives us.

It’s managed services and solutions for security, high availability and high resiliency.

Managed services are cloud services you can use where the complexity of hosting, maintaining and scaling a certain service is abstracted to you by the cloud provider.

Think of managed load balancers, object storage and databases. If you were to provision, host and maintain your own load balancing service using Nginx, Varnish or HAProxy, it would be like opening a can of worms.

Same if you were trying to build your own service to store, encrypt, backup and serve a growing number of files generated by your application.

Same if you were trying to scale and distribute your relational database to multiple nodes. Good luck with mastering all of the configuration for that.

AWS definitely let you do all these things manually if you wanted to. But it also provides you with welcomed shortcuts. Managed services.

Launch an AWS load balancer, S3 bucket or a managed (even auto-scaling) database and job done.

AWS takes care of the security (OS patches, service configuration, DoS protection etc), the availability (that’s managing a downtime objective) and the resiliency (using distributed technologies such as when one server’s hard disk dies for example, the service is still running and no data is ever lost.)

With the cloud, it’s a constant tradeoff between using bare virtual machines, the cost of managing it all yourself vs paying for managed services instead.

So what about our web application?

We’ll make the most of managed services. We’ll deploy it onto Fargate (ECS), Aurora, S3 and ElastiCache.

➡️ Our Docker containers will run onto Fargate instances that we don’t even have to configure.

➡️ ️️Our database will automatically be backed up and can even scale automatically.

➡️ Our file storage solution is automatically encrypted, is guaranteed to have less than 0.000000001% downtime and scales to infinity at a very reasonable price.

➡️ Our Redis servers are automatically patched for security holes and we can scale them with a click of a button.

This is the exact infrastructure I’ve used to scale a software business to one million paying customers. You can see all the infrastructure details in my ebook Deploy Laravel on AWS with CloudFormation.

Just use this architecture. Ignore the CloudFormation code because you now know that you should use Terraform. Same thing, give me a shout in the comments below and I’ll send you a link to the exact configuration files to build this exact Fargate infrastructure with Terraform.

How do you implement these Principles in your company?

Are you ready to build everything as-code? Are you convinced? I’m certainly never going back…

Do you remember my story about discovering Docker? The best part was when I realised I didn’t have to learn everything from scratch myself. I could copy or at least start from, other people’s battle-tested configuration recipes. Which would save me weeks and weeks of work.

And what I discovered over the years is that reading and starting from other people’s configuration is by far the fastest way to learn!

Some cloud services in particular can be awful to learn from scratch. Networking on AWS is an example. You need to learn VPCs, public/private subnets, network ACLs, NAT gateways, route tables etc and it’s a massive pain…

Some other cloud services can be quirky and require lots of wasted time and pulled hair to eventually get it right. ECS Fargate is one of them…

And that’s just the basics. There’s much more you should do for overall security and data protection.

So when you have the opportunity to start from battle-tested off-the-shelf end-to-end blueprints from people who’ve done it many times before, I recommend you do that ;-)

And I probably already have the exact infrastructure blueprint you need. Bullet-proof and battle-tested. The AWS stack I mentioned above? I used this to serve tens of millions of users. Just in my last job. Some other infrastructure blueprints I refined over 2+ years. That’s a hell of a shortcut.

So if you need a specific setup, and you want to make sure it’s all built properly as-code, with the full Terraform + Docker + CI/CD pipeline configuration for your specific needs, if you need someone to come and help your team with that, connect with me on LinkedIn.

Finally, if you have an existing infrastructure and are unhappy with managing it by hand, I can help you keep it under control, reduce human errors and downtimes all by migrating it to Terraform.

Lionel helps high-growth tech companies automate their infrastructure management using infrastructure-as-code and containers. Reach out to him for consulting work on https://getlionel.com.

--

--

Lionel Martin

Lionel helps high-growth 🚀 tech companies automate their infrastructure management using infrastructure-as-code and containers -> https://getlionel.com