Kahu Software, Inc. ⋅ AWS' Major Outage Today

What Happened?

Hundreds of major companies went dark when Amazon Web Services (AWS) suffered an extensive outage. This outage disrupted the North Virginia (us-east-1) region which is considered by many to be a single point of failure (SPOF) due to its configurations that affect global deployments.

Mehdi Daoudi, CEO of Catchpoint sent in a statement to CNN estimating that the total loss could "easily" reach the hundreds of billions of dollars. This estimate has to do with productivity lost by employees on top of business operations that can to a halt in major industries like airlines and factories.

Is It Resolved?

Kahu Software has a few clients in the North Virginia region, so we can directly see who and what was affected beyond AWS' statements.

We saw the last failed Amazon Simple Email Service error arrived at 2025-10-20 18:29:34 UTC, or 1:29 PM CT.

This means the outage lasted ~12 hours!

Was There Any Damage?

We will have to keep digging into everything that occurred. In Laravel applications, background jobs are placed into a failed_jobs table if your request fails. This allows you to re-queue jobs back up after assessing the damage.

Many of our requests to us-east-1 resulted in the following responses:

500 Internal Server Error response: {"message":null}

Connection could not be established with host "email-smtp.us-east-1.amazonaws.com:587"

There were media conversion requests to that region that had failed and many automated emails that did not get sent.

What was especially frustrating was AWS' responses on their status page. https://health.aws.amazon.com/health/status

Repeatedly, AWS stated that they had found an issue and that, "We are seeing significant signs of recovery."

Clearly, our logs showed the opposite.

How To Mitigate This?

Hosting a server or relying on services in a single region is risky. For larger applications, most developers configure their DNS to load balance servers in at least 2 regions with scaling groups to add additional servers in the event a regional outage.

The same goes for DNS providers like Cloudflare, who experienced a serious outage in 2024. To mitigate DNS, you could apply multiple name servers to your DNS configuration. In the event Route 53 or Cloudflare goes down due to a networking issue, you're customers will be routed to whoever is still alive.

AWS' Major Outage Today - North Virginia Oct 20, 2025

What Happened?

Is It Resolved?

Was There Any Damage?

How To Mitigate This?

Get notified when we create.

Let's get started