Slow server over the past few days

By James Cridland
Posted 18 October 2015, 4.44pm edt


The above image shows why. Probably. For some reason, thousands of hits are pouring in asking for a specific logo (Clubland TV, if you were interested, which I'm sure you were). This was a link to the old Media UK infrastructure, which was re-pointed to the servers a few days ago.

A server script ran for links which redirects them properly. But that was producing a database hit every time it was run. So that probably wasn't good. It doesn't any more.

And then the redirect - to the front page in this case - was, of course, producing a ton more database calls, since the front page is full of them. That's fine for a 'proper' request, but not so fine for a request of an image. So, all calls to* will all now (rather uglily) fail.

So the database server ran really slowly, which meant that the webserver, in turn, filled up with requests that weren't being turned around fast enough. And so everything just ground to a halt.

Anyway. Lessons learnt from the above are:

  1. Don't do a database call for every file request
  2. Don't redirect every file request to a database-heavy front page when it's just a logo
  3. Amazon CloudFront doesn't always help you.

I think I've managed to put the changes in place to stop this from happening again; but we'll see, won't we...


4 years, 3 months ago

The database server has also had an upgrade, by the way. Like the website (and every computer I own), it no longer uses spinning discs of magnetic material, and is now entirely powered by SSD.

4 years, 3 months ago

A few more pieces of downtime later, and it's clear that the above was one thing, but it wasn't the whole story. A much more esoteric issue with just the front page is now fixed, and I'm confident - with other changes I've made - that any errors are a thing of the past.

Even though I've also made a much nicer error message if it ever happens again.

