Site problems

As many have noticed we had a few problems with the site yesterday. Thanks to everyone who emailed to let me know they were experiencing difficulties.

We think the problem has been solved, and that the site should be stable now.

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.

Post-mortem

For the record, events seem to have progressed something like this:

  • An unusual spike of broken search bots and spam attempts created large numbers of session records, making the session table & indexes unusually large, reaching 6 times normal size, and creating high database contention.

    (As an aside, the search bot industry is to be commended on its inspiring reliance on recursion to construct request URLs. As soon as the authors learn how to either write a base case or use coinduction properly, we'll be all set, and bot-generated URLs will stop converging towards infinite length.)

  • At some point, the session table index became corrupted (MySQL FTW?)
  • Drupal was confused by the index problem, and overwrote the user and permissions records for the anonymous user (user id 0), something it is known to do in various situations. When in doubt, write to not-so-random places in the database!
  • Anonymous users were prevented from accessing site content. This persisted for a number of hours, unnoticed. As Kanye West might say, in an accusatory tone, "LtU doesn't care about anonymous users!"
  • Ultimately, the session index problem became bad enough to affect logged in users.
  • A cry of anguish went up across the PL blogosphere, as if the evaluation of millions of lambdas had suddenly hit bottom. A crack team of roughnecks was flown to a secret base... Sorry, different story. Much investigation and a few SQL updates later, the problem was solved.

At the not insignificant risk of trolling...

...has a migration to PostgreSQL ever been contemplated?

SQL is a leaky abstraction

MySQL is only being used for line-of-least-resistance reasons: because it's the default database for Drupal, as well as for various other systems being used on this server.

This means that switching to PG would have its own set of not-so-desirable consequences. For example, Drupal's code base apparently contains various "inherent mysqlisms" which favor MySQL over PG, among other issues.

The most significant concern is that maintenance of Drupal support for PG has been inconsistent, at best, and even actively opposed by some Drupal core developers. Similarly, contributed add-on modules don't always support PG, although that's supposed to have improved in more recent versions than the one LtU is running.

Here's a recent blog post with a comparison. Here's an older post which includes a brief description of the PG maintenance scenario.

Problems like the one that's just happened can probably be prevented with more proactive management, like better blocking of bad bots, and more aggressive deletion of anonymous sessions.

Quite Right

I was really only addressing MySQL's trashing of its index. I do hope that future versions of Drupal are more successful in supporting PostgreSQL. I hope even more fervently that blog posters of benchmarks between MySQL and PostgreSQL bother to learn how to do even minimal configuration of their OSes, PostgreSQL, and Drupal so that their benchmarks aren't trivially broken, but that's another subject altogether.