It is hard to predict, in nearly every pursuit, what will be popular and what will not. Blog posts are no exception. Sometimes a blogger posts something that would seem to generate a lot of interest and it fades without a trace; sometimes you post something that seems like no big deal and, for whatever reason, people care a lot.
The latter case happened here the other day, with a post about libraries. The post was circulated via Reddit, which, if the wind is blowing just right, can be the equivalent of dropping a match in a dry pine forest. Indeed, Reddit spread the post so quickly and well that our website was overwhelmed and, for the better part of 24 hours, was inaccessible, even to us. We couldn’t even get onto the site of our hosting service, 1and1.com, to see what was going on.
But that’s what I learned in retrospect; at the time, we were baffled.
When I was finally able to access the hosting site yesterday, I found a neat example of how illusory data can be. Let me explain:
The first graph I looked at showed the number of daily unique visitors (excluding feed readers) on Monday and Tuesday of this week. It was on Tuesday afternoon that we began to have overload problems. But this graph alone didn’t indicate anything very unusual: there were about 60,000 visitors on Monday and 70,000 on Tuesday.
Next I looked at the hourly traffic for Tuesday. It, too, looked normal, growing in late morning EDT, peaking at about 3 or 4 p.m. EDT, but overall showing a pretty gentle upward slope and then a gentle downward slope. Again, if you looked at this graph alone, you wouldn’t have suspected anything out of the ordinary. Why? I assume it’s because the nature of the overload meant that some people were getting through, but it took a long, long time.
(I would show you these two graphs I refer to above but, sadly, 1and1 is still not providing full service to us, so I can’t; I did, however, download two images that are pasted below.)
But then I looked at the site data for Referrer URL’s — i.e., where people are on the Internet before they land on our site. This data usually only reflects about 10% of our incoming traffic — it doesn’t tell you when people come to your site from their own bookmark, for example — but it’s still helpful. Here the Referrer data for Monday, which was a normal day:
And here is the Referrer URL data for Tuesday:
Hello, Reddit! Once you see this picture, the cause of our overload is pretty obvious; but the first two layers of data I looked at offered no clues as to the nature of the problem — or even the fact that there was a problem. This tiny and insignificant riddle made me think about how hard it is for a doctor to diagnose a problem in a patient, and why I was so impressed with Jerome Groopman‘s book How Doctors Think, which is an exploration of that exact problem.
Apologies to all of you who had a hard time getting onto our site, and apologies also for not posting during our outage. The good news is that we will soon be making a pretty significant change around here, to be announced in a few weeks, that should ensure against this type of problem in the future.