Our Daily Bleg: What to Do About Too Much Data?

Tucked inside his bleg is the part that tickled me the most: a website Evan created to tell him whether it’s worth it to watch a basketball game he’d recorded. Anyway, I’ll give my answer below, after his bleg.

I was wondering if too much data is ever a bad thing? I ask because I thought one of the rules of life that I’ve learned is that it’s best to have as much data as possible.

Whether it be hard numbers or smart people around, and at least when you are starting, you want as much information as you can get. The smart guys are the ones who know how to analyze it.

However, in my personal life I was having a problem with too much data. I watch all of the Warriors basketball games on DVR. However, nothing is worse than watching for 1.5 hours and in the end your team gets blown out. However, I never want to see the score before I watch because that ruins the game. To solve the problem I created a little website to warn me if the games are bad (www.shouldiwatch.com), but it won’t tell me anything about the outcome (who won or the score) if the game was relatively close. Trust me, as a Warriors fan this is a huge time saver. It’s a stupid little example, but it makes me wonder if there are other cases when you are doing research when you need to turn away from some information.

Anyway, it’s a little bit backwards to think of, but I thought it might be interesting to explore.

His question sounds as if it is directed more at quants than writers, but as a writer I’ll say that I face this dilemma daily. Right now, in the middle of our writing SuperFreakonomics, I’m facing a number of short sections that require a bunch of historical reading and research. But the key thing is that these sections remain short — they are not the donut in this case, but the donut hole, and if they start getting swollen they will turn the book into a flabby monster.

The problem is that the reading and research is so much fun that it is really hard to limit yourself. Especially in this age of Google (and Google Books) and Amazon and even Wikipedia (yes, I was an early detractor but have come around on certain subjects), I am constantly trying to take a little sip from a firehose, and it’s nearly impossible. Reading too much inevitably turns into wanting to write too much; in this case, shorter will be better, but it takes a lot of effort and a long time to get the right three paragraphs (as opposed to a much easier but, to my mind, less effective 12 paragraphs).

The problem is that the more I’ve read — and the more data I’ve consumed, to get back to Evan’s question — the better those three paragraphs will be in the end. It reminds me of making maple syrup, which we did every winter as kids. You’d run around collecting all this sap, gallons and gallons of it from the trees you’d tapped, and then stay up all night boiling it down on an open fire — all to produce one little jar of syrup.

Was it worth the effort? Some people would say yes, others no. But in any case, it sure tasted good.

Another view on this would be Malcolm Gladwell's excellent book "BLINK:the power of thinking without thinking". I don't think it's been mentionned but I apologize if it has... he makes a good case on why sometimes Less data is better.

Even better is ""Winning Decisions'' (Russo and Schoemaker) who, among other things show how the marginal impact of data tends to zero very quickly as the maount of data grows, but decision-makers' confidence in those decisions continues to grow.
They describe how important it is for physicians, and for people who bet on horses. The latter has implications for data-deluged investors: more data doesn't make you a better investor, but it tends to make you more confident in your decision, which, I guess, means you will pay higher prices for the securities being analyzed, thus assuring yourself of lower returns.

I surprised no one has mentioned attention economics explicitly, defined (via Wikipedia) as: "an approach to the management of information that treats human attention as a scarce commodity."

I rely on filters to decide what to read and what to ignore: publishers, mainstream media, and Google's PageRank algorithm all do a passable job.

The best filter of all? History. When I pick up a book that people are still talking about twenty years, two centuries (Tocqueville), or two millennia later (Plutarch) -- I'm never disappointed.