How Valid Are T.V. Weather Forecasts?

Eggleston and daughterEggleston and his daughter two minutes before it began to hail. Says Eggleston, “Hail was not in the forecast.”

A gentleman named J.D. Eggleston recently wrote to us with a rather interesting report, a nice piece of D.I.Y. Freakonomics concerning the accuracy of local T.V. weather forecasts. I thought it was interesting enough to post in its entirety here on the blog, and I hope you agree. Before we get to the report itself, here is a little background information from Eggleston himself:

I live with my wife and two kids, 15 and 12, in rural northwest Missouri. I earned a bachelor’s in electronics engineering technology from DeVry University in 1987. I’ve been an electronics engineer, software engineer, and for the past 13 years I’ve owned and operated a consumer electronics retail business.

I’ve always loved math and statistics and the information that can be learned from studying them. Growing up, I was told that people like me were called “weird.” Since reading Freakonomics, I now know they are called “economists.” It’s good to know I’m not alone.

The forecasting study began in April of 2007 when my fifth-grade daughter was given a school assignment to monitor the temperature and rainfall at our home for a week. Our family members are big T.V. watchers, and our house is loaded with the latest D.V.R. (digital video recorder) technology. So we decided to document not only the weather results at our home, but also to record the 10 p.m. newscasts for channels 4, 5, 9, and 41 and compare our home results to those reported by the Kansas City T.V. stations.

And while we were at it, we decided to also document each station’s weather predictions and compare them to the actual results to see if one station was better than the others. For a non-T.V. weather source, we also recorded the predictions of the federal government’s National Weather Service each evening.

And now for the report. The takeaway message? Do not plan your weekend activities based on the T.V. weather forecasts unless it is already Thursday — but waiting until Friday would be even better.

How Valid Are T.V. Weather Forecasts?
A Guest Post
By J.D. Eggleston

The authors of Freakonomics posed the question, “Do real estate agents have your best interests at heart?” Then they statistically showed they (the real estate agents) do not. So what about meteorologists? How accurate are their forecasts? Do they even care?

A seven-month study of weather forecasting at Kansas City television stations was conducted over 220 days, from April 22 to November 21, 2007. The seven-day forecasts for both high temperature and P.O.P. (probability of precipitation) for each station’s 10 p.m. telecast and from the N.O.A.A. Web site were recorded. For stations that did not offer a P.O.P. in the form of percent likelihood, the best impression of percent likelihood that could be inferred from the meteorologists’ words and graphics were used. The results of Kansas City’s high temperature and rainfall as reported at the K.C.I. airport weather station — which are the data that become the official record for weather at Kansas City — were also recorded. Those results were then compared to the high temperature and P.O.P. predictions to determine forecasting accuracy for each source for each of the seven days predicted.

The results were quite enlightening, as were some of the comments of the local meteorologists and their station managers. Here a few of the quotes we received:

“We have no idea what’s going to happen [in the weather] beyond three days out.”

“There’s not an evaluation of accuracy in hiring meteorologists. Presentation takes precedence over accuracy.”

“All that viewers care about is the next day. Accuracy is not a big deal to viewers.”

Temperature

INSERT DESCRIPTION

All of the chief meteorologists were asked, “How close does your high-temperature prediction have to be to the actual temperature for you to feel like you did a good job?”

Without exception, all of the meteorologists answered, “within three degrees.”

The chart above shows the results of the stations’ temperature prediction accuracy for their full seven-day forecasts. For next day predicting (one day out), all stations met their “within three degrees” goal. For two days out, all but one was within three degrees. But for three days out and beyond, none of the forecasters met their three-degree benchmark, and in fact get linearly worse each day.

The conclusion to be drawn here is not so much that one station is better than another, since all of them seem to be similar in accuracy — and most people won’t alter their plans based on a couple degrees of temperature. Rather, all of our stations did not do a good job by their own definition of plus/minus three degrees beyond two days out.


Getting It Right the First Time

INSERT DESCRIPTION

When we get our first predictions for, say, June 13th, it will be the seventh day of a seven-day forecast made on June 6th. The following day, it will be the sixth day out, then the fifth, then fourth, and so on until it is tomorrow’s forecast.

Have you ever noticed that the prediction for a particular day keeps changing from day to day, sometimes by quite a bit? The graph above shows how much the different stations change their minds about their own forecasts over a seven-day period.

On average, N.O.A.A. is the most consistent, but even they change their mind by more than six degrees and 23 percent likelihood of precipitation over a seven-day span.

The Kansas City television meteorologists will change their mind from 6.8 to nearly nine degrees in temperature and 30 percent to 57 percent in precipitation, showing a distinct lack of confidence in their initial predictions as time goes on.

The prize for the single most inconsistent forecast goes to Channel 5’s Devon Lucie who on Sunday, September 30th predicted a high temperature of 53 degrees for October 7th, and seven days later changed it to 84 degrees — a difference of 31 degrees! It turned out to be 81 that day.

A close second was Channel 4’s Mike Thompson‘s initial prediction of 83 for October 15th, which he changed to 53 just two days later. It turned out to be 64 on the 15th.

Even more conclusively than the temperature accuracy graph, this prediction variance graph shows that 21st century meteorology is not developed enough to provide a week of accurate temperature forecasting.

Meteorologists take a blind stab at what the high temperature and rain possibilities might be seven days out, and then adjust their predictions on the fly as the week goes on. As mentioned earlier, one meteorologist told us: “We have no idea what’s going to happen beyond three days out.”

Will It Rain?

Precipitation will affect the average person’s plans more significantly than temperature. We rely on meteorologists to be accurate in their rainfall predictions so we can plan the events of our lives. Parades, gardening, ball games, outdoor work, car washing, construction work and farming are all affected — positively or negatively — by rain.

We could just assume it will not rain, but it would be nice to have a little heads-up. In measuring precipitation accuracy, the study assumed that if a forecaster predicted a 50 percent or higher chance of precipitation, they were saying it was more likely to rain than not. Less than 50 percent meant it was more likely to not rain.

That prediction was then compared to whether or not it actually did rain, where “rain” is defined as one-tenth of an inch or more of rainfall reported at K.C.I. Anything less than that is so irrelevant, it would likely make no difference in people’s lives.

INSERT DESCRIPTION

The graph above shows that stations get their precipitation predictions correct about 85 percent of the time one day out and decline to about 73 percent seven days out.

On the surface, that would not seem too bad. But consider that if a meteorologist always predicted that it would never rain, they would be right 86.3 percent of the time. So if a viewer was looking for more certainty than just assuming it will not rain, a successful meteorologist would have to be better than 86.3 percent. Three of the forecasters were about 87 percent at one day out — a hair over the threshold for success.

Other than that, no forecaster is ever better than just assuming it won’t rain. If you think that’s bad, sadly it gets worse:

The data for the precipitation accuracy graph was taken from all days of the study. For many of those summer days it was clearly obvious there would be no rain, and thus those days were no challenge for the meteorologists. A better measure of a forecaster’s skill would be to exclude the days when there was clearly no chance of rain. After all, if you wanted to measure a golfer’s putting skill, you would not have him putt his test putts from only six inches away from the cup. You would want to challenge him with putts from five to fifteen feet — putts that could readily be made or missed.

For that type of meteorologist test, we only included the days that it either rained or the meteorologist predicted it would rain, thus eliminating the days where it clearly was not going to rain. The following graph shows the results.

INSERT DESCRIPTION

Because conditions for rain on these days were more likely and more challenging to predict, we lowered our benchmark for success on this test from 86.3 percent to 50 percent. Sadly, four of the five stations topped the 50 percent goal only on their next-day forecast.

For all days beyond the next day out, viewers would be better off flipping a coin to predict rainfall than trusting the stations on days where rain was possible. Oddly, N.O.A.A. — which had been one of the better forecasters in our other evaluations — was the worst in this one, especially when predicting three days out and beyond.

When N.O.A.A. meteorologist Noelle Runyan was questioned about this, she stated, “Our forecasts are more conservative than the television stations. We raise our P.O.P. predictions to over 50 percent only when we are sure of rain.” This statement and the data above are another illustration of how — with the data and tools given to them — today’s meteorologists cannot confidently predict the weather beyond three days out.

Second Fiddles

Have you ever wondered if the forecast you get from the weekend meteorologist or vacation replacement is as good as from the chief meteorologist? Many people do, so on July 5th, comparisons of the accuracy of each station’s chief meteorologist to their weekend replacements were made.

Because this comparison did not start to be made until July 5th, the numbers shown in the table below may not match the numbers published for station-to-station comparison in other parts of this report. For each of the lines below, the top name is a station’s chief meteorologist, and the second line is their back up. Here is how the individual meteorologists fared.

INSERT DESCRIPTION

At Channel 4, Mike Thompson’s weekend man is Joe Lauria. From the table above, we can see that Lauria is actually much better than Thompson in temperature accuracy from about .5 to 2.5 degrees better across the seven-day range. Regarding precipitation, Thompson is slightly better than Lauria one or two days out, but Lauria is more accurate three to seven days out, and on the challenging days.

At Channel 5, Katie Horner‘s weekend replacement is Devon Lucie. As with Channel 4, it appears Channel 5’s weekend forecasts are more accurate for both temperature and precipitation, but only slightly.

At Channel 9, Pete Grigsby is the weekend man for Bryan Busby. Here, Busby is better at precipitation and at one to three days out on temperature. Grigsby is better four to seven days out on temperature.

Channel 41’s weekend weatherman is Jeremy Nelson. When it comes to temperature, Nelson is not as good as Lezak one or two days out, but is better than Lezak longer range. For precipitation, both are pretty even.

The New and Improved Weather

Back in the 1990’s in an episode of the television show L.A. Law, a nerdy but effective meteorologist sued his former employer because they fired him and hired a comedian to do the weather. While none of Kansas City’s meteorologists are uneducated, stand-up comics, there does seem to be an unfortunate emphasis of style over substance.

When station managers were asked about this, one said, “There’s not an evaluation of accuracy in hiring meteorologists. Presentation takes precedence over accuracy.” And when discussing accuracy (or the lack thereof) of a seven-day forecast, another station manager stated, “All viewers care about is the next day. Accuracy is not a big deal to viewers.”

When weather events occur that really are news — flooding, tornadoes, ice storms — all of the Kansas City meteorologists do an excellent job of informing their viewers, as do most forecasters across the country. Likewise, the stations allow their meteorologists ample time to report these serious weather events, be it in their 5, 6, or 10 p.m. telecasts, or by interrupting regular programming when necessary.

One of the two major weaknesses in television meteorology today is the “non-event” days — the boring, run-of-the-mill days when no significant weather events are upcoming. It is unfortunate that 13 percent of each news telecast (actually about 20 percent if you discount the commercials) is dedicated to a weather forecast that is mostly time-consuming fluff.

The meat of such forecasts could easily be condensed to one minute or less, or maybe even a crawl at the bottom of the screen that runs for the full telecast. Reduction of the weather segment on days when there is no weather news would allow for more thorough reporting of world, national, and local news.

The other major weakness is that ratings drive television. Sadly, the data show that stations are so consumed with ratings that accuracy in weather predictions takes an irrelevant back seat to snappy patter and charm. When directly asked if accuracy mattered in forecasting, every station manager and meteorologist said it did. But when asked what steps they had taken to measure and ensure accuracy, they were without answers.

No meteorologist or television station kept records of what they predicted, nor compared their predictions to actual results over a long term. No meteorologist posts their accuracy statistics on their résumé. No station managers use accuracy statistics in the hiring or evaluation of their meteorologists.

Instead, the focus is on charm, charisma, and presentation. Their words say they care about accuracy, but their actions say they do not. Yet, they wish to continue providing inaccurate seven-day forecasts that are no more than a semi-educated shot in the dark because a) their competitors do and b) they can get away with it since they think the public does not know how inaccurate they are.

Until the public demands change in the form of lost ratings from this hollow practice of “placebo forecasting,” T.V. weather forecasts will continue to blow smoke up our … upper-level-lows.

Until this change comes to pass, we must take what we see on T.V. with a grain (or perhaps block) of salt. And if you really want to know what weather will occur in Kansas City tomorrow, find out what happened in Denver today.


Devon

Fantastic post. From the production / showmanship side of the weather - which like every other tv show / segment is all about ratings - one can't help but notice the proliferation of Alerts and Warnings and Storm Tracking that keep us glued to the sets for up-to-the-minute information.

It would be very interesting to apply the approach from this post to tracking storms and the overall severity - wind speed, snowfall, rainfall, etc.

john

Someone took a similar look at Internet forecasts a few years ago.
http://www.omninerd.com/articles/Internet_Weather_Forecast_Accuracy

SVD

Excellent work! I myself and few other people I know have resorted to using satellite images to make our own forecasts about rain, and follow some rules of thumb about possible highs and lows of temperature. Weather forecasters, bah!

Ed Roberts

An interesting study. I DO want to make one note regarding the forecasts issued by the National Weather Service. As you had noted in your precipitation study, the weather service is deliberately conservative with their forecasts. This is especially true with changing their forecasts. It's actually a mandate of the weather service in a wide variety of forecast products.

RUBBA

I also enjoyed the guest post. Nice job.
Here in LI, NY you have to love the local 24 hour news and weather channel. They predict the temperature in a range. "Today's High will be 61-68 degrees." Every day in the summer we hear
"chance of an afternoon thunderstorm". Gee Thanks!

justanobserver

You're report missed an extremely important element that explains why all of the forecasters are uniformly poor predictors:

They all merely regurgitate the same data they get elsewhere. Garbage in, garbage out.

Do you really think that a TV meteorologist has the wherewithall to tell us what the temperature will be 7 days from now? Please! They're spending their entire news budget chasing Brittney Spears.

They all get this data from elsewhere: the US government. As we all know, the US government does a lot of things merely acceptably, and never does anything very well.

That's why they're all not only wrong, but uniformly wrong. They purchase their data from the same poor source.

Steerpike

An interesting further analysis would be to look at the performance of 'forecasters' in other spheres of activity (including, as many have noted, economic)

Also, I'm not sure that the "prediction varaiance" graph tells us anything we shouldn't already know. Surely it is good practice to constantly refresh your forecast based on emerging data ? Therefore you would expect a forecast to change over time.

There's no point sticking doggedly with your initial data.

Still, interesting article.

Gil

I agree with weather/climate analogy to micro/macro -economics predictions. Economic Doom 'n Gloom seems to be alway forecast 'in the not-too-distant-future' yet it never arrives (or at least is a helluva lot less than what was predicted).

Dilip Andrade

We should probably cut the forecasters a bit of slack. The forecast isn't specific to one particular spot.

If it rained in KC, but not at the airport, it wouldn't be counted as rain in the stats, but would be by anyone who got caught in the rain.

Similarly, temperatures are measured at an airport, which is usually a big open field. Temperatures in the city itself are often hotter if you are in a major city that has a concrete effect, or cooler if the airport is somewhat remote from a body of water that the city sits on.

Hermel

You say
> For all days beyond the next day out, viewers
> would be better off flipping a coin to predict
> rainfall than trusting the stations on days
> where rain was possible.
I guess the hard part here is to know in advance on which days rain is 'possible'.

It would be really interesting to compare the performance of the stations to a static forecast that always predicts the same wheather as it is today. Most of the time, the wheather doesn't change that much from one day to the next so I guess constantly predicting 'the wheather stays as it is' could be quite accurate. That way, you would get a good indicator of how much worth the forecasts really are.

mannyv

"It is much easier to predict macro trends than it is to predict specifics in one city on one day."

Macro trends are made up of micro trends. What you're saying is "if my time horizon is long enough, I can't be proven wrong."

In the long run, everybody dies. While a true statement, it's not very useful in predicting the future.

neal

The notion that weather forecasts are limited to a 3-day window is neither new nor particularly astonishing. The ability for even a moderate-scale atmospheric model (representing, say, weather for the coterminous states) is limited in the scope of its input, output and boundary conditions (effects at the edge of your region of interest).

I think a more useful extension of this work would be to consider the forecasting ability in regions of dramatically different climatological patterns. Comparing the forecasting abilities of TV meteorologists in Kansas City to those in say, Bismarck or Pittsburgh is not useful because the general patterns of weather (that is, climate) are consistent. But if you were to compare forecast accuracy from those locations with Anchorage, Miami, Denver, Phoenix, and Honolulu, you might see a substantially different result, primarily due to the difference in the dominant atmospheric patterns that drive weather.

Read more...

bill

Couldn't we make similar comments about economic forecasting?

andy

emj,

Thanks for looking it over. If Eggleston really is plotting variances in graph 2, then all of the forecasters are actually doing a pretty good job. For example, Channel 4, the worst forecaster of the temperature seven days out has a variance of 8.95. Call it 9, and the standard deviation, being the square root of the variance, is only 3 degrees. That's not bad.

Jim Birch

I never realised how weird things are in the US. Is there some kind of non-compete law operating? Weather prediction is a highly technical activity with significant economic impacts. Here in Australia we have a government organisation - the Bureau of Meteorology - that does numerical weather prediction and produces forecasts that are used by everyone. Local Bureau offices will produce local forecasts incorporating local knowledge and some private outfits will produce custom forecasts for events or activities.

Media organisations don't generally hire meteorologists - they hire presenters. The presenters may attend a short course so they can talk intelligibly about the Bureau's forecasts but they don't presume to know more than the boffins so they don't make stuff up themselves. If they did, and were wrong when the Bureau was right, they could expect trouble.

The quality of forecasts is generally pretty good for a few days but can fall off beyond this. This is in part due to data quality, the limitations of numeric models, and interpretation, but it's ultimately a function of the nonlinear nature of the atmosphere. Small variations produce big effects over time.

It's possible to not only make predictions but to assess the quality of the predictions. This can be done historically, say for a given location and season, or, to make an actual actually assessment on the basis of what's going on: sometimes small unstable systems will be interacting making even short-term predictions unreliable, other times big slow moving systems can result in good predictions out to ten days or so.

A media personality at a TV station might beat the Bureau on delivery, but they haven't got a chance against the physics, statistics, and resources of the big boys on content.

Read more...

liberalarts

Local tv does spend too much time on the weather. Can you imagine how painful a whole channel devoted only to weather would be to watch? Nevermind.

J. Eggleston

Thanks for all of the positive comments on the article. A lot of work went into it, and I'm glad so many now get to enjoy it.

To Jeremy #10, Carr #13 & Wes #23: Get four DVRs and an Excel spreadsheet and do it. No one else does, and someone should. There is an outfit called WeatheRate that rates stations across the country, but they publish only the name of the "winner" in each market - no data, no graphs and explanation of the formula they use to score the stations.

To Andrew #16: I noticed that using the day before was not a good forecasting method, but using the previous time zone was good. What's in San Fran on Monday is in Denver on Tuesday and in K.C. on Wednesday.

To Steve #28 & Sal #44: Bingo!

To Andy #29 & EMJ #31: Sorry about using the word 'variance'. I'm not a math or statistics major, so I didn't realize 'variance' had a special meaning. I just wanted to show that forecasters should only forecast for the time periods over which they can be accurate. Offering a 7-day forecast is as valid as a 123-day forecast. They can't do it, and they know, but they do it anyway because they don't get caught, and if they do there's no punishment.

To Dustin #30 & Skip #37: One of the five meteorologists we interviewed agreed with Dustin, the other four with Skip.

To Mike #3 & Joe #17: I previously considered your idea, but didn't pursue it...until now. I took all five stations and lumped them together (they were all pretty similar), and here's what I found.

Precip Prediction Actually Rained
0% 7.9% of the time
10% 5.3%
20% 10.8%
30% 19.2%
40% 26.5%
50% 27.8%
60% 46.2%
70% 58.0%
80% 58.1%
90% 63.6%
100% 66.7%

To Cris #1: Uh, sorry, no. :-)

Thanks again. Good health and happiness to all.

Read more...

bob mcarthur

This article is very timely given the passing away last week of Edward Lorenz, the meteorologist who discovered chaos theory while studying the predictability of equations that describe fluid flow. Lorenz estimated - in 1961, based on a very simple approximation of the equations that govern motion in the atmosphere - that three days was the inherent limit to accurate weather forecasts, based on the sensitivity of weather systems to small changes in the atmosphere. (This is illustrated nicely by the ensemble forecasts described in comment #20.)

I have always wondered why NOAA even bothers to publish 7-day forecasts. It's not as if they never heard of Edward Lorenz.

In a prior lifetime as a graduate student at MIT, Dr. Lorenz was my thesis advisor. He never failed to be both amused and saddened by "long-range" weather forecasts.

sikantis

I estimate the meteorologists, their job isn't easy. They do it great. With esteem everything goes easier, see my post about meteorologists: www.sikantis.net: Meteorologists get less esteem?

Smadaf

"TV" seems more sensible than "T.V.": "television" is not two words.