As many people will have read there was a glitch in the surface temperature record reporting for October. For many Russian stations (and some others), September temperatures were apparently copied over into October, giving an erroneous positive anomaly. The error appears to have been made somewhere between the reporting by the National Weather Services and NOAA’s collation of the GHCN database. GISS, which produces one of the more visible analyses of this raw data, processed the input data as normal and ended up with an October anomaly that was too high. That analysis has now been pulled (in under 24 hours) while they await a correction of input data from NOAA (Update: now (partially) completed).
There were 90 stations for which October numbers equalled September numbers in the corrupted GHCN file for 2008 (out of 908). This compares with an average of about 16 stations each year in the last decade (some earlier years have bigger counts, but none as big as this month, and are much less as a percentage of stations). These other cases seem to be mostly legitimate tropical stations where there isn’t much of a seasonal cycle. That makes it a little tricky to automatically scan for this problem, but putting in a check for the total number or percentage is probably sensible going forward.
It’s clearly true that the more eyes there are looking, the faster errors get noticed and fixed. The cottage industry that has sprung up to examine the daily sea ice numbers or the monthly analyses of surface and satellite temperatures, has certainly increased the number of eyes and that is generally for the good. Whether it’s a discovery of an odd shift in the annual cycle in the UAH MSU-LT data, or this flub in the GHCN data, or the USHCN/GHCN merge issue last year, the extra attention has led to improvements in many products. Nothing of any consequence has changed in terms of our understanding of climate change, but a few more i’s have been dotted and t’s crossed.
But unlike in other fields of citizen-science (astronomy or phenology spring to mind), the motivation for the temperature observers is heavily weighted towards wanting to find something wrong. As we discussed last year, there is a strong yearning among some to want to wake up tomorrow and find that the globe hasn’t been warming, that the sea ice hasn’t melted, that the glaciers have not receded and that indeed, CO2 is not a greenhouse gas. Thus when mistakes occur (and with science being a human endeavour, they always will) the exuberance of the response can be breathtaking – and quite telling.
A few examples from the comments at Watt’s blog will suffice to give you a flavour of the conspiratorial thinking: “I believe they had two sets of data: One would be released if Republicans won, and another if Democrats won.”, “could this be a sneaky way to set up the BO presidency with an urgent need to regulate CO2?”, “There are a great many of us who will under no circumstance allow the oppression of government rule to pervade over our freedom—-PERIOD!!!!!!” (exclamation marks reduced enormously), “these people are blinded by their own bias”, “this sort of scientific fraud”, “Climate science on the warmer side has degenerated to competitive lying”, etc… (To be fair, there were people who made sensible comments as well).
The amount of simply made up stuff is also impressive – the GISS press release declaring the October the ‘warmest ever’? Imaginary (GISS only puts out press releases on the temperature analysis at the end of the year). The headlines trumpeting this result? Non-existent. One clearly sees the relief that finally the grand conspiracy has been rumbled, that the mainstream media will get it’s comeuppance, and that surely now, the powers that be will listen to those voices that had been crying in the wilderness.
Alas! none of this will come to pass. In this case, someone’s programming error will be fixed and nothing will change except for the reporting of a single month’s anomaly. No heads will roll, no congressional investigations will be launched, no politicians (with one possible exception) will take note. This will undoubtedly be disappointing to many, but they should comfort themselves with the thought that the chances of this error happening again has now been diminished. Which is good, right?
In contrast to this molehill, there is an excellent story about how the scientific community really deals with serious mismatches between theory, models and data. That piece concerns the ‘ocean cooling’ story that was all the rage a year or two ago. An initial analysis of a new data source (the Argo float network) had revealed a dramatic short term cooling of the oceans over only 3 years. The problem was that this didn’t match the sea level data, nor theoretical expectations. Nonetheless, the paper was published (somewhat undermining claims that the peer-review system is irretrievably biased) to great acclaim in sections of the blogosphere, and to more muted puzzlement elsewhere. With the community’s attention focused on this issue, it wasn’t however long before problems turned up in the Argo floats themselves, but also in some of the other measurement devices – particularly XBTs. It took a couple of years for these things to fully work themselves out, but the most recent analyses show far fewer of the artifacts that had plagued the ocean heat content analyses in the past. A classic example in fact, of science moving forward on the back of apparent mismatches. Unfortunately, the resolution ended up favoring the models over the initial data reports, and so the whole story is horribly disappointing to some.
Which brings me to my last point, the role of models. It is clear that many of the temperature watchers are doing so in order to show that the IPCC-class models are wrong in their projections. However, the direct approach of downloading those models, running them and looking for flaws is clearly either too onerous or too boring. Even downloading the output (from here or here) is eschewed in favour of firing off Freedom of Information Act requests for data already publicly available – very odd. For another example, despite a few comments about the lack of sufficient comments in the GISS ModelE code (a complaint I also often make), I am unaware of anyone actually independently finding any errors in the publicly available Feb 2004 version (and I know there are a few). Instead, the anti-model crowd focuses on the minor issues that crop up every now and again in real-time data processing hoping that, by proxy, they’ll find a problem with the models.
I say good luck to them. They’ll need it.
Alastair McDonald says
Re the response to #20 where Gavin wrote
Wake up, Gavin! This is important. Where have I gone wrong?
If you have no answer, perhaps others might like to comment.
Cheers, Alastair.
Martin Audley says
Gavin – Your response to #24: No deal – and not a very constructive answer to start with “Rubbish”.
You are setting up a straw man by implying that the only alternative to accepting errors would be to wait for human checking.
You will probably be aware that figures are already (and logically) rejected for days where the recorded mid temperature falls outside the min and max temperatures (A not-uncommon phenomenon at sites where the figures are manually recorded and accidentally transposed).
It’s entirely possible, and reasonable, to ask for more automated error catching during the raw data processing, without requiring a wait for human scrutiny.
Such error catching *could* involve automatically rejecting the data if a “large” (tba) proportion of a sources figures were missing or identical to the previous set. Other obvious checks could include rejection if all values for a month were zero, or outside a “reasonable” (tba) range.
You appear to be resistant to a suggestion for technical improvement.
[Response: Huh? Did you even read the top post? I suggest exactly that. – gavin]
BrianMcL says
The Nasa gistemp website’s offline at the moment.
Perhaps a molehill fell on it.
http://data.giss.nasa.gov/gistemp/
Hopefully it’ll be back online once they’ve cleared the debris away.
Ray Ladbury says
You know, Gavin, the denialosphere is having so much fun with this that maybe you should introduce errors as a regular feature so they can thump their chest and bleat about how “Mavericky” they are. Those of us who understand science could shake our heads and wonder as they reveal their complete lack of perspective about what really matters in science.
[Response: Maybe that was the plan all along? ;) – gavin]
Ellis says
So if the exact same error occurred between April and May nobody would have double checked the results before publishing? Please. You take issue with a commenter at WUWT saying you are blinded by your own bias, as if there is no bias at all in climate science. I personally don’t believe that you, Gavin, are blinded by anything but I wonder how often you take a personal look at those biases. The funny thing about this is that just this January Had/Cru changed there endpoint algorithm because it gave the appearance of being too cold. And yet, the previous Jan when the same algorithm gave the appearance of being too warm nobody noticed. The fact is that Climate Scientist are very quick to correct any data that does not subscribe to the theory, or at the very least look to see if the data is faulty, but blissful unaware of any faulty data that coincides with the theory. Fortunately there are a few jesters out there keeping you on your toes. The molehill is faulty Russian data that would have most likely been corrected for the year end summary, the mountain is the inability to scrutinize the data without regards to any theory.
Ike Solem says
If one wanted an accurate data record, and found any errors or omissions, then one would push for more data collection. That includes ocean heat content measurements, tropical and Southern hemisphere surface temperature and vertical profile measurements, etc.
The fact that skeptics attack data but never call for increased data collection means that they are simply trying to inject doubt into the discussion – accurate and comprehensive datasets are the last thing they want.
The linked story (earthobservatory.nasa.gov) – Ocean Cooling is excellent, but could use an addendum, on this topic:
That was the purpose of Triana, the Deep Space Climate Observatory project, which would have given a continuous non-stitched record of the total energy budget at the top of the atmosphere from the L1 Langrange vantage point. This would also have allowed more accurate estimates of global ocean heat uptake.
Under steady-state conditions, the radiation output at the top of the atmosphere perfectly balances the amount of sunlight being absorbed by the planet. If the planet is warming, there will be less radiation emitted from the top of the atmosphere – which is the case, according to stitched satellite measurements. (Alastair, #51 always tries to claim that “the physics is wrong” based on the same discredited argument – it’s been dealt with many times).
Most of this is just an effort to create doubt – first, there were claims that the models were being tuned to fit the data (which was not true). Now, there are claims that the data is being tuned to fit the models – which is also not true. Nowhere do you see these skeptics calling for more comprehensive data collection, which should indicate what their true agenda is.
Hank Roberts says
Alastair, cite, show the math, and publish. It’s the only way.
—-
For those still aghast at the error that led to this thread, you might consider that you’re illustrating the sense behind Mark Twain’s warning about the risks of reading medical books — that you may die of a typographical error. If you base your life decisions on first-run data, you make the same mistake wattsname is making about weather instrument data sets — they will be wrong sometimes.
Expertise furthers one’s ability to deal with dirty data in your own field — and helps learn the general lesson that all data is dirty.
Sorry.
——
ReCaptcha: Chris- coolly
(the oracle appears to have begun making friends!)
pete best says
Re #50, its funny how denialists always use the term “the Church” of AGW in that it has a religious bent of some kind and that environmentalists have hijacled AGW and write about it in terms of biblical effect of rising sea levels and and masive droughts and floods (40 days and nights I suppose) in order to redicule it in the media and try and decry people as being slightly mad and not in the real world.
This (in my mind) was the reason why realclimate was founded so that articles can be posted by real peer reviewed and IPCC involved scientists who know about the reality of earth science and hence climate science in order to postulate the truth about the subject.
If we look at the index of this site we can see many scientific articles posted by peer reviewed scientists in the field of climate change. If skeptics and denialists are trying to emote that real climate is in some way aligned to the alarmists then they are very much mistaken. Real climate is a scientific site who argues the science and it pisses (sorry but its winding me up)off that the skeptics and the alarmists seem to want to paint real climate in either of the camps (which is silly and hence shows how little the media understand the scientific rationale and its process of being scientific)in order to give their arguments more credence but both of them are wrong. Only the science is right.
For some reason I imagine it is probably the same skeptics who denied that smoking caused lung cancer and other nasty illnesses or that AIDS was transmitted by HIV Virus as they were employed and sponsored by vested interest groups and companies to limit the capatalist scope of the profits and future of their profits.
Anyway, the tide is turning sicne Obama recently won the election in the USA and today the IEA has sided with the IPCC and wants the alternative sustainable energy technologies to become viable and pursued in order to stop the 6C (James Hansens earth sensitivity I am presuming and not the charney 3C only of warming) of future climate change.
YIPPEEEEE!!!!!!
Ashby Lynch says
In my opinion, $500,000 would be well worth it, if this database could be made more reliable and understandable. If the nation takes action to impose carbon taxes of some sort, the costs will be in the billions. Nothing is more important in the AGW issue than the magnitude of warming. Almost everyone agrees that the anthropogenic greenhouse gases contribute some degree of warming to the earth, the burning question is “how much?”. The ultimate test of the climate models is against the temperature record. Validation and correction of those models depends on little else. If that record is incorrect, very important and costly decisions may be incorrect.
It cannot be stressed too much that accurate empirical data is vital to assessing and addressing the problem appropriately.
I would be glad to spearhead an effort to turn this database into a freestanding mulitmillion dollar unit, unrelated and independent of any modeling efforts, or any efforts to influence policy. What steps must be taken to form this unit?
B.D. says
You would have been far better off simply saying that you found an error, are working to correct it, and the new data will be posted at that point. End of story. Instead, while complaining about much ado about nothing, you actually make much ado by: 1) incorrectly placing the blame on NOAA instead of your own processing algorithm. You can accurately model the atmosphere into the future but you can’t detect that a +13C anomaly might be a red flag? Once you take a product and use it to produce a new product, YOU take responsibility for that new product. And 2) ludicrously blaming “temperature observers” for heavily wanting to find something wrong. There are nutcases on both sides of the AGW issue, but legitimate temperature observers want the data to be correct, whether is pro- or anti- AGW. Maybe the “auditors” made your molehile into Mt. Washington, but your blame-deflection game elevated it to Mt. Everest.
[Response: I’m finding this continued tone of mock outrage a little tiresome. The errors are in the file ftp://ftp.ncdc.noaa.gov/pub/data/ghcn/v2/v2.mean.Z, not in the GISTEMP code (and by the way, the GISTEMP effort has nothing to do with me personally). The processing algorithm worked fine. Multi-degree anomalies are not that unusual (checks are made for unphysical outliers which wasn’t the case here). I daresay they’ll put in more checks now to deal with this specific issue, but you can’t check for all possible kinds of corrupted data in the input files ahead of time. Science works because people check things – against expectations, against alternate records, against other sources of data – this is what occurred here. – gavin]
wayne davidson says
43 ccpo. Well yes some heat from ice, but there was more open water last year and October just past was warmer than last year. September was equally warm than 2007, I suspect more complexities, namely extensive cloud extent and feedbacks generated with heat reflected from above and gained from freezing water below, but the clouds have played a huge role.
iheartheidicullen says
this has always rested with the russians. real or not, their semi-continent seems to be cursed by the warming. “mountains, molehills, gorillas, etc.”, you guys sure use great imagery for scientists.
Martin Audley says
Gavin – Re your response to #52 – You should be aware that in #52 I was responding to your own text in #24 in which you wrote:
“…If you don’t mind waiting years for the data, fine…”
Your comment suggested that the only alternative to my suggestion was to switch to heavily-delayed human-checked results.
I restate my position: There *is* a problem in the GISS data product, and it * can * be improved by better automated error trapping, *without* years of delay.
I’ll go further and say it *should* be improved, (even at great expense) – since decisions influencing the US and world economy are being made based upon it. That’s what stops this issue from being a molehill.
[Response: Don’t be ridiculous. No decisions are made on whether one months data was erroneous and available for less than 24 hours. No one died, no one lost out, nothing happened. Therefore there is no cost that could have been avoided. Automated error trapping is fine, but you need to know what to look for. I am unaware of anyone raising the issue of systematic month-to-month overwrites and so how automatic traps are supposed to catch errors that haven’t happened before is a bit of a mystery. Existing traps look for station specific outliers, but this was a regional effect – not a single station, so that didn’t catch it. The main point is that no automatic traps can catch every single thing – the existing ones didn’t catch this, if there is a next time, they will. But no traps are going to catch issues that have never been seen before. – gavin]
maikdev says
MSU:http://vortex.nsstc.uah.edu/data/msu/t2lt/uahncdc.lt
GISTEMP met. stations: the warming trend agrees with MSU´s one.
http://data.giss.nasa.gov/gistemp/graphs/Fig.A.txt
GISTEMP land proccessed: the warming trend is bigger. Why?
http://data.giss.nasa.gov/gistemp/graphs/Fig.A4.txt
GISTEMP algorythm works fine?
Patrick Henry says
In many scientific fields we have moved on from error prone manual methodologies, to more sophisticated ones – like automated satellite data. This screw-up is just one more indicator that it is time to move on.
An increasingly sparse array of ground based thermometers which require post-processing adjustments equal to or greater than the trend, is simply not an acceptable way to measure a small trend. There is too much subjectivity in the process.
Next time you should compare vs. satellite data before publishing.
Rob says
That analysis has now been pulled (in under 24 hours) while they await a correction of input data from NOAA. Yes only after Steve Mac informed you [edit]
They pulled the data AFTER I (Steve Mac sent them an email notifying them of the error (which had been pointed out to me by a CA reader and which I had confirmed). They did not identify the error on their own. [edit]
[Response: You and McIntyre are mistaken. The first intimation of a problem was posted on Watt’s blog in the comments by ‘Chris’ at around 4pm EST. By 7pm in those comments John Goetz had confirmed that the NOAA file was the problem. Notifications to the GISTEMP team of a problem started arriving very shortly after that, and I personally emailed them that evening. However, no action was taken until the next morning because people had left work already. They had decided to take the analysis down before 8.14am (email time stamp to me) since the overnight update to the NOAA file (uploaded 4.30am) had not fixed the problem. McIntyre’s intervention sometime that morning is neither here nor there. Possibly he should consider that he is not the only person in the world with email, nor is he the only person that can read. The credit for first spotting this goes to the commentators on WUWT, and the first notification to GISTEMP was that evening. – gavin]
David says
You do have something in common with Mr Watts, Gavin: neither of you know how to use apostrophes.
[Response: Common ground at last! – gavin]
Ed says
NASA GISTEMP site still down as of 12:29 CST
http://data.giss.nasa.gov/gistemp/
[Response: It was nothing to do with GISTEMP. The link to GISS goes through GSFC and they had a scheduled power outage from 7 to 3pm. – gavin]
Jared says
The response from several of the RC faithful here is truly disappointing. Rather than acknowledge the fact that quality control is of the utmost importance for agencies that monitor global temperature, and that a rather obvious mistake was made here, all you guys are doing is predictably saying “It doesn’t really matter, the science is what matters”, as well as putting down skeptics.
Well, here is a newsflash: one of the fundamentals of science is accurate data collection. If you cannot acknowlege how important that is, then you truly are blind to what constitutes good science.
[Response: Of course it’s important. No ‘newsflash’ needed. – gavin]
Jerry Alexander says
Temperature problems: Hadley Atmospheric Center in London released a report last year stating that Russia and many other weather stations were making false or weak reports.
NOAA is not reporting on world temperatures. They are strictly U.S.
[Response: The collation of the GHCN file is done (I think) at NOAA, but relies on input from the national weather centers who decide what they want to upload. This is why national weather service web sites often have many more, and more up-to-date stations than are included in GHCN. As to their quality, there are a multitude of stories that have shown problems with individual stations – however, they are the responsibility of the weather services, not GISTEMP. – gavin]
Jonathan says
Ashby Lynch #54 – absolutely.
Can I just check I have got this right? GISTEMP is, if I understand correctly, one of the main temperature datasets against which GCMs are checked. On the basis of this we are proposing to revolutionise the economies of the world. The cost is hard to estimate, but in round terms we can say many billions of dollars. It would, therefore, seem to be a good idea to get this right. But the current approximate budget (from the figures in #37) is about $20,000 per year?
[Response: Pretty much. However, the costs of gathering the data, collating it and disseminating it is much larger (many millions I would imagine) and is found in the budgets of NOAA and the various national weather services. The GISTEMP product is an analysis of that data, not the originator of it. Other analyses exist which show very similar results. Maybe if that was better understood, the demands on what GISTEMP do or not do would be a little more reasonable! – gavin]
William Astley says
In reply to Paddy Comment’s : #3 “It appears that there are systemic errors in Russian data collection that results in significantly overstating the ST anomaly by several degrees C warmer for an extended period of time. The October data are counter intuitive and contradicted by UAH troposphere temperatures and Arctic ice extent increases during October. I find your explanation as implausible as Russian October ST.”
I agree with your comment. I thought the October monthly average anomaly was high based on the current ocean surface temperatures anomaly.
http://www.osdpd.noaa.gov/PSB/EPS/SST/data/anomnight.11.10.2008.gif
GCR has been increasing. Has there been any change observed in planetary cloud cover?
Andrew Thomson says
The argument that GISS doesn’t have the resources to screen this data before it is posted might have merit if this function was outsourced through a volunteer activity of the local Boy Scout troop but considering it is an internal function of what is suppose to be one of the primary authorities in the field this just doesn’t hold water. Call me budget challenged, but I fail to see why it would necessarily cost $500,000 to review a measly small data set of 908 to catch glaring errors in roughly 10% of the information. The casual observation by whoever uploaded the anomaly map that all of Asia looked like a blast furnace might have been a subtle clue. I do not understand why GISS couldn’t complete this task in several minutes, as was accomplished by several third parties once the data was presented on the net. It seems to me you could save yourself $500,000 by just buying Steve McIntyre the occasional working lunch.
Sure, mistakes happen but my view is that when your organization conducts a colossal blunder it’s probably not the best approach to simultaneously point the finger at someone else, rage against those who pointed out the blunder, state that it is business as usual and recommend that your organization receive substantially more money to prevent such blunders in the future. Instead you should probably evaluate what processing errors and cultural biases led to such an impressive gaffe.
[Response: With hindsight, it’s trivial to spot a problem you know about. It took me about 5 minutes to see the problem in the NOAA file once I knew what to look for. That is not the point – you need staff and eyes to see things that you aren’t expecting. And please dial down the rhetoric. The opening of Terminal 5 at Heathrow was a colossal blunder, the units issue with the Mars Climate Orbiter was a colossal blunder, the choice of Sarah Palin … etc. You can start using that language when you provide any evidence that this glitch affected anyone in any negative way. – gavin]
Pierre Gosselin says
Governments are forming major policy on GISS data. Policy that will profoundly alter people’s lives all over the world. PLEASE, do something to assure that the data and results are correct in the future. A little quality control and responsibility is the least one can expect. The term rigorous review has to mean something again at our prestigious institutions.
Thank you.
Pierre Gosselin says
“GISTEMP provides that analysis as is and can’t possibly certify the work of all the individual agencies whose data gets used.”
Garbage in means garbage out. If one can’t certify the quality of the data, then one ought not expect governments to form public policy based on “uncertified” results. Overall, GISS should at least show gratitude to the outsiders who spot errors, and not get all defensive about it. You’re getting quality control for free!
“If that’s what you want, only look at the data at the end of the year and ignore the monthly releases.” – gavin
If the integrity of the data and GISS results don’t improve quickly, then the yearly results may also end up getting ignored. Please take this as constructive criticism only.
FrancisT says
As I note at my blog ( http://www.di2.nu/200811/12.htm ) a number of the arguments here sound remarkably similar to those put forward by large software companies in the face of open source alternatives. I don’t think this bodes well for climate science
PS Captcha – Further Research – sounds like a message for all of us :)
[Response: Actually, I’d strongly support an open source analysis of the same data. ‘OpenTemp’ started off with that idea I think, but enthusiasm appears to have died down once it was clear that the results weren’t that different. GISS ended up doing this kind of by accident, and given the grief it gives us, I’m sure we’d be happy to pass on to a trustworthy open source version. All the data sources are publicly downloadable and the issues involved in producing the end product have been discussed ad nasueum. Go for it! – gavin]
jcbmack says
Alastair #51 you have overcomplciated matters. That simple. The analogy from #57 is fitting here. Also I read Gavin’s post, he cleary states he knows of several errors, and makes suggestion for hasimprovement; the thing is the sceptics have not been able to find them or suggest logical ways to make improvements. Science is cold as ice, all this finger pointing and jumping up and down because the data is not perfect is nonsense. If a patient comes in with a head trauma and the CT shows no bleeding, but 3 hours later the patient’s personality grossly changes and a seizure; the CT is performed again, lo and behold there is bleeding (acute hemmorhage) that was not there before, the patient operated on and saved; do we tell the doctor’s they are incompetent? Or if a patient with metabolic acidosis comes in the ER, the sodium bicarbonate is admistered immediately to treat the symptoms and save their life, but does not work; the doctors probe deeper and discover their error and resort to proper treatment… get the picture?
Kevin McKinney says
Re 59:
Ashby, the real problem with this idea is that these anomalies, even if uncorrected, will average out and have little to no influence over the long term–which is what we need to watch. (This assumes randomness, of course, but I’m not a conspiracy nut.) So the $500 G wouldn’t buy us any more real certainty about actual trends.
In short, it needs to be about climate, not weather.
Dill Weed says
Patrick you sure do get around. : P
Welcome to RC!
Dill Weed
Mark says
Unfortunately, Patrick (#65) the network required for ground based systems is HUGE.
Who is willing to pay 2p in the pound for taxes to pay for that???
It’s cheaper to do this and keep the vigilance up.
PS Gerrym, #50, so what if it wasn’t 12C? What change does it make to the output? If another measurement is 10C and turns out it’s 8C, that’s still a big anomaly.
PPS Alistair #51, the BZZT try again was because this statement:
“The OLR emitted by carbon dioxide is calculated using the kinetic temperature of the atmosphere, but CO2 emits at its vibrational temperature.”
Is incorrect.
a) “vibrational temperature” is a meaningless and unscientific element. I.e. It doesn’t exist.
b) Excitation relaxation lasts far, FAR too long for it to survive the collision with another (non-CO2-excited) molecule in the air
Therefore finding something wrong based on these erroneous statements would be akin to Neo noticing the black cat walking past twice.
Someone would have had to change the Matrix…
Add that if you HAD found something wrong in all these radiative models, you wouldn’t be here talking about it. You’d be getting this paper published and garnering fame.
Mike Bryant says
“Response: I’m finding this continued tone of mock outrage…”
We are your employers, and this is not mock outrage. This is rather righteous indignation. You should accept responsibility, apologize and beg that you are not fired.
[Response: No it isn’t. It politically-driven piling on. What responsibility do you think I should take for a project that I am not involved in (other than it happens in my building)? – gavin]
Derek Smith says
I find this site and the discussions extremely useful. Gavin is right in insisting that scientists should be seeking the truth about climate change rather than supporting the pro- or anti-AGW factions. So please could we balance Gavin’s rather pro-AGW Responses with some more sceptical comments on what seem to me to be often quite serious contributions which he chooses to rubbish? If not, this site could start to look like an upper-class Greenpeace marketing emporium.
jcbmack says
Conversely, if one were to accept all data at face value with no analysis, then we would suffer the possibility of grave consequences as Twain pointed out.
Mark says
Pierre, #75. However, ***is*** this garbage? I mean, even if the result came back as 8.0, that’s wrong, because the average of the dataset was both
a) Note ***precisely*** 8.0 but may have been 8.00000032283004376. GIGO?
b) The measurements were a sampling, so the “true” answer may have been 8.4854975303005058689600376663829002481011, GIGO?
This is called “sensitivity”.
And as Kevin says in #78, what would this error do to the model outputs for climate over the next 150 years? Done the maths on that?
If not, you don’t know if this error really IS garbage or just an error.
Then again, assuming that it must be is exactly what this entire topic is about. You haven’t done the maths to see if it IS garbage, but you’ve just assumed that since a correction had to be made, it MUST be.
“Begging the question”. And in its true sense. “GIGO” begs the question “How do you know it to be garbage in?”
W F Lenihan says
If I understand your explanation for the flawed October global ST data, it is:
You get garbage from NOAA, its compiler; you feed it into the black box that you designed and built; and after processing when garbage comes out, it is not your fault.
Preposterous.
[Response: No it’s not (and note that I did not design or build this). If you can show me a piece of software made for any purpose that gives the right answer regardless of it’s input data, I’ll be very impressed. – gavin]
Rod B says
Mark (45), Hank (if still interested), et al, re 12 & 33: That raw data should not be released willy-nilly is still a correct idea, IMO, but Gavin (24 & 48) makes a solid case that this would be too pristine, impractical, and probably in the end, not very helpful. Every now and then s__ happens and the lumps should just be taken and problems corrected as soon as discovered rather than delay the data for weeks or months while bureaucrats scrub it — as Hank said.
There will always be someone criticizing any output from any group, so it’s a fool’s errand to try to get around all conceivable attacks. Mark should mitigate his paranoia a bit. And, BTW, if I’m looking at satellite spectrographic data and it shows variances from 0.003 to 0.007 over months and then shows a couple of months at 7.6, no, I don’t need an education in quantum mechanics to make a damn good assertion that something smells.
MrPete says
Gavin,
Do you guys have access to people who process data commercially? I’m thinking some good lessons could be learned from a meeting with folks from a place like D&B. Since they have clients who pay $$$ for their data, they can afford to invest heavily in improved data QA processes. There are quite a few similarities; they too are consumers/reporters in many cases.
What I’m thinking: such entities have refined their methods and processes to be very efficient/effective. It may be possible to arrange a mutually beneficial agreement that would take care of these kinds of issues.
(Yes, I have some experience with this.)
[Response: Sure, but remember the no budget thing. You/They should contact the GISTEMP team to see. Note that there is already an independent rewrite of the base code being volunteered by a software company. – gavin]
Jared says
#78 Kevin
I agree, Kevin, that it ultimately comes down to climate, not weather. However, climate trends are composed of lots of weather over periods of time. If weather data is not interpreted properly, then climate data has the potential of being corrupted over time as well. Obviously, if this mistake had somehow not been caught, the huge GISS October anomaly would have represented a serious divergence with the satellite sources – and it also would have altered the yearly data/trends.
Some mistakes and problems are caught, and I’m sure some aren’t, but the bottom line is that we should be doing our best to make sure our weather information is as accurate as possible – and the climate trends will follow.
steven mosher says
Gavin;
“Actually, I’d strongly support an open source analysis of the same data. ‘OpenTemp’ started off with that idea I think, but enthusiasm appears to have died down once it was clear that the results weren’t that different. GISS ended up doing this kind of by accident, and given the grief it gives us, I’m sure we’d be happy to pass on to a trustworthy open source version. All the data sources are publicly downloadable and the issues involved in producing the end product have been discussed ad nasueum. Go for it! – gavin”
Actually the enthusiasm did not die down because the results “were not that different” The problem we encountered was twofold. First, we only had results for CONUS, and the issue there was we only had a handful of class1 and class2 sites. So the “match” achieved with GISS had no statistical relevance. Further, I had results showing a difference in trend for the worst sites ( class5) versus all others,
again with a small sample size and no statistical relevance. The small sample size raised the issue of getting more surveying done by surface stations, a volunteer run organization with a budget less than zero. It also raised the issue of how well “nightlights” works as a proxy for rural sites. Both of these issues remain unresolved. Second, the result achieved in CONUS has not been extended to the ROW. As people started to look at the ROW ( see the where’s waldo posts on CA) more data problems cropped up. You guessed it, in RUSSIA. There is a reason why the CA crowd is particularly sceptical about the Russian data, once you understand the background.
WRT “all the data being publically available.” Most is.
One thing lacking is the data from intermediate steps. GISSTEMP has multiple steps. At the end of each step intermediate results files are output. In doing an Open source rewrite of GISSTEMP this data would be hugely helpful. that way the new program could be verified step by step. Ordinarily we would compile and run the original GISS source and pull the intermediate results from that. But alas no one has been able to get the code to compile fully or run. ( Little things like infinite loops and EOF problems bedeviled folks every step of the way.) Even in that case we would still want to check that the intermediate results achieved on the recompiled code matched the intermediate results on the original code. particularly since folks had to make minor code changes and change fiddle with compiler flags to get rid of some of the errors . I know that seems anal, but its a QC/QA thing.
The other issue is that some of the algorithms are still somewhat obscure. The funny thing is after I noted one of these obscurities, my next download of the code contained a clarification of the line we had been discussing at CA for two days.
A Open Source version of GISSTEMP would require a bit more work than just tossing the code and data over the fence, but not that much work. It would be much better to get done and over with. Otherwise, who knows, another “Y2K” problem could crop up and steal the news for a week, or another bad russian data day. It’s actually in NASA’s interest to divest itself of this responsibility. Devote the .25FTE to open sourcing the product and the savings over the next 100 years will add up.
[Response: But you are not thinking like a true open source pioneer. You have the same source data that GISTEMP uses, you’ve read the papers on the various issues that arise (splicing different versions, correcting for UHI, filling in missing data etc.), you should make a version that addresses these things indpendently and at the same time build in a better level of flexibility and portability. Linus Torvalds didn’t build Linux by going back to see what BSD did every time he had a problem. Make something new and truly independent. Don’t wait on other people to spoon feed you stuff. That would be a real contribution. – gavin]
Harold K McCard says
During most of my business career. I had to sign my name on documents cerifying that the data contained therin was accurate. It didn’t matter who or what source provided the data. Moreover, there were legal consequences in the event that inaccuracies were subsequently discovered. Later in my career, the disclaimer, \to the best of my knowledege.\ was no longer acceptable. I never quite understood why some of my USG counterparts weren’t held to the same standards.
Figen Mekik says
# 81: Seriously? Please name me one line of work, one type of human endeavor, where mistakes are never made.
The only place where you can be 100% certain of something is if you simply choose to believe in it without asking for data or proof. You know, like having faith in god or something. Nothing wrong with faith if that is what you choose as your guide to understanding the truth. But accept that other than divine perfection, humans err. It’s the way we are. And there is a lot of integrity in being honest about it.
The great thing about science is that we immediately accept the errors we make as soon as we realize we made them and take action to correct it. And as you see in this post, they even openly broadcast it that errors were made and are being corrected. You think people should be fired for that?
Ed says
.25 FTE’s is someone devoting 2 hours a day to verify the accuracy of this data. That sounds like enough given the number of sites (2000) and frequency of measurement (one report per month?). $500,000 for 6 people working full time 8 hours a day sounds like overkill. What am I missing? One person on the web was able to catch this glitch and probably did not have to spend 2 hours to do so.
[Response: The work on this is done over a one or two day period once every month. Add in a bit more for code maintenance, writing for papers etc. But remember that this is an analysis not data generation. NOAA has a whole division devoted to what you are concerned about (and I’m sure there are counterparts in the other national weather centers). If you want GISTEMP do that level of additional investigation it takes people to do it. As you say, this was caught very quickly and with no cost. Seems a pretty good deal. – gavin]
pete best says
Hey Gavin, it seems to be that this website is so scientifically important and successful that every contrarian/skeptic is attacking it here now.
Congratulations, at least they are posting here and hence you are now able to demonstrate how silly they are being.
Ray Ladbury says
Derek Smith likes his climate science “fair and balanced.” Only one problem Derek, the basis of this site is science, which means that things need to be consistent with the evidence. Tell ya what. If you can find a reputable climate scientist with an extensive publication record that really illuminates the phenomena of climate and whose opposition is not based on either misunderstanding or outright obfuscation, we’d all love to know about it.
[Crickets chirping…]
jcbmack says
Mike Bryant it seems you are confusing a few issues, there is a Division Of Labor. The mathematical modeler does his part, inputting, adapting information into the computer; the scientists who make the oberervarions and measurements assist in the data to be input; the quality control measures ensure that errors like this one are caught efficiently and additional checks help improve quality; also speaking of business and error analysis and calculated errors, are you familiar with Six Sigma? Think about it, everything from pacemakers, to a tuna fish sandwich has a margin of error and probability for errors. The science of climatology has evolved quite rapidly, and it continues to do; just read the last IPCC report with one from 2-4 years prior; what a massive improvement in attribution and evidence to support such claims!
And Jared # 88 you are oversimplifying matters. Try reading some of the articles available on this site regarding weather and climate. You are neglecting the long term averaging done. To say that weather patterns average out into climate trends or that weather is chaotic, well, yes, however, over time climate itself as an average is easier to measure out and predict than weather two weeks from now.
jcbmack says
Then there are the actual software engineers and computer programmers, and the repair teams…
jcbmack says
The denialist agenda has no leg to stand on when looking at the facts, the data, the models, the artic ice sheets melting etc…
jcbmack says
Once again… read, learn, solve the equations, see how data is transduced and analyzed… we all begin as lay peopel, but our own eforts are what take us into a given area(s) of expertise. Take some courses at night. Read the realclimate literature, suscribe to journal nature. Again a goed engineering community college offers calculus 1-3, (some offer a type of IV) modern physics, enginerring, and advanced engineering mathematics; state universities with no prestige, offer excellent courses in a wide variety if courses that will enable one to understand errors, error analysis and the process of quality controls. Read the IPCC report, all of it.
A layperson can ask a question, but how can they criticize? For every one Einstein there are a billion wannabes:)
jcbmack says
I would love to get Watts in a debate.
Ben Lankamp says
#76, Gavin: thanks for the encouragement. My Fortran 90 is mediocre at best but I got pretty far with reproducing the GISSTEMP steps. So far I found the homogenisation (step 2) to be the most complex to unravel. Some of the methodology is not entirely clear to me but I still have to check the papers on that. My set-up is slightly different than GISSTEMP, mostly on the level of I/O: all input data is stored and continuously updated in a SQL database. All program source code is in comprehensible C++. An attempt is being made to publicise a work-in-progress version before the end of the year. No guarantees though, it is all done in spare time next to my forecaster job.