In reality, for the past 20 years, the climatology profession has been oblivious to the errors in AT99, and untroubled by the complete absence of specification testing in the subsequent fingerprinting literature.
These problems mean there is no basis for treating past attribution results based on the AT99 method as robust or valid.
The conclusions might by chance have been correct, or totally inaccurate;
but without correcting the methodology and applying standard tests for failures of the GM conditions it is mere conjecture to say more than that."
Some of the errors would be obvious to anyone trained in regression analysis,
and the fact that they went unnoticed for 20 years despite the method being so heavily used
does not reflect well on climatology as an empirical discipline.
My paper is a critique of “Checking for model consistency in optimal fingerprinting” by Myles Allen and Simon Tett, which was published in Climate Dynamics in 1999 and to which I refer as AT99.
Their attribution methodology was instantly embraced and promoted by the IPCC in the 2001 Third Assessment Report (coincident with their embrace and promotion of the Mann hockey stick).
The IPCC promotion continues today: see AR6 Section 3.2.1.
It has been used in dozens and possibly hundreds of studies over the years.
Wherever you begin in the Optimal Fingerprinting literature (example), all paths lead back to AT99, often via Allen and Stott (2003).
So its errors and deficiencies matter acutely.
... The continuing influence of AT99 two decades later means these issues should be corrected.
I identify 6 conditions needing to be shown for the AT99 method to be valid.”
The Allen and Tett paper had merit as an attempt to make operational some ideas emerging from an engineering (signal processing) paradigm for the purpose of analyzing climate data.
The errors they made come from being experts in one thing but not another,
and the review process in both climate journals and IPCC reports is notorious for not involving people with relevant statistical expertise (despite the reliance on statistical methods).
If someone trained in econometrics had refereed their paper 20 years ago the problems would have immediately been spotted,
the methodology would have been heavily modified or abandoned
and a lot of papers since then would probably never have been published (or would have, but with different conclusions—
I suspect most would have failed to report “attribution”).
Optimal Fingerprinting
AT99 made a number of contributions.
They took note of previous proposals for estimating the greenhouse “signal” in observed climate data and showed that they were equivalent to a statistical technique called Generalized Least Squares (GLS).
They then argued that, by construction, their GLS model satisfies the Gauss-Markov (GM) conditions, which according to an important theorem in statistics means it yields unbiased and efficient parameter estimates.
... Unfortunately these claims are untrue.
Their method is not a conventional GLS model.
It does not, and cannot, satisfy the GM conditions and in particular it violates an important condition for unbiasedness.
And rejection or non-rejection of the RC test tells us nothing about whether the results of an optimal fingerprinting regression are valid.
AT99 and the IPCC
AT99 was heavily promoted in the 2001 IPCC Third Assessment Report (TAR Chapter 12, Box 12.1, Section 12.4.3 and Appendix 12.1)
and has been referenced in every IPCC Assessment Report since.
... The Gauss-Markov (GM) Theorem
As with regression methods generally, everything in this discussion centres on the GM Theorem.
... I teach the GM theorem every year in introductory econometrics.
(As an aside, that means I am aware of the ways I have oversimplified the presentation, but you can refer to the paper and its sources for the formal version).
It comes up near the beginning of an introductory course in regression analysis.
It is not an obscure or advanced concept, it is the foundation of regression modeling techniques.
Much of econometrics consists of testing for and remedying violations of the GM conditions.
... The Main Error in AT99
AT99 asserted that the signal detection regression model applying the P matrix weights is homoscedastic by construction, therefore it satisfies the GM conditions, therefore its estimates are unbiased and efficient (BLUE).
Even if their model yields homoscedastic errors (which is not guaranteed) their statement is obviously incorrect: they left out the conditional independence assumption.
Neither AT99 nor—as far as I have seen—anyone in the climate detection field has ever mentioned the conditional independence assumption nor discussed how to test it nor the consequences should it fail.
And fail it does—routinely in regression modeling; and when it fails the results can be spectacularly wrong, including wrong signs and meaningless magnitudes.
But you won’t know that unless you test for specific violations.
... Other Problems
In my paper I list five assumptions which are necessary for the AT99 model to yield BLUE coefficients, not all of which AT99 stated.
All 5 fail by construction.
I also list 6 conditions that need to be proven for the AT99 method to be valid.
In the absence of such proofs there is no basis for claiming the results of the AT99 method are unbiased or consistent,
and the results of the AT99 method (including use of the RC test) should not be considered reliable as regards the effect of GHG’s on the climate.
... The climate model embeds the assumption that greenhouse gases have a significant climate impact.
Or, equivalently, that natural processes alone cannot generate a large class of observed events in the climate, whereas greenhouse gases can.
It is therefore not possible to use the climate model-generated weights to construct a test of the assumption that natural processes alone could generate the class of observed events in the climate.
... The RC Test
AT99 claimed that a test statistic formed using the signal detection regression residuals and the C matrix from an independent climate model follows a centered chi-squared distribution, and if such a test score is small relative to the 95% chi-squared critical value, the model is validated.
More specifically, the null hypothesis is not rejected.
But what is the null hypothesis?
Astonishingly it was never written out mathematically in the paper.
All AT99 provided was a vague group of statements about noise patterns, ending with a far-reaching claim that if the test doesn’t reject,
“then we have no explicit reason to distrust uncertainty estimates based on our analysis.”
As a result, researchers have treated the RC test as encompassing every possible specification error,
including ones that have no rational connection to it,
erroneously treating non-rejection as comprehensive validation of the signal detection regression model specification.
This is incomprehensible to me.
If in 1999 someone had submitted a paper to even a low-rank economics journal proposing a specification test in the way that AT99 did, it would have been annihilated at review.
They didn’t state the null hypothesis mathematically or list the assumptions necessary to prove its distribution (even asymptotically, let alone exactly),
they provided no analysis of its power against alternatives
nor did they state any alternative hypotheses in any form
so readers have no idea what rejection or non-rejection implies.
Specifically, they established no link between the RC test and the GM conditions.
I provide in the paper a simple description of a case in which the AT99 model might be biased and inconsistent by construction, yet the RC test would never reject.
And supposing that the RC test does reject, which GM condition therefore fails?
Nothing in their paper explains that.
It’s the only specification test used in the fingerprinting literature and it is utterly meaningless.
... Guessing at Potential Objections
1. Yes but look at all the papers over the years that have successfully applied the AT99 method and detected a role for GHGs.
Answer:
the fact that a flawed methodology is used hundreds of times does not make the methodology reliable,
it just means a lot of flawed results have been published.
And the failure to spot the problems means that the people working in the signal detection/Optimal Fingerprinting literature aren’t well-trained in GLS methods.
People have assumed, falsely, that the AT99 method yields “BLUE” – i.e. unbiased and efficient – estimates.
Maybe some of the past results were correct.
The problem is that the basis on which people said so is invalid, so no one knows.
2. Yes but people have used other methods that also detect a causal role for greenhouse gases.
Answer:
I know.
But in past IPCC reports they have acknowledged those methods are weaker as regards proving causality,
and they rely even more explicitly on the assumption that climate models are perfect.
And the methods based on time series analysis have not adequately grappled with the problem of mismatched integration orders between forcings and observed temperatures.
... 3. Yes but this is just theoretical nitpicking, and I haven’t proven the previously-published results are false.
Answer:
What I have proven is that the basis for confidence in them is non-existent.
AT99 correctly highlighted the importance of the GM theorem but messed up its application.
... I have found that common signal detection results, even in recent data sets, don’t survive remedying the failures of the GM conditions.
If anyone thinks my arguments are mere nitpicking and believes the AT99 method is fundamentally sound,
I have listed the six conditions needing to be proven to support such a claim. Good luck.
I am aware that AT99 was followed by Allen and Stott (2003) which proposed TLS for handling errors-in-variables.
This doesn’t alleviate any of the problems I have raised herein. ... "