P-Values and Trend

5/8/19

In Briggs' Using P-Values To Diagnose "Trends" Is Invalid article, as an argument against p-values, he said

Consider the problem of varying the start date of the analysis. We have observations from 1 to t, and check for trend using (the incorrect) statistical means. The trend, as above in the picture, is declared. It is negative. Therefore, the cause is said to be present. Then redo the analysis, this time starting from 2 to t, then 3 to t, and so on. You will discover that the trend changes, and even changes signs, all changes verified by wee p-values.

Where to even start...?

The X13-ARIMA-SEATS seasonal adjustment program is an approach developed incrementally over the last 50+ years, and is used in many government agencies, banks, etc. in the world for seasonal adjustment. It uses a variety of statistics and p-values (that Briggs doesn't like).

What is Briggs' well-developed and accepted substitute, exactly? I ask this because "all we have to do is look", as Briggs proposed for a substitute to using models, doesn't pass the most basic smell test. Different people "looking", whatever that even means, could come up with different decisions. Moreover, the same person looking at the same data at a different time could come up with a different decision. We obviously need to be more rigorous and define things in science. Also, modern time series isn't confused about defining a trend like Briggs implies. The trend and other terms are quite well defined in the documentation and program code.

Yes, output and decisions can change based on your model span (data used to create model). That's not surprising or undesirable whatsoever. Is Briggs arguing that it shouldn't change? I'd still rather have things change based on different input data than have things based on "looking", which we still do not know what it means.

Please pass this on to anybody you see, especially scientists, who use statistical methods to claim their trends are "significant".

I passed to them "all we have to do is look". What do you think they said?

To my reply, Briggs wrote

You have read the many arguments against p-values in the linked paper, yes?

Yes, I am very familiar with them, and frankly, all the arguments against p-values are very poor arguments. I am more wondering however what the arguments are against claimed proposed alternatives to p-values, or how the pros of said alternatives stack up to the pros of p-values. For example, Briggs' readers may be wondering why 'looking' is not relied upon by official agencies to make important decisions but X13-ARIMA-SEATS is.

He continued:

X13-ARIMA-SEATS is used for forecasting, primarily. And forecasting, which is to say predicting, is exactly what I do advocate. That is the alternative to hypothesis testing/Bayes factors. ... If you agree p-values are invalid, I don't see we have any disagreement.

I don't agree that p-values are invalid; they are very valid. I'm saying X13-ARIMA-SEATS is successful and it uses p-values.

Moreover, tests for trends with very minimal assumptions have existed since at least 1945. I present one below from Mann (1945).

The Mann trend test is like Kendall's tau if you are familiar with that. Some details:

time is treated as the X variable
association between X and Y might be indicative of a trend
considers relative magnitude of each observation relative to every preceding observation

One calculates the n choose 2 pairwise differences Y_j - Y_i (for 1<=i<=j<=n). The test statistic is T = sum[ c(Y_j-Y_i) ], where the function c is:

c(a):
= 1, if a>0
= 0, if a=0
= -1, if a<0

If the alternative hypothesis is an upward trend, the rejection region is large positive values of T. For a downward trend, we'd reject for large negative values of T.

There are messy equations for small, but for large n, Z = [3*sqrt(n(n-1))*T ] / sqrt(2*(2n+5)) is distributed as a standard normal. I got most of this information from Nonparametric Statistical Inference, by Gibbons, Chakraborti, p. 399 and p. 406.

Great book on nonparametric statistics by the way, check it out here:

Do you see how this trend is precisely defined? It is not "just look".

After that exchange was basically ignored, I lost ~~hope~~ interest in replying further. The bottom line is the pros use p-values, and have been for a while. The pros do not use p-values just to use them. They use them because "just look" is ill-defined and a terrible way to proceed in science.

Thanks for reading.

Please anonymously VOTE on the content you have just read:

Like:
Dislike: