Email news@statisticool.com to sign up to receive news and updates
Weighted Wilcox Rank Sum Test
8/2/17
The Wilcox Rank Sum test (WRST) is an extremely useful test from nonparametric statistics. One issue arises, however, in sample survey work. In a basic sample survey, a unit, say a business establishment, has a weight of w_i which means that that unit represents itself and w_i units in the population. The issue is that the WRST does not incorporate survey weights. Additionally, the standard SAS procedures and R macros for doing a WRST do not incorporate survey weights for nonparametric analysis. In this article I will give several ways (ranging from bad, to OK, to probably correct) to carry out a WRST using survey weights.
a) First, there is the naive way of creating w_i replications of an observation given the observation's weight of w_i, and then run the Wilcox rank sum test as usual on this expanded dataset. I'd proceed with extreme caution with this because it is ad hoc and increases the 'sample size', but it seems a good first stab at it in my opinion. I believe this because experience tells me that the impact of sampling design and weights on statistical tests is sometimes great but sometimes it is not so great. With this approach, your dataset increased from size M to size N. Again, beware.
b) Another way, is to do the duplication described in a) above, but then randomly sample M observations with replacement from the N observations, so you're back at an "original" dataset of size M, and then carry out the Wilcox rank sum test as you normally would. The logic is that this newly sampled dataset would reflect the weights more than not using weights at all.
c) A way to improve the above is to replicate the procedure described in b) many times, say 100, and give the averages and results over these 100 simulations. One could provide distributions of the Wilcox statistics and p-values, for example, to make their conclusion.
d) If you want to do a probably less powerful test than the Wilcox rank sum test, you can do the Wilcox signed rank test on the original data. To get this weighted, you can create the signed ranks yourself, and then put these signed ranks through SAS's PROC SURVEYMEANS which can incorporate the survey weights. Note that this would ultimately carry out a t-test however to get the p-value.
e) A more sophisticated way, probably the correct way mathematically speaking, is something I recently found in R, a nonparametric alternative to the weighted t test (wtd.t.test) in R, which is svyranktest (see https://www.rdocumentation.org/packages/survey/versions/3.32-1/topics/svyranktest) in the 'survey' package. Please note that I have never tried it, but have heard good things from people that have.
The theory for this comes from:
T. Lumley and A.J. Scott (2013). Two-sample rank tests under complex sampling. Biometrika, 100, 831-842. (https://stattech.blogs.auckland.ac.nz/files/2012/06/ranktests-techrep.pdf)
f) Last, I would simply bypass nonparametric approaches entirely and try something like a weighted 2 sample t-test! Nonparametric tests can definitely be a headache when working with survey data, but if sample sizes are "large enough", both parametric and nonparametric tend to be similar, if not the numbers of the test statistics at least the same conclusions from the tests.
I hope any of this helps! Thank you for reading.
Please anonymously VOTE on the content you have just read:
Like:Dislike:
If you enjoyed any of my content, please consider supporting it in a variety of ways:
- PLEASE take a moment to check out two GoFundMe fundraisers I set up. The idea is to make it possible for me to pursue my passions. My goal is to be able to create free randomized educational worksheets and create poetry on a full-time basis. THANK YOU for your support!
- Email news@statisticool.com to sign up to receive news and updates
- Donate any amount via PayPal
- Take my Five Poem Challenge
- Subscribe to my YouTube channel
- Visit my Amazon author page
- Buy what you need on Amazon using my affiliate link
- Follow me on Twitter here
- Buy ad space on Statisticool.com
AFFILIATE LINK DISCLOSURE: Some links included on this page may be affiliate links. If you purchase a product or service with the affiliate link provided I may receive a small commission (at no additional charge to you). Thank you for the support!