Get 40% Off
🚨 Volatile Markets? Find Hidden Gems for Serious OutperformanceFind Stocks Now

Outlier Risk, Part IV

Published 01/19/2022, 12:15 AM
Updated 07/09/2023, 06:31 AM

Identifying outliers in a data set is conceptually cut and dry. Extremes, by definition, should be conspicuous. But in practice, labeling outliers is a mix of art and science, in part because expectations vary from investor to investor as do sensitivities to “risk.” The analysis can also be muddied by the technique used to crunch the numbers.

In previous articles in this series (see list below) we used several applications that rely on a mix of observation and quantification for the test data [rolling 1-year returns for the S&P 500 Index)]. Sorting performances by way of interquartile range, for example, clearly indicates outliers, albeit on a relative basis. But this approach to outlier search is destined to find extremes, even if they don’t exist.

Imagine that we’re looking at a set of returns that we know do not contain outliers, based on some pre-determined rule that we agree on. Nonetheless, graphing the numbers via a histogram or boxplot will likely turn up outliers because the analysis is relative and so there are always outliers for any given data set, even if the extremes don’t pass muster in an absolute sense.

That creates a challenge, particularly if you and I disagree on defining outliers. There’s also a process issue. Visually inspecting histograms and boxplots is tedious and time-consuming if we’re analyzing data periodically for multiple markets and time windows. Yes, we can shift to a parametric approach—z-scores, for example. But we can still disagree over which z-scores are relevant or if the underlying analysis (standard deviation in this case) is reliable.

3rd party Ad. Not an offer or recommendation by Investing.com. See disclosure here or remove ads .

As a solution, we can formalize the analysis with models proper. Several model tests have been proposed. Once again, perfection is elusive, but a higher level of objectivity is available (assuming you buy into the underlying assumptions). Let’s look at one of the options—the Grubbs test. The goal here is to decide if a single outlier is truly extreme by way of a formal statistical test. Once again we’ll use the rolling 1-year returns for the S&P 500.

S&P 500: Rolling 1-Year Return

As outlined previously, there was a case for seeing some of the extreme returns as outliers. But those conclusions rely on a degree of subjectivity. The Grubbs’ test, by comparison, leaves no room for debate. That alone doesn’t mean it’s the last word on defining outliers, but at least it provides a benchmark that’s immune to the behavioral factors that can otherwise complicate the analysis.

As an example, running a Grubbs test on the 1-year returns in the chart above focuses on the highest positive return—a nearly 75% gain. The resulting test statistic (4.15) and p-value (0.25) indicate that this number isn’t an outlier. The p-value is far above 0.05, a reading that is commonly said to provide statistical support for accepting that the null hypothesis is true—in this case that the 75% isn’t an outlier.

Running the Grubbs test on the opposite extreme return—a near 49% loss—also finds no support for labeling the number an outlier.

This result conflicts with some of the previous analytics discussed in this series. The logic appears to be that in context with the full data set, a 75% gain or a 49% one-year loss for the S&P 500 don’t constitute outlier events, at least within the statistical paradigm set up by a Grubbs test.

3rd party Ad. Not an offer or recommendation by Investing.com. See disclosure here or remove ads .

But that’s not the end of the story, even if we’re limiting our data crunching analysis to formal statistical analytics. As we’ll see in the next installment, a Grubbs test isn’t the only game in town in the pursuit of formal outlier tests.

Remaining previous articles in this series:

Outlier Risk, Part II: Z-score

Outlier Risk, Part III: Hampel filter/median absolute deviation

Latest comments

Risk Disclosure: Trading in financial instruments and/or cryptocurrencies involves high risks including the risk of losing some, or all, of your investment amount, and may not be suitable for all investors. Prices of cryptocurrencies are extremely volatile and may be affected by external factors such as financial, regulatory or political events. Trading on margin increases the financial risks.
Before deciding to trade in financial instrument or cryptocurrencies you should be fully informed of the risks and costs associated with trading the financial markets, carefully consider your investment objectives, level of experience, and risk appetite, and seek professional advice where needed.
Fusion Media would like to remind you that the data contained in this website is not necessarily real-time nor accurate. The data and prices on the website are not necessarily provided by any market or exchange, but may be provided by market makers, and so prices may not be accurate and may differ from the actual price at any given market, meaning prices are indicative and not appropriate for trading purposes. Fusion Media and any provider of the data contained in this website will not accept liability for any loss or damage as a result of your trading, or your reliance on the information contained within this website.
It is prohibited to use, store, reproduce, display, modify, transmit or distribute the data contained in this website without the explicit prior written permission of Fusion Media and/or the data provider. All intellectual property rights are reserved by the providers and/or the exchange providing the data contained in this website.
Fusion Media may be compensated by the advertisers that appear on the website, based on your interaction with the advertisements or advertisers.
© 2007-2024 - Fusion Media Limited. All Rights Reserved.