3. ROBUST Z-SCORE
It is also called as Median absolute deviation method. It is similar to the Z-score method with some changes in parameters. Since mean and standard deviations are heavily influenced by outliers, alter to these parameters we use median and absolute deviation from median.
Suppose x follows a standard normal distribution. The MAD will converge to the median of the half-normal distribution, which is the 75% percentile of a normal distribution, and N(0.75)≃0.6745.
import pandas as pd
import numpy as np
train = pd.read_csv('../input/house-prices-advanced-regression-techniques/train.csv')
out=[]
def ZRscore_outlier(df):
med = np.median(df)
ma = stats.median_absolute_deviation(df)
for i in df:
z = (0.6745*(i-med))/ (np.median(ma))
if np.abs(z) > 3:
out.append(i)
print("Outliers:",out)
ZRscore_outlier(train['LotArea'])
Last updated