4. IQR METHOD
In this method by using Inter Quartile Range(IQR), we detect outliers. IQR tells us the variation in the data set. Any value, which is beyond the range of -1.5 x IQR to 1.5 x IQR is treated as outlier
* Q1 represents the 1st quartile/25th percentile of the data.* Q2 represents the 2nd quartile/median/50th percentile of the data.* Q3 represents the 3rd quartile/75th percentile of the data.* (Q1–1.5*IQR) represent the smallest value in the data set and (Q3+1.5*IQR) represnt the largest value in the data set.
import pandas as pd
import numpy as np
train = pd.read_csv('../input/house-prices-advanced-regression-techniques/train.csv')
out=[]
def iqr_outliers(df):
q1 = df.quantile(0.25)
q3 = df.quantile(0.75)
iqr = q3-q1
Lower_tail = q1 - 1.5 * iqr
Upper_tail = q3 + 1.5 * iqr
for i in df:
if i > Upper_tail or i < Lower_tail:
out.append(i)
print("Outliers:",out)
iqr_outliers(train['LotArea'])
Last updated