How to test for normality

Rabi Kumar Singh
3 min readSep 25, 2021

--

Many statistical techniques solely depends upon the important assumptions which implies that data is normally distributed. For these techniques, it is good practice to examine the data to confirm that the assumption of normality is tenable.

Here I am going to revel few important simples test which can be performing in order to test for normality.

  • Descriptive Statistics
  • Generate Histogram or Density plot
  • Chi-square test

Descriptive Statistics:

Perhaps, the easiest way to test for normality is to examine several common descriptive statistics. Here’s what to look for:

  • Central tendency. The mean and the median are summary measures used to describe central tendency — the most “typical” value in a set of values. With a normal distribution, the mean is equal to the median.
Central Tendency
  • Skewness. Skewness is a measure of the asymmetry of a probability distribution. If observations are equally distributed around the mean, the skewness value is zero; otherwise, the skewness value is positive or negative. As a rule of thumb, skewness between -2 and +2 is consistent with a normal distribution.
Skewness
  • Kurtosis. Kurtosis is a measure of whether observations cluster around the mean of the distribution or in the tails of the distribution. The normal distribution has a kurtosis value of zero. As a rule of thumb, kurtosis between -2 and +2 is consistent with a normal distribution.
Kurtosis

Histogram/ Kdeplot

Histogram

Another easy way to test for normality is to plot data in a histogram, and see if the histogram reveals the bell-shaped pattern characteristic of a normal distribution. It is easy to plot in python the best way to plot in python, it is done with the help of the Seaborn library.

Kdeplot

Kdeplot

A kernel density estimate (KDE) plot is a method for visualizing the distribution of observations in a dataset, analogous to a histogram. KDE represents the data using a continuous probability density curve in one or more dimensions.

Chi-Square Test

The chi-square goodness of fit test is another good option for determining whether a set of data was sampled from a normal distribution.

Hypothesis Testing

The chi-square test is an actual hypothesis test, where we examine observed data to choose between two statistical hypotheses:

  • Null hypothesis: Data is sampled from a normal distribution.
  • Alternative hypothesis: Data is not sampled from a normal distribution.

Like many other techniques for testing hypotheses, the chi-square test for normality involves computing a test-statistic and finding the P-value for the test statistic, given degrees of freedom and significance level. If the P-value is bigger than the significance level, we accept the null hypothesis; if it is smaller, we reject the null hypothesis.

Conclusion

We can see that after performing the these test for normality. Since most of time instead of performing other tests , I would suggest that perform these essay test and we can conclude our observation .

My Profile

For more update you guys can follow me on Kaggle

Here is the link

https://www.kaggle.com/jurk06

My Profile.

--

--

Rabi Kumar Singh
Rabi Kumar Singh

No responses yet