Quantopian's community platform is shutting down. Please read this post for more information and download your code.
Back to Community
Jarque-Bera Test and p-values

Hello,

Re the Instability of Estimates lecture, I'm having a hard time parsing the meaning of the p-value returned by running the Jarque-Bera test. In the lecture notebook, this is stated:

Sure enough the value is < 0.05 and we say that X is not normal

And yet, in the answers to the exercises notebook, this code is run:

if Xp < 0.05:  
    print 'The distribution is likely normal.'  

These aren't consistent, unless I'm missing something? The first makes more sense to me--given the null hypothesis that X came from a normal distribution, then a significant p-value (less than 0.05) suggests that X likely did not come from a normal distribution.

Can someone please explain if I'm misunderstanding?

Thanks!

6 responses

Just came across the same issue; also think there might be a mistake. Can someone confirm?

I didn't look to the exercice but if p-value is higher than 0.05 then The data is normally distributed.Pay attention however if in the exercice for example they consider that the data is normal if and only if p-value is higher than 0.01 ...It depends also on the context . You don't have to run "blindly" a test and make "blind" conclusion.If it's not the case , I think with a p-value of 0 that they did a mistake LOL

Hope it helped

Hi Binani, thanks for your response; however, isn’t the null hypothesis of the test that the distribution is normal? Meaning that if p-value is less than 0.05, we would reject the null hypothesis.
Correct me if I’m wrong!

This is what I said if p-value less than 0.05 (by default) then we reject H0
(the hypothesis telling us that the test REJECT "Normality" of the data) However , if p-value >0.05 then we ACCEPT H0 which means that the test didn't succeed to reject the normality of the data.

Beware , usually , newbies think that the hypothesis tells us that the data is effectively NORMALLY DISTRIBUTED but actually this is not saying any test says. The test says that It doesn't REJECT the hypothesis of data's normality but It doesn't mean that the data is effectively NORMALLY DISTRIBUTED.

I don't if I was clear for the last paragraph...

Thanks for reviving this discussion Andreas and Bibani!

These can be quite difficult to follow, but here's what I feel I've learned since then:

  1. The smaller the p-value, the stronger the evidence we have to reject the null hypothesis. See the wiki and this link. A small p-value suggests a small probability of obtaining the observed results assuming the null hypothesis is true.
  2. The null hypothesis of the Jarque-Bera test is that the data is normally distributed.
  3. The conclusion of (1) and (2) must be that a small p-value for the Jarque-Bera test suggests the data is not normally distributed.

I confirmed this by running jarque_bera on two very large samples (10,000), the first of which is very obviously normal, and the second of which is very obviously bimodal. The test returns a p-value (second return value of jarque_bera, per the documentation) of 0.74 for the normally distributed data and a p-value of 0.0 for the bimodally distributed data. See my attached notebook if you'd like to review the results.

Note that part of the confusion stems from the value of the test statistic itself. Though the value is largely meaningless, it is true that if its value is far from zero, the data is likely not normally distributed. See the wiki for Jarque-Bera. For instance, in my test, the JB value (first value returned by the jarque_bera function) is 0.6 for the normally distributed data and 3331 for the bimodally distributed data. This is essentially "backwards" compared to its p-value, which could lead to some of the confusion here.

I think we can safely say this means the answers for Lecture 10 are incorrect and need adjusting.

I'm deeply sorry , I made a big mistake we reject "H0 : the data is normally distributed" if p<0.05 ; we accept it otherwise