Test for Normality and Multicollinearity in Probit Models

Joseph Malcontento

Join Date: Apr 2016

Posts: 6
#1

Test for Normality and Multicollinearity in Probit Models

17 Apr 2016, 23:56

Hello! I've been running post-estimation techniques after conducting a probit regression. I have two questions:

1. How can I test for normality? The normal "sktest" only works with an OLS regression. I was wondering if there is a command that can allow me to test for normality after running a probit model.

2. I want to test for the presence of multicollinearity in my probit model but just like in the previous question, the "vif" command only works after an OLS regression. I did some searching over the internet and some sources say that I should run an OLS regression and run VIF. I can disregard the OLS regression results but use the VIF in determining the presence of multicollinearity in my OLS model and apply its results to my probit model also. Can I do this or is there a more proper way of detecting this issue? I did this and the VIF, after an OLS regression, tell me that multicollinearity is unlikely present in my model because it's less than the benchmark value of 10. Is it acceptable to say that my probit model using the same set of variables is free from this issue too?
Tags: None
Maarten Buis

Join Date: Mar 2014

Posts: 3464
#2

18 Apr 2016, 01:47

It is de facto impossible to test for normality in a probit model. The residual that should be normally distributed is the difference between the unobserved latent variable and the predicted values. Compare that with the residual in linear regression (OLS is the algorithm used for computing the estimates, while linear regression is the model) are the difference between the observed dependent variable and the predicted values. In essence, the normality assumption governs the functional form relating the expalantory variables with the probability. The common alternative (logit) is so similar that they are indistinguishable in most datasets.

Multicolinearity is a characteristic of the explanatory variables alone, so it does not matter which model was used to compute the VIF.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Joseph Malcontento

Join Date: Apr 2016

Posts: 6
#3

18 Apr 2016, 06:13

Thank you very much sir! Besides checking for multicollinearity, using robust standard errors in accounting for heteroscedasticity and conducting a link test to check for model misspecification, can you suggest other post-estimation techniques I can employ in my probit model?
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3464
#4

18 Apr 2016, 06:41

Heteroscedasticity is a much more complicated problem in logit/probit models than in linear regression, see e.g.: http://maartenbuis.nl/wp/oddsratio.html . Robust standard errors are not a solution to that problem.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3018
#5

18 Apr 2016, 14:34

Dear Joseph and Maarten,

This may be one of those "cultural" differences, but there are tests for normality in the probit. As Maarten mentions, normality governs the functional form and the test for normality is just a standard RESET test using squares and cubes. The significance of the first term is a check for skewness and the significance of the second term is a check of for excess kurtosis. I can try to dig out the reference if you are interested. It turns out that this is also a test for a particular form of heteoskedasticity.

Best regards,

Joao
2 likes
Comment
James Park

Join Date: May 2017

Posts: 97
#6

28 Mar 2019, 19:55

Originally posted by Joao Santos Silva View Post

Dear Joseph and Maarten,

This may be one of those "cultural" differences, but there are tests for normality in the probit. As Maarten mentions, normality governs the functional form and the test for normality is just a standard RESET test using squares and cubes. The significance of the first term is a check for skewness and the significance of the second term is a check of for excess kurtosis. I can try to dig out the reference if you are interested. It turns out that this is also a test for a particular form of heteoskedasticity.

Best regards,

Joao

Dear Joao,

Could you explain how to test for normality in probit or ivprobit commands? (or heteroskedasticity)
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3018
#7

29 Mar 2019, 05:58

Dear James Park,

Just perform a RESET test with squares and cubes.

Best wishes,

Joao
Comment
Kazim Hashim

Join Date: Jan 2020

Posts: 6
#8

29 Apr 2020, 09:02

Hey guys,

I am running a Probit model also and wish to conduct a VIF test to check for Multicollinearity. However I wanted to ask firstly if the Probit Model assumed no multicollinearity in the first place?

I appreciate any clarity you could provide
Comment
Maarten Buis

Join Date: Mar 2014

Posts: 3464
#9

29 Apr 2020, 14:22

Mulitcolinearity is not a problem (I ignore perfect multicolinearity, as there the problem is usually a logic error by the researcher and not a data problem). Neither linear regression (some people mistakenly call it OLS) nor probit assume anything about multicolinearity.

With a regression model (linear, probit, logit, or otherwise) you are trying to separate effect of different variables, and that is harder when the variables move together. However, the standard errors acurately represents this additional uncertainty. So multicolinearity does not invalidate our model. You need to understand your data, and know how much (or little) information is present in your data, but finding multicolinearity does not require you to do anything.

Multicolineartiy is just a property of the explanatory variables alone. So you can just do a linear regression of any arbitry dependent variable with the explanatory variables of interest, and get the FIVs from the post-estimation of those models.

---------------------------------
Maarten L. Buis
University of Konstanz
Department of history and sociology
box 40
78457 Konstanz
Germany
http://www.maartenbuis.nl
---------------------------------
Comment
Kazim Hashim

Join Date: Jan 2020

Posts: 6
#10

30 Apr 2020, 08:36

Thank you Maarten! This has helped me a lot.
Comment
Maria Kaneva

Join Date: Sep 2021

Posts: 3
#11

09 Oct 2024, 00:58

[QUOTE=Joao Santos Silva;n1490741]Dear James Park,

Just perform a RESET test with squares and cubes.

Dear Joao,

My name is Maria and I am interested in the topic of this discussion, namely testing the normality in the Heckman model. I used the following code to test normality in the probit part of the model:

heckman zreal age agesq exper expnewsq .... other Xs... if sex==1, select(age mard uni i.badsah richl1 numkids18)

*RESET normality test for probit*
predict pred, xbs
gen pred2=pred^2
gen pred3=pred^3

heckman zreal age agesq exper expnewsq .... other Xs... if sex==1, select(age mard uni i.badsah richl1 numkids18 pred2 pred3)

test pred2 pred3

where zreal is the real hourly wage. The result was as follows:

test pred2 pred3

( 1) [select]pred2 = 0
( 2) [select]pred3 = 0

chi2( 2) = 16.28
Prob > chi2 = 0.0003

Pred2 and Pred3 are not significant in the first regression.
I interpret that result as an incorrect functional form for the probit model. Does it mean the estimates of my heckman model are biased? What are the ways to improve the model?

Thank you very much in advance!
Maria
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3018
#12

09 Oct 2024, 02:27

Dear Maria Kaneva,

Your results suggest that the assumption of normality is not valid, and therefore the conditions for the consistency of Heckman's correction are not met. Unfortunately, that is often the case and the problem is often ignored. It is also unfortunate that there does not seem to be any simple way to fix the problem.

Best wishes and I am sorry for not being more positive,

Joao
Comment

Announcement

Test for Normality and Multicollinearity in Probit Models

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment