Hi all,
More of a methodological / econometric question than a Stata, technical question.
I have repeated cross-sectional data; thousands of different apprentices enroling into apprenticeships at different start dates. So therefore there is a time component, and a cross-sectional component. However, each apprentice is observed only once.
My dependent variable is binary, and my regressor of interest is as well. My regressor of interest is CB.
I can include three vectors of fixed effects in my estimation: month, industry and state.
I am well aware of all the literature on the debate between linear probability model versus logit and probit (Wooldridge, 2010, Lewbel et al., 2012, Angrist and Pischke, 2009, Maddala, 1985, Long, 1997, etc.). However, my question concerns the incidental parameters problem, and whether it would apply here, given that I do not have panel data.
I understand that the incidental parameter problem arises from the fact that the dimensions of certain of certain parameters increase with sample size (e.g. fixed effects), and we only have a fixed number of time periods T to estimate each unit FE, and conversely for time FE.
So I guess the question is, does the incidental parameters problem also occur to repeated cross-sectional data?
I am also given to understand that the incidental parameters problem arises solely in nonlinear models, correct? i.e. models in which fixed-effects don't get averaged out, but in which the log-likelihood function is maximised over each parameter, is biased, and then this bias propagates to the estimation of other parameters.
Bottom line: is it possible for LPM in this case (with repeated cross-sections) to suffer from the incidental parameters problem? Will a logit model suffer from it here?
More of a methodological / econometric question than a Stata, technical question.
I have repeated cross-sectional data; thousands of different apprentices enroling into apprenticeships at different start dates. So therefore there is a time component, and a cross-sectional component. However, each apprentice is observed only once.
My dependent variable is binary, and my regressor of interest is as well. My regressor of interest is CB.
I can include three vectors of fixed effects in my estimation: month, industry and state.
Code:
input float(y CB Industry) str2 progState float month 0 0 . "AK" 1 0 0 . "AL" 1 0 0 . "FL" 11 1 0 . "FL" 5 0 0 . "FL" 11 0 0 . "FL" 2 0 0 . "FL" 8 1 0 . "FL" 11 0 0 . "FL" 11 1 0 . "FL" 5 0 0 . "FL" 8 0 0 . "FL" 1 0 0 . "FL" 11 1 0 . "FL" 3 0 0 . "FL" 6 0 0 . "FL" 9 1 0 . "FL" 8 0 0 . "FL" 9 0 0 . "FL" 9 1 0 . "FL" 8 1 0 . "FL" 9 0 0 . "FL" 6 1 0 . "FL" 2 1 0 . "FL" 2 1 0 . "FL" 2 0 0 . "FL" 2 0 0 . "FL" 2 0 0 . "FL" 2 0 0 . "FL" 2 0 0 . "FL" 9
I understand that the incidental parameter problem arises from the fact that the dimensions of certain of certain parameters increase with sample size (e.g. fixed effects), and we only have a fixed number of time periods T to estimate each unit FE, and conversely for time FE.
So I guess the question is, does the incidental parameters problem also occur to repeated cross-sectional data?
I am also given to understand that the incidental parameters problem arises solely in nonlinear models, correct? i.e. models in which fixed-effects don't get averaged out, but in which the log-likelihood function is maximised over each parameter, is biased, and then this bias propagates to the estimation of other parameters.
Bottom line: is it possible for LPM in this case (with repeated cross-sections) to suffer from the incidental parameters problem? Will a logit model suffer from it here?
Comment