Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Panel data: time-invariant dependent variable

    Hi there,
    I'm fitting a panel data mlogit model(T=37, N=267). The dependent variable is a time-invariant multinomial logit variable (indicating which cluster each individual is in ) which changes only with individuals. I have several important time-varing regressors and I can (only) estimate random effect by the Stata gsem command. But I'm not sure it is correct to fit a panel model to such a dependent variable in terms that wheter this model conflicts with classical econometrical assumptions.
    Particularly, I found a thread in the forum has disscussed a similar problem here: http://www.statalist.org/forums/forum/general-stata-discussion/general/506473-dependent-variable-does-not-vary-within-a-panel But there seem to be no conclusions.
    The dependent variable (cluster) is from a time series model-based clustering, so it is important to look into the time-varying covariates in the panel. May I ask experts on Statalist to shed more light on my confusion? If it's not correct to fit such a random effect model, are there any other models I can consider?
    Many many thanks!
    Last edited by Carrie Chao; 04 Jan 2017, 19:10.

  • #2
    You didn't get a quick answer. You may want to look at the FAQ on asking questions, provide Stata code, Stata output, and data using dataex.

    I am not expert in these matters, but if your dv does not vary within panels, then it seems questionable to use multiple time-based observations to explain it. You need to think about the time structure of what is generating your data. If you use all the years of data to generate your dv (e.g., t to t++n), it looks really questionable to explain it as a separate observation with data in t.

    You also may open some complex issues when you use one statistical technique to generate a variable and then other statistical techniques on the same data to analyze it.

    Comment


    • #3
      Originally posted by Phil Bromiley View Post
      You didn't get a quick answer. You may want to look at the FAQ on asking questions, provide Stata code, Stata output, and data using dataex.

      I am not expert in these matters, but if your dv does not vary within panels, then it seems questionable to use multiple time-based observations to explain it. You need to think about the time structure of what is generating your data. If you use all the years of data to generate your dv (e.g., t to t++n), it looks really questionable to explain it as a separate observation with data in t.

      You also may open some complex issues when you use one statistical technique to generate a variable and then other statistical techniques on the same data to analyze it.
      Thank you so much for your kind help, Phil.
      My dependent variable (cluster) is generated by using time series of GDP growth of the 267 individuals during 1970s-2010s, and the covariates are common explainary variables to regional GDP growth like policy indicators and regional characters within the same period. So I'm using the same period but different data in the two steps. I want to prove the model is correct but I don't know how. The estimated results indeed mean something economic, so I don't want to give up this model setting.

      Comment


      • #4
        I'm still having trouble understanding what you want to do. You talk about GDP and regional GDP but then refer to individuals. Gross Domestic Product (which is the normal meaning of GDP) as I know it is a macro economic variable, not an individual-level variable. You seem to be crossing levels of analysis without being careful.

        You can't cluster on GDP growth over 40 years and then use annual data to explain it. At the minimum, you'd be explaining a variable dependent on much earlier data with much later data. There is also a high probability of endogeneity problems - GDP growth in the 1980's probably influences many of your explanatory variables in the 1990's. I suppose you could aggregate all your variables over 40 years giving you an N of 267. But this is throwing away an immense amount of information.

        I don't know what individuals mean. If it is how many folks you happen to observe, is your sampling strategy legitimate?

        Try to explain more clearly exactly what you're doing. First, what is the observation unit? Despite this being the second round, I still don't even know this basic fact. Provide your Stata code and Stata output. (Look at the FAQ on asking questions.)

        By the way, you are posing the wrong problem. If you really want to "prove the model is correct", rather than test the model or estimate the parameters of the model, you are doing bad research. You're starting with the conclusion you want and working backward, rather than starting with the question and seeing what the data tell you about that question.

        Comment

        Working...
        X