Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Running Multinomial Probit on unbalanced Panel data

    I have unbalanced panel data on household cooking energy, with total observations equal to 1762 as follows:
    Year
    Count 2010 2013 2016 2019
    1 62 12 17 422
    2 57 55 20 8
    3 307 308 302 7
    4 46 46 46 46
    Total 472 421 385 483
    I would like to use the multinomial Probit to analyze the choice determinants in STATA 16 (I have 4 alternatives). I set my time variable, Year. First I need to know if I can still go ahead and use this unbalanced data, or if I should use only if the count==3 as it is the largest available. I thought of also analyzing using the count=4 data (46 households, 184 observations), just to be able to use the most recent data, but I am not sure if this will be enough for estimations.

    Also, I haven't seemed to find a command for the multinomial Probit on panel data for STATA 16. Please assist with this as well.
    Thank you
    Last edited by Glory Sibale; 28 May 2022, 00:02.

  • #2
    I would like to use the multinomial Probit to analyze the choice determinants in STATA 16 (I have 4 alternatives).

    In the case of binary choice, probit and logit give comparable results and there is little to choose between them. In the multinomial case, in which there are \(m > 2\) possible responses, probit is more complicated and becomes computationally difficult. With \(m > 3\), it becomes impossible. Therefore, I would advise that you switch to multinomial logit. The command is mlogit and it accommodates missing values.

    Code:
    help mlogit

    Also, I haven't seemed to find a command for the multinomial Probit on panel data for STATA 16.
    Panel multinomial logit was introduced in Stata 17. See https://www.stata.com/manuals/xtxtmlogit.pdf. Prior to that, you could use gsem as outlined in https://www.stata.com/stata-news/news29-2/xtmlogit/.


    I have unbalanced panel data on household cooking energy, with total observations equal to 1762 as follows:
    Year
    Count 2010 2013 2016 2019
    Note that if your variables are collected in 3 year intervals, your panel is not unbalanced in the real definition of the term (which refers to missing values within the available time periods). What you have is a time unit of 3 years, which you can specify when you xtset the data:

    Code:
    xtset hhid year, delta(3)
    where "hhid" is the household identifier.
    Last edited by Andrew Musau; 28 May 2022, 06:41.

    Comment


    • #3
      Thank you very much Andrew Musau for your response. I will have to switch to logit instead as per your advice. Thank you for the links too. Concerning the data, can I still run an estimation on panel data that has only 46 households with whole data? From my table, only 46 households participated from 2010 to 2019. I read the article about the percentage of available data, but I still did not find it clear whether my 46 households will be sufficient or not. Please advise, and thank you, once again.

      Comment


      • #4
        Originally posted by Glory Sibale View Post
        Concerning the data, can I still run an estimation on panel data that has only 46 households with whole data? From my table, only 46 households participated from 2010 to 2019.
        Yes you can, there is no issue with that.

        Comment


        • #5
          Originally posted by Andrew Musau View Post

          Yes you can, there is no issue with that.
          Thank you very much Sir.

          Comment

          Working...
          X