Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • panel data - panel id

    Hi,

    I have an unbalanced panel dataset (N=3836, T=13), using survey responses.
    My dependent variable is the household's ability to save (saving=1 if able to save, 0 otherwise).
    hhid is the Household's unique identifier.
    nomem is the number of the household member (e.g. 1 may be allocated to one household member participating in the survey, and if another member of that household participates, they will be nomem=2, etc).

    For some households (eg hhid=979), just one household member has participated in the survey:
    Code:
    hhid  nomem  year
    979    1    2004
    979    1    2005
    979    1    2006
    979    1    2008
    979    1    2009
    979    1    2010
    However, in other households (eg hhid=986), we see that member1 dropped out after 2014, and member2 participated in 2016.
    Code:
    hhid  nomem  year
    986    1    2004
    986    1    2005
    986    1    2006
    986    1    2007
    986    1    2008
    986    1    2009
    986    1    2010
    986    1    2011
    986    1    2012
    986    1    2014
    986    2    2016
    When I try to run the regression, I understandably get an error message:
    Code:
    . xtprobit saving age income, re vce(cluster hhid nomem) nolog
    panels are not nested within clusters
    r(498);
    I think dropping observations if nomem is not 1 would be a poor choice because this would bias my sample.

    I am unsure of how to run the regression though. Is there a simple way perhaps to combine hhid and nomem, to create a unique identifier for each household member? E.g. when hhid=986 and nomem=2, this can be combined to be hhid+nomem?

    Or is this idea not the right way to go about this?

    Thanks in advance for your suggestions
    Last edited by Rose Simmons; 06 Apr 2017, 11:49.

  • #2
    Upon reflection, I wonder whether the following would be more suitable than merely adding hhid and nomem (as this may overlap with other ID's).
    Would the following be appropriate?
    Code:
    gen id=(hhid*100)+nomem

    Comment


    • #3
      I would just use a composite here:

      Code:
      egen id = group(hhid nomem), label
      For discussion at excruciating length see http://www.stata-journal.com/sjpdf.h...iclenum=dm0034

      http://www.stata.com/support/faqs/da...p-identifiers/

      Comment


      • #4
        Thank you for your help Nick Cox

        Comment

        Working...
        X