Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Within Variation with nested data

    Dear Stata Community,

    I am trying to figure out how much within variation there is for one of my variables (essentially trying to use something like the xttab command). My data refer to different jobs nested in people over time (see data below). So when I try to set a panel with xtset, I have to use a panel identifier that makes a unique combination out of the individual and the job. However, using xttab with this panel identifier would show me the within variation within jobs and not individuals. I want to see how many people hold jobs for which the contact variable always takes the same value (0 or 1) as opposed to those having held both kinds of jobs, so some with contact=0 and some with contact=1 (the contact variable does not change for a single job over time).

    Can somebody help me out on this? Thanks so much



    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int(caseid year) float(jobnumber contact)
     4 1993  8 0
     4 1996  8 0
     4 1996  9 0
     4 1998  9 0
     6 1996 12 0
     7 1993  8 1
     7 1994  8 1
     7 1996  9 0
     7 1998  9 0
     7 2000  9 0
     7 2002  9 0
     7 2004  9 0
     8 1988  8 0
     8 1989  8 0
     8 1990  8 0
     8 1991  8 0
     8 1992  8 0
     8 1993  8 0
     8 1994  8 0
     8 1996  8 0
     8 1996  9 1
     8 1998  8 0
     8 1998  9 1
     8 2000  8 0
     8 2000  9 1
     8 2002  8 0
     8 2004  8 0
     8 2006  8 0
     9 1994  5 1
     9 1996  5 1
     9 1998  5 1
     9 2000  5 1
     9 2002  5 1
     9 2004  5 1
     9 2006  5 1
     9 2008  5 1
     9 2010  5 1
     9 2012  5 1
     9 2014  5 1
    10 1994  9 1
    11 1993 11 1
    11 1994 11 1
    11 1996 11 1
    11 1998 12 1
    11 1998 13 1
    11 2000 13 1
    11 2000 14 0
    11 2002 14 0
    12 1982  6 0
    12 1983  6 0
    12 1984  6 0
    12 1985  6 0
    12 1993  6 0
    12 1994  6 0
    13 1994  7 0
    13 1994  8 0
    13 1996  7 0
    13 1998  7 0
    13 2000  7 0
    13 2002  7 0
    14 1990 15 0
    14 1991 15 0
    14 1992 15 0
    14 1993 15 0
    14 1994 15 0
    14 1994 17 1
    14 1994 18 0
    14 1996 17 1
    14 1996 18 0
    14 1996 19 1
    14 2000 21 0
    14 2002 21 0
    14 2004 21 0
    14 2006 21 0
    14 2008 21 0
    15 1993  9 0
    15 1994  9 0
    15 1996  9 1
    15 1998  9 0
    end

  • #2
    I think this does what you want:
    Code:
    by caseid (contact), sort: gen byte contact_varies = (contact[1] != contact[_N])
    by caseid (year jobnumber), sort: gen flag = (_n == 1)
    tab contact_varies if flag

    Comment


    • #3
      Thanks Clyde. Much appreciated.

      Comment

      Working...
      X