Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • generating marital history variable

    Hi,

    I wanted to generate a marital history variable in Stata. The dataset includes identifiers (pidlink), marital status (marstat), number of marriages (kwn_num), year of marriage (kw10yr), and year of separation (kw18yr). As an example, considering highlighted rows for the same person who married in 1975, separated in 2007, and remarried in 2009 until 2014, I aim to create a dummy variable set to 1 during marriage periods. Specifically, for this individual, the dummy variable should be 1 from 1988 to 2007, 0 in 2008, and 1 again from 2009 to 2014. Assistance with coding in Stata would be highly appreciated. Thank you!

    Click image for larger version

Name:	Untitled.png
Views:	1
Size:	43.8 KB
ID:	1738297



  • #2
    Please provide a data example using the dataex command (see FAQ Advice #12 for details). You may e.g. copy and paste the result of

    Code:
    dataex pidlink year age marstat kwn_num kw10yr kw18yr if pidlink=="004030001"

    Comment


    • #3
      Hi Andrew,

      Thanks for referring me to the FAQ Advice. Here's the data example:

      clear
      input str10 pidlink float year double(age marstat kwn_num kw10yr kw18yr)
      "004030001" 2014 62 2 2 1975 2007
      "004030001" 2014 62 2 1 2009 .
      end

      Thanks once again!

      Comment


      • #4
        I believe the following will do it:
        Code:
        //    EXPAND DATA TO YEARLY OBSERVATIONS
        gen `c(obs_t)' obs_no = _n
        gen int from = kw10yr
        gen int to = min(kw18yr, year)
        expand to-from+1
        by obs_no, sort: gen yr = from + _n - 1
        gen byte wanted = (marstat == 2)
        egen `c(obst_t)' numeric_id = group(pidlink)
        
        //    CARRY FORWARD ANY VARIABLES OTHER THAN wanted TO FILL IN GAPS NOT COVERED BY DATA
        xtset numeric_id yr
        tsfill
        order pidlink, last
        ds numeric_id yr wanted, not
        foreach v of varlist `r(varlist)' {
            capture confirm numeric var `v', exact
            if c(rc) == 0 {
                replace `v' = L1.`v' if missing(pidlink)
            }
            else {
                by numeric_id (yr), sort: replace `v' = `v'[_n-1] if missing(`v')
            }
        }
        
        //    FILL IN WANTED DEPENDING ON WHETHER kw18yr WAS MISSING OR NOT PRIOR TO THE GAP
        by numeric_id (yr): replace wanted = missing(L1.kw18yr) if missing(wanted) & !missing(L1.wanted)
        Note: In this case, setting the value of wanted to 0 in year 2008 relies on the fact that kw18yr indicated end of the marriage in 2007. If a gap occurs in the data with nothing in kw18yr the year preceding the gap, this code will presume that the person remained married until the end of that gap. I imagine this is what you intend, but it does make a presumption that the person did not become unmarried during the gap and we just didn't know about it.

        Comment

        Working...
        X