Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Re-shaping double/triple wide with non-constant indices

    Hello,

    I am working with data that includes a subset of variables for each member of a HH. I have a series of variables that index information in the following manner

    intern_12m_[i=1-55] --> a variable for whether someone migrated, i indexes the household member number.

    intern_months_[i=1-34]_[j=1-12] --> a variable that lists the months in which the episode occurred, i indexes the member number, j indexes the migration episode number of the member (max 12 b/c one episode defined as absence for at least 1 month).
    ---ex. intern_months_3_4 lists the months of member 3's 4th migration episode.

    But- unfortunately- the data also contain dummies for each member x episode x month combination, and I'd like to drop these because I get this information from the variable above. The variables are of the following format:

    intern_months_[j=1-12]_[i=1-34]_[k=1-12] --> a dummy if a migration episode for a member occurred in month k.

    I've transformed data from long to wide with

    Code:
     reshape long name_first_ name_last_ intern_12m_ , i(hhid) j(member)
    and the data is in the current format (I ommitted the remaining intern_months_ variables bc dataex limits exceeded)

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str9 hhid byte(member intern_12m_) str26 intern_months_1_1
    "1--66-1" 1 0 ""
    "1--66-1" 2 0 ""
    "1--66-1" 3 0 ""
    "1-1-1"   1 0 ""
    "1-1-1"   2 0 ""
    "1-1-1"   3 1 ""
    "1-1-1"   4 0 ""
    "1-1-2"   1 0 ""
    "1-1-2"   2 0 ""
    "1-1-2"   3 0 ""
    "1-1-2"   4 0 ""
    "1-1-2"   5 0 ""
    "1-1-2"   6 0 ""
    "1-1-3"   1 0 ""
    "1-1-3"   2 0 ""
    "1-1-3"   3 0 ""
    "1-1-3"   4 0 ""
    "1-1-3"   5 1 ""
    "1-1-3"   6 0 ""
    "1-1-3"   7 0 ""
    end
    ------------------ copy up to and including the previous line ------------------

    Now I want to reshape the intern_months_[i=1-34]_[j=1-12] variables to long format as well by first dropping the intern_months_[j=1-12]_[i=1-34]_[k=1-12] dummy variables. I don't know to reference these bc the variables are not constant across these indices. For example, the maximum number of episodes for a "1st" member household is 7, and 6 for the 2nd member so doing something like:

    Code:
    forvalues x=1/12 {
        forvalues y=1/34 {
            forvalues z=1/12 {
        drop intern_months_`x'_`y'_`z'
    }
    }
    }
    results in messages that certain variables aren't found. Is there another way to think about how to approach this transformation process?

    Thank you.

Working...
X