Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Observe change between first two waves of panel data while held constant over following years

    Dear Statalist,

    I am using four years of a panel data set (waves 11-14). I am trying to capture change in my focal independent variable (job status) between the first two years (waves 11-12), while holding that change constant over the following years to observe the lasting effect that change has on my dependent variable (satisfaction). Also, I only want to keep observations that maintain the same category of job status following wave 12. Basically, I want to observe respondents that may, or may not have changed job status between the first two years, but do not experience change in job status over the years thereafter.

    I have set my data set tas time series:

    tsset pid wave
    tsspell, cond(wave> 10 & wave < 15)
    by pid: egen maxrun = max(_seq)
    gen wave11_14 = maxrun if maxrun==4
    tab maxrun
    keep if wave11_14==4


    I cannot figure out if my issue would best be resolved through coding (i.e. generating a variable for change in job status) or through Time Series tools (i.e. using some sort of tsspell specification)?
    I intend to use a fixed effects regression, which was favored over a random effects model, per the results of a Hausman test. However, I'm not sure if using another model, or lag variable, would be more appropriate to solve my issue?

    I am new to the stata forum, so I hope this an appropriate question to ask here. Furthermore, I hope I've made my question clear. Please let me know if I should provide additional information.

    Thank you in advance for any assistance or insight you can provide!

    Best,
    Wyatt


  • #2
    It is difficult to write real code for imaginary data. Descriptions are never sufficient. Please show an example of your data, and be sure to use the -dataex- command to do that.

    If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Dear Clyde,

      Thank you for your feedback and for helping me understand the best way to utilize this forum! I apologize for not knowing to have include a sample of my data. Below is data for four respondents with four waves each. I have manually edited the data (I hope this is okay) so that the first and last respondents represent who I mean to capture, as they have a change in job status between waves 11 and 12, and they also maintain that status over the following three years. These are the type of respondents I aim to capture; I would drop the 2nd and 3rd respondent.

      I am somewhat new to stats and to stata, so please pardon my ignorance for not knowing the best way to handle this problem. Thank you again for your help!

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input byte(losoverall p_job_status wave)
      4 1 11
      3 3 12
      3 3 13
      3 3 14
      2 3 11
      3 . 12
      3 3 13
      3 3 14
      3 1 11
      3 . 12
      3 3 13
      3 3 14
      3 2 11
      2 3 12
      2 3 13
      3 3 14
      end
      label var losoverall "Overall Life Satisfaction" 
      label var p_job_status "Job Status" 
      label var wave "Wave of Survey"

      Best,
      Wyatt

      Comment


      • #4
        In order to do this you need to have a variable that identifies respondents. I've modified your data example to include such a variable, which I call id. If your data set doesn't have one, you need to create it.

        I notice that in your data example, every id has exactly four observations, one for each wave 11, 12, 13, and 14. If this is true in your whole data set, it greatly simplifies the code, so I assume that it's true and include a couple of -assert- statements that verify that assumption. If those assumptions are not true, then the code will break on one of the -asserts- and it won't blunder on and give you wrong results. Then you have to decide whether there is a problem with your data, or if that assumption is simply not warranted. If the latter, post back for code that does not require those assumptions.

        Code:
        clear
        input byte(losoverall p_job_status wave) float id
        4 1 11 1
        3 3 12 1
        3 3 13 1
        3 3 14 1
        2 3 11 2
        3 . 12 2
        3 3 13 2
        3 3 14 2
        3 1 11 3
        3 . 12 3
        3 3 13 3
        3 3 14 3
        3 2 11 4
        2 3 12 4
        2 3 13 4
        3 3 14 4
        end
        
        isid id wave, sort
        by id (wave): assert wave == 10 + _n
        by id (wave): assert _N == 4
        
        by id (wave): gen byte has_change = (p_job_status[2] != p_job_status[1])
        by id (wave): gen byte sustained = cond(_n > 2, p_job_status == p_job_status[2], 1)
        by id (sustained), sort: replace sustained = sustained[1]
        isid id wave, sort
        keep if has_change & sustained

        Comment


        • #5
          Dear Clyde,

          This worked perfectly, thank you very much!

          Much appreciation,
          Wyatt

          Comment

          Working...
          X