Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Convert data - Merge

    Hello all

    I've monthly cross-sectional data - only for some months for 2020 and 2021( April 2020, May 2020, June 2020, July 2020, Sep 2020, Nov 2020, Jan 2021, March 2021, and Sep 2021). I would like to convert the data to yearly. Is this possible with keeping the variable "wah" is a categorical variable because this is my interest variable and needs to be a categorical variable, not a mean
    I want to convert to yearly because I have other data from 2019 to 2021 yearly, and I need to merge both datasets. I think the best way to merge these two datasets is by converting the monthly to yearly and merging them. Can my idea work? If not, could you please suggest any way to merge both data?

    I will provide a sample of both data in a wide format :

    Monthly data :
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long pidp byte(ca_wah cb_wah cc_wah cd_wah ce_wah cf_wah cg_wah ch_wah ci_wah)
       76165  1  1  1  1  1  1  1  1  3
      280165  4  4  4  4  4  .  .  .  .
      469205  4  .  .  .  4  .  .  .  4
      599765  1  1  1  .  .  .  .  . -8
      732365 -8 -8 -8  . -8 -8 -8 -8 -8
     1587125  3  2  2  2  2  3  3  3  3
     3424485 -8 -8  .  .  .  .  . -8 -8
     4849085  4  3  2  3  3  2  1  1  2
    68002725  . -8 -8 -8 -8 -8 -8 -8 -8
    68008847  4  . -8  4  4  4  4  4  4
    68010887  4  4  4  4  3  3  2  2  .
    68029931  .  .  .  1  .  .  1  .  .
    68031967 -8 -8 -8 -8 -8 -8 -8 -8  .
    68035365 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68035367  1  1  2  3  3  2  2  2  2
    68037411  . -8  4  .  .  .  .  .  .
    68041487  1  1  1  1  1  1  1  1  1
    68041491  4 -8  4  .  .  .  1  .  2
    68042171  1  .  .  .  .  .  .  .  .
    68044207  1  .  .  .  .  .  .  .  .
    68045567  1  1  1  1  1  1  1  1  1
    68046247 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68046251 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68051007  1  2  1  1  1  1  2  1  1
    68051011  4  4  4  4  4  4  4  4  4
    68056463 -8  .  .  .  .  .  .  .  .
    68058487 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68058491 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68060527  1  1  1  .  .  1  .  .  .
    68060531  1  1  1  1  1  1  1  1  1
    68060533 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68060537 -8 -8 -8 -8 -8  . -8 -8 -8
    68061288  4  4  4  .  4  4  .  .  .
    68063247  4  4  4  4  4  4  4  4  4
    68063927  4  4  4  4  4  4  4  4  4
    68063931  4  4  4  .  4  .  4  4  4
    68064605 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68064609 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68068007  4  4  4  4  4  4  4  4  4
    68068011  4  4  4  4  4  4  .  4  4
    68068015  1  1  1  1  .  1  .  .  4
    68068082  4  4  3  3  3  4  4  4  .
    68071407  4  .  .  .  .  .  .  .  .
    68076171 -8  .  .  .  .  .  .  .  .
    68091139  . -8 -8 -8  . -8  .  .  .
    68097245 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68097927 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68106767 -8  .  .  .  .  .  .  .  .
    68111531 -8  .  .  .  .  .  .  2  2
    68112211  4  3  2  2  3  3  .  3  2
    68120367 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68120375  1  1  2  2  2  3  1  2  3
    68121047 -8  .  .  .  .  .  .  . -8
    68125127  4  4  4  4  4  4  4  4  4
    68125131  1  1  1  1  2 -8  2  2  .
    68125135  3  3  2  3  3  4  3  4  2
    68131931  . -8  .  .  .  .  .  .  .
    68133285 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68133289 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68136009  1  1  1  1  1  .  1  1  1
    68137365 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68138045 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68138049 -8 -8 -8 -8 -8  . -8 -8 -8
    68138051 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68141447 -8  .  .  .  .  .  . -8  .
    68142139  4  .  .  .  .  .  .  .  .
    68142890  1  3  .  .  .  .  .  1  2
    68144847  1  1  1  1  1  1  1  1  1
    68144851  1  1  2  2  2 -8 -8 -8 -8
    68148247 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68148251 -8 -8 -8 -8 -8 -8  . -8  .
    68150967  4  4  4  4  4  4  .  4  4
    68150971  1  1  1  1  1  1  1  1  1
    68150975  1  1  1  1  1  1  1  1  1
    68155047 -8 -8 -8 -8 -8 -8 -8  .  .
    68155051  2  2  4  4  4  3 -8 -8  .
    68155055  4  4  4  .  .  .  .  .  .
    68155727  4  .  4  .  .  4  .  4  4
    68155731  1  1  .  1  .  .  .  .  4
    68157166  1  .  .  .  .  .  .  .  .
    68157767  1  4  .  .  .  .  .  .  .
    68157771  1  .  .  .  1  1  1  1  1
    68159131  3  4  4  3  3  3  2  2  2
    68160485  4  4  4  4  4  4  4  4  4
    68160489  1  1  1  1  1  .  1  1  2
    68163887 -8  .  .  .  .  .  .  .  .
    68173407 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68174767 -8  .  . -8 -8 -8  .  . -8
    68180887  4  3  4  4  4  4  4  4  4
    68180891  1  1  1  1  1  1  .  1  .
    68180895  4  .  4  4  . -8  .  .  4
    68184967  2  .  .  1  .  .  .  . -8
    68184971  4  4  4  4  .  .  1  1  4
    68185647 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68187687 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68187691 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68188367  3  3  3  4  4  4  3  3  3
    68191771  1  1  1  1  1  1  1  1  1
    68193127 -8 -8 -8 -8 -8 -8 -8 -8 -8
    68193131  .  .  . -8  .  .  .  .  .
    end
    label values ca_wah ca_wah
    label def ca_wah -8 "inapplicable", modify
    label def ca_wah 1 "Always", modify
    label def ca_wah 2 "Often", modify
    label def ca_wah 3 "Sometimes", modify
    label def ca_wah 4 "Never", modify
    label values cb_wah cb_wah
    label def cb_wah -8 "Inapplicable", modify
    label def cb_wah 1 "Always", modify
    label def cb_wah 2 "Often", modify
    label def cb_wah 3 "Sometimes", modify
    label def cb_wah 4 "Never", modify
    label values cc_wah cc_wah
    label def cc_wah -8 "Inapplicable", modify
    label def cc_wah 1 "Always", modify
    label def cc_wah 2 "Often", modify
    label def cc_wah 3 "Sometimes", modify
    label def cc_wah 4 "Never", modify
    label values cd_wah cd_wah
    label def cd_wah -8 "Inapplicable", modify
    label def cd_wah 1 "Always", modify
    label def cd_wah 2 "Often", modify
    label def cd_wah 3 "Sometimes", modify
    label def cd_wah 4 "Never", modify
    label values ce_wah ce_wah
    label def ce_wah -8 "inapplicable", modify
    label def ce_wah 1 "Always", modify
    label def ce_wah 2 "Often", modify
    label def ce_wah 3 "Sometimes", modify
    label def ce_wah 4 "Never", modify
    label values cf_wah cf_wah
    label def cf_wah -8 "inapplicable", modify
    label def cf_wah 1 "Always", modify
    label def cf_wah 2 "Often", modify
    label def cf_wah 3 "Sometimes", modify
    label def cf_wah 4 "Never", modify
    label values cg_wah cg_wah
    label def cg_wah -8 "inapplicable", modify
    label def cg_wah 1 "Always", modify
    label def cg_wah 2 "Often", modify
    label def cg_wah 3 "Sometimes", modify
    label def cg_wah 4 "Never", modify
    label values ch_wah ch_wah
    label def ch_wah -8 "inapplicable", modify
    label def ch_wah 1 "Always", modify
    label def ch_wah 2 "Often", modify
    label def ch_wah 3 "Sometimes", modify
    label def ch_wah 4 "Never", modify
    label values ci_wah ci_wah
    label def ci_wah -8 "inapplicable", modify
    label def ci_wah 1 "Always", modify
    label def ci_wah 2 "Often", modify
    label def ci_wah 3 "Sometimes", modify
    label def ci_wah 4 "Never", modify

    Yearly data :
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input double pidp int j_dvage byte j_jbsat double j_jbhrs int l_jbsoc10_cc
       22445 33  6   30  -8
       29925 41  7   23 415
       76165 35  5   35 313
      280165 39  5   40  -8
      333205 28  6   40   .
      469205 28  7   16 623
      599765 31  6 36.5 113
      665045 36  4   15 543
      732365 33 -8   -8  -8
      760925 37 -8   -8   .
     1587125 52  3   -8 244
     1697285 45  5 37.5   .
     2626845 40 -7   35   .
     2888645 29  6   38  -8
     3229325 49 -8   -8   .
     3424485 82 -7   -8  -8
     3667245 30  7   40   .
     3705325 64 -8   -8  -8
     3914085  .  .    .   .
     3914765  .  .    .   .
     3916125  .  .    .   .
     4454005 73 -8   -8  -8
     4849085 35  1 37.5 113
     4853165 48  6   37 511
    68002725 64 -8   -8  -8
    68004087 68 -8   -8   .
    68006127 48 -8   -8   .
    68008847 60  7   39 415
    68009527 40  4 36.5 312
    68010887 54  6   32 614
    68011567 44  6 37.5   .
    68014287 49 -8   -8   .
    68014291 25  6   35   .
    68014295 18 -8   -8   .
    68020407 81 -8   -8   .
    68020564 47 -8   -8   .
    68021765 59  7   30   .
    68021781 24  6   35   .
    68028575 27 -8   -8  -8
    68029927 46 -8   -8  -8
    68029931 49  7   37   .
    68029935  .  .    .   .
    68031967 70 -8   -8  -8
    68035365 66 -8   -8  -8
    68035367 37  6   37 242
    68036727 86 -8   -8  -8
    68037407 49  6 37.5 813
    68037411 19  5   37   .
    68041487 48  6   37 354
    68041491 44  6   40  -8
    68041495 17 -8   -8   .
    68041499  .  .    .   .
    68042167 48  3   37 231
    68042171 47  6   40 231
    68043527 64 -8   -8  -8
    68044207 42  4   21  -9
    68044211 45  6   -8  -9
    68044887 71 -8   -8  -8
    68045567 56  6   37 118
    68045571 58 -8   -8  -8
    68046247 75 -8   -8  -8
    68046251 73 -8   -8  -8
    68046927 45  6   37   .
    68046935 18  4   12   .
    68049647 60  6   35 231
    68049651 58  4   38 411
    68051007 57  2 37.5 212
    68051011 50  6 37.5 223
    68056447 55  4   30  -9
    68056451 53 -8   -8 923
    68056455 23  1 32.5 356
    68056459 18 -8   -8  -8
    68056463  .  .    . 911
    68058485 73 -8   -8  -8
    68058487 78 -8   -8  -8
    68058489 73 -8   -8  -8
    68058491 69 -8   -8  -8
    68059171 28  4   -8   .
    68060525 91 -8   -8   .
    68060527 44  6   37 356
    68060531 45  5   37  -9
    68060533 62  5   15   .
    68060537 74 -8   -8   .
    68061288 32  5 22.5 711
    68063247 51 -8   -8 923
    68063251 53  7   39 822
    68063255 18 -8   -8 924
    68063927 48  6   30 223
    68063931 50  4   40 614
    68064605 69 -8   -8  -8
    68064609 66 -8   -8  -8
    68068007 51 -8   -8 913
    68068011 51  3   30  -9
    68068015 26 -8   -8 354
    68068019 21  7 37.5   .
    68068082  .  .    .  -9
    68069367 90 -8   -8  -8
    68071407 29  2   40 544
    68072087 64 -8   -8  -8
    68076167 69 -8   -8  -8
    end
    label values j_dvage j_dvage
    label values j_jbsat j_jbsat
    label def j_jbsat -8 "inapplicable", modify
    label def j_jbsat -7 "proxy", modify
    label def j_jbsat 1 "completely dissatisfied", modify
    label def j_jbsat 2 "mostly dissatisfied", modify
    label def j_jbsat 3 "somewhat dissatisfied", modify
    label def j_jbsat 4 "neither satisfied or dissatisfied", modify
    label def j_jbsat 5 "somewhat satisfied", modify
    label def j_jbsat 6 "mostly satisfied", modify
    label def j_jbsat 7 "completely satisfied", modify
    label values j_jbhrs j_jbhrs
    label def j_jbhrs -8 "inapplicable", modify
    label values l_jbsoc10_cc l_jbsoc10_cc
    label def l_jbsoc10_cc -9 "missing", modify
    label def l_jbsoc10_cc -8 "inapplicable", modify
    label def l_jbsoc10_cc 113 "Functional Managers and Directors", modify
    label def l_jbsoc10_cc 118 "Health and Social Services Managers and Directors", modify
    label def l_jbsoc10_cc 212 "Engineering Professionals", modify
    label def l_jbsoc10_cc 223 "Nursing and Midwifery Professionals", modify
    label def l_jbsoc10_cc 231 "Teaching and Educational Professionals", modify
    label def l_jbsoc10_cc 242 "Business, Research and Administrative Professionals", modify
    label def l_jbsoc10_cc 244 "Welfare Professionals", modify
    label def l_jbsoc10_cc 312 "Draughtspersons and Related Architectural Technicians", modify
    label def l_jbsoc10_cc 313 "Information Technology Technicians", modify
    label def l_jbsoc10_cc 354 "Sales, Marketing and Related Associate Professionals", modify
    label def l_jbsoc10_cc 356 "Public Services and Other Associate Professionals", modify
    label def l_jbsoc10_cc 411 "Administrative Occupations: Government and Related Organisations", modify
    label def l_jbsoc10_cc 415 "Other Administrative Occupations", modify

  • #2
    It is difficult to convert monthly data to yearly. Doing that requires somehow collapsing all of the monthly values for a given year into a single value on the same scale. No matter how you choose to do it, information is being discarded.

    Why not instead combine the data sets by combining each yearly observation with all of the months of that year in the monthly data set? That way no information is discarded.

    Your example data contains no chronology information--you don't show your month or year variables, so I can't illustrate the code for you. Suffice it to say that you will have to extract the year from the monthly date (-yofd()- and -dofm()- functions are useful for this) and then do a m:1 or 1:m (depending on which data set is master and which is using) -merge- on pidp and year to then combine the data sets.

    This is likely to be more useful than crunching the monthly observations into yearly summaries. Moreover, if, in the end, it turns out you really need to aggregate everything up to yearly in some way, you can still do that later in the combined data. By contrast if you aggregate up to yearly now, there is no going back if, as I suspect you will find, that turns out to be a mistake.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      It is difficult to convert monthly data to yearly. Doing that requires somehow collapsing all of the monthly values for a given year into a single value on the same scale. No matter how you choose to do it, information is being discarded.

      Why not instead combine the data sets by combining each yearly observation with all of the months of that year in the monthly data set? That way no information is discarded.

      Your example data contains no chronology information--you don't show your month or year variables, so I can't illustrate the code for you. Suffice it to say that you will have to extract the year from the monthly date (-yofd()- and -dofm()- functions are useful for this) and then do a m:1 or 1:m (depending on which data set is master and which is using) -merge- on pidp and year to then combine the data sets.

      This is likely to be more useful than crunching the monthly observations into yearly summaries. Moreover, if, in the end, it turns out you really need to aggregate everything up to yearly in some way, you can still do that later in the combined data. By contrast if you aggregate up to yearly now, there is no going back if, as I suspect you will find, that turns out to be a mistake.
      Thanks, Clyde, for your clarifying. Now it is clearer to me, but I need your help with coding I am trying to do it, but I need to make sure because I have a time variable, which is the time of the interview. I am not sure how I can generate and link this variable with other data.


      In my Yearly data, the variable wave_num is defining 1 "2019", 2 "2020", 3 "2021", 4 "2022."
      and in Monthly data, the variable time defining 1 "April 2020" 2 "May 2020" 3 "June 2020" 4 "July 2020" 5 "Sep 2020" 6 " Nov 2020" 7 "Jan 2021" 8 "Mar 2021" 9 "Sep 2021."

      This means I can merge the months ( 1,2,3,4,5, and 6) with the year (2) and the rest with year (3)

      This is my sample of both data. Could you help with the code and how to generate a time variable appropriately

      Yearly data:

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input double pidp float(tq wave_num int_date) double jbhrs int dvage byte jbsat
         22445 231 1 21115   35 33  4
         22445 237 3 21643   45 34  2
         22445 233 2 21277   30 33  6
         22445 241 4 22007    . 35 -8
         29925 230 1 21053   24 40  6
         29925 234 2 21394   23 41  7
         29925 242 4 22166   28 43  5
         29925 238 3 21766   28 42  6
         76165 232 2 21260   35 35  5
         76165 241 4 22006   35 37  5
         76165 236 3 21623   35 36  5
         76165 229 1 20911    7 34  6
        280165 235 2 21520   40 39  5
        280165 239 3 21900   38 40  5
        280165 243 4 22261    . 41 -8
        280165 232 1 21215   40 38  5
        333205 229 1 20978   37 26  6
        333205 237 3 21698   37 29  6
        333205 233 2 21319   40 28  6
        469205 233 2 21363   16 28  7
        469205 241 4 22040   16 30  6
        469205 230 1 21028   42 27  5
        469205 237 3 21677   16 29  6
        599765 232 2 21198 36.5 31  6
        599765 228 1 20845 37.5 30  6
        599765 240 4 21928 36.5 33  6
        599765 236 3 21584 36.5 32  6
        665045 233 2 21305   15 36  4
        665045 241 4 22032   20 38  4
        665045 237 3 21651   20 37  3
        732365 234 2 21426    . 33 -8
        732365 230 1 21073    . 32 -8
        732365 242 4 22156    . 35 -8
        732365 238 3 21804    . 34 -8
        760925 232 2 21246    . 37 -8
        760925 237 3 21642    . 38 -7
       1587125 238 3 21797    . 53  3
       1587125 230 1 21063    . 51  5
       1587125 242 4 22157    . 54  5
       1587125 234 2 21438    . 52  3
       1697285 229 1 20911 37.5 44  5
       1697285 233 2 21282 37.5 45  5
       1833965 231 1 21140 37.5 52  5
       2270525 232 1 21267    . 20 -8
       2626845 236 2 21609   35 40 -7
       2626845 232 1 21209   20 39  7
       2888645 228 1 20879   38 28  6
       2888645 241 4 22032    . 31 -8
       2888645 232 2 21259   38 29  6
       3229325 234 2 21399    . 49 -8
       3229325 238 3 21772    . 50 -8
       3424485 233 2 21295    . 82 -7
       3424485 237 3 21664    . 83 -8
       3424485 241 4 22019    . 84 -8
       3587685 231 1 21130   39 36 -7
       3663845 229 1 20992    . 33  6
       3667245 233 2 21347   40 30  7
       3667245 230 1 21005   40 29  4
       3667245 238 3 21803   40 31  6
       3705325 232 2 21209    . 64 -8
       3705325 228 1 20858    . 63 -8
       3705325 236 3 21588    . 65 -8
       3705325 240 4 21942    . 66 -8
       3914085 239 3 21882   30 57  5
       3914765 239 3 21882    . 56  5
       3916125 239 3 21882    . 21  6
       4454005 242 4 22166    . 75 -8
       4454005 230 1 21081    . 72 -8
       4454005 234 2 21434    . 73 -8
       4454005 238 3 21804    . 74 -8
       4849085 237 3 21645 37.5 36  2
       4849085 233 2 21302 37.5 35  1
       4849085 241 4 22008 37.5 37  3
       4849085 229 1 20921 37.5 34  2
       4853165 244 4 22312   40 50  7
       4853165 240 3 21917   37 49  5
       4853165 231 1 21162   37 47  5
       4853165 236 2 21579   37 48  6
      68002045 229 1 20964    . 74 -8
      68002725 240 4 21976    . 66 -8
      68002725 228 1 20893    . 63 -8
      68002725 236 3 21610    . 65 -8
      68002725 232 2 21238    . 64 -8
      68004087 228 1 20900   35 67  3
      68004087 232 2 21239    . 68 -8
      68006127 236 3 21619    . 49 -8
      68006127 228 1 20890    . 47 -8
      68006127 232 2 21260    . 48 -8
      68006807 228 1 20888    . 80 -8
      68008847 240 4 21982   39 62  7
      68008847 228 1 20882    . 59 -8
      68008847 233 2 21343   39 60  7
      68008847 236 3 21619   39 61  7
      68009527 228 1 20895   37 39  5
      68009527 240 4 21995 36.5 42  5
      68009527 232 2 21246 36.5 40  4
      68009527 236 3 21610   39 41  5
      68010887 228 1 20886   32 53  6
      68010887 236 3 21611   32 55  6
      68010887 232 2 21266   32 54  6
      end
      format %tq tq
      format %td int_date
      label values wave_num waves
      label values jbhrs i_jbhrs
      label values dvage i_dvage
      label values jbsat i_jbsat
      label def i_jbsat -8 "inapplicable", modify
      label def i_jbsat -7 "proxy", modify
      label def i_jbsat 1 "completely dissatisfied", modify
      label def i_jbsat 2 "mostly dissatisfied", modify
      label def i_jbsat 3 "somewhat dissatisfied", modify
      label def i_jbsat 4 "neither satisfied or dissatisfied", modify
      label def i_jbsat 5 "somewhat satisfied", modify
      label def i_jbsat 6 "mostly satisfied", modify
      label def i_jbsat 7 "completely satisfied", modify

      Monthly data:
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long pidp byte(time wah)
         76165 1  1
         76165 2  1
         76165 3  1
         76165 4  1
         76165 5  1
         76165 6  1
         76165 7  1
         76165 8  1
         76165 9  3
        280165 1  4
        280165 2  4
        280165 3  4
        280165 4  4
        280165 5  4
        280165 6  .
        280165 7  .
        280165 8  .
        280165 9  .
        469205 1  4
        469205 2  .
        469205 3  .
        469205 4  .
        469205 5  4
        469205 6  .
        469205 7  .
        469205 8  .
        469205 9  4
        599765 1  1
        599765 2  1
        599765 3  1
        599765 4  .
        599765 5  .
        599765 6  .
        599765 7  .
        599765 8  .
        599765 9 -8
        732365 1 -8
        732365 2 -8
        732365 3 -8
        732365 4  .
        732365 5 -8
        732365 6 -8
        732365 7 -8
        732365 8 -8
        732365 9 -8
       1587125 1  3
       1587125 2  2
       1587125 3  2
       1587125 4  2
       1587125 5  2
       1587125 6  3
       1587125 7  3
       1587125 8  3
       1587125 9  3
       3424485 1 -8
       3424485 2 -8
       3424485 3  .
       3424485 4  .
       3424485 5  .
       3424485 6  .
       3424485 7  .
       3424485 8 -8
       3424485 9 -8
       4849085 1  4
       4849085 2  3
       4849085 3  2
       4849085 4  3
       4849085 5  3
       4849085 6  2
       4849085 7  1
       4849085 8  1
       4849085 9  2
      68002725 1  .
      68002725 2 -8
      68002725 3 -8
      68002725 4 -8
      68002725 5 -8
      68002725 6 -8
      68002725 7 -8
      68002725 8 -8
      68002725 9 -8
      68008847 1  4
      68008847 2  .
      68008847 3 -8
      68008847 4  4
      68008847 5  4
      68008847 6  4
      68008847 7  4
      68008847 8  4
      68008847 9  4
      68010887 1  4
      68010887 2  4
      68010887 3  4
      68010887 4  4
      68010887 5  3
      68010887 6  3
      68010887 7  2
      68010887 8  2
      68010887 9  .
      68029931 1  .
      end
      label values wah ci_wah
      label def ci_wah -8 "inapplicable", modify
      label def ci_wah 1 "Always", modify
      label def ci_wah 2 "Often", modify
      label def ci_wah 3 "Sometimes", modify
      label def ci_wah 4 "Never", modify


      Comment


      • #4
        So these data are a bit strange, and I don't know if that is because they are wrong, or just odd.

        For one thing, your "monthly" data has only years 2020 and 2021 according to what you say in #3. But nearly all of the observations in the "yearly" data are from years 2017-2019. So most of the data will go unmatched in the end.

        Also, the "monthly" data are an unevenly spaced series of "months," not the usual sequence, perhaps with occasional gaps, that one normally means by "monthly."

        Next, there is an anomaly in the yearly data. Pidp 280165 has two observations in year 2018. So this is not true yearly data. I suppose since you have no monthly 2018 data to match this with, there is no real problem. But if the full data set has instances of Pidp's with two or more observations in 2020 or 2021 then your data is unsuitable for combining in this way: how would we know which of the 2020 (or 2021 as the case may be) observations goes with the "monthly" observations where time is in the 1-6 range (resp time > 6).

        So before you even try to proceed, you need to check whether you have this same problem in the years that really matter:
        Code:
        use yearly_dataset
        gen year = yofd(int_date)
        keep if inlist(year, 2020, 2021)
        duplicates tag pidp year, gen(flag)
        browse if flag
        This will show you any observations where the combination of pidp and year repeats for year 2020 or 2021, if there are any. If so, you need to figure out whether those are data errors, or whether this is a real fact about your data. If it is a real fact about your data, then you need to consider the question of what to pair with what and decide on an answer. Only then will it be possible to combine them correctly.

        If you find there are no pidp-year anomalies in 2020 and 2021, then you can do this to combine these data sets:
        Code:
        use monthly_data, clear
        gen year = cond(time <= 6, 2020, 2021) if !missing(time)
        tempfile holding
        save `holding'
        
        use yearly_data, clear
        gen year = yofd(int_date)
        keep if inlist(year, 2020, 2021)
        merge 1:m pidp year using `holding'

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          For one thing, your "monthly" data has only years 2020 and 2021 according to what you say in #3. But nearly all of the observations in the "yearly" data are from years 2017-2019. So most of the data will go unmatched in the end.
          I think the data sample did not show the rest of the years because my complete data has more than 50000 observations. Therefore, in the full data, I have the years 2020,2021, and 2022.

          Originally posted by Clyde Schechter View Post
          Also, the "monthly" data are an unevenly spaced series of "months," not the usual sequence, perhaps with occasional gaps, that one normally means by "monthly."
          Yes, the institution that collected this data changed the way of collecting data after the fourth month (wave) to every two months instead of each month, and the last wave had a gap of five months, so in this case, I left with some missing months.

          Originally posted by Clyde Schechter View Post
          Next, there is an anomaly in the yearly data. Pidp 280165 has two observations in year 2018. So this is not true yearly data. I suppose since you have no monthly 2018 data to match this with, there is no real problem. But if the full data set has instances of Pidp's with two or more observations in 2020 or 2021 then your data is unsuitable for combining in this way: how would we know which of the 2020 (or 2021 as the case may be) observations goes with the "monthly" observations where time is in the 1-6 range (resp time > 6).
          Yes, I found the combination of pidp and year repeats in all years as well. I think this is not an error in the data because the yearly data contains "4" waves ("1" (2017-2018), "2" (2018-2019), "3" (2020-2021) , and "4" (2021- 2022), so I append these waves on one log file as it showed in the in #3. I think this way, I have repeated individuals because most the individuals appear in all waves.

          In this scenario, what is the best way to combine them correctly?
          Last edited by Hassan zahey; 18 Dec 2023, 17:05.

          Comment


          • #6
            In this scenario, what is the best way to combine them correctly?
            So, with this additional explanation, it seems to me that the characterization of the "yearly" data is misleading and leads us down a false path of reasoning. The "yearly" data actually has an interview date variable that is a complete date. So it seems to me that what you need to do is join the observations in the "yearly" data set to the "monthly" data when the interview date falls in the month of the monthly observation.

            Code:
            use monthly_data, clear
            gen mdate = tm(2020m4) if time == 1
            replace mdate = tm(2020m5) if time == 2
            replace mdate = tm(2020m6) if time == 3
            replace mdate = tm(2020m7) if time == 4
            replace mdate = tm(2020m9) if time == 5
            replace mdate = tm(2020m11) if time == 6
            replace mdate = tm(2021m1) if time == 7
            replace mdate = tm(2021m3) if time == 8
            replace mdate = tm(2021m9) if time == 9
            format mdate %tm
            tempfile holding
            save `holding'
            
            use yearly_data, clear
            gen mdate = mofd(int_date)
            format mdate %tm
            merge 1:m pidp mdate using `holding'
            But really, to keep thinking clearly about how to use this data, I recommend abandoning the terms "yearly" and "monthly," because they are not only not descriptive, they evoke all the wrong connotations. The "yearly" data set is just a series of survey waves with specific interview dates, and the "monthly" data is a series of episodic cross-sectional observations of some other variables.

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              So, with this additional explanation, it seems to me that the characterization of the "yearly" data is misleading and leads us down a false path of reasoning. The "yearly" data actually has an interview date variable that is a complete date. So it seems to me that what you need to do is join the observations in the "yearly" data set to the "monthly" data when the interview date falls in the month of the monthly observation.

              Code:
              use monthly_data, clear
              gen mdate = tm(2020m4) if time == 1
              replace mdate = tm(2020m5) if time == 2
              replace mdate = tm(2020m6) if time == 3
              replace mdate = tm(2020m7) if time == 4
              replace mdate = tm(2020m9) if time == 5
              replace mdate = tm(2020m11) if time == 6
              replace mdate = tm(2021m1) if time == 7
              replace mdate = tm(2021m3) if time == 8
              replace mdate = tm(2021m9) if time == 9
              format mdate %tm
              tempfile holding
              save `holding'
              
              use yearly_data, clear
              gen mdate = mofd(int_date)
              format mdate %tm
              merge 1:m pidp mdate using `holding'
              But really, to keep thinking clearly about how to use this data, I recommend abandoning the terms "yearly" and "monthly," because they are not only not descriptive, they evoke all the wrong connotations. The "yearly" data set is just a series of survey waves with specific interview dates, and the "monthly" data is a series of episodic cross-sectional observations of some other variables.
              Thanks for your help I merged both data and want to use the DiD model. However, I got an error on the parallel test. I tried to know where my mistake was, but I could not. Could you help with that?

              I want to imply the causal effect of job satisfaction on individuals who work from home pre/post-COVID-19. After I merged the data, I created my treatment/control group, which is Individuals working from home or not.
              Code:
              gen wfh_always1 = (wfh_always==1 | jbfxuse7 ==1)
              gen wfh_never1 = (wfh_never==1 |jbfxuse7 ==0 )
              gen treat_group = 0
              replace treat_group = 1 if (wfh_always1 == 1 | wfh_can1 == 1) 
              gen control_group = 0
              replace control_group = 1 if wfh_never1 == 1
              gen pre_covid = 0
              replace pre_covid =  1 if int_date < date("2020-03-23", "YMD")
              gen post_covid = 0
              replace post_covid = 1 if int_date >= date("2020-03-23", "YMD") 
              
              gen treatment_post = treat_group * post_covid
              Then I apply DiD :
              Code:
              didregress (jbsat) ( treatment_post ), group(pidp) time(int_date)
              estat trendplots
              I got this error "treatment assignment times vary; not allowed with estat trendplots"

              Do you think this is because of a mistake when I merge the data?
              This is my data :
              Code:
              * Example generated by -dataex-. For more info, type help dataex
              clear
              input double pidp byte jbsat float(int_date mdate wfh_always wfh_never wfh_can)
                22445  6 21277 699 . . .
                22445  2 21643 711 . . .
                22445 .b 22007 723 . . .
                29925  7 21394 702 . . .
                29925  6 21766 715 . . .
                29925  5 22166 728 . . .
                76165  5 21260 698 . . .
                76165  5 21623 710 . . .
                76165  5 22006 723 1 0 0
                76165  .     . 724 1 0 0
                76165  .     . 725 1 0 0
                76165  .     . 726 1 0 0
                76165  .     . 728 1 0 0
                76165  .     . 730 1 0 0
                76165  .     . 732 1 0 0
                76165  .     . 734 1 0 0
                76165  .     . 740 0 0 1
               280165  5 21520 707 . . .
               280165  5 21900 719 . . .
               280165  .     . 723 0 1 0
               280165  .     . 724 0 1 0
               280165  .     . 725 0 1 0
               280165  .     . 726 0 1 0
               280165  .     . 728 0 1 0
               280165 .b 22261 731 . . .
               333205  6 21319 700 . . .
               333205  6 21698 712 . . .
               469205  7 21363 701 . . .
               469205  6 21677 712 . . .
               469205  .     . 723 0 1 0
               469205  6 22040 724 . . .
               469205  .     . 728 0 1 0
               469205  .     . 740 0 1 0
               599765  6 21198 696 . . .
               599765  6 21584 709 . . .
               599765  6 21928 720 . . .
               599765  .     . 723 1 0 0
               599765  .     . 724 1 0 0
               599765  .     . 725 1 0 0
               599765  .     . 740 0 0 0
               665045  4 21305 700 . . .
               665045  3 21651 711 . . .
               665045  4 22032 723 . . .
               732365 .b 21426 703 . . .
               732365 .b 21804 716 . . .
               732365  .     . 723 0 0 0
               732365  .     . 724 0 0 0
               732365  .     . 725 0 0 0
               732365 .b 22156 727 . . .
               732365  .     . 728 0 0 0
               732365  .     . 730 0 0 0
               732365  .     . 732 0 0 0
               732365  .     . 734 0 0 0
               732365  .     . 740 0 0 0
               760925 .b 21246 698 . . .
               760925 .c 21642 711 . . .
              1587125  3 21438 704 . . .
              1587125  3 21797 716 . . .
              1587125  .     . 723 0 0 1
              1587125  .     . 724 0 0 1
              1587125  .     . 725 0 0 1
              1587125  .     . 726 0 0 1
              1587125  5 22157 727 . . .
              1587125  .     . 728 0 0 1
              1587125  .     . 730 0 0 1
              1587125  .     . 732 0 0 1
              1587125  .     . 734 0 0 1
              1587125  .     . 740 0 0 1
              1697285  5 21282 699 . . .
              2626845 .c 21609 710 . . .
              2888645  6 21259 698 . . .
              2888645 .b 22032 723 . . .
              3229325 .b 21399 703 . . .
              3229325 .b 21772 715 . . .
              3424485 .c 21295 699 . . .
              3424485 .b 21664 711 . . .
              3424485 .b 22019 723 0 0 0
              3424485  .     . 724 0 0 0
              3424485  .     . 734 0 0 0
              3424485  .     . 740 0 0 0
              3667245  7 21347 701 . . .
              3667245  6 21803 716 . . .
              3705325 .b 21209 696 . . .
              3705325 .b 21588 709 . . .
              3705325 .b 21942 720 . . .
              3914085  5 21882 718 . . .
              3914765  5 21882 718 . . .
              3916125  6 21882 718 . . .
              4454005 .b 21434 704 . . .
              4454005 .b 21804 716 . . .
              4454005 .b 22166 728 . . .
              4849085  1 21302 699 . . .
              4849085  2 21645 711 . . .
              4849085  3 22008 723 0 1 0
              4849085  .     . 724 0 0 1
              4849085  .     . 725 0 0 1
              4849085  .     . 726 0 0 1
              4849085  .     . 728 0 0 1
              4849085  .     . 730 0 0 1
              4849085  .     . 732 1 0 0
              end
              format %td int_date
              format %tm mdate
              label values jbsat j_jbsat
              label def j_jbsat 1 "completely dissatisfied", modify
              label def j_jbsat 2 "mostly dissatisfied", modify
              label def j_jbsat 3 "somewhat dissatisfied", modify
              label def j_jbsat 4 "neither satisfied or dissatisfied", modify
              label def j_jbsat 5 "somewhat satisfied", modify
              label def j_jbsat 6 "mostly satisfied", modify
              label def j_jbsat 7 "completely satisfied", modify

              Comment


              • #8
                It is not possible to apply your code for creating the treatment and post-covid variables because your example data does not include some of the variables needed. Please post a new example that runs that code to completion. Your post-covid variable looks like it is properly constructed and should provide a single time of treatment, namely 23 Mar 2020. But it's not possible to tell what your treatment variable looks like, and that is likely the source of the problem. In particular, bear in mind that for the DID commands you are using to work properly, each pidp should have all of its observations as treat_group == 1 or all of them as treat_group == 0. There should never be any change in that variable within pidp.

                Let me also point out that there is no need for the variables pre_covid and control_group: these are just the opposites of treat_group and pre_covid, though they do no harm other than cluttering up your data set and wasting memory.

                Comment


                • #9
                  Thanks for your reply.

                  Originally posted by Clyde Schechter View Post
                  In particular, bear in mind that for the DID commands you are using to work properly, each pidp should have all of its observations as treat_group == 1 or all of them as treat_group == 0. There should never be any change in that variable within pidp.
                  .
                  The pidp variable has missing data in terms of the treat variable (Work from home). This means some individuals did not work from home, so we coded it as a missing value. I'm not sure if this is what you meant by your reply above. If this is correct, do you mean I have to drop all missing values of the treat variable?

                  This is the updated example:
                  [CODE]
                  * Example generated by -dataex-. For more info, type help dataex
                  Code:
                  * Example generated by -dataex-. For more info, type help dataex
                  clear
                  input double pidp float(int_date mdate) byte(jbsat jbflex7) float(wfh_always wfh_never wfh_can)
                    22445 21277 699  6  1 . . .
                    22445 21643 711  2  . . . .
                    22445 22007 723 .b .b . . .
                    29925 21394 702  7  0 . . .
                    29925 21766 715  6  . . . .
                    29925 22166 728  5  0 . . .
                    76165 21260 698  5  0 . . .
                    76165 21623 710  5  . . . .
                    76165 22006 723  5  0 1 0 0
                    76165     . 724  .  . 1 0 0
                    76165     . 725  .  . 1 0 0
                    76165     . 726  .  . 1 0 0
                    76165     . 728  .  . 1 0 0
                    76165     . 730  .  . 1 0 0
                    76165     . 732  .  . 1 0 0
                    76165     . 734  .  . 1 0 0
                    76165     . 740  .  . 0 0 1
                   280165 21520 707  5  1 . . .
                   280165 21900 719  5  . . . .
                   280165     . 723  .  . 0 1 0
                   280165     . 724  .  . 0 1 0
                   280165     . 725  .  . 0 1 0
                   280165     . 726  .  . 0 1 0
                   280165     . 728  .  . 0 1 0
                   280165 22261 731 .b .b . . .
                   333205 21319 700  6  0 . . .
                   333205 21698 712  6  . . . .
                   469205 21363 701  7  0 . . .
                   469205 21677 712  6  . . . .
                   469205     . 723  .  . 0 1 0
                   469205 22040 724  6  0 . . .
                   469205     . 728  .  . 0 1 0
                   469205     . 740  .  . 0 1 0
                   599765 21198 696  6  0 . . .
                   599765 21584 709  6  . . . .
                   599765 21928 720  6  0 . . .
                   599765     . 723  .  . 1 0 0
                   599765     . 724  .  . 1 0 0
                   599765     . 725  .  . 1 0 0
                   599765     . 740  .  . 0 0 0
                   665045 21305 700  4  0 . . .
                   665045 21651 711  3  . . . .
                   665045 22032 723  4  0 . . .
                   732365 21426 703 .b .b . . .
                   732365 21804 716 .b  . . . .
                   732365     . 723  .  . 0 0 0
                   732365     . 724  .  . 0 0 0
                   732365     . 725  .  . 0 0 0
                   732365 22156 727 .b .b . . .
                   732365     . 728  .  . 0 0 0
                   732365     . 730  .  . 0 0 0
                   732365     . 732  .  . 0 0 0
                   732365     . 734  .  . 0 0 0
                   732365     . 740  .  . 0 0 0
                   760925 21246 698 .b .b . . .
                   760925 21642 711 .c  . . . .
                  1587125 21438 704  3 .b . . .
                  1587125 21797 716  3  . . . .
                  1587125     . 723  .  . 0 0 1
                  1587125     . 724  .  . 0 0 1
                  1587125     . 725  .  . 0 0 1
                  1587125     . 726  .  . 0 0 1
                  1587125 22157 727  5 .b . . .
                  1587125     . 728  .  . 0 0 1
                  1587125     . 730  .  . 0 0 1
                  1587125     . 732  .  . 0 0 1
                  1587125     . 734  .  . 0 0 1
                  1587125     . 740  .  . 0 0 1
                  1697285 21282 699  5  0 . . .
                  2626845 21609 710 .c .c . . .
                  2888645 21259 698  6  0 . . .
                  2888645 22032 723 .b .b . . .
                  3229325 21399 703 .b .b . . .
                  3229325 21772 715 .b  . . . .
                  3424485 21295 699 .c .c . . .
                  3424485 21664 711 .b  . . . .
                  3424485 22019 723 .b .b 0 0 0
                  3424485     . 724  .  . 0 0 0
                  3424485     . 734  .  . 0 0 0
                  3424485     . 740  .  . 0 0 0
                  3667245 21347 701  7  0 . . .
                  3667245 21803 716  6  . . . .
                  3705325 21209 696 .b .b . . .
                  3705325 21588 709 .b  . . . .
                  3705325 21942 720 .b .b . . .
                  3914085 21882 718  5  . . . .
                  3914765 21882 718  5  . . . .
                  3916125 21882 718  6  . . . .
                  4454005 21434 704 .b .b . . .
                  4454005 21804 716 .b  . . . .
                  4454005 22166 728 .b .b . . .
                  4849085 21302 699  1  1 . . .
                  4849085 21645 711  2  . . . .
                  4849085 22008 723  3  1 0 1 0
                  4849085     . 724  .  . 0 0 1
                  4849085     . 725  .  . 0 0 1
                  4849085     . 726  .  . 0 0 1
                  4849085     . 728  .  . 0 0 1
                  4849085     . 730  .  . 0 0 1
                  4849085     . 732  .  . 1 0 0
                  end
                  format %td int_date
                  format %tm mdate
                  label values jbsat j_jbsat
                  label def j_jbsat 1 "completely dissatisfied", modify
                  label def j_jbsat 2 "mostly dissatisfied", modify
                  label def j_jbsat 3 "somewhat dissatisfied", modify
                  label def j_jbsat 4 "neither satisfied or dissatisfied", modify
                  label def j_jbsat 5 "somewhat satisfied", modify
                  label def j_jbsat 6 "mostly satisfied", modify
                  label def j_jbsat 7 "completely satisfied", modify
                  label values jbflex7 j_jbflex7
                  label def j_jbflex7 0 "Not mentioned", modify
                  label def j_jbflex7 1 "Yes mentioned", modify
                  Last edited by Hassan zahey; 22 Dec 2023, 17:56.

                  Comment


                  • #10
                    The example data is still not usable: jbfxuse7 is not there. Neither is wfh_can1.

                    Before you post back with another example, check it out. Do that by pasting the -dataex- output into your do-file editor. Then paste in the code you wrote in #7 and run it all. Make sure that the code runs without error messages all the way down to and through the -didregress- command and that the -estat trendplots- command produces the same error message you are getting when you run the code with your full data set. That way you can be sure that that example will reproduce your problem. Once you have an example that does that, post the -dataex- output here.
                    Last edited by Clyde Schechter; 22 Dec 2023, 19:33.

                    Comment


                    • #11
                      Originally posted by Clyde Schechter View Post
                      The example data is still not usable: jbfxuse7 is not there. Neither is wfh_can1.

                      Before you post back with another example, check it out. Do that by pasting the -dataex- output into your do-file editor. Then paste in the code you wrote in #7 and run it all. Make sure that the code runs without error messages all the way down to and through the -didregress- command and that the -estat trendplots- command produces the same error message you are getting when you run the code with your full data set. That way you can be sure that that example will reproduce your problem. Once you have an example that does that, post the -dataex- output here.
                      Apologize for the previous miskicks by me. I tried to get a correct 100 observation sample showing the same error message that I got with full data set but I could not, with 100 run smoothly without error for applying - estat trendplots-. However sometimes give me error message for -didregress- "model is not identified The treatment variable treatment_post1 was omitted because of collinearity." for solve this I used a -randomtag- and made a random sample just to give me the same error message when i run the full data set, this what I got :




                      Code:
                      * Example generated by -dataex-. For more info, type help dataex
                      clear
                      input double pidp float(int_date mdate) byte(jbsat jbfxuse7) float(wfh_always1 wfh_never1 wfh_can1)
                           22445 21277 699  6  1 1 0 0
                           22445 21643 711  2  . 0 0 0
                           22445 22007 723 .b .b 0 0 0
                           29925 21394 702  7  0 0 1 0
                           29925 21766 715  6  . 0 0 0
                           29925 22166 728  5  0 0 1 0
                           76165 21260 698  5  0 0 1 0
                           76165 21623 710  5  . 0 0 0
                           76165 22006 723  5  0 1 1 0
                          280165 21520 707  5  1 1 0 0
                          280165 21900 719  5  . 0 0 0
                          280165 22261 731 .b .b 0 0 0
                          333205 21319 700  6  0 0 1 0
                          333205 21698 712  6  . 0 0 0
                          469205 21363 701  7  0 0 1 0
                          469205 21677 712  6  . 0 0 0
                          469205 22040 724  6  0 0 1 0
                          599765 21198 696  6  0 0 1 0
                          599765 21584 709  6  . 0 0 0
                          599765 21928 720  6 .b 0 0 0
                          665045 21305 700  4  0 0 1 0
                          665045 21651 711  3  . 0 0 0
                          665045 22032 723  4  0 0 1 0
                          732365 21426 703 .b .b 0 0 0
                          732365 21804 716 .b  . 0 0 0
                          732365 22156 727 .b .b 0 0 0
                          760925 21246 698 .b .b 0 0 0
                          760925 21642 711 .c  . 0 0 0
                         1587125 21438 704  3 .b 0 0 0
                         1587125 21797 716  3  . 0 0 0
                         1587125 22157 727  5 .b 0 0 0
                         1697285 21282 699  5 .b 0 0 0
                         2626845 21609 710 .c .c 0 0 0
                         2888645 21259 698  6 .b 0 0 0
                         2888645 22032 723 .b .b 0 0 0
                         3229325 21399 703 .b .b 0 0 0
                         3229325 21772 715 .b  . 0 0 0
                         3424485 21295 699 .c .c 0 0 0
                         3424485 21664 711 .b  . 0 0 0
                         3424485 22019 723 .b .b 0 0 0
                         3667245 21347 701  7 .b 0 0 0
                         3667245 21803 716  6  . 0 0 0
                         3705325 21209 696 .b .b 0 0 0
                         3705325 21588 709 .b  . 0 0 0
                         3705325 21942 720 .b .b 0 0 0
                         3914085 21882 718  5  . 0 0 0
                         3914765 21882 718  5  . 0 0 0
                         3916125 21882 718  6  . 0 0 0
                         4454005 21434 704 .b .b 0 0 0
                         4454005 21804 716 .b  . 0 0 0
                         4454005 22166 728 .b .b 0 0 0
                         4849085 21302 699  1  0 0 1 0
                         4849085 21645 711  2  . 0 0 0
                         4849085 22008 723  3  0 0 1 0
                         4853165 21579 708  6  0 0 1 0
                         4853165 21917 720  5  . 0 0 0
                         4853165 22312 733  7 .b 0 0 0
                        68002725 21238 697 .b .b 0 0 0
                        68002725 21610 710 .b  . 0 0 0
                        68002725 21976 722 .b .b 0 0 0
                        68004087 21239 697 .b .b 0 0 0
                        68006127 21260 698 .b .b 0 0 0
                        68006127 21619 710 .b  . 0 0 0
                        68008847 21343 701  7  0 0 1 0
                        68008847 21619 710  7  . 0 0 0
                        68008847 21982 722  7  0 0 1 0
                        68009527 21246 698  4 .b 0 0 0
                        68009527 21610 710  5  . 0 0 0
                        68009527 21995 722  5 .b 0 0 0
                        68010887 21266 698  6 .b 0 0 0
                        68010887 21611 710  6  . 0 0 0
                        68010887 21984 722  6  0 0 1 0
                        68011567 21195 696  6  1 1 0 0
                        68014287 21320 700 .b .b 0 0 0
                        68014287 21677 712 .b  . 0 0 0
                        68014291 21355 701  6  0 0 1 0
                        68014295 21346 701 .b .b 0 0 0
                        68020407 21329 700 .b .b 0 0 0
                        68020407 21678 712 .b  . 0 0 0
                        68020564 21245 698 .b .b 0 0 0
                        68020564 21619 710 .b  . 0 0 0
                        68021765 21325 700  7  0 0 1 0
                        68021765 21692 712  7  . 0 0 0
                        68021781 21332 700  6  0 0 1 0
                        68028575 21255 698 .b .b 0 0 0
                        68028575 21620 710 .b  . 0 0 0
                        68028575 21968 721 .b .b 0 0 0
                        68029927 21264 698 .b .b 0 0 0
                        68029927 21612 710 .b  . 0 0 0
                        68029927 21992 722 .b .b 0 0 0
                        68029931 21264 698  7  0 0 1 0
                        68029931 21613 710  6  . 0 0 0
                        68029935 21615 710  4  . 0 0 0
                        68029939 21992 722 .b .b 0 0 0
                        68031967 21265 698 .b .b 0 0 0
                        68031967 21628 710 .b  . 0 0 0
                        68031967 21999 722 .b .b 0 0 0
                        68035365 21235 697 .b .b 0 0 0
                        68035365 21605 709 .b  . 0 0 0
                        68035365 21922 720 .b .b 0 0 0
                        68035367 21304 699  6  0 0 1 0
                        68035367 21657 711  6  . 0 0 0
                        68035367 21986 722  6  0 0 1 0
                        68036727 21301 699 .b .b 0 0 0
                        68036727 21677 712 .b  . 0 0 0
                        68036727 22029 723 .b .b 0 0 0
                        68037407 21238 697  6 .b 0 0 0
                        68037407 21600 709  6  . 0 0 0
                        68037407 22059 724  6 .b 0 0 0
                        68037411 21214 696  5  0 0 1 0
                        68037411 21581 709  5  . 0 0 0
                        68041487 21277 699  6 .b 0 0 0
                        68041487 21683 712  6  . 0 0 0
                        68041487 21949 721  6  1 1 0 0
                        68041491 21277 699  6 .b 0 0 0
                        68041491 21683 712  6  . 0 0 0
                        68041491 21944 720 .b .b 0 0 0
                        68041495 21277 699 .b .b 0 0 0
                        68041499 21683 712 .b  . 0 0 0
                        68042167 21222 697  3  0 0 1 0
                        68042167 21588 709  3  . 0 0 0
                        68042167 21985 722  4  0 0 1 0
                        68042171 21221 697  6  0 0 1 0
                        68042171 21692 712 .b  . 0 0 0
                        68042171 21956 721  5 .b 0 0 0
                        68043527 21234 697 .b .b 0 0 0
                        68043527 21664 711 .b  . 0 0 0
                        68043527 21964 721 .b .b 0 0 0
                        68044207 21227 697  4  0 0 1 0
                        68044207 21597 709  4  . 0 0 0
                        68044207 21949 721  6  1 1 0 0
                        68044211 21240 697  6 .b 0 0 0
                        68044211 21609 710  5  . 0 0 0
                        68044211 21978 722  6 .b 0 0 0
                        68044887 21210 696 .b .b 0 0 0
                        68044887 21584 709 .b  . 0 0 0
                        68044887 21956 721 .b .b 0 0 0
                        68045567 21212 696  6  0 0 1 0
                        68045567 21587 709  6  . 0 0 0
                        68045567 21926 720  6  1 1 0 0
                        68045571 21233 697 .b .b 0 0 0
                        68045571 21668 711 .b  . 0 0 0
                        68045571 21987 722 .b .b 0 0 0
                        68046247 21255 698 .b .b 0 0 0
                        68046247 21652 711 .b  . 0 0 0
                        68046247 21987 722 .b .b 0 0 0
                        68046251 21255 698 .b .b 0 0 0
                        68046251 21647 711 .b  . 0 0 0
                        68046251 21987 722 .b .b 0 0 0
                        68046927 21280 699  6  0 0 1 0
                        68046935 21280 699  4  0 0 1 0
                        68049647 21196 696  6 .b 0 0 0
                        68049647 21560 708  6  . 0 0 0
                        68049647 21922 720  6 .b 0 0 0
                        68049651 21196 696  4  0 0 1 0
                        68049651 21560 708  5  . 0 0 0
                        68049651 21922 720  5  0 0 1 0
                        68051007 21193 696  2 .b 0 0 0
                        68051007 21553 708 .b  . 0 0 0
                        68051007 21922 720  5 .b 0 0 0
                        68051011 21194 696  6 .b 0 0 0
                        68051011 21554 708  5  . 0 0 0
                        68051011 21927 720  6  0 0 1 0
                        68056447 21221 697  4  0 0 1 0
                        68056447 21620 710  1  . 0 0 0
                        68056447 21936 720  1  0 0 1 0
                        68056451 21220 697 .b .b 0 0 0
                        68056451 21620 710  6  . 0 0 0
                        68056451 21924 720  7  0 0 1 0
                        68056455 21193 696  1 .b 0 0 0
                        68056455 21627 710  6  . 0 0 0
                        68056455 21929 720  6  0 0 1 0
                        68056459 21220 697 .b .b 0 0 0
                        68056459 21620 710  7  . 0 0 0
                        68056459 21938 720 .b .b 0 0 0
                        68056463 21622 710  7  . 0 0 0
                        68056463 21927 720  4 .b 0 0 0
                        68058485 21245 698 .b .b 0 0 0
                        68058485 21612 710 .b  . 0 0 0
                        68058485 21975 722 .b .b 0 0 0
                        68058487 21206 696 .b .b 0 0 0
                        68058487 21565 708 .b  . 0 0 0
                        68058487 21925 720 .b .b 0 0 0
                        68058489 21245 698 .b .b 0 0 0
                        68058489 21612 710 .b  . 0 0 0
                        68058489 21975 722 .b .b 0 0 0
                        68058491 21198 696 .b .b 0 0 0
                        68058491 21562 708 .b  . 0 0 0
                        68058491 21925 720 .b .b 0 0 0
                        68059171 21222 697  4 .b 0 0 0
                        68060525 21250 698 .b .b 0 0 0
                        68060527 21220 697  6  1 1 0 0
                        68060527 21600 709  6  . 0 0 0
                        68060527 21955 721  5  0 0 1 0
                        68060531 21220 697  5  0 0 1 0
                        68060531 21593 709  3  . 0 0 0
                        68060531 21955 721  5  1 1 0 0
                        68060533 21239 697  5  0 0 1 0
                        68060537 21240 697 .b .b 0 0 0
                        68061288 21238 697  5  0 0 1 0
                      1020017015 21605 709 .b .b 0 0 0
                      1020412767     . 728  .  . 0 1 0
                      1020494375 21553 708  6 .b 0 0 0
                      1020551487 22032 723 .b  . 0 0 0
                      1020607927 21919 720 .b  . 0 0 0
                      1020622211 21954 721 .b  . 0 0 0
                      1020666419 22315 733 .b .b 0 0 0
                      1020742567 21553 708  4  0 0 1 0
                      1020745971 21921 720  1  . 0 0 0
                      1021105007     . 726  .  . 0 0 0
                      1021162896     . 723  .  . 0 1 0
                      1022699607     . 724  .  . 0 0 1
                      1088167295 22024 723  3  . 0 1 0
                      1088228495     . 723  .  . 0 1 0
                      1088894207     . 724  .  . 1 0 0
                      1089189327     . 724  .  . 0 1 0
                      1108638050 21730 713 .b .b 0 0 0
                      1156071407 22122 726  6  . 1 0 0
                      1156578007     . 723  .  . 0 1 0
                      1156656207 22404 736 .b .b 0 0 0
                      1156840487     . 723  .  . 0 0 0
                      1157012527     . 740  .  . 0 0 1
                      1157081899 22102 726  6  . 0 1 0
                      1157167575     . 728  .  . 0 1 0
                      1157185247     . 730  .  . 0 1 0
                      1157207691     . 725  .  . 0 1 0
                      1157973367 22114 726 .b  . 0 0 0
                      1158606455 22384 735 .b .b 0 0 0
                      1224120375     . 724  .  . 0 0 0
                      1224433851     . 723  .  . 0 0 0
                      1224480091 21783 715 .b .b 0 0 0
                      1225079847     . 732  .  . 0 0 1
                      1225089367     . 726  .  . 0 0 0
                      1225227411 22010 723 .b  . 0 0 0
                      1225993091     . 726  .  . 0 1 0
                      1292557690 21795 716 .b .b 0 0 0
                      1292656891     . 723  .  . 0 0 0
                      1292682727     . 726  .  . 0 1 0
                      1293022047     . 723  .  . 0 0 0
                      1293153291 22218 729 .b  . 0 0 0
                      1293222647     . 734  .  . 0 1 0
                      1293236927 22190 729 .b  . 0 0 0
                      1360848655     . 723  .  . 0 0 0
                      1360946567 21861 718  6 .b 0 0 0
                      1361306291 21852 717 .b .b 0 0 0
                      1428245487 22146 727  5  . 0 0 0
                      1428635811     . 728  .  . 0 0 0
                      1428717407     . 724  .  . 0 0 0
                      1428943847     . 734  .  . 0 0 0
                      1429113167     . 723  .  . 1 0 0
                      1429113167     . 726  .  . 0 0 1
                      1429245087 22168 728 .b  . 0 0 0
                      1429340291 21814 716 .b .b 0 0 0
                      1430050211     . 740  .  . 0 1 0
                      1430362327     . 728  .  . 0 0 0
                      1496719447     . 726  .  . 0 0 0
                      1496734419     . 726  .  . 0 0 0
                      1496784047     . 732  .  . 1 0 0
                      1496971055 21894 719  6 .b 0 0 0
                      1497224007     . 726  .  . 0 1 0
                      1497351167 22339 733  5  . 0 0 0
                      1564200611 21818 716 .b .b 0 0 0
                      1564288327     . 734  .  . 1 0 0
                      1565079851     . 732  .  . 0 0 1
                      1576212850     . 723  .  . 1 0 0
                      1584012490     . 730  .  . 0 0 1
                      1632042847     . 732  .  . 0 0 0
                      1632352927 21867 718 .b .b 0 0 0
                      1632566451     . 740  .  . 1 0 0
                      1632633087     . 723  .  . 0 1 0
                      1633881571     . 732  .  . 0 0 0
                      1644675290 22190 729 .b  . 0 0 0
                      end
                      format %td int_date
                      format %tm mdate
                      label values jbsat j_jbsat
                      label def j_jbsat 1 "completely dissatisfied", modify
                      label def j_jbsat 2 "mostly dissatisfied", modify
                      label def j_jbsat 3 "somewhat dissatisfied", modify
                      label def j_jbsat 4 "neither satisfied or dissatisfied", modify
                      label def j_jbsat 5 "somewhat satisfied", modify
                      label def j_jbsat 6 "mostly satisfied", modify
                      label def j_jbsat 7 "completely satisfied", modify
                      label values jbfxuse7 j_jbfxuse7
                      label def j_jbfxuse7 0 "Not mentioned", modify
                      label def j_jbfxuse7 1 "Yes mentioned", modify
                      Code:
                      gen treat_group = 0
                      replace treat_group = 1 if (wfh_always1 == 1 | wfh_can1 == 1) 
                      gen post_covid = 0
                      replace post_covid = 1 if int_date >= date("2020-03-23", "YMD") 
                      gen treatment_post = treat_group * post_covid
                      
                      didregress (jbsat) ( treatment_post ), group(pidp) time(mdate)
                      estat trendplots


                      Comment


                      • #12
                        As I suspected, your data are not suitable for this analysis. In order to do a DID of this type, each pidp must either have treat_group = 1 for all observations, or it must have treat_group = 0 for all observations. That is, a given pidp is either in the treatment group or not: this does not vary with time. The treatment group may be untreated before a certain time--in fact that must also be true to do a DID. But those pidp's are still in the treatment group even if they have not yet been treated. Here is a list of the pidp's in your example data that violate this requirement:
                        Code:
                        22445, 76165, 280165, 68041487, 68044207, 68045567, 68060527, 68060531
                        So you need to fix this before you can run a DID analysis.

                        Added: There is an additional problem in your data that you will probably run into even after you fix that. You have many dates that are only observed with observations in the treatment group, and other dates that are only observed in the control group. These dates are therefore colinear with the treatment_post variable and get omitted, along with the observations that contain them. It may be the case that when all of those are omitted the remaining sample no longer contains instances of all four combinations of treat_group and post_covid--another requirement for the DID analysis. I think that it is a mistake to use a daily date variable as the time fixed effect analysis here. It is too fine grained: most of the values of int_date appear with just one pidp, and only a handful with 2 or 3. That's a recipe for failiure. I suggest aggregating the date variable to a coarser unit, perhaps week or month or quarter for your analysis.
                        Last edited by Clyde Schechter; 24 Dec 2023, 11:20.

                        Comment


                        • #13
                          Originally posted by Clyde Schechter View Post
                          As I suspected, your data are not suitable for this analysis. In order to do a DID of this type, each pidp must either have treat_group = 1 for all observations, or it must have treat_group = 0 for all observations. That is, a given pidp is either in the treatment group or not: this does not vary with time. The treatment group may be untreated before a certain time--in fact that must also be true to do a DID. But those pidp's are still in the treatment group even if they have not yet been treated. Here is a list of the pidp's in your example data that violate this requirement:
                          Code:
                          22445, 76165, 280165, 68041487, 68044207, 68045567, 68060527, 68060531
                          So you need to fix this before you can run a DID analysis.

                          Added: There is an additional problem in your data that you will probably run into even after you fix that. You have many dates that are only observed with observations in the treatment group, and other dates that are only observed in the control group. These dates are therefore colinear with the treatment_post variable and get omitted, along with the observations that contain them. It may be the case that when all of those are omitted the remaining sample no longer contains instances of all four combinations of treat_group and post_covid--another requirement for the DID analysis. I think that it is a mistake to use a daily date variable as the time fixed effect analysis here. It is too fine grained: most of the values of int_date appear with just one pidp, and only a handful with 2 or 3. That's a recipe for failiure. I suggest aggregating the date variable to a coarser unit, perhaps week or month or quarter for your analysis.
                          I appreciate your comments and your clarification of these issues in my data. Still, there are some parts in your comments that I did not fully understand, and I may not understand how to apply them correctly.

                          Is there a problem with merging data? Because I merged master data for several years with data on a monthly basis for only eight months? Could this be the main reason why I have some pidp appear only at certain periods and not others?

                          I do not understand how I can fix this problem. I appreciate your comment clarifying this.
                          I also have a variable, “mdate ”, which is a monthly date variable and not a daily date variable. Can I use it instead of “ int_date” to solve this problem? How can I do that?

                          I apologize for reposting, but I did not understand exactly how to overcome this problem.

                          Comment


                          • #14
                            To fix the errors that I mentioned in #12, identify all of the pidps that have inconsistent marking for treat_group:
                            Code:
                            by pidp (treat_group), sort: gen byte problem = treat_group[1] != treat_group[_N]
                            browse if problem
                            Then you will have to check your original source data to say which of them should be 0 and which should be 1. And, important, don't just make a change in the existing data set. Go back over the data management leading up to this point to see why you ended up with these inconsistencies and fix the mistakes. You may uncover other mistakes along the way. If you do, fix those as well. Then try it all again.

                            Next, when you run your -didregress- command I recommend you use mdate rather than ind_date for the -time()- variable.

                            Next, I think that what you actually need to use here is -xtdidregress-, not -didregress-, because this is panel data, right?

                            Once you get that far, and -xtdidregress- runs, we can try to figure out why -estat ptrends- doesn't like your data. It's not obvious to me at this point.

                            Comment

                            Working...
                            X