Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a variable to measure median wages across occupations/time

    Dear Statalist Users

    I am seeking some help. I would like to calculate the median wage for each occupation group (1-8) in my data. I am using longitudinal data from 2009 to 2019. I was trying: sort occupation: egen medianoccwage = median(wages) but that gave me the same figure for each individual, each year so I think there's a missing piece here. I'm not sure what.

    I would also like to then create a binary variable that captures respondents who are 20% lower than the median for their occupation across year year of observation.

    Any help on creating these would be very much appreciated.

    Thanks in advance

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long pid int year float(occupation wages)
    100177 2009 1 1854
    100231 2009 1  900
    100246 2009 1  210
    100346 2009 1 2472
    100398 2009 1 1381
    100916 2009 1 1016
    100945 2009 1  400
    101060 2009 1 1841
    101196 2009 1 1565
    101229 2009 1 1407
    101296 2009 1 1119
    101358 2009 1  840
    101498 2009 1 1052
    101505 2009 1    0
    101759 2009 1 2301
    101848 2009 1 1266
    101903 2009 1  350
    102001 2009 1 3165
    102028 2009 1 1826
    102138 2009 1  600
    102190 2009 1    0
    102248 2009 1  750
    102373 2009 1 1578
    102374 2009 1 1057
    102567 2009 1 1335
    102568 2009 1 1611
    102587 2009 1  750
    102615 2009 1    0
    102654 2009 1 1900
    102750 2009 1 2250
    102796 2009 1  550
    103145 2009 1 1216
    103272 2009 1  670
    103279 2009 1 1538
    103312 2009 1  500
    103663 2009 1  550
    103974 2009 1  472
    103975 2009 1    0
    104000 2009 1 1200
    104112 2009 1  884
    104160 2009 1 1600
    104306 2009 1 1246
    104370 2009 1    0
    104665 2009 1  800
    104882 2009 1 1260
    105017 2009 1  925
    105305 2009 1 1200
    105517 2009 1 2685
    105562 2009 1 2100
    105649 2009 1    0
    105846 2009 1  750
    105857 2009 1  959
    105858 2009 1 1726
    105949 2009 1  656
    105958 2009 1 1266
    106162 2009 1  903
    106279 2009 1 1250
    106317 2009 1 1260
    106429 2009 1    0
    106463 2009 1  865
    106505 2009 1  778
    106556 2009 1 3125
    106945 2009 1  500
    106954 2009 1 1100
    106968 2009 1  760
    107143 2009 1 1200
    107275 2009 1 1014
    107296 2009 1  699
    107317 2009 1  760
    107474 2009 1  729
    107491 2009 1 1151
    107961 2009 1 1534
    108348 2009 1 3548
    108349 2009 1    0
    108503 2009 1  896
    108755 2009 1 2038
    108892 2009 1  807
    109102 2009 1  770
    109122 2009 1  715
    109206 2009 1  760
    109335 2009 1 1799
    109598 2009 1 1300
    109768 2009 1  901
    110292 2009 1 1600
    110499 2009 1  786
    110622 2009 1 1876
    110703 2009 1  974
    110976 2009 1 1600
    111206 2009 1  850
    111486 2009 1  926
    111638 2009 1 1036
    111849 2009 1 1496
    111962 2009 1  780
    112062 2009 1 2685
    112123 2009 1 1300
    112125 2009 1 2163
    112440 2009 1 1024
    112512 2009 1 1289
    112570 2009 1    0
    112660 2009 1    0
    end

  • #2
    Indeed, there is something missing.

    Code:
    sort occupation: egen medianoccwage = median(wages)
    should be

    Code:
    bysort occupation year: egen medianoccwage = median(wages)
    as otherwise all years will be pooled, as you report.

    Then I think you seek something like

    Code:
    gen lowwages = wages < 0.8 * medianoccwage if wages < .

    Comment


    • #3
      Let's flag that the indicator compares individual wages with the median for the same occupation and year, so individuals may well change status from year to year.

      Comment


      • #4
        Thanks Nick Cox

        It works! Much appreciated!
        Brendan
        Last edited by Brendan Churchill; 17 Aug 2024, 20:28.

        Comment

        Working...
        X