Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Accounting for population change when creating a categorical city population size indicator in panel dataset

    Hello,

    I have a quarterly dataset of 3,000 city police departments for the years 2018-2022. ORI is an alphanumeric police department indicator, qdate indicates the quarter-year, totalpop equals the annual population of the city the police department serves, and popstrata is a categorical variable I created to group departments based on population size (see code below). The problem I am facing is that over the 5 years, agencies like the one shown below switch strata if their population crosses the strata thresholds (e.g., a city of 9,800 residents increases to 11,000 residents the next year moving it from strata 6 to strata 5). I want to assign police departments to a single population group/strata that does not change over the 5-year period, so I was thinking of changing my code to assign agencies to the population strata based on their 2020 populations. Could anyone provide me with the code to adjust what I have to assign each agency a single population strata for all 60 months based on its totalpop value for one of the 2020 quarter years? Alternatively, if you have an idea of a more accurate way to pick the population to use given the population change over Covid, I'm all ears. Even if I change it to use the 2018 or the 2022 population value, I imagine the code won't change other than changing the quarter year value. Thank you for any help you can offer!

    Code:
    gen popstrata = .
    replace popstrata=1 if totalpop>249999
    replace popstrata=2 if totalpop<250000 & totalpop>99999
    replace popstrata=3 if totalpop<100000 & totalpop>49999
    replace popstrata=4 if totalpop<50000 & totalpop>24999
    replace popstrata=5 if totalpop<25000 & totalpop>9999
    replace popstrata=6 if totalpop<10000
    Click image for larger version

Name:	citypops.JPG
Views:	1
Size:	49.8 KB
ID:	1725672

  • #2
    So, if you want to use the population stratum based on total population in the first quarter of 2020:
    Code:
    by ori (qdate), sort: egen popstratum2020 = max(cond(qdate = tq(2020q1), popstrata, .))

    Comment


    • #3
      I received the following error:

      by ori (qdate), sort: egen popstratum2020 = max(cond(qdate = tq(2020q1), popstrata, .))
      unknown function qdate=tq()
      r(133);

      It is likely my fault for not sharing my data. I did find a much less elegant solution:

      Code:
      tostring qdate, gen(qdate_string)
      gen popstrata = .
      replace popstrata=1 if (totalpop>249999 & totalpop!=.) & qdate_string=="240"
      replace popstrata=2 if (totalpop<250000 & totalpop>99999) & qdate_string=="240"
      replace popstrata=3 if (totalpop<100000 & totalpop>49999) & qdate_string=="240"
      replace popstrata=4 if (totalpop<50000 & totalpop>24999) & qdate_string=="240"
      replace popstrata=5 if (totalpop<25000 & totalpop>9999) & qdate_string=="240"
      replace popstrata=6 if (totalpop<10000 & totalpop!=.) & qdate_string=="240"
      encode ori, gen(ori_n)
      xfill popstrata, i(ori_n)

      Comment


      • #4
        Sorry, typo. Should be
        Code:
        by ori (qdate), sort: egen popstratum2020 = max(cond(qdate == tq(2020q1), popstrata, .))
        That was my error. But if you had shown your example data using -dataex-, instead of a screenshot (which cannot be imported to Stata), I would have tested the code before posting. Please always use -dataex- to show example data in the future.

        Comment

        Working...
        X