Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Degree of polycentricity based on rank-size distributions. Running multiple regressions for different groups of variables

    Hello
    I hope everyone is OK!
    I need to estimate the degree of polycentricity of 50 urban regions. Each one has between 1 and 20 cities inside. The degree of polycentricity is obtained from the rank-size distribution based on this equation: Ln(size)=α+βLn(rank-1/2), where Ln(size) is the size of each city within the different urban regions and ln(rank-1/2) is the rank that the city occupies within each region (1, 2, 3,...20).

    The literature recommends calculating β using the range-size distribution average when considering the region's 2, 3, and 4 most populous cities. From this average, we must extract its absolute value and its inverse. The result is the degree of polycentricity. I have run the equation in Stata with the following command per region:

    bysort cve_sun: regress lnpob2020 lnrank2020_ if rank2020<=2, r
    bysort cve_sun: regress lnpob2020 lnrank2020_ if rank2020<=3, r
    bysort cve_sun: regress lnpob2020 lnrank2020_ if rank2020<=4, r

    From there, I get a β for each city, but I don't know how to get the average of the β's for each region (by hand, it would take a long time because there are 50 urban regions).

    I don't know if I have to use a loop, but how to do it is also unclear. I tried with:
    forvalues i = 2/4 {
    reg lnpob2020_sun lnrank2020 if rank2020 == `i', r
    local beta = _b[lnrank2020]
    local betas "`betas' `beta'"
    }

    I did not get the expected result. Any recommendation?
    Thank you,
    Diego

  • #2
    Diego:
    welcome to this forum.
    It would seem a task for -statsby-:
    Code:
    use "https://www.stata-press.com/data/r17/nlswork.dta"
    . statsby, by(year): regress ln_wage age if year<=71
    (running regress on estimation sample)
    
          Command: regress ln_wage age if year<=71
               By: year
    
    Statsby groups
    ----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
    ....
    
    . egen average_cons=mean( _b_cons)
    
    . egen average_slope=mean( _b_age )
    
    . list
    
         +--------------------------------------------------+
         | year     _b_age    _b_cons   averag~s   averag~e |
         |--------------------------------------------------|
      1. |   68   .0515506   .3629596   .5123854   .0458598 |
      2. |   69   .0451916   .5636937   .5123854   .0458598 |
      3. |   70   .0438581   .5435218   .5123854   .0458598 |
      4. |   71   .0428387   .5793663   .5123854   .0458598 |
         +--------------------------------------------------+
    
    .
    Remember to save your original dataset before going -statsby-.
    Last edited by Carlo Lazzaro; 28 Jun 2023, 00:12.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Thanks a lot, Carlo
      This command has been really useful. However, when I tried, the result obtained (average of slope) is the value when age is <=71 (in my case when the number of cities is <=4). I need the average when cities are <=2,3 and 4.

      Do you recommend running the same procedure three times (when cities ==2, 3 and 4) and then obtaining the average or is there some way to do this in a loop? I tried:
      forvalues i = 1/74 {
      forvalues j = 2/4 {
      statsby, by(id_sun): regress lnpob2020 lnrank2020_ if (id_sun == `i') & (rank2020 <= `j'), r
      }
      }

      *Here, after the first iteration is not possible to continue because rank2020 disappears from the database. Then, I tried:
      forvalues i = 1/74 {
      forvalues j = 2/4 {
      bysort id_sun: regress lnpob2020 lnrank2020_ if (id_sun == `i') & (rank2020 <= `j'), r
      }
      }
      egen average_slope=mean( lnrank2020_ )
      by id_sun: egen average=mean( lnrank2020_ )

      However, the average obtained is not correct (I did it manually for two cities).

      Thanks for your support!
      Best,
      Diego


      Comment


      • #4
        Diego:
        I think that the safest procedure is repeating -statsby- three times.
        Kind regards,
        Carlo
        (Stata 19.0)

        Comment

        Working...
        X