Degree of polycentricity based on rank-size distributions. Running multiple regressions for different groups of variables

Diego PF

Join Date: Jun 2023

Posts: 6
#1

Degree of polycentricity based on rank-size distributions. Running multiple regressions for different groups of variables

27 Jun 2023, 08:49

Hello
I hope everyone is OK!
I need to estimate the degree of polycentricity of 50 urban regions. Each one has between 1 and 20 cities inside. The degree of polycentricity is obtained from the rank-size distribution based on this equation: Ln(size)=α+βLn(rank-1/2), where Ln(size) is the size of each city within the different urban regions and ln(rank-1/2) is the rank that the city occupies within each region (1, 2, 3,...20).

The literature recommends calculating β using the range-size distribution average when considering the region's 2, 3, and 4 most populous cities. From this average, we must extract its absolute value and its inverse. The result is the degree of polycentricity. I have run the equation in Stata with the following command per region:

bysort cve_sun: regress lnpob2020 lnrank2020_ if rank2020<=2, r
bysort cve_sun: regress lnpob2020 lnrank2020_ if rank2020<=3, r
bysort cve_sun: regress lnpob2020 lnrank2020_ if rank2020<=4, r

From there, I get a β for each city, but I don't know how to get the average of the β's for each region (by hand, it would take a long time because there are 50 urban regions).

I don't know if I have to use a loop, but how to do it is also unclear. I tried with:
forvalues i = 2/4 {
reg lnpob2020_sun lnrank2020 if rank2020 == `i', r
local beta = _b[lnrank2020]
local betas "`betas' `beta'"
}

I did not get the expected result. Any recommendation?
Thank you,
Diego
Tags: None

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17734

28 Jun 2023, 00:08

Diego:
welcome to this forum.
It would seem a task for -statsby-:

Code:

use "https://www.stata-press.com/data/r17/nlswork.dta"
. statsby, by(year): regress ln_wage age if year<=71
(running regress on estimation sample)

      Command: regress ln_wage age if year<=71
           By: year

Statsby groups
----+--- 1 ---+--- 2 ---+--- 3 ---+--- 4 ---+--- 5
....

. egen average_cons=mean( _b_cons)

. egen average_slope=mean( _b_age )

. list

     +--------------------------------------------------+
     | year     _b_age    _b_cons   averag~s   averag~e |
     |--------------------------------------------------|
  1. |   68   .0515506   .3629596   .5123854   .0458598 |
  2. |   69   .0451916   .5636937   .5123854   .0458598 |
  3. |   70   .0438581   .5435218   .5123854   .0458598 |
  4. |   71   .0428387   .5793663   .5123854   .0458598 |
     +--------------------------------------------------+

.

Remember to save your original dataset before going -statsby-.

Last edited by Carlo Lazzaro; 28 Jun 2023, 00:12.

Kind regards,
Carlo
(Stata 19.0)

Comment

Diego PF

Join Date: Jun 2023

Posts: 6
#3

28 Jun 2023, 05:47

Thanks a lot, Carlo
This command has been really useful. However, when I tried, the result obtained (average of slope) is the value when age is <=71 (in my case when the number of cities is <=4). I need the average when cities are <=2,3 and 4.

Do you recommend running the same procedure three times (when cities ==2, 3 and 4) and then obtaining the average or is there some way to do this in a loop? I tried:
forvalues i = 1/74 {
forvalues j = 2/4 {
statsby, by(id_sun): regress lnpob2020 lnrank2020_ if (id_sun == `i') & (rank2020 <= `j'), r
}
}

*Here, after the first iteration is not possible to continue because rank2020 disappears from the database. Then, I tried:
forvalues i = 1/74 {
forvalues j = 2/4 {
bysort id_sun: regress lnpob2020 lnrank2020_ if (id_sun == `i') & (rank2020 <= `j'), r
}
}
egen average_slope=mean( lnrank2020_ )
by id_sun: egen average=mean( lnrank2020_ )

However, the average obtained is not correct (I did it manually for two cities).

Thanks for your support!
Best,
Diego
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17734
#4

28 Jun 2023, 08:36

Diego:
I think that the safest procedure is repeating -statsby- three times.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Announcement

Degree of polycentricity based on rank-size distributions. Running multiple regressions for different groups of variables

Comment

Comment

Comment