Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • collapse municipality-level data and calculating a ratio

    Dear Statalisters,

    For better understanding and learning STATA, I would like to ask you the following question. I have a municipality-level panel dataset containing some information about the kindergarten services offered by each municipalities, like the following:

    Code:


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int code_province double code_municipality float year double(children nr_seats) float(pop0_2 pop_tot)
    1 1008 2004 71 75 437 16752
    1 1008 2005 73 75 416 16769
    1 1008 2006 73 75 392 16726
    1 1008 2007 73 75 391 16760
    1 1008 2008 74 90 411 17029
    1 1008 2009 87 90 415 16998
    end
    In particular, I'm interested in the ratio between nr_seats and pop0_2, that is the ratio between available kindergarten seats and population aged 0-2 (the kindergarten coverage rate), by municipalities-year.
    However, because many municipalities are too small to offer any kindergarten service and their demand for childcare service is basically served by the closest large municipality, I'm interested in aggregating those data at a higher geographical level, that in my case is the province level (code_province).

    For this purpose, I've tried 2 different codes in STATA, each one giving me a different result:

    First code:

    Code:
    . preserve
    
    . gen coverage=nr_seats/pop0_2
    (480 missing values generated)
    
    . recode coverage(.=0) if nr_seats==0|pop0_2==0
    (coverage: 480 changes made)
    
    . collapse coverage, by(code_province year)
    
    . sum coverage
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
        coverage |      1,320    .0502159    .0455661          0   .2749822
    
    . restore
    Second code:

    Code:
    . preserve
    
    . collapse nr_seats pop0_2, by(code_province year)
    
    . gen coverage=nr_seats/pop0_2
    
    . sum coverage
    
        Variable |        Obs        Mean    Std. Dev.       Min        Max
    -------------+---------------------------------------------------------
        coverage |      1,320    .0844444    .0592051          0   .2932529
    
    . restore

    as I understand it, however, the two codes should not give such a different results; indeed the first one is giving the average coverage rate by province, weighting all the municipalities belonging to the province the same, regardless of the population in each. The second one is calculating the average nr. of seats and the average nr. of kids aged 0-2 by province, again without any weighting, and then is calculating the ratio between the two.

    Could some one please give me some insights on why the two codes give such a different results?

    Thank you so much,

    Chiara
    Last edited by Chiara Puccioni; 15 Nov 2019, 01:48.

  • #2
    Chiara:
    welcome to this forum.
    I would propose a different take, since you have a panel dataset:
    Code:
    xtset code_province year
    xtpoisson nr_seats children i.year, exp( pop0_2 ) irr
    Kind regards,
    Carlo
    (Stata 18.0 SE)

    Comment


    • #3
      Thank you Carlo,

      but I am not sure this will solve the problem. I apologise in advance if I didn't understand correctly the idea of xtpoisson but I cannot set
      as panel var the provinces unless I first collapse the dataset; indeed if I try I obtain the following error:

      Code:
      repeated time values within panel
      Secondly, maybe I didn't express well the problem: my final goal is to evaluate if a certain policy did have an impact on the kindergarten coverage rate by province, and in particular in my analysis I would use a Difference in Difference model. As such, I need to first construct my outcome variable at province level, that is the kindergarten coverage rate by province-year.

      Best,

      Chiara

      Comment

      Working...
      X