count by

Paris Rira

Join Date: Dec 2022
Posts: 384

29 Apr 2023, 13:09

Hi dear profs and colleagues,

I am going to reach this statement in my dataset. Please share your ideas with me. Thanks.
' The unit of observation is one region in one year '

year: 2010-2020
region: 7 regions exist
firms ID: NPC_FIC. In each year they are unique. But they repeat during the period.

Code:

input double(year NPC_FIC) float region 2010 500135017 1 2019 501301917 1 2010 501833633 1 2020 501102337 1 2010 501022911 1 2014 502207708 2 2011 501129116 2 2012 501077767 2 2012 502230825 2 2018 501081500 2 2019 501346223 2 2017 501023486 2 2011 501829556 2 2016 501205066 2 2020 501170028 2 2018 501032576 3 2011 501031781 3 2020 501179930 3 2011 502216695 3 2011 501273750 3 2016 502228955 3 2010 502485654 3 2011 500985340 3

Cheers,
Paris

Tags: None

Carlo Lazzaro

Join Date: Apr 2014

Posts: 17741
#2

30 Apr 2023, 05:32

Paris:
I am nost sure I got your question right; so please consider what follows as a tentative answer:

Code:

. egen wanted=group( region year)

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment
Paris Rira

Join Date: Dec 2022

Posts: 384
#3

30 Apr 2023, 08:06

Thank you so much, Carlo.
Comment
Paris Rira

Join Date: Dec 2022

Posts: 384
#4

30 Apr 2023, 09:23

I am going to run a panel data in which ' The unit of observation is one region in one year '.
By doing so

Code:

egen wanted=group( region year)

Shall I do the same procedure with the rest of the variables in the panel as well? I mean here dependent variable is the number of firms--n_firms--, and the explanatory variable is immigrant share -immi_sh.
is this what I should do?:

Code:

egen n_firm= count (NPC_FIC), by(region year) egen immi_sh_year = sum( immi_sh), by(region year)
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17741

30 Apr 2023, 10:13

Paris:
if you have panel data, you have a sample of uunits (paneles) which are measured on the very same variables at (theoretically) equally spaced time intervals.
Therefore, each region is measured on the same set of variables each year.
That said, I created a -depvar- and propose you the following answer to your question:

Code:

. g depvar=runiform()*100000

. xtset region year
repeated time values within panel
r(451);

. xtset region

Panel variable: region (unbalanced)

. xtreg depvar  i.year, fe

Fixed-effects (within) regression               Number of obs     =         23
Group variable: region                          Number of groups  =          3

R-squared:                                      Obs per group:
     Within  = 0.4308                                         min =          5
     Between = 0.1999                                         avg =        7.7
     Overall = 0.2672                                         max =         10

                                                F(8,12)           =       1.14
corr(u_i, Xb) = -0.3574                         Prob > F          =     0.4068

------------------------------------------------------------------------------
      depvar | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
-------------+----------------------------------------------------------------
        year |
       2011  |   39895.72   24782.69     1.61   0.133    -14101.13    93892.57
       2012  |   26147.39   32056.69     0.82   0.431    -43698.14    95992.91
       2014  |   69350.06   38210.67     1.81   0.095    -13903.84      152604
       2016  |   65855.42   30179.24     2.18   0.050     100.4962    131610.3
       2017  |   2703.408   38210.67     0.07   0.945    -80550.49     85957.3
       2018  |   23746.54   30179.24     0.79   0.447    -42008.38    89501.47
       2019  |   43534.61   26814.96     1.62   0.130    -14890.17    101959.4
       2020  |   36116.05   24273.75     1.49   0.163    -16771.92    89004.02
             |
       _cons |   14984.22   18857.21     0.79   0.442    -26102.11    56070.55
-------------+----------------------------------------------------------------
     sigma_u |  18427.016
     sigma_e |  29408.298
         rho |  .28192802   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(2, 12) = 1.43                       Prob > F = 0.2768

.

You obviously have to add more predictors in the right-hand side of your regression equation.

Kind regards,
Carlo
(Stata 19.0)

Comment

Paris Rira

Join Date: Dec 2022

Posts: 384
#6

30 Apr 2023, 10:34

Prof Carlo, Thank you for the clarification.

To do aggregation, "sum firms in the same region" shall I collapse by region? or the way you did it has already aggregated?
Because the main point is the estimation at the aggregated district-year level.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17741
#7

30 Apr 2023, 11:01

Paris:
first, Carlo is enough . Thanks.
Then you need to -collapse-.

Last edited by Carlo Lazzaro; 30 Apr 2023, 11:04.

Kind regards,
Carlo
(Stata 19.0)
1 like
Comment

Announcement

count by

Comment

Comment

Comment

Comment

Comment

Comment