Dear Stata members
I would like to tabulate total unique firms and total firm-years based on sample. I will illustrate this with an example
* Example generated by -dataex-. For more info, type help dataex
When we use this data, Stata says there are 5 variables and 21 observations. However when we regress sales on advt, the no: of observations is only 11 and there are 7 groups
I would like to present this information that is total firm-years (observations) are 11 and total unique firms are 7
I surfed Stata, and found post https://www.statalist.org/forums/for...84#post1639484 by Nick Cox.
I followed the same
So 1+6+4=11, I could get firm-year observations, but how to get the number 7 which is the unique number of firms in each of these countries? I tried various ways but somehow I couldnt tell Stata to count the distinct firms that meet the firm-year criteria. I know it is 3 from UK, 3 From US, and 1 from Japan. But how to count/create that var?
I would like to tabulate total unique firms and total firm-years based on sample. I will illustrate this with an example
* Example generated by -dataex-. For more info, type help dataex
Code:
clear input str1 id str5 country int(year sales) byte advt "a" "UK" 2001 50 12 "a" "UK" 2002 90 14 "a" "UK" 2003 118 49 "b" "UK" 2001 . . "b" "UK" 2002 . . "b" "UK" 2003 82 55 "c" "UK" 2001 98 65 "c" "UK" 2002 . . "c" "UK" 2003 71 45 "d" "USA" 2001 . . "d" "USA" 2002 22 9 "d" "USA" 2003 78 . "e" "USA" 2001 39 18 "e" "USA" 2002 . . "e" "USA" 2003 98 33 "f" "USA" 2001 78 . "f" "USA" 2002 69 29 "f" "USA" 2003 . . "g" "Japan" 2001 . . "g" "Japan" 2002 . . "g" "Japan" 2003 100 65 end
Code:
encode id, gen (id2)
. xtset id2 year
Panel variable: id2 (strongly balanced)
Time variable: year, 2001 to 2003
Delta: 1 unit
. xtreg sales advt, fe vce(r)
Fixed-effects (within) regression Number of obs = 11
Group variable: id2 Number of groups = 7
R-squared: Obs per group:
Within = 0.6930 min = 1
Between = 0.6427 avg = 1.6
Overall = 0.4994 max = 3
F(1,6) = 24.58
corr(u_i, Xb) = -0.6179 Prob > F = 0.0026
(Std. err. adjusted for 7 clusters in id2)
------------------------------------------------------------------------------
| Robust
sales | Coefficient std. err. t P>|t| [95% conf. interval]
-------------+----------------------------------------------------------------
advt | 1.616037 .3259711 4.96 0.003 .8184149 2.41366
_cons | 18.20739 11.67569 1.56 0.170 -10.362 46.77678
-------------+----------------------------------------------------------------
sigma_u | 20.00314
sigma_e | 21.317105
rho | .46823257 (fraction of variance due to u_i)
------------------------------------------------------------------------------
I surfed Stata, and found post https://www.statalist.org/forums/for...84#post1639484 by Nick Cox.
I followed the same
Code:
egen tag = tag(id country sales advt )
egen ndistinct = total(tag), by( country )
tabdisp country , c(ndistinct )
----------------------
country | ndistinct
----------+-----------
Japan | 1
UK | 6
USA | 4

Comment