Could someone kindly help with plotting a graph showing trends of time. This is a study looking at age-adjusted prevalence of asthma from 2005/6 to 2017/18. I am trying to create a graph trend with proportion on the y-axis and year on x-axis, with a line connecting proportion during each cycle. Also, how does one plot trends stratified by a third variable, race for example?

I'm using the NHANES database and because it's a 2-yr cycle, I'm unclear as to how to label the x-axis to reflect the 2-year cycles.

I would be grateful if anyone in this community could help.

Rows: row number

Year: NHANES cycle years

Proportion: national estimate per cycle

SE: Standard error

low: Lower limit of CI

high: Upper limit of CI

Code:

* Example generated by -dataex-. To install: ssc install dataex clear input byte Rows str7 Year double(Proportion SE low high) 1 "2005-6" 12.47344 1.391377 9.715774 15.2311 2 "2007-8" 11.66398 .6140856 10.44689 12.88108 3 "2009-10" 13.22848 1.052125 11.1432 15.31376 4 "2011-12" 13.7035 1.013246 11.69528 15.71172 5 "2013-14" 14.25656 .8406718 12.59038 15.92275 6 "2015-16" 16.80714 1.920966 12.99985 20.61443 7 "2017-18" 19.08004 1.282349 16.53846 21.62161 end

Code tried:

set scheme s1mono

encode Year, gen(year)

twoway (rcap low high Rows, vert) (scatter Proportion year, sort connect(l))

output::

Array ]]>

However, a difference here is that I would like to compare the coefficients off of the endogenous variable across models,

It seems like this might be related to how Stata stores estimations that were run, but I don't know how to get around it. I may be setting up the GMM incorrectly too. I'm attaching where (a) the comparison runs ok since I'm not doing any conditioning on a specific sample; (b) the comparison does not run because I am conditioning; and (c) the comparison does not run when I try reshaping the data, as I am told that I have fewer observations than parameters (which I don't think is right; I think this is related to having missing data, since in other datasets that were not this example dataset I was told I could not reach a positive definite matrix. But again I might just be missing something with the GMM command).

Any help would be much appreciated!

Code:

* This works cls sysuse auto, clear ivregress 2sls mpg (turn = weight), robust ivregress 2sls mpg (turn = length), robust gmm (eq1: mpg - {b1}*turn - {b0}) /// (eq2: mpg - {c1}*turn - {c0}), /// instruments(eq1: weight) /// instruments(eq2: length) /// onestep winitial(unadjusted, indep) test [b1]_cons = [c1]_cons * This does not work cls sysuse auto, clear ivregress 2sls mpg (turn = weight) if foreign==1, robust ivregress 2sls mpg (turn = weight) if foreign==0, robust gmm (eq1: mpg - {b1}*turn - {b0}) /// (eq2: mpg - {c1}*turn - {c0}), /// instruments(eq1: weight) /// instruments(eq2: weight) /// onestep winitial(unadjusted, indep) test [b1]_cons = [c1]_cons * Nor does this cls sysuse auto, clear reshape wide mpg turn weight, i(make) j(foreign) ivregress 2sls mpg0 (turn0 = weight0), robust ivregress 2sls mpg1 (turn1 = weight1), robust gmm (eq1: mpg0 - {b1}*turn0 - {b0}) /// (eq2: mpg1 - {c1}*turn1 - {c0}), /// instruments(eq1: weight0) /// instruments(eq2: weight1) /// onestep winitial(unadjusted, indep) test [b1]_cons = [c1]_cons

We are interested in estimating the impact of elite college attendance on the expected wage income. Abstracting from other factors, suppose we have omitted an important variable ability (this may capture both physical and intellectual characteristics of a person). For this exercise, rather than using actual data, simulate data for 900 respondents. For this analyze I need to proceed with this steps.

• Generate variable ability as follows: ability = 0 (poor) for the first 300 respondents, ability = 1 (moderate) for the observations in 301-600, and ability = 2 (talented) in observations 601-900.

• Create an indicator (dummy) variable for those people who live near an elite college. This will be used as an instrument. [Hint: You can use the following command gen nearcol = mod(n, 2)].

• Generate a college attendance dummy variable, such that the person attends a college if he/she is talented or has a moderate ability and lives near an elite college.

• Generate a white noise variable , following a normal distribution with a standard deviation of 0.01 and a mean of zero, εi ∼ N(0, 0.1).

• Generate an income variable such that the true return to ability is 1 and the return to college is 2. Add the white noise to the constructed income variable so that income has a small random component [Hint: Incomei = 1 ∗ Abilityi + 2 ∗ AttendCollegei + εi ].

- Regress income on college attendance and ability and verify that OLS uncovers the data generating process you created.
- Now suppose ability is unobservable to the econometrician but remains an important determinant of income (still has a return of 1). Regress income on college attendance and state your conclusions.

Need some help on how to Tag dataset 's observations based on its histogram

Code:

. webuse auto (1978 Automobile Data) . histogram price, frequency addlabel (bin=8, start=3291, width=1576,875)

I went little further

Code:

. frame create histogram . frame histogram:{ . serset use . gen bin = _n . list +--------------------------------------+ | __000009 __00000A __00000Bbin| |--------------------------------------| 1. | 35 0 4.079,4 1 | 2. | 21 0 5.656,3 2 | 3. | 4 0 7.233,2 3 | 4. | 2 0 8.810,1 4 | 5. | 4 0 10.387 5 | |--------------------------------------| 6. | 3 0 11.964 6 | 7. | 3 0 13.541 7 | 8. | 2 0 15.118 8 | 9. | . 0 3.291 9 | +--------------------------------------+ . }

thks, Luis (Stata MP 16.1)

]]>

I am dealing with panel data involving some banks over 63 time periods(N=4736 T=63). I used the Hausman test to determine I needed to use a FE model. But, this is where the trouble begins and my questions begin! So I run a basic xtreg fe model and get this result

Code:

xtreg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, fe Fixed-effects (within) regression Number of obs = 298,355 Group variable: cert Number of groups = 4,736 R-sq: Obs per group: within = 0.0179 min = 62 between = 0.0242 avg = 63.0 overall = 0.0227 max = 63 F(8,293611) = 670.17 corr(u_i, Xb) = -0.0196 Prob > F = 0.0000 ------------------------------------------------------------------------------ zscore | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnasset | .4193168 .0420687 9.97 0.000 .3368634 .5017702 lnassetsq | -.0071421 .0016294 -4.38 0.000 -.0103357 -.0039485 diverse | .0000122 .000058 0.21 0.834 -.0001015 .0001259 leverage | -.013206 .0004068 -32.46 0.000 -.0140033 -.0124087 eeffqr | -.0001518 8.33e-06 -18.23 0.000 -.0001681 -.0001355 DGS10 | .0478444 .0021882 21.86 0.000 .0435557 .0521332 CPIAUCSL_PCH | .0676272 .0032808 20.61 0.000 .0611969 .0740575 GDPC1_PC1 | .0310941 .0008967 34.68 0.000 .0293366 .0328516 _cons | -1.795671 .2740217 -6.55 0.000 -2.332746 -1.258596 -------------+---------------------------------------------------------------- sigma_u | 1.7745564 sigma_e | .99459682 rho | .76095759 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(4735, 293611) = 196.07 Prob > F = 0.0000

Code:

xttest3 Modified Wald test for groupwise heteroskedasticity in fixed effect regression model H0: sigma(i)^2 = sigma^2 for all i chi2 (4736) = 379.43 Prob>chi2 = 1.0000

Now running xtserial I get

Code:

xtserial zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1 Wooldridge test for autocorrelation in panel data H0: no first-order autocorrelation F( 1, 4735) = 226.237 Prob > F = 0.0000

Secondly if I were to use a VCE(Robust) model why when I run it using areg as such do I get such a different significance on some of my variables than using xtreg. It was my impression they were so similar that they should not differ by much?

Results from areg note that cert is just a unique identifier for each individual bank:

Code:

areg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, a(cert) vce(robust) Linear regression, absorbing indicators Number of obs = 298,355 Absorbed variable: cert No. of categories = 4,736 F( 8, 293611) = 343.16 Prob > F = 0.0000 R-squared = 0.7691 Adj R-squared = 0.7654 Root MSE = 0.9946 ------------------------------------------------------------------------------ | Robust zscore | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnasset | .4193168 .0686047 6.11 0.000 .2848535 .5537801 lnassetsq | -.0071421 .0025061 -2.85 0.004 -.0120541 -.0022301 diverse | .0000122 .0000432 0.28 0.778 -.0000725 .0000969 leverage | -.013206 .0102436 -1.29 0.197 -.0332831 .0068711 eeffqr | -.0001518 .0001045 -1.45 0.146 -.0003566 .000053 DGS10 | .0478444 .0046113 10.38 0.000 .0388064 .0568824 CPIAUCSL_PCH | .0676272 .0041492 16.30 0.000 .0594948 .0757596 GDPC1_PC1 | .0310941 .0011823 26.30 0.000 .0287769 .0334113 _cons | -1.795671 .4056216 -4.43 0.000 -2.590678 -1.000664 ------------------------------------------------------------------------------

Code:

xtreg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, fe vce(robust) Fixed-effects (within) regression Number of obs = 298,355 Group variable: cert Number of groups = 4,736 R-sq: Obs per group: within = 0.0179 min = 62 between = 0.0242 avg = 63.0 overall = 0.0227 max = 63 F(8,4735) = 184.31 corr(u_i, Xb) = -0.0196 Prob > F = 0.0000 (Std. Err. adjusted for 4,736 clusters in cert) ------------------------------------------------------------------------------ | Robust zscore | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnasset | .4193168 .134761 3.11 0.002 .1551224 .6835111 lnassetsq | -.0071421 .0051472 -1.39 0.165 -.0172331 .0029488 diverse | .0000122 .0000445 0.27 0.784 -.000075 .0000994 leverage | -.013206 .0102906 -1.28 0.199 -.0333804 .0069684 eeffqr | -.0001518 .0001059 -1.43 0.152 -.0003594 .0000558 DGS10 | .0478444 .0067429 7.10 0.000 .0346253 .0610636 CPIAUCSL_PCH | .0676272 .0037723 17.93 0.000 .0602317 .0750227 GDPC1_PC1 | .0310941 .0014933 20.82 0.000 .0281667 .0340216 _cons | -1.795671 .8540467 -2.10 0.036 -3.47 -.1213422 -------------+---------------------------------------------------------------- sigma_u | 1.7745564 sigma_e | .99459682 rho | .76095759 (fraction of variance due to u_i) ------------------------------------------------------------------------------

Code:

xtreg zscore lnasset lnassetsq diverse leverage eeffqr DGS10 CPIAUCSL_PCH GDPC1_PC1, fe Fixed-effects (within) regression Number of obs = 298,355 Group variable: cert Number of groups = 4,736 R-sq: Obs per group: within = 0.0179 min = 62 between = 0.0242 avg = 63.0 overall = 0.0227 max = 63 F(8,293611) = 670.17 corr(u_i, Xb) = -0.0196 Prob > F = 0.0000 ------------------------------------------------------------------------------ zscore | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- lnasset | .4193168 .0420687 9.97 0.000 .3368634 .5017702 lnassetsq | -.0071421 .0016294 -4.38 0.000 -.0103357 -.0039485 diverse | .0000122 .000058 0.21 0.834 -.0001015 .0001259 leverage | -.013206 .0004068 -32.46 0.000 -.0140033 -.0124087 eeffqr | -.0001518 8.33e-06 -18.23 0.000 -.0001681 -.0001355 DGS10 | .0478444 .0021882 21.86 0.000 .0435557 .0521332 CPIAUCSL_PCH | .0676272 .0032808 20.61 0.000 .0611969 .0740575 GDPC1_PC1 | .0310941 .0008967 34.68 0.000 .0293366 .0328516 _cons | -1.795671 .2740217 -6.55 0.000 -2.332746 -1.258596 -------------+---------------------------------------------------------------- sigma_u | 1.7745564 sigma_e | .99459682 rho | .76095759 (fraction of variance due to u_i) ------------------------------------------------------------------------------ F test that all u_i=0: F(4735, 293611) = 196.07 Prob > F = 0.0000]]>