Scatter plot with fixed effects

Alessandro Sovera

Join Date: Jun 2016

Posts: 67
#1

Scatter plot with fixed effects

19 Jul 2019, 06:52

I am trying to draw a scatter plot of two variables, after having taken into account fixed effects (state and year). The idea is to analyse the correlation between gonorrhea rates and broadband diffusion so that I ran the following regression

Code:

tsset state year xtreg gonorrhea internet i.year, fe

Then I collected the residuals using

Code:

predict gonorrhea_res, e

and I plot the residuals against broadband by state running

Code:

scatter internet gonorrhea_res, by(state)

Is this the right way to proceed? I tried to follow some other sources but I am not 100% sure about the code I wrote here.

Last edited by Alessandro Sovera; 19 Jul 2019, 07:22.
Tags: fixed effects, graph, panel data, scatterplot
Alessandro Sovera

Join Date: Jun 2016

Posts: 67
#2

20 Jul 2019, 06:53

UPDATE:

I realized I probably made a mistake in my code above. After the regression, I think I should run

Code:

predict gonorrhea_hat, xb

but in this case I would obtain the fitted values of gonorrhea, my dependent variable, computed as b_1*ratio + b_2*state_1+...+b_x*year_1+... Instead, my plan was to discount for the fixed effect and scatter plot ratio vs gonorrhea.

How should I proceed?
Comment

Carlo Lazzaro

Join Date: Apr 2014
Posts: 17712

20 Jul 2019, 07:25

Alessandro:
I've read your original post more than once but I'm still not clear with what you're after (And I suspect this is the reason why you did not receive any reply, so far).
That said, I assume you want to visually inspect whether your regression model suffers from heteroskedasticity.
Maybe something along the following lines can help:

Code:

. use "http://www.stata-press.com/data/r15/nlswork.dta"
. xtreg ln_wage age, fe

Fixed-effects (within) regression               Number of obs     =     28,510
Group variable: idcode                          Number of groups  =      4,710

R-sq:                                           Obs per group:
     within  = 0.1026                                         min =          1
     between = 0.0877                                         avg =        6.1
     overall = 0.0774                                         max =         15

                                                F(1,23799)        =    2720.20
corr(u_i, Xb)  = 0.0314                         Prob > F          =     0.0000

------------------------------------------------------------------------------
     ln_wage |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
-------------+----------------------------------------------------------------
         age |   .0181349   .0003477    52.16   0.000     .0174534    .0188164
       _cons |   1.148214   .0102579   111.93   0.000     1.128107     1.16832
-------------+----------------------------------------------------------------
     sigma_u |  .40635023
     sigma_e |  .30349389
         rho |  .64192015   (fraction of variance due to u_i)
------------------------------------------------------------------------------
F test that all u_i=0: F(4709, 23799) = 8.81                 Prob > F = 0.0000

. predict fitted, xb

. predict idiosyncratic, e

. scatter idiosyncratic fitted

. scatter idiosyncratic age

I would also check whether your model with two predictors only is correctly specified.

Kind regards,
Carlo
(Stata 19.0)

Comment

Alessandro Sovera

Join Date: Jun 2016

Posts: 67
#4

21 Jul 2019, 01:51

Thank you Carlo.

The fact is that is not 100% clear to me too. I have been asked to plot a scatter for the two variables, x and y after taking into account state and year fixed effects. And that is what I thought about. But it is quite weird to me to run a regression and than to check for correltation.

I think this should be some kind of preliminary analysis to see if it is worthy to move on with the project
Comment
Wouter Wakker

Join Date: Nov 2018

Posts: 621
#5

21 Jul 2019, 02:32

You might want to have a look at avplot, which I think does what you want. The only problem is that it only works after regress. But you can change

Code:

xtreg gonorrhea internet i.year, fe

to

Code:

reg gonorrhea internet i.year i.state

and then use avplot.
Comment
Alessandro Sovera

Join Date: Jun 2016

Posts: 67
#6

22 Jul 2019, 06:23

Originally posted by Wouter Wakker View Post

You might want to have a look at avplot, which I think does what you want. The only problem is that it only works after regress. But you can change

Code:

xtreg gonorrhea internet i.year, fe

to

Code:

reg gonorrhea internet i.year i.state

and then use avplot.

I think this makes sense. Thanks for the advice!
Comment
Kye Lippold

Join Date: Jun 2019

Posts: 67
#7

22 Jul 2019, 16:12

Just to clarify--what avplot is doing is equivalent to plotting the residuals of the y and x variables after removing the means with fixed effects. Remember that using fixed effects is equivalent to demeaning both the x and y values in the data by the fixed effect groups. So what you are after with such a plot is to examine the variation in the x and y data after they have been demeaned, which is the variation giving you the coefficient of interest.

Here is a quick example illustrating how the avplot command works, and equivalent ways of getting the same graph with -reg- or -xtreg-

Code:

clear all webuse grunfeld xtset *0. Final model where "invest" is coefficient of interest xtreg mvalue invest i.year, fe *1. avplot approach reg mvalue invest i.year i.company avplot invest, name(avplot) *2. manual avplot with reg reg mvalue i.year i.company //get demeaned y predict y_res, res reg invest i.year i.company //get demeaned x predict x_res, res scatter y_res x_res || lfit y_res x_res, name(not_avplot) ytitle("e( mvalue | X)") ytitle("e( invest | X)") legend(off) note("The same!") *3. manual avplot with xtreg xtreg mvalue i.year, fe //get demeaned y predict y_resfe, e xtreg invest i.year, fe //get demeaned x predict x_resfe, e scatter y_resfe x_resfe || lfit y_resfe x_resfe, name(not_avplot_fe) ytitle("e( mvalue | X)") ytitle("e( invest | X)") legend(off) note("Still the same!") *Note all these graphs show the same thing!

So Alessandro, you were on the right track with your first post--but the key is that you want to plot residuals for the x variable also. In other words, you want to know the variation in gonorrhea and internet after removing averages by year and state for both variables.
1 like
Comment

Announcement