No Base Level for Factor Variable Operators

Jeff Thompson

Join Date: Feb 2018

Posts: 30
#1

No Base Level for Factor Variable Operators

07 Apr 2018, 09:03

Hi,

I've decided on using no base level in the operator for one of my factor variables size_cat

Code:

xtreg log_avg_pay_change ibn.size_cat##c.emp_change, fe vce (cluster lbdnum)

*Networking restrictions prevent me from copy/pasting Stata outputs. Screenshots are the only option for me*

(if you can't read the regression output it's not that important)

However, the output omits the second to last observation in size_cat the last observation in size_cat#c.emp_change
I'd read somewhere that using -noconstant- is an way to keep these variables in there, but I receive: option noconstant not allowed

Two quick questions:

1) Is there anything I can do to remedy this problem.
2) How should these coefficients be interpreted; the changes are relative to what?The Stata manual wasn't particularly clear on this issue.

-Jeff
Tags: factor variables, ibn., no base level, panel data
Clyde Schechter

Join Date: Apr 2014

Posts: 30115
#2

07 Apr 2018, 10:43

You can emulate -xtreg, fe- by using -regress- and including i.group in the varlist (where group is the name of the variable you specified as the group variable in your -xtset- command. -regress- does allow the no-constant option. Now, Stata will still have to omit something. The number of colinearities in your model is two. -noconstant- will eliminate one of them, but Stata will still drop something else to resolve he other colinearity. There is no way around that: that's linear algebra.

In this situation, you would interpret interpret the coefficients of the size_cat indicators as the expected value of the outcome in that size category. Or, rather, that is what they would mean if you didn't have an interaction in the model. Because you do have an interaction with emp_change, the interpretation is that these are the expected values of the outcome in that size cateogry conditional on emp_change = 0.
Comment
Jeff Thompson

Join Date: Feb 2018

Posts: 30
#3

11 Apr 2018, 13:36

Thanks for the response Clyde Schechter ,

In this situation, you would interpret interpret the coefficients of the size_cat indicators as the expected value of the outcome in that size category. Or, rather, that is what they would mean if you didn't have an interaction in the model. Because you do have an interaction with emp_change, the interpretation is that these are the expected values of the outcome in that size cateogry conditional on emp_change = 0.

To be extra clear on this point: are referring to the coefficients under size_cat#emp_change and not just the size_cat?

I'm really wanting to see how emp_change differs across the 10 size_cats, I suppose I would only be able to deduce this through -margins-*, correct? Would the "hole" in size_cat 10 inhibit being able to include it in my results?

*

Code:

margins ar.size_cat, dydx(emp_change)

Last edited by Jeff Thompson; 11 Apr 2018, 13:38.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30115
#4

11 Apr 2018, 13:48

I'm really wanting to see how emp_change differs across the 10 size_cats, I suppose I would only be able to deduce this through -margins-*, correct?

No. In fact, this has nothing to do with your regression or margins. If you want to see how emp_change differs across the 10 size_cats, just do -tab emp_change size_cat-.

To be extra clear on this point: are referring to the coefficients under size_cat#emp_changeand not just the size_cat?

The coefficients under size_cat (not the ones under size_cat#emp_change) reflect the effect of size_cat, relative to the baseline value of size_cat, on your outcome variable conditional on emp_change = 0.

The coefficients you find under size_cat#emp_change are something different. They represent the rate of change in the effect of size_cat per unit of emp_change.

For any given value of emp_change, the effect of size_cat (relative to the base level of size_cat) on your outcome is equal to the coefficient of that value of size_cat + the value of emp_change * the coefficient of that value of size_cat#emp_change.
Comment
Jeff Thompson

Join Date: Feb 2018

Posts: 30
#5

12 Apr 2018, 05:22

No. In fact, this has nothing to do with your regression or margins. If you want to see how emp_change differs across the 10 size_cats, just do -tab emp_change size_cat-...They represent the rate of change in the effect of size_cat per unit of emp_change.

Apologies, I wasn't clear enough, I wanted to see the effect of emp_change on log_avg_pay for each level of size_cat; not really interested in the effect of size_cat on log_avg_pay. I'll go ahead and still assume the -margins- command I specified earlier is the way to go if no one replies.

It's in the ballpark of the advice you'd given in this post:
https://www.statalist.org/forums/for...nce-categories

relative to the baseline value of size_cat

I used the ibn. prefix, what is the baseline value then?
Comment
Marcos Almeida

Join Date: Apr 2014

Posts: 4047
#6

12 Apr 2018, 05:28

You may wish to read this FAQ.

Best regards,

Marcos
Comment
Jeff Thompson

Join Date: Feb 2018

Posts: 30
#7

12 Apr 2018, 08:20

Thanks Marcos, I'd come across this earlier, but it uses the -noconstant- option, which isn't available with -xtreg-. I could switch to using -reg- as Clyde suggested but I'm working with panel data and don't feel like I could create an analogous model and because the i.groups would be beyond numerous.

There is a link on that FAQ to info I found helpful: https://www.stata.com/manuals/u11.pdf#u11.4.3
Others, with similar issues will find it explains a lot as I had, but it gets pretty thin when dealing with interpretations, at least from a novice's perspective.

http://www.maartenbuis.nl/publications/ref_cat.pdf is really helpful with interpretations, but deals with the -noconstant- option so none of the variables were omitted.

I'm still trying to determine if I should leave the i. prefix or deal with the omitted values by arbitrarily adding an 11th size_cat that will get omitted. As far as interpreting results, I'm not sure how to treat response #4 because it mentions base levels and I assumed I didn't have any.

Last edited by Jeff Thompson; 12 Apr 2018, 08:22.
Comment
Jeff Thompson

Join Date: Feb 2018

Posts: 30
#8

21 Apr 2018, 03:12

For anyone with a similar question and for what it's worth: The prefix you use before your categorical variable, albeit i. ibn. io1. etc, will all yield different values and should be interpreted differently in the regression output. However, the values are all (miraculously) the same regardless of the prefix you use after using the -margins- command. So in a statement not formally condoned by Stata experts: don't really fret about the prefixes you want to use if you plan on interpreting your results from -margins- anyways because, more or less, it all comes out in the wash.

Last edited by Jeff Thompson; 21 Apr 2018, 03:15.
Comment

Announcement

No Base Level for Factor Variable Operators

Comment

Comment

Comment

Comment

Comment

Comment

Comment