interaction term using a categorical and dummy variable

Mallika An

Join Date: Mar 2018

Posts: 1
#1

interaction term using a categorical and dummy variable

23 Mar 2018, 09:27

Hi,
I am trying to interpret the interaction term I have created using i.origpexpn#southasian where i.origpexpn is a categorical variable coded from 1- very high to 4- very low and south asian is a dummy such that 1- southasian and 0-not southasian (meologit is the command being used to run the regression). It runs the regression but sends a note that 1.southasian#4.origpexpn has been omitted due to collinearity. It looks something as seen below.
southasian#origpexpn \ coeff
1 1 \ 0.6
1 2 \ -0.5
1 3 \ -0.9
1 4 \ (ommitted)
Is this because stata has assumed 1.southasian#4.origpexpn as the base? I am unsure about how to interpret this and will be grateful if someone could explain.
Thanks so much.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30100
#2

23 Mar 2018, 10:44

Is this because stata has assumed 1.southasian#4.origpexpn as the base?

No. Your base category is southasian = 0, and the automatically omitted levels of the interacxtion all would involve 0.southasian.

So you need to explore your data to figure out why you have this colinearity. If you want more specific advice, show the complete regression command, the complete output, and an example of your data. Be sure to show the commands and output between code delimiters so that they line up in an easily readable way. Read the Forum FAQ, with particular attention to #12 for instructions on how to use code delimiters if you are not familiar with them.

And use the -dataex- command to show your example data. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

Added: I may be incorrect in my original statement that this is not omitted as a base category. If you simply coded the interaction as i.southasian#i.origpexpn, based on your description of the variables, the reference categories for southasian and origpexpn respectively should be 0.southasian and 1.origpexpn. So I would expect the interaction terms that show up would be 1.southasian#2.origpexpn 1.southasian#3.origpexpn and 1.southasian#4.origpexpn. So, while I am surprised that 1.southasian#4.origpexpn is omitted, I am also surprised that 1.southasian#1.origpexpn is present. In fact, you have a total of three interaction variables included, which is precisely the right number for an interaction between a 4-level and a 2-level variable. So I am thinking that you have perhaps somehow directed Stata to use 4 as the reference level of origpexpn. So if you don't think you did that, then something is amiss and I would advise you to show the complete code and output for your analysis. Include in that the code for any -fvset- statements you might have given.

Last edited by Clyde Schechter; 23 Mar 2018, 10:50.
Comment

Announcement

interaction term using a categorical and dummy variable

Comment