Continuous by continuous LOGIT interaction with missing values for control group

Roger Clements

Join Date: Jun 2017

Posts: 40
#1

Continuous by continuous LOGIT interaction with missing values for control group

22 Jun 2018, 02:05

Hello,

I am trying to compute the probability of being treated using a continuous by continuous interaction Logit model in Stata. I have three key variables:

Dependent variable = target (1 = targeted company; 0 = control company, not targeted)
Main predictor variable = company_size (size of company)
Interaction variable = activist_age (age of social activist)

Both the predictor and interaction variables are continuous.

In Stata, I input Logit code:

Code:

logit target c.company_size##c.activist_age i.sic_industry_code i.year

The problem is Stata outputs an error because "the outcome does not vary". This is because I only have activist_age data for the targeted firms; the control firms of course don't have this data because no activist targets them.

How do I solve this problem? Is there some way I should break the targeted firms into strata based on activist_age and compare the probability of being targeted just among targeted firms? Any advice on how to move forward would be greatly appreciated. Any other help is welcome as I am fairly new to Logit models.

Thank you!

Roger
Tags: None
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#2

22 Jun 2018, 02:14

Roger:
probably the easiest fix is to remove -activist_age- from the set of predictors.

Kind regards,
Carlo
(Stata 19.0)
Comment
Roger Clements

Join Date: Jun 2017

Posts: 40
#3

22 Jun 2018, 02:58

Thanks for the quick reply Carlo. I agree (it works in that case), but then it prevents me from theorizing about the effect of activist age on targeting, which I'd like to avoid if possible.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#4

22 Jun 2018, 03:06

Roger:
perhaps a different specification of the model is worth considering.
The usual recipe is to skim through the literature in your research field and see what others did in the past when presented with the same research topic.

Kind regards,
Carlo
(Stata 19.0)
Comment
Roger Clements

Join Date: Jun 2017

Posts: 40
#5

22 Jun 2018, 03:35

I agree completely. I have conducted a fairly exhaustive review and discovered others conclude, "For future research, considering the characteristics of activists themselves would be worthwhile." Thus, I would like to be the first (that I can at least see in the journals...) to do this. I am simply asking about how to compute any specification of any model where activist characteristics are incorporated when only half the sample (i.e., treated/targeted companies) have values for activists.
Comment
Carlo Lazzaro

Join Date: Apr 2014

Posts: 17702
#6

22 Jun 2018, 03:42

Roger:
your request is clear, but, as far as I can get your query, it cannot be satisfied given your data.

Kind regards,
Carlo
(Stata 19.0)
Comment
Igor Paploski

Join Date: Oct 2014

Posts: 174
#7

22 Jun 2018, 09:05

One approach to this situations is to use an outcome that is quantitative, not categorical. If you could score how targeted a company was (think of a scale from 1 to 10, for example), than you could check if, among the companies that were targeted, their targeting rate is associated to the activist age.
Comment
Roger Clements

Join Date: Jun 2017

Posts: 40
#8

25 Jun 2018, 01:03

Hi Igor, thanks for the possible solution. My limitation is a nature of the data: companies are either targeted by an activist or they are not. It is very difficult to conceptualize a continuous "targeting" variable that would make sense.

Can I do a sub-sample analysis on the targeted companies to test the moderating variable if the DV remains binary? That is, look at just the targeted companies to see if activist age strengthens the relationship between company size and likelihood of being targeted. I suppose then again my problem would be that my dependent variable (targeting) does not vary, which is exactly why you, Igor, suggest a continuous DV. So I am stuck...

I thought I have seen this type of analysis is other literature so I am perplexed.

Thanks again.
Comment
Igor Paploski

Join Date: Oct 2014

Posts: 174
#9

25 Jun 2018, 07:47

Hi Roger,

If someone wants to know if smoking is associated with lung cancer, epidemiologicaly speaking, you need a group of smoking and no-smoking people to compare and see if lung cancer incidence differs in both groups. Alternatively, you could get two smoking groups that smoke different amounts and check if there is an association with the amount a person smokes and the incidence of lung cancer.

There is a hypothesis that cell phone usage causes some forms of brain cancer due to radiation they emit while we use them by our ears. One way to test this hypothesis is get people who use and don't use cell phones and compare the incidence of these forms of brain cancer on both groups. But those who don't use cell phones are different to those who use in many characteristics other than cell phone usage - they are older, live in places where there is no cell phone coverage (which reflect lack of access to many other features of modern life), it's simply hard to say that any different in brain cancer incidence in both groups is due to lack of cell phone usage. here are many very very obvious confounders there. One way out is to compare rates of brain cancer among people that use cell phones with different intensities (let's say, a different amount of hours per day).

My point with both examples is that simply working with "smokers" and "cell phone users" just won't do the trick. You need a counterpart to determine what the baseline would be, in order to calculate how many times the prevalence/risk/odds of your outcome changed due to the exposure.

I honestly don't know of any way out of this situation.

Best;
1 like
Comment

Announcement

Continuous by continuous LOGIT interaction with missing values for control group

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment