-mtreatreg- : Problem to define base category

Raux Morgan

Join Date: May 2015

Posts: 15
#1

-mtreatreg- : Problem to define base category

13 May 2016, 09:30

Dear all,

I would like to use the user written program called -mtreatreg- with Stata 12.

I would like to estimate multinomial treatment effect on wage with comparison to a precise control group.
My groups are defined by the variable treatment (float variable), divided in 9 categories

Here a copy of the tabulate command for the treatment variable

treatment | Freq. Percent Cum.
------------+-----------------------------------
0 | 521,372 96.62 96.62
1.945867 | 2,568 0.48 97.09
3.619311 | 500 0.09 97.18
3.891733 | 1,658 0.31 97.49
5.000877 | 306 0.06 97.55
5.8376 | 7,235 1.34 98.89
6.674322 | 829 0.15 99.04
6.946743 | 2,888 0.54 99.58
7.783466 | 2,277 0.42 100.00
------------+-----------------------------------
Total | 539,633 100.00

I manage to use this program but by default, the program chooses the control group randomly among my nine groups. Actually, it chooses category 6 ( = 5.8376) while I would like to estimate these effects with comparison to the first category (= 0 )

So I have tried to specify the base category as following : mtreatreg $ylist $xlist , mtreatment(treatment= AGE AGESQ chld married Tax_revenue_ji ) basecat(0) sim(200) density(normal) difficult

But Stata returns an error message : "basetreatment(0) is not an outcome of treatment; use tabulate for a list of values"

More surprisingly, the program works when I indicate the other variable called "lp1" (double variable) as treatment variable.
But "treatment" is just the inverse of "lp1" since I have created it with the following code line

gen treatment = 7.783466 - lp1

I have tried to modified the type of the variable treatment from float to double but it does not still works.

I know that my question concerns a user written program but I think the problem is more about the format of my variable.
I think about the format problem, especially because I have noted another surprising fact with my variable "treatment"

drop if treatment == 1.945867
(0 observations deleted)

Whereas it should drop 2,568 observations

Does anyone have an idea of what could be the problem ?

Best regards.

Morgan
Tags: None
Partha Deb

Join Date: Apr 2014

Posts: 22
#2

13 May 2016, 14:36

Morgan - it's definitely a "precision" issue. See http://blog.stata.com/2011/06/17/pre...-again-part-i/ and many other similar references. Vis-a-vis thinking of your variable as a multinomial, it would be much easier to work with if you recoded your variable into something that was truly integer valued. Hope this helps.

Partha
Comment
Raux Morgan

Join Date: May 2015

Posts: 15
#3

14 May 2016, 06:36

Thank you Partha for your advice.

I have recoded my treatment variable with integers, but the problem was elsewhere and I have finally found the source of the problem.
Still I don't understands the link between the error message and the problem but it does not matter.

To summarize, my treatment variable was explained by the following selection equation.

Treatment = Age Age_squared chld married Tax_revenue_ji

Age and Age square are continuous variables
chld and married are dummy variables equal to 1 respectively if the individual has child and if he is married.
Finally, my "Tax_revenue_ji" variable is a measure the difference of tax revenue in percentage of GDP between the country of origin and the country of destination.

Since my control group (category 1) represents native born people, this variable had only missing values for "Tax_revenue_ji". I think this is the reason for which -mtreatreg- could not accept [ basecat(1) ].
I have replaced missing values by zero and it works.
Sorry for this question, it took me so much time to figure it out.

Still, I have another question on the power of -mtreatreg-.
As you can see above, I have a quite large dataset with almost 540.000 observations. Among these 540.000, my control group is composed of 520.000 observations (native people).
I have read your paper published in the Stata Journal (2006) where you describe the use of -mtreatnb-. You say that it took 20 minutes to achieve the process whereas your dataset was only composed of 5.000 observations.

Do you think that it is realistic to use -mtreatreg- with a dataset of 540.000 observations ?

Best regards.

Morgan
Comment
Partha Deb

Join Date: Apr 2014

Posts: 22
#4

15 May 2016, 12:41

Morgan - I'm guessing that your computer is much, much faster than whatever I had in 2005 or so. So, why not give it a shot and see how things go. Given your really big sample, it's possible that you can reduce the number of simulation draws a bit and still get stable results.
Comment
Raux Morgan

Join Date: May 2015

Posts: 15
#5

21 May 2016, 11:30

Hello Partha,

I have tried to use -mtreatreg- with my complete dataset, even with a very small number of simulations (I have test -sim(20)- ) my computer (macbook pro) was not powerful enough to complete the estimation.

Nevertheless, thank you for your help.

Best regards.

Morgan
Comment
Chris Schrey

Join Date: Mar 2019

Posts: 1
#6

13 Mar 2019, 06:04

Dear Raux, dear Partha,

similar to your case, Raux, I am using

Code:

mtreatreg

(endogenous treatment: health insurance, outcome: healthcare utilisation), inspired by (among others) your paper(s), Partha on endogenous treatment (health insurance) on a count outcome (visits to the doctor). So far the amount of observations employed is drastically limited by my computational power.
Apart from the estimation being memory intensive, I wonder if using Stata MP and utilising, say, 2-core or 4-core would additionally decrease computation time? This command is not listed in Stata's MP whitepaper, so I wonder if potentially spending money on Stata MP might speed up things with respect to the mtreatreg command (as I wish to further expand the observations used)?

Best regards,

Chris
Comment

Announcement

-mtreatreg- : Problem to define base category

Comment

Comment

Comment

Comment

Comment