How to Specify A Large Number of Initial Values for (Multinomial) Logit Regression

Sherry Wu

Join Date: May 2017

Posts: 1
#1

How to Specify A Large Number of Initial Values for (Multinomial) Logit Regression

23 May 2017, 14:24

Hi,

I am running logit regressions on establishment-level data where the LHS variable is the store's entry decision and the RHS variables are store characteristics and local market demographics characteristics. The issue is that, after the inclusion of about 300 MSA fixed effects, the logit regressions do not converge. The regression works fine without the inclusion of the fixed effects though.

Since the linear regression with exactly the same LHS and RHS variables run fine with MSA fixed effects, I would like to save the estimates from the linear regression (coeff(OLS)) and use as starting values for the multinomial logit regression the OLS coefficients multiplied by a factor of 4, 4*coeff(OLS). Do you have any pointers on the correct syntax Stata requires for specifying starting values for a large number of parameters? Below are the commands which I used. None of the code that I have tried for specifying the initial values seem correct.

************************************************** ************************************************** ************************************************** ************************************************** ************

/*OLS regression with MSA fixed effects*/

xi: regress Entry i.ChainSizeCtgry*Online marketcontrols i.msa

/*Storing Estimates from Linear Specification to Use As Starting Values for Multinomial Logit Specification*/

matrix b0=e(b)
matrix b1=4*b0

/*Multinomial regression where I try to specify the initial value*/

xi: mlogit Entry i.ChainSizeCtgry*Online marketcontrols i.msa, baseoutcome(0) from(b1)

initial values not feasible /*<---Stat returns the following*/

xi: mlogit Entry i.ChainSizeCtgry*Online marketcontrols i.msa. baseoutcome(0) from(eq2: b1)

initial vector: matrix eq2: not found /*<---Stat returns the following*/

xi: mlogit Entry i.ChainSizeCtgry*Online marketcontrols i.msa, baseoutcome(0) from([#2]: b1)

initial vector: matrix [#2]: not found /*<---Stat returns the following*/

************************************************** ************************************************** ************************************************** ************************************************** ***

I'm sorry that because I am working with data through a secure server, I cannot be more specific by copying and pasting my exact STATA commands and output.

Many Thanks,
Sherry
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30155
#2

23 May 2017, 22:42

Well, your first attempt

Code:

mlogit Entry i.ChainSizeCtgry*Online marketcontrols i.msa, baseoutcome(0) from(b1)

was correct. What Stata is telling you is that those actual initial values are not possible. I don't know why that would be (if you were able to show the actual values I might hazard a guess). But, no fancy footwork on the syntax is going to get you around that. You'll need to use a different set of initial values. And, of course, there is no guarantee that you will get the model to converge in any case. This sounds like a difficult model to fit.

As an aside, having no bearing on the question you posed, if you are using a recent version of Stata that supports factor-variable notation, you should stop using -xi-. Use factor-variable notation instead:

Code:

regress Entry i.ChainSizeCtgry##c.Online marketcontrols i.msa

etc., with no prefix. See -help fvvarlist- and the linked manual section for more information. Stata will create the necessary "virtual" indicator variables and also create the interaction terms for you automatically. Then, when you finally get your model to converge you will be able to use the wonderful -margins- command to get predicted probabilities, marginal effects, and graphs of those, with simple one-line commands. Using -xi- just clutters up your data set with all those ugly _I variables, and it also actually prevents you from using the -margins- command afterward (which means you'll have a lot more tedious, error-prone work to do.) There are a couple of situation in Stata where -xi- is still needed. But they are uncommon, and most of them involve archaic commands whose functions have been superseded by more modern commands that do support factor-variable notation. So more or less try to forget you ever heard of -xi-.
Comment
John Mullahy

Join Date: Dec 2016

Posts: 752
#3

24 May 2017, 06:22

Sherry: Without seeing the data one can only speculate, but I would guess that you might have a perfect-prediction problem with one or more of your fixed-effect covariates. In the simple case of binary logit this happens when you don't have all four combinations of the 0-1 outcome and the 0-1 dummy in your sample. In this case logit won't converge but you'll still be able to estimate the linear regression version of the model. Analogous issues arise in more complex settings (e.g. multinomial outcomes). It's hard to imagine ever needing to specify starting values for logit or mlogit if the data are well behaved.
Comment

Announcement

How to Specify A Large Number of Initial Values for (Multinomial) Logit Regression

Comment

Comment