Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • "too many variables specified" when run "reg" with a large number of dummies

    Dear Stata users,

    I have trouble to run the command “reg” for a model with thousands of dummies. Would really appreciate if you can help me out. Thank you in advance for your help. Below are more details,

    Task: generate school value-added measures with standard errors for an education project
    Commands:
    1. codes: reg posttest pretest i.year_sch [pw=weight]
    2. year_sch represents school/year. It produces school value-added measures with standard errors.
    Error message:
    1. “too many variables specified”
    2. There’re in total 6400 unique school/year in my sample. “Reg” does work when I cut my sample and reduce the number of unique school/year to 5000. So I think the problem seems very clear. Stata does not allow “reg” to run with so many dummies.
    Other possible commands
    1. My current coding strategies involve two steps. First, I use “reg” to estimate school value-added measures. However, “reg” computes fixed effects relative to some arbitrary holdout unit (e.g. a school/year), which can produce incorrect standard errors. Thus, in the second step, I use “Contrast” to normalize fixed-effects to grand mean and computes their standard errors.
    2. For the estimation stage, I considered other stata commands but failed to find anyone worked. “Areg” does not produce standard errors. More importantly, areg does not work with contrast, because areg treats school/year as a factor variable. xtreg does not work neither because it requires constant weight within panel (school/year), which is not true in our case.
    3. For the normalization stage, I considered command “Felsdvregdm”. But it does not allow for weights, which are important to our project.
    Questions
    1. Is my understanding of the error report correct? “Reg” does work when I cut my sample and reduce the number of unique school/year to 5000. So I think the problem seems very clear. Stata does not allow “reg” to run with so many dummies.
    2. The online stata guide says that the Maximum right-hand-side variables for stata MP is 10,998. In my sample, there are only 6400 unique school/years. Why doesn’t “reg” work? FYI: http://www.stata.com/products/which-...-right-for-me/
    3. How can I fix the problem using “reg”?
    4. Or are there other commands I should consider?
    Version of stata: StataMP 14 (64-bit)
    Num. of obs. in my sample: about 887,000

    Thank you again for your help!

    Best,
    Lihan Liu

  • #2
    Lihan:
    welcome to the list.
    I would take a look at -help statamp-.
    Quoting an excerpt of it I read:
    2. maxvar
    The maximum number of variables allowed in a dataset. This limit is initially set to 5,000; you can increase it up to 32,767.
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Couldn't you just run this as a fixed effects regression using xtreg, fe? Alternatively, use reghdfe ..., absorb(year_sch). Or are you particularly interested in the coefficients of the dummies?

      Comment


      • #4
        You can increase the maximum variables by typing:

        Code:
        set maxvar 32767, permanently
        About 6,000 variables is very extensive, nevertheless.

        Comment


        • #5
          Thank you all for your help! set maxvar did work. Before this post, I looked at the instruction of maxvar. It's said it increases the num. of variables in data (not the right hand side variables in regressions). So I did not give it a try. Thank you again for your help!

          Comment

          Working...
          X