Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Estrella Gomez
    started a topic Error 103 too many variables specified

    Error 103 too many variables specified

    Hi

    I am runing a OLS regression for a weekly panel data containing 104 weeks, 32 exporter countries (ono) and 29 importer countries (cno). I want to introduce country time varying fixed effects, so I use the interact countries and weeks:

    reg lq ldist dhome dlang i.ono#i.week i.cno#i.week

    to interact countries with weeks. In total I have 6346 and one continuous variable. Every time I try this I obtain the error message 103 "too many variables specified". However, when I did this regression two months ago with a very similar dataset (same size) I didn't obtain any error message. I have set matsize to 11000 and maxvar to 20000

    Any idea of what the problem can be?

    Thank you very much,
    Estrella

  • yihan huang
    replied
    you can try to use reghdfe instead of using reg,stata maybe can work

    Leave a comment:


  • Katherine Adams
    replied
    Hello,

    Here is the example of my data; the data in the example is sorted by location, so, in fact, it is related to only one household with location id 600001 (the original data is for many households over 2017-2018).

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long location float(lconsum tp) byte randomgrp float(calday treatgr) int treat_numb_of_days
    600001  5.332319 0 0 20820 0 -403
    600001  5.397176 0 0 20821 0 -402
    600001 5.4483995 0 0 20822 0 -401
    600001 5.4339075 0 0 20823 0 -400
    600001 5.3300753 0 0 20824 0 -399
    600001  4.984716 0 0 20825 0 -398
    600001 5.2140517 0 0 20826 0 -397
    600001 5.4064755 0 0 20827 0 -396
    600001 5.2368565 0 0 20828 0 -395
    600001 5.3253976 0 0 20829 0 -394
    600001 5.2220044 0 0 20830 0 -393
    600001 5.4335613 0 0 20831 0 -392
    600001  5.122608 0 0 20832 0 -391
    600001  5.443592 0 0 20833 0 -390
    600001 5.1272955 0 0 20834 0 -389
    600001  5.153299 0 0 20835 0 -388
    600001  5.018543 0 0 20836 0 -387
    600001 4.8548176 0 0 20837 0 -386
    600001  4.744078 0 0 20838 0 -385
    600001  4.775974 0 0 20839 0 -384
    end
    format %td calday
    location; household’s location id
    lconsum; log of energy consumption
    tp; post-treatment variable; gen tp = (calday >= td(08feb2018))
    randomgr; one of three treatment groups (can be 1,2,3, as well as 0 if it is a control group)
    calday; day and year 01jan2017
    treatgr; treatment indicator;
    gen treatgr = randomgrp if calday >= td(08feb2018)
    replace treat = 0 if treat ==.
    treat_numb_of_days; the number of days before/after the treatment date; gen treat_numb_of_days = calday-td(08feb2018)


    I need to do the following two regressions using quite similar model specifications:
    qui tab calday, gen(day_dummy)
    areg lconsum i.treat tp i.treat#tp#c.treat_numb_of_days#i.day_dummy*, absorb(location) vce(cluster location)
    and
    areg lconsum i.treat tp i.treat#c.treat_numb_of_days#i.calday, absorb(location) vce(cluster location)

    However, the two regression above take up over 5000 variables. I get this error:
    maxvar too small
    You have attempted to use an interaction with too many levels or attempted to fit a model with too many variables.
    You need to increase maxvar; it is currently 5000


    So, I have two questions:
    1
    Could you please tell me why I end up with so many variables in my regressions? There are only about 680 daily dummies in my original data, so I do not understand why so many variables are used.
    2
    If I do not want to use “set maxvar”, how can I modify my code in order to avoid the error message above?


    Thank you.
    Last edited by Katherine Adams; 21 Feb 2019, 07:37.

    Leave a comment:


  • Estrella Gomez
    replied
    Thank you!

    Leave a comment:


  • Steve Samuels
    replied
    With 61 countries with 104 weeks each, there are 61 x 104 = 6344 observations. You've specified 6347 variables. No wonder Stata complains! And you are far from the rules of thumb that require 5-10 observations per fitted coefficient for OLS. The flaw is your attempt to fit all the week-company combinations-this is what your two interaction terms amount to. Note that you don't have a term for comparing importer and exporter companies.

    I'm not expert in panel data analysis, but I recommend that you consider xtset and an analysis with xtreg or mixed. You can parameterize time with a spline function and fit quite complicated models, whether with "fixed effects" in the panel data sense (a separate parameter, not estimated, for every company) or with companies as random effects.

    Thank you for signing your real name. Now, please make it official by re-registering so that your name appears properly everywhere. You can do it via the CONTACT US button at the bottom right.
    Last edited by Steve Samuels; 11 Nov 2014, 17:20.

    Leave a comment:


  • Nick Cox
    replied
    You could set trace on to try to see precisely what is biting, but at first sight you seem to be asking for an extraordinarily large number of terms to be fitted.

    Leave a comment:


  • Estrella Gomez
    replied
    Dear Carlo

    Thank you for your answer. I have checked that both dataset are different; but I still have the problem of the 103 error message... Any further idea?

    Thanks,
    Estrella Gomez

    Leave a comment:


  • Carlo Lazzaro
    replied
    Estrella (as per FAQ, please note the preference for real full surnames, too on this forum):
    as a first step, you may want to take a look at -help cf- to investigate the differences in the databases you mention in your query.
    Kind regards,
    Carlo

    Leave a comment:

Working...
X