Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Why not infinite number of variables in Stata?

    Dear All,

    I am curious to know if there is a way to have more than 120,000 variables (ideally infinite) in Stata? I mean by using extremely powerful machines? If no, why such a limit is set?

    This becomes repeatedly important for me since I generate a lot of vars (or tempvars) which are used in the analysis that I am doing currently. I cannot create the variables first because of the maxvar limit and hence have to generate variables in batches and run the rest of the program on them. This solves my problem but then it increases the runtime of my program too much since a lot of thing get repeated in loops that I run to generate the vars in batches.

    I am just wondering if this limit can be set to ideally say infinite?

    Sorry if my concern looks silly. I am just curious to know.

    Thanks
    Amit

  • #2
    help limits cites Stata's limits and at the time of writing 120,000 variables is the absolute upper limit.

    Otherwise this is about the way StataCorp writes low-level code and why -- or about why your programs need thousands of variables, and it is hard to say more about either.

    Comment


    • #3
      With the frames command, you can now have 12,000,000 variables at a time.

      Nick already hints at the lurking question: why do you need more than 120,000 variables? Are you sure this is the best approach to code what you want?
      Last edited by daniel klein; 02 Mar 2021, 11:17.

      Comment


      • #4
        Thanks Nick and Daniel. I will look into both altering my programs and the frames command.

        On why I need thousands of variables: I have a variable representing age of respondents (0 to 120 years) and was doing some analysis for each single year populations as well as for each combinations of age-groups - say people aged 1-2 years, 2-3 years, 3-4 years.....1-3 years, 2-4 years.....1-5 years, 2-6 years and so on. I was trying to create new variables one for each such age group. This literally resulted in creating more than 12000-14000 new age-group vars. and grouping them with just a few more variables exceeded all the limits. I knew I had to redesign my codes, but just dropped a message to seek help before going for the new codes.

        Thanks.

        Comment


        • #5
          I was trying to create new variables one for each such age group.
          But that's a terrible idea! Once you had that data set, even if there were no limit imposed on the number of variables, it would be essentially unusable in Stata anyway because very few Stata commands are set up to work with that kind of wide layout data.

          What you should be doing instead is creating a long layout data set in which age_group is a separate variable in the study taking on values 1-2 years, 2-3 years, 3-4 years, etc. with the corresponding values in those operations. Nearly all Stata commands are either optimized for or only able to work at all with long layout. And the limit on the number of observations is large enough to accommodate all of this.

          Comment

          Working...
          X