Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • problem maxvar too small

    So i have a data set with all the states abbreviations. Firstly I used the command encode(), gen so that it changed from being a string. But when i used the tab() command the names of the states got weird symbols and random numbers and letters. When I wanted to put it in a regression I got the error code "maxvar too small".
    Can someone tell me if I did something wrong or if I need to change anything (I have just basic stata knowledge)

  • #2
    Show us the results of the -describe- command on your abbreviation variable and your name variable (sounds like it is another variable??), and your encoded version of the abbreviation variable. Also, show us the -tab- command you used that produces the "weird" results, and some of those results. Also, when you tabulate your encoded variable, how many different values does it show? How many state values (U.S.??) should there be in your dataset? Seeing the command you used to encode the variable and the regression command you used would also be important.

    General advice: I'd encourage you to take another look at the FAQ (top left of your screen) for new StataList members, as it might help you now and in the future post a question so as to get a quicker and more helpful answer.

    Comment


    • #3
      This is the problem i experience ( I have a data set with a lot of observations with the corresponding U.S. state connected to it)

      ==>
      describe State

      Variable Storage Display Value
      name type format label Variable label
      ------------------------------------------------------------------------------
      State long %8.0g State state

      . reg incident_nr i.State
      maxvar too small
      You have attempted to use an interaction with too many levels or
      attempted to fit a model with too many variables. You need to increase
      maxvar; it is currently 2048. Use set maxvar; see help maxvar.

      If you are using factor variables and included an interaction that has
      lots of missing cells, try set emptycells drop to reduce the required
      matrix size; see help set emptycells.

      If you are using factor variables, you might have accidentally treated a
      continuous variable as a categorical, resulting in lots of categories.
      Use the c. operator on such variables.
      r(907);

      and when i use tab i get this weird outcome
      �L | 1 0.00 99.99
      �Y | 1 0.00 99.99
      �� | 1 0.00 100.00
      �� | 1 0.00 100.00
      �� | 1 0.00 100.00
      �� | 1 0.00 100.00
      �E | 1 0.00 100.00
      �Q | 1 0.00 100.00
      �Z | 1 0.00 100.00
      �o | 1 0.00 100.00
      �p | 1 0.00 100.00
      �� | 1 0.00 100.00
      �� | 1 0.00 100.00
      �� | 1 0.00 100.00
      �� | 1 0.00 100.00
      �� | 1 0.00 100.00
      ------------+-----------------------------------
      Total | 274,077 100.00

      Comment


      • #4
        First of all, you didn't respond to some of the things I asked, so that makes it difficult to help you.

        1. "Show us the results of the -describe- command on your abbreviation variable"
        What you showed is for the State variable, not the abbreviation variable you said you encoded. Seeing describe for the abbreviation variable and a tabulation for it essential. Please do that and post your results between "CODE delimiters" as prescribed in the FAQ, as that makes it much easier to read. In doing that, show us all of what Stata produced, including your -tab- command.

        2. "Seeing the command you used to encode the variable ... would also be important."
        You didn't offer that to us..

        What appears to be going on here is that you *think* you are using -encode- on a variable that takes on the standard U.S. state abbreviations, i.e., AK, AL, AZ, ...,
        of which there should be only 50 (or 51, including DC) values. Encoding that should result in a variable with the values 1, ..., 50, which apparently is not true, since 50 different values would not provoke the " ...maxvar..." error message. So, here are the possibilities I can imagine:

        a) Your "abbreviation" variable contains something other than what you think it does, presumably a lot of "junk." When you see that tabulation for your abbreviation variable per 1), it should show 50 values, each consisting of the two letters, and no junk. Looking at that may instantly show you the problem. The fact that you didn't mention this makes me think you haven't tried it.

        b) Perhaps your abbreviation variable has acceptable content, but you accidentally used -encode- on some other variable. That's why I asked you about your -encode- command. You might *retry* -encode- on your abbreviation variable. (I'd guess you did this, but you didn't say, so I don't know.)

        c) You correctly applied the -encode- command to the correct variable, and that variable has the right content, but some weird glitch of some unknown kind occurred. This is unlikely in the extreme, but the "retry" I suggested in b) should rule that out.

        Comment

        Working...
        X