Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Too Many Values

    Hi all,

    I'm working with a dataset with around 350.000 rows and 20 columns. When I try to encode some variables, as they are recognized as string for some reason, it won't let me as it returns the error Too Many Values (r134). Basically, what I'm trying to do is just to convert my variable into a numerical variable by using encode. I also tried destring, but it gives me the error "contains nonnumeric values", which I cannot find at all in my dataset.

    Please help me out,

    Luc

  • #2
    here is one way to find the nonnumeric values:
    Code:
    ta var if real(var)==.
    replace "var" with your actual variable name(s); note that if your variable(s) is meant to be numeric, then -encode- is not the right command anyway; if this is insufficient, please read the FAQ and supply a data example using the -dataex- command and posting the results within CODE blocks

    Comment


    • #3
      As a way for Luc to have more understanding of a likely origin for this problem, I'd point to a common and somewhat nonvisible issue that can show up when data comes from a CSV or Excel file. There can be columns or rows that contain blanks (or nonprinting) characters that the user doesn't see, but which are included in the import. (This most commonly happens as a blank row at the end of the input file.) Stata sees a "blank" value within a column, which is non-numeric, and so defaults to treating the whole variable as a string. The solution is to always clean up or exclude rows at the end of a CSV or Excel file, and less commonly, check for and handle the same issue occurring in blank columns on the righthand side. I'd also say that understanding the difference in function between -encode- and -destring- is important here.

      Comment

      Working...
      X