Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • When appending goes wrong because of leading zeros

    Hi everybody,

    I've spent hours trying to fix a problem that may be simple: when I append some datasets, variables that have been encoded turn into a 3-digit observation. To be precise:

    My dataset contains - among others - two variables that would be numerical, but are strings because of stuff like "#NI#". Once I need it as numerical, I used "encode", which worked well [Stata didn't allow "destring"]. However, the new vars present leading zeros, and they seem to be an issue when I append the datasets: only when there are leading zeros the observations become a 3-digit number.

    Here is the script:

    encode v28, gen (v28_n)
    encode v14, gen (v14_n)

    rename v28 v28str
    rename v14 v14str

    rename v28_n v28
    rename v14_n v14

    drop v14str
    drop v28str

    recast double v28
    recast double v14

    How can I remove the leading zeros or avoid the 3-digit problem when I append the datasets?

    Thank you!

  • #2
    I don't understand all of this. You don't present a data example which is the most useful part of a question to make clear what your data are like.

    My guess is that encode is quite wrong here. encode is for genuine strings that you want a numeric version of. It's not at all for fixing small problems with numeric variables that it have been misleadingly read as string. It's not the case that Stata doesn't allow destring: it is just that destring will need to be specified with options tailored to whatever are the problems in your data.

    I wouldn't encode at all. I would just append the string variables and then work at them.

    Suppose you have a variable strvar that "should be" numeric but is string.

    Code:
     
    list strvar if missing(real(strvar))
    lists values that can't be converted to numeric. If there are lots of them then

    Code:
     
    tab strvar if missing(real(strvar))
    classifies. It is that detail that we need to see to advise. What does #NI# mean? If it's just a code for missing then destring, force would solve that problem, but there may be others.

    Comment


    • #3
      Dear Nick,

      The codes just gave the message "no missing values". However, I tried "destring" as you suggested and it solved my problems.

      Thank you so much!

      Comment

      Working...
      X