Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem for append datasets

    Hello,

    I'm new with STATA, and i'm trying to perform an append of different datasets and there is a specific variable for which i'm having problems to append correctly. In the original dataset this variable is double and in the last dataset to append imported to stata (from a csv file) this variable was imported as string. The variable refer to the income for individuals for a specific month. While trying to run this command and force it to do it, STATA reclassify the value to another ones. To solve this i try:
    1. destring the variable (before the append) --> not worked since the output is an error for non numeric values
    2. encode and recast --> this don't bring any error

    However when finally append the datasets with this last variable converted (from "2.") the result is the same one as the beginning. STATA assign new values for each observation.
    The dataex after the appending (listed in 1/15):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input double P21
      2
      2
      2
      2
      2
      1
      2
     98
      2
      2
      1
      2
    376
    278
      2
    end
    label values P21 P21_NN
    label def P21_NN 1 "-9", modify
    label def P21_NN 2 "0", modify
    label def P21_NN 98 "16,000", modify
    label def P21_NN 278 "40,000", modify
    label def P21_NN 376 "8,000", modify
    How can i resolve this to have a correct append of the dataset? for sure i'm ignoring the power of the stata to resolve this kind of problems
    Many thanks!


    Bruno

  • #2
    Welcome to Statalist.

    The second option is never a good idea. As the output of help encode instructs us

    Do not use encode if varname contains numbers that merely happen to be stored as strings; instead, use generate newvar = real(varname) or destring; see real() or [D] destring
    Your data - the part that is correct - consists of numbers stored as strings, so don't use encode. You need instead to deal with the incorrect data by correcting it, so that destring will work, or choose to replace it with missing values. But you must understand what the problem in the data is before you choose what to do.

    Here is some example code that may lead you in a useful direction.
    Code:
    . describe x
    
                  storage   display    value
    variable name   type    format     label      variable label
    ------------------------------------------------------------------------------------------------
    x               str5    %9s                   
    
    . destring x, replace
    x: contains nonnumeric characters; no replace
    
    . tab x if missing(real(x))
    
              x |      Freq.     Percent        Cum.
    ------------+-----------------------------------
             NA |          1       50.00       50.00
            two |          1       50.00      100.00
    ------------+-----------------------------------
          Total |          2      100.00
    
    . replace x = "2" if x=="two"
    (1 real change made)
    
    . destring x, replace force
    x: contains nonnumeric characters; replaced as byte
    (1 missing value generated)
    
    . tab x, missing
    
              x |      Freq.     Percent        Cum.
    ------------+-----------------------------------
              1 |          1       20.00       20.00
              2 |          1       20.00       40.00
              3 |          1       20.00       60.00
              5 |          1       20.00       80.00
              . |          1       20.00      100.00
    ------------+-----------------------------------
          Total |          5      100.00

    Comment


    • #3
      Hello William,

      thanks for your reply and the information. I have actually try to understand the problem related to this issue but thought there was another way to solve instead using the replace command. But the steps you indicate before works perfectly and the append was done correctly.

      Thank you for your support, and again for the info.. I will follow your suggestion for the next time and understand what kind of problem i'm facing with...

      bye!!

      Bruno V.-

      Comment

      Working...
      X