Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata shows "wrong" value

    Hey,

    I have a dta file with several variables. The properties of the variables in the dta file are all the same (Type "long", Format "%9.0g"). According to the documentation the variables are different:
    Variable 1 has "Length of Byte = 2"
    Variable 2 has "Length of Byte = 12" (plus note "4 - decimal points")
    Variable 3 has "Length of Byte = 1"

    The file can be found here:
    https://www.dropbox.com/s/pcp79apult...ample.dta?dl=0

    In case of variable 1 and 3 everything seems to be fine. The value shown in the cell is the same as the value shown above. Plus, if I copy the value to the clipboard it also stays the same. However, in case of variable 2 the value in the cell is basically the same as in the clipboard, but not as shown above (= field on top of screenshot). What do I have to change to get the correct value (it should be the one in the clipboard).

    In the end I want to use the data in R. Unfortunately R also gives me the value shown in the top field, which is wrong. So I assume I have to change the dta file first.

    Thanks in advance!
    Attached Files

  • #2
    These are encoded variables, or in other words, numeric values (what Stata shows i top box) labelled with the values shown in blue.
    See
    Code:
    help encode
    And in particular the decode bit, so you can get the values you want

    Comment


    • #3
      Actually, you have the same problem with v1 and v3, you just don't recognize it because of a lucky coincidence.

      The information Stata is showing you in the array of data cells is not the values of the variables in the data set. These variables are all variables with value labels, and you are seeing the labels, not the underlying values. The values shown in the circled screen at the top are the actual data values. If the values you need are what is shown in the array of data cells then you need to -decode- those variables and then -destring- them.

      Code:
      foreach v of varlist v1 v2 v3 {
          decode `v', gen(_`v')
          destring _`v', replace
          drop `v'
      }
      rename _* *
      If you are not familiar with value labels, read -help label- and the associated sections of the PDF manuals that are installed with your Stata. Also be sure to read -help decode- and -help destring- so you understand what these are doing.

      By the way, the way I can tell that v1 and v3 are also affected by the same problem is that they are all shown in blue in the Browser. When Stata is showing you the real values of numeric variables, they are displayed in black. Value labels are shown in blue. String variables are shown in a color that may appear either red or brown depending on your display. It just happens that in your data set for v1, the label for 1 is 01 and the label for 2 is 02, and similarly for v3, the label for 1 is 1 and the label for 2 is 2, so the problem isn't as obvious.

      Added: Crossed with #2.

      Comment


      • #4
        Thanks a lot for your replies!

        True, I didn't think of them not being shown in black (probably because no variable is shown in black and I'm not using Stata that often).

        I also managed to solve the problem directly in R by creating factors from value labels and then converting them to numbers (without changing the dta file).

        Comment

        Working...
        X