Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Label values

    Hello, I have been trying to generate a variable called animal1 following this code:
    Code:
    gen byte animal1=(a34animal1==1|a34animal_2==1|a34animal_3==1|a34animal4==1|a34animal_5==1|a34animal_6==1|a34animal_7==1)
    which means that animal1 should be equal to 1 only if one OR all the variables of (a34animal_) are equal to 1. But when I check the animal1 values it shows that something did not go right.
    Code:
     
    list a34animal1 a34animal_2 a34animal_3 a34animal4 a34animal_5 a34animal_6 a34animal_7 animal1 in 6637/6646
    
          +--------------------------------------------------------------------------------------+
          | a34an~l1   a34ani~2   a34ani~3   a34an~l4   a34ani~5   a34ani~6   a34ani~7   animal1 |
          |--------------------------------------------------------------------------------------|
    6637. |        0          0          0          1          0          0          0         1 |
    6638. |        6          0          0          1          0          0          0         1 |
    6639. |        1          0          0          1          0          0          0         1 |
    6640. |        6          0          0          1          0          0          0         1 |
    6641. |        1          0          0          1          0          0          0         1 |
          |--------------------------------------------------------------------------------------|
    6642. |        6          0          0          1          0          0          0         1 |
    6643. |        6          0          0          1          0          0          0         1 |
    6644. |        0          0          0          1          0          0          0         1 |
    6645. |        1          6          0          1          0          0          0         1 |
    6646. |        4          6          0          1          0          0          0         1 |
          +--------------------------------------------------------------------------------------+
    I am guessing there is a problem in the label values but don't know how to fix it.
    Knowing that the excel file and the data editor both show data similar to as above but different from the dataex in terms of a34animal1 and a34animal4.
    These particular variables were originally string variables (because they have some characters) but I encoded them. The issue would probably be because of this but I still can't pinpoint it yet. I wish I can get the original values in my encoded variables but don't know how.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long a34animal1 byte(a34animal_2 a34animal_3) long a34animal4 byte(a34animal_5 a34animal_6 a34animal_7 animal1)
     1 0 0 1 0 0 0 1
    11 0 0 1 0 0 0 1
     2 0 0 1 0 0 0 1
    11 0 0 1 0 0 0 1
     2 0 0 1 0 0 0 1
    11 0 0 1 0 0 0 1
    11 0 0 1 0 0 0 1
     1 0 0 1 0 0 0 1
     2 6 0 1 0 0 0 1
     9 6 0 1 0 0 0 1
    end
    label values a34animal1 a34animal1
    label def a34animal1 1 "0", modify
    label def a34animal1 2 "1", modify
    label def a34animal1 9 "4", modify
    label def a34animal1 11 "6", modify
    label values a34animal4 a34animal4
    label def a34animal4 1 "0", modify

  • #2
    There is another way to do this which is

    Code:
    gen byte animal1= inlist(1, a34animal1, a34animal_2, a34animal_3, a34animal4, a34animal_5, a34animal_6, a34animal_7)
    Evidently some values that should be 0 are in fact 1 and there are other ensuing problems. That is what encode does: assign integer values 1 up in the absence of instructions to the contrary.

    In a nutshell, you needed destring not encode.

    The problem is not easy to fix as your variables have evidently been
    encoded differently. Again, that is a consequence of what encode does, treat variables separately unless you specify a standard set of labels.

    You probably need to go back to the original dataset, import again, then fix non-numeric input, and then
    destring.

    As an extra detail using underscores in some variable names and not others is exposing you to other small complications and confusions.

    Sorry not to have better news.
    Last edited by Nick Cox; 10 Nov 2022, 00:52.

    Comment


    • #3
      Surprisingly it worked perfectly after importing the 2 variables in question again from excel and removing the non-numeric inputs.

      Thank you loads!

      Comment

      Working...
      X