Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with variable and its format.

    Hello, I have a mock data set and one variable is "country" with the corresponding two letters "AT" for the few observations in my mock data set. (HFCS dataset)

    Now the actual data is stored at the university (security reasons) so I have to write the code and then let it run through the data there, when I am done.

    I am having trouble though with the before mentioned country variable though. Even though it appears in the data set as letters (AT) it seems to be encoded as either numeric or something else.

    if I type keep if country == "AT" I get a mismatch error. The variable is type long and the format is %8.0g (whatever that means).

    So I tried to change it via tostring and typed in the command

    tostring country , generate(count)

    But now the new variable doesn't have AT anymore, instead it has the number 1 for each entry. Although if I now put in the command

    keep if count == "1" (doesn't work without the "", so it seems to be coded as non numeric) it at least performs the command and seems to yield no errors.

    I was just wondering, is there a way to change the country variable in such a way, that I do keep the two letters and not change that into a number? Otherwise it will get very confusing once I have access to the entire data set and now have over 20 different countries.
    Last edited by Oscar Weinzettl; 18 Mar 2019, 08:11.

  • #2
    You may try to type - codebook country -, check the codes for each category, then, use the "if" without quotation marks for the variable.

    Logically, this advice assumes you won't transform the variable to string, but just use it accordingly.

    Shall you wish to use the variable as string, the solution is exactly the opposite, i.e., to keep the number for each category under quotes whenever dealing with strings. By that way, that is it "worked" and gave no error when you did so.
    Last edited by Marcos Almeida; 18 Mar 2019, 08:37.
    Best regards,

    Marcos

    Comment


    • #3
      You are looking at a value label. To learn more, do read the relevant chapters on data in Stata in [U] (pdf documentation) and

      Code:
      help label
      help decode

      Comment


      • #4
        Originally posted by Marcos Almeida View Post
        You may try to type - codebook country -, check the codes for each category, then, use the "if" without quotation marks for the variable.

        Logically, this advice assumes you won't transform the variable to string, but just use it accordingly.

        Shall you wish to use the variable as string, the solution is exactly the opposite, i.e., to keep the number for each category under quotes whenever dealing with strings. By that way, that is it "worked" and gave no error when you did so.
        do you mean to not use the "" when typing in : keep if country == AT? Because if I do that it tells me AT not found, even though it is there.

        Codebook country tells me its type is numeric (long), its unique value , its label sa0100 (the old variable name I changed to country)

        Then under tabulation it tells me freq = 12, numeric = 1 and label = AT.

        But, sorry, how do I not transform the variable to string? Is there a way to get Stata to recognize the AT? Right now no matter if I put it in "" or not it will not work. Even though it tells me in the data editor the variables label is AT, it really is 1 and if I type keep country if == 1, then it works. So I guess it isn't coded as AT at all even though it does state so?
        Last edited by Oscar Weinzettl; 18 Mar 2019, 10:16.

        Comment


        • #5
          #4 Most of your questions are answered if you take the advice in #3 seriously.

          Evidently country is numeric with value 1 and value label "AT". That being so,

          Code:
          decode country, generate(COUNTRY)
          is a way to get a string variable out of your numeric variable.

          In giving you that example, I don't mean to imply that using variable names that are ALL CAPS is good style; I am just emphasising that you need a different name for the resulting variable.

          As you've found, tostring country just produces a string variable with value "1", which isn't very useful. But you shouldn't be surprised with

          Code:
          tostring country, gen(count)
          that you need thereafter to use qualifiers such as
          Code:
          ... if count == "1"
          as count is a string variable (you got what you asked for) and its values must be given within quotation marks.
          Last edited by Nick Cox; 18 Mar 2019, 10:38.

          Comment

          Working...
          X