Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue with dummy variable when creating descriptive statistics

    Hi,

    I am writing my thesis and have encountered an issue with a dummy variable. In the data (when i type br as a command) the values are either 0 or 1, but when i create the descriptive statistics it says that the min value is 1 and the max is 2. I have no idea why since I've never had this issue before and would appreciate some guidance if anyone has any recommendations as to how I should proceed.

    Here's my descriptive table:

    Summary statistics
    N Mean Median SD Min Max
    g gdp 1367 .04 0.04 .09 -.61 .58
    length 1391 29.54 29.00 16.83 1 63
    GB2 1391 1.4 1.00 .49 1 2
    g prim1 1342 .04 0.04 .07 -.53 .84
    g death 1360 -.02 -0.02 .02 -.1 .1
    patent1 1069 2.49 2.45 .46 1.05 4.25
    polity2 1372 -1.58 -4.00 5.94 -9 10
    g capital 1079 0 0.01 .25 -2.9 2.29
    g gov con 1254 0 0.00 .16 -.95 1.62
    g trade 1307 0 0.01 .14 -.98 .76
    g pop 1367 .03 0.03 .01 -.02 .07
    initial gdp 1391 17214.88 6350.00 28651.96 369 119367

    And here is the command that I typed in:

    local indvars g_gdp length GB2 g_prim1 g_death patent1 polity2 g_capital g_gov_con g_trade g_pop initial_gdp
    asdoc sum `indvars', stat(N mean p50 sd min max), replace dec(2)

  • #2
    We can't tell confidently what was done earlier to your data, but here is a guess. Your dummy variable was a string variable with values like "0" and "1". Then it was encoded.

    With nothing else said, encode would map strings "0" and "1" to numeric values 1 and 2 and the string values would become value labels.

    That guess is consistent with the evidence you give.

    To be of any use for model fits, you need a (0, 1) variable which you can get just by subtracting 1. You should then fix the value labels (which could mean removing them).

    So, think back, did you or someone else ever encode that variable?

    encode is often applied when destring would fit the need more nearly.

    Comment


    • #3
      Thank you so much for your response. You are correct. I have previously used the encode command to destring a variable. I subtracted 1 from the variable and now my descriptive statistics looks alright.

      Comment


      • #4
        Good, but the wording "I used encode to destring a variable" is not the way to think about it at all. The commands have different goals. Your encode didn't do much damage, but if you ask Stata to encode "1000" "20" "3" those strings will be mapped to 1 2 3, which is quite wrong -- and not the results are not even the right order if 1000, 20 and 3 are the desired numeric values.

        encode and destring are almost never alternatives.

        Comment

        Working...
        X