Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Preparing reshape - Dropping and renaming variables

    Dear forum,
    I currently have a problem working with a dataset in Stata as I wish to automize all reshaping steps I previously did in Excel to successfully reshape my dataset to a panel data set.
    As you can see in my data, I have an empty vector for the variable ERROR and H. Is there any way to drop (or replace them with '.') without the need of typing the variable names so that I could use the same logic in a much larger dataset?
    Furthermore before using the reshape command, is there any way to rename the variables in a systematic way, e.g. renaming them by creating a name that combines a special prefix (e.g. 'price') and the ISIN (line 2)?

    Your help is highly appreciated.
    Thanks,
    Tobias

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str10 A str12 SP500COMPOSITEMARKETVALUE str17 SP500COMPOSITEPRICEINDEX str25 ERROR str17 MSCIEUROPEUPRICEINDEX str8 STOXXEUROPE600EMARKETVALU str17 STOXXEUROPE600EPRICEINDEX str25 H str17 MSCIWORLDUPRICEINDEX
    "ISIN"       "US78378X1072" "US78378X1072" " "                         " "        " "       " "       " "                         " "                
    "  1/1/2002" "10463410"     "1148.08"      "$$ER: E100,ACCESS DENIED " "1085.759" "4976921" "265.993" "$$ER: E100,ACCESS DENIED " "1003.516"         
    "  1/2/2002" "10526670"     "1154.67"      ""                          "1086.953" "4968859" "265.558" ""                          "1006.929"         
    "  1/3/2002" "10623320"     "1165.27"      ""                          "1102.748" "5039285" "269.322" ""                          "1017.717"         
    "  1/4/2002" "10689290"     "1172.51"      ""                          "1099.113" "5031703" "268.917" ""                          "1023.486"         
    "  1/7/2002" "10622750"     "1164.89"      ""                          "1086.553" "4972515" "265.754" ""                          "1016.525"         
    "  1/8/2002" "10584610"     "1160.71"      ""                          "1076.244" "4924854" "263.207" ""                          "1008.681"         
    "  1/9/2002" "10533850"     "1155.14"      ""                          "1074.317" "4910837" "262.458" ""                          "1004.584"         
    " 1/10/2002" "10546710"     "1156.55"      ""                          "1067.723" "4890711" "261.382" ""                          "1001.982"         
    " 1/11/2002" "10448400"     "1145.6"       ""                          "1071.192" "4910638" "262.432" ""                          "997.014"          
    " 1/14/2002" "10382810"     "1138.41"      ""                          "1052.431" "4824006" "257.768" ""                          "988.179"          
    " 1/15/2002" "10453800"     "1146.19"      ""                          "1057.603" "4852454" "259.312" ""                          "992.22"           
    " 1/16/2002" "10283940"     "1127.57"      ""                          "1038.357" "4760685" "254.408" ""                          "977.39"           
    " 1/17/2002" "10391970"     "1138.88"      ""                          "1047.801" "4799221" "256.468" ""                          "985.299"          
    " 1/18/2002" "10288800"     "1127.58"      ""                          "1046.53"  "4799810" "256.462" ""                          "980.0839999999999"
    " 1/21/2002" "10288800"     "1127.58"      ""                          "1041.067" "4771558" "254.952" ""                          "978.569"          
    " 1/22/2002" "10213350"     "1119.31"      ""                          "1041.346" "4766634" "254.689" ""                          "971.6130000000001"
    " 1/23/2002" "10294360"     "1128.18"      ""                          "1048.473" "4815957" "257.325" ""                          "977.349"          
    " 1/24/2002" "10330580"     "1132.15"      ""                          "1056.707" "4841416" "258.685" ""                          "982.123"          
    " 1/25/2002" "10342030"     "1133.28"      ""                          "1037.78"  "4749456" "253.772" ""                          "978.1420000000001"
    end

  • #2
    A problem here has been created upstream: metadata have crept into your data that need to be pushed out again.

    Code:
    foreach v of var * {
        label var `v' "`=`v'[1]'"
    }
    
    drop in 1
    Note that the variable ERROR is not empty from Stata's point of view so -- although it does appear useless -- you need some other rule to get rid of it.

    The commands findname and missings (both Stata Journal) include machinery for finding variables for which all values are missing.

    The leading space in your string date variable could cause problems if it is not consistently present in other values. Indeed, for that and other reasons, I would turn it into a Stata numeric date variable at the earliest opportunity.

    That said, sorry, but don't understand your other questions and I don't understand the structure of your data enough to be confident about how best to reshape it. But my comments may remove the need for some other procedures you have in mind.

    Comment


    • #3
      Thank you Nick. Indeed your code helps to find a new approach. Is it possible to include a step within the loop that also gives a new name to the variable, e.g. rename var `v' concat('Price' `v'[1]), punct(-) ?
      By this, I would like to systematically rename the variables always having the same beginning which is needed for the reshape command in Stata as far as I understand.

      Comment


      • #4
        That is a long way from the syntax of rename. You're mixing in functions and options that don't exist or that are allowed only with other commands. Hyphens are not allowed in variable names, although underscores are.

        Code:
        rename `v'   Price`=`v'[1]'
        might work. You'd need to watch out that the new names are all legal and not more than 32 characters long.

        Comment


        • #5
          Thank you Nick. One additional question to the previously mentioned find name command: How is it possible to drop variables that are identified by the find name command?

          Comment

          Working...
          X