Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    The problem is that the variable ProductSector in #10 is too complicated to transform into a minerals variable like the one you show in #10. In addition to the names of some minerals, these values contain extraneous information like product numbers, and long modifiers. It is going to be difficult or impossible to extract just the name of the primary mineral from these as there is no structured pattern that I can see for finding it. And there are some that name multiple minerals: "HS - 8112 - Beryllium, chromium, germanium, vanadium, gallium, hafnium, indium, niobium (columbium), rhenium and thallium, and articles of these metals, including waste and scrap." What on earth are we to do with that?

    The best I can think of is for you to create a new data set containing two variables. The first variable is just the ProductSector variable you already have. Drop all the duplicates Then, by hand, create a second variable, called minerals, that contains the short description you want, like the ones you show in the second example in #10. Save that data set. Then -merge- it with the original data, and drop the ProductSector variable. Then you can use -collapse- to get the aggregated (summed) data you want.

    Comment


    • #17
      Clyde Schechter thank you so much for your concern. this means a lot to a struggling researcher like me. yes, I agree with the complexity of my dataset especially with the ProductSector variable. actually, this data has been downloaded from ITC (International trade center) at HS 4 and HS 6 level that makes the variable look more complex to read and understand.
      I, finally, somehow managed to get the requisite results. I used command
      Code:
      by year iso3_o , sort: egen mineral_name= total( tradeflow_wto_d ) if (strpos(lower( ProductSector)), "mineral_name")
      I ran this command 20 times to get 20 different variables for 20 minerals.

      then I joined all these 20 variables data using "stack" command under a new variable "trade_minerals"

      then I created a string variable "minerals" and filled it with name of the minerals using

      Code:
      replace minerals = "mineral_name" if ( mineral_name != .)
      I used this code 20 times to fill minerals variable with all minerals.

      I know this is not the way to do this, but unable to get any legible code, i used this cheat code to get my desired results.

      once again thanks for your response and help. stay blessed.

      Note: commands provided for future reference.

      Comment

      Working...
      X