Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Reshaping and Merging Multiple Excels and Saving as one single .dta file

    Hi all,
    I have numerous countries FDI data (1985-2022) with each couyntry in a seperate excel workbook named after respective country. When I import desired sheet from a given excel workbook for one country like Afghanistan the data looks like this

    import excel "Afghanistan.xlsx", sheet("inflow-M") firstrow clear

    dataex Afghanistan B-J

    ----------------------- copy starting from the next line -----------------------
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str75 Afghanistan int(B C D E F G H I J)
    "FDI flows in the host economy, by geographical origin"                          .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    "(Millions of US dollars)"                                                       .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    "Reporting Country"                                                           1985 1986 1987 1988 1989 1990 1991 1992 1993
    "Austria"                                                                        0    0    0    0    0    0    0    0    0
    "Bahrain"                                                                        0    0    0    0    0    0    0    0    0
    "China"                                                                          0    0    0    0    0    0    0    0    0
    "Croatia"                                                                        0    0    0    0    0    0    0    0    0
    "Denmark"                                                                        0    0    0    0    0    0    0    0    0
    "Finland"                                                                        0    0    0    0    0    0    0    0    0
    "France"                                                                         0    0    0    0    0    0    0    0    0
    "Germany"                                                                        0    0    0    0    0    0    0    0    0
    "Greece"                                                                         0    0    0    0    0    0    0    0    0
    "Hungary"                                                                        0    0    0    0    0    0    0    0    0
    "India"                                                                          0    0    0    0    0    0    0    0    0
    "Italy"                                                                          0    0    0    0    0    0    0    0    0
    "Kazakhstan"                                                                     0    0    0    0    0    0    0    0    0
    "Korea, Republic of"                                                             0    0    0    0    0    0    0    0    0
    "Latvia"                                                                         0    0    0    0    0    0    0    0    0
    "Luxembourg"                                                                     0    0    0    0    0    0    0    0    0
    "Netherlands"                                                                    0    0    0    0    0    0    0    0    0
    "Norway"                                                                         0    0    0    0    0    0    0    0    0
    "Poland"                                                                         0    0    0    0    0    0    0    0    0
    "Russian Federation"                                                             0    0    0    0    0    0    0    0    0
    "Spain"                                                                          0    0    0    0    0    0    0    0    0
    "Sweden"                                                                         0    0    0    0    0    0    0    0    0
    "Switzerland"                                                                    0    0    0    0    0    0    0    0    0
    "Thailand"                                                                       0    0    0    0    0    0    0    0    0
    "Türkiye"                                                                       0    0    0    0    0    0    0    0    0
    "United Kingdom"                                                                 0    0    0    0    0    0    0    0    0
    "United States"                                                                  0    0    0    0    0    0    0    0    0
    ""                                                                               .    .    .    .    .    .    .    .    .
    "Source: UNCTAD, FDI/MNE database."                                              .    .    .    .    .    .    .    .    .
    "Note: Data are based on information reported by the economies listed above."    .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    ""                                                                               .    .    .    .    .    .    .    .    .
    end
    ------------------ copy up to and including the previous line ------------------

    Listed 100 out of 495 observations
    Use the count() option to list more

    After cleaning, I wish to rehape the data with each Reporting country (Austria, Bahrain etc), i.e. source of FDI, in wide form; year and recipient country; i.e. Afghanistan in this case in long form. I will be thankful for some suggestion on reshaping and especially if through a loop/syntax where I import the each excel sheet, reshape and finally append the data by host country in a single .dta file?
    Kind regards

  • #2
    just to add that I want to reshape above data in the following form
    year country countrycode Austria Bahrain China Croatia Denmark Finland France Germany Greece Hungary India Italy Kazakhstan Korea, Republic of Latvia Luxembourg Netherlands Norway Poland Russian Federation Spain Sweden Switzerland Thailand Türkiye United Kingdom United States
    1985 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1986 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1987 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1988 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1989 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1990 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1991 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1992 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1993 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1994 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1995 Afghanistan AFG - - - - - - - 0.25 - - - - - - - - - - - - - - - - - - -
    1996 Afghanistan AFG - - - - - - - 3.38 - - - - - - - - - - - - - - - - - - -
    1997 Afghanistan AFG - - - - - - - -2.98 - - - - - - - - - - - - - - - - - - -
    1998 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    1999 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    2000 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - - - - - - -
    2001 Afghanistan AFG - - - - - - - - - - - - - - - - - - - - - -0.20 - - - - -
    2002 Afghanistan AFG - - - - - - - -0.76 0.22 - - - - - - - - - - - - -0.13 - - - - -
    2003 Afghanistan AFG - - 0.30 - - - - 1.27 - - - - - 0.21 - - - - - - - -0.16 - - - - -
    2004 Afghanistan AFG - - - - - - - 0.59 - - - - 0.10 -0.21 - - - - - - - -0.14 - - - - -
    2005 Afghanistan AFG - - - - - - - 0.40 - - 0.06 - - - - - - - - - - -0.25 - - - - -
    2006 Afghanistan AFG - - 0 - - - - 3 - - - - - 0 0 - - - - - - - - 0 - - - - -
    2007 Afghanistan AFG - - 0 - - - - - 5 - - - - - - - - - - - - - - 0 - - - - -
    2008 Afghanistan AFG - - 114 - - - - 1 - - - - 8 - 0 - - - - - - - - 0 - - - - -
    2009 Afghanistan AFG - - 16 - - - - 0 - - - - 9 - 0 - - - - - - - - 0 - - - - - 1
    2010 Afghanistan AFG - - 2 - 0 - - - - 0 - - - - 7 0 - 0 - - - - - - - - 0 - - - - - 1
    2011 Afghanistan AFG - - 296 - - - - - 1 - - - - 6 - 0 - - - 1 - 0 - 1 - - - - - - - 1
    2012 Afghanistan AFG - - 18 - - - 0 - 11 - - - - 5 - - - - 5 - - 0 - 1 - - 0 - - - - - 1
    2013 Afghanistan AFG - - - 1 - - - 0 1 - 0 - - 7 - - - - 1 - 0 - - 0 0 - - - - 0 - 1
    2014 Afghanistan AFG 0 - 28 - - - 0 - 8 - 0 - 1 - 0 - 0 7 0 - - 1 0 - 0 - - 0 -
    2015 Afghanistan AFG 0 - - 3 - - 3 - 0 - - - - 0 - - - 0 3 0 - - 0 - 0 - 1 - -
    2016 Afghanistan AFG 0 - 2 - - 10 0 0 - - - - - 0 - - - 0 - 0 - - 0 0 - 0 1 - - -
    2017 Afghanistan AFG 0 - 5 - - 3 - 0 1 - - - 0 - - - 0 - 0 - - - - 0 - 0 - 0 2
    2018 Afghanistan AFG - - - 0 - - - 0 - 1 - - - - 0 - - - - - - - - 0 - - - - - - -
    2019 Afghanistan AFG - - 24 - - - 0 0 - - - 0 - - - - - 15 - - - - - - - - - 1
    2020 Afghanistan AFG - - 0 3 - - - 0 - 1 - - - 0 - - - - - 0 - - - - - - 1 - 2
    2021 Afghanistan AFG - 0 - 3 - - - 0 - 0 - - - 0 - - - - - 2 - - - - - - - - 2
    2022 Afghanistan AFG - - - - - - - - 0 - - - 0 0 - - - - - - - - - - - - - 3

    Comment


    • #3
      OK! I figured this out in case of a single country as under:

      import excel "Afghanistan.xlsx", sheet("inflow-M") clear

      gen x=A if A==A[1]
      replace x=x[_n-1] if x==""
      rename x country
      keep if B!=.
      drop AN AO
      drop if A=="Reporting Country"
      rename (A) (source)

      local x B C D E F G H I J K L M N O P Q R S T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ AK AL AM
      local y y1985 y1986 y1987 y1988 y1989 y1990 y1991 y1992 y1993 y1994 y1995 y1996 y1997 y1998 y1999 y2000 y2001 y2002 y2003 y2004 y2005 y2006 y2007 y2008 y2009 y2010 y2011 y2012 y2013 y2014 y2015 y2016 y2017 y2018 y2019 y2020 y2021 y2022
      foreach var of varlist `x' {
      gettoken new y : y
      rename `var' `new'
      }
      replace source = subinstr(source, ",","", .) //to remove special chracters
      replace source = subinstr(source, " ","", .) //to remove spaces from strings
      *ssc install sxpose2
      sxpose2, clear firstnames varname force
      gen x=UnitedStates if UnitedStates==A[39]
      replace x=x[_N] if x==""
      drop if _varname=="country"
      rename x country
      rename _varname year
      replace year = subinstr(year, "y","", .) //to remove special chracters
      destring, replace

      and the data looks perfect as below:


      . dataex country year Austria Bahrain China Croatia

      ----------------------- copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str11 country int year double(Austria Bahrain China Croatia)
      "Afghanistan" 1985      0            0      0            0
      "Afghanistan" 1986      0            0      0            0
      "Afghanistan" 1987      0            0      0            0
      "Afghanistan" 1988      0            0      0            0
      "Afghanistan" 1989      0            0      0            0
      "Afghanistan" 1990      0            0      0            0
      "Afghanistan" 1991      0            0      0            0
      "Afghanistan" 1992      0            0      0            0
      "Afghanistan" 1993      0            0      0            0
      "Afghanistan" 1994      0            0      0            0
      "Afghanistan" 1995      0            0      0            0
      "Afghanistan" 1996      0            0      0            0
      "Afghanistan" 1997      0            0      0            0
      "Afghanistan" 1998      0            0      0            0
      "Afghanistan" 1999      0            0      0            0
      "Afghanistan" 2000      0            0      0            0
      "Afghanistan" 2001      0            0      0            0
      "Afghanistan" 2002      0            0      0            0
      "Afghanistan" 2003      0            0     .3            0
      "Afghanistan" 2004      0            0      0            0
      "Afghanistan" 2005      0            0      0            0
      "Afghanistan" 2006      0            0    .25            0
      "Afghanistan" 2007      0            0     .1            0
      "Afghanistan" 2008      0            0 113.91            0
      "Afghanistan" 2009      0            0  16.39            0
      "Afghanistan" 2010      0            0   1.91 -.0390986027
      "Afghanistan" 2011      0            0 295.54            0
      "Afghanistan" 2012      0            0  17.61            0
      "Afghanistan" 2013      0            0  -1.22            0
      "Afghanistan" 2014 .00001            0  27.92            0
      "Afghanistan" 2015 .00001            0  -3.26            0
      "Afghanistan" 2016 .00001            0   2.21            0
      "Afghanistan" 2017 .00001            0   5.43            0
      "Afghanistan" 2018      0            0   -.16            0
      "Afghanistan" 2019      0            0  24.08            0
      "Afghanistan" 2020      0 -.0003523936   2.54            0
      "Afghanistan" 2021      0  .0005433511  -2.55            0
      "Afghanistan" 2022      0            0      0            0
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 38 out of 38 observations

      BUT when I try to do in the following loop for all the stored excel files in my folder

      local filenames: dir "C:\UNCTAD_FDI\Country profiles" files "*.xlsx*"
      local x B C D E F G H I J K L M N O P Q R S T U V W X Y Z AA AB AC AD AE AF AG AH AI AJ AK AL AM
      local y y1985 y1986 y1987 y1988 y1989 y1990 y1991 y1992 y1993 y1994 y1995 y1996 y1997 y1998 y1999 y2000 y2001 y2002 y2003 y2004 y2005 y2006 y2007 y2008 y2009 y2010 y2011 y2012 y2013 y2014 y2015 y2016 y2017 y2018 y2019 y2020 y2021 y2022
      foreach f of local filenames {
      import excel using `f', sheet("inflow-M") clear
      gen x=A if A==A[1]
      replace x=x[_n-1] if x==""
      rename x country
      keep if B!=.
      drop AN AO
      drop if A=="Reporting Country"
      rename (A) (source)

      foreach var of varlist `x' {
      gettoken new y : y
      rename `var' `new'
      replace source = subinstr(source, ",","", .) //to remove special chracters
      replace source = subinstr(source, " ","", .) //to remove spaces from strings
      sxpose2, clear firstnames varname force
      gen x=UnitedStates if UnitedStates==A[39]
      replace x=x[_N] if x==""
      drop if _varname=="country"
      rename x country
      rename _varname year
      replace year = subinstr(year, "y","", .) //to remove special chracters
      save `f'.dta, replace

      }

      }

      I see an error saying c is an ambigious abbrevation in variable year. And the saved data file is as below:


      . dataex country year Austria Bahrain China Croatia

      ----------------------- copy starting from the next line -----------------------
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input str11 country str7 year double(Austria Bahrain China Croatia)
      "Afghanistan" "1985"      0            0      0            0
      "Afghanistan" "C"         0            0      0            0
      "Afghanistan" "D"         0            0      0            0
      "Afghanistan" "E"         0            0      0            0
      "Afghanistan" "F"         0            0      0            0
      "Afghanistan" "G"         0            0      0            0
      "Afghanistan" "H"         0            0      0            0
      "Afghanistan" "I"         0            0      0            0
      "Afghanistan" "J"         0            0      0            0
      "Afghanistan" "K"         0            0      0            0
      "Afghanistan" "L"         0            0      0            0
      "Afghanistan" "M"         0            0      0            0
      "Afghanistan" "N"         0            0      0            0
      "Afghanistan" "O"         0            0      0            0
      "Afghanistan" "P"         0            0      0            0
      "Afghanistan" "Q"         0            0      0            0
      "Afghanistan" "R"         0            0      0            0
      "Afghanistan" "S"         0            0      0            0
      "Afghanistan" "T"         0            0     .3            0
      "Afghanistan" "U"         0            0      0            0
      "Afghanistan" "V"         0            0      0            0
      "Afghanistan" "W"         0            0    .25            0
      "Afghanistan" "X"         0            0     .1            0
      "Afghanistan" "Y"         0            0 113.91            0
      "Afghanistan" "Z"         0            0  16.39            0
      "Afghanistan" "AA"        0            0   1.91 -.0390986027
      "Afghanistan" "AB"        0            0 295.54            0
      "Afghanistan" "AC"        0            0  17.61            0
      "Afghanistan" "AD"        0            0  -1.22            0
      "Afghanistan" "AE"   .00001            0  27.92            0
      "Afghanistan" "AF"   .00001            0  -3.26            0
      "Afghanistan" "AG"   .00001            0   2.21            0
      "Afghanistan" "AH"   .00001            0   5.43            0
      "Afghanistan" "AI"        0            0   -.16            0
      "Afghanistan" "AJ"        0            0  24.08            0
      "Afghanistan" "AK"        0 -.0003523936   2.54            0
      "Afghanistan" "AL"        0  .0005433511  -2.55            0
      "Afghanistan" "AM"        0            0      0            0
      end
      ------------------ copy up to and including the previous line ------------------

      Listed 38 out of 38 observations


      Any suggestion where I am doing the mistake in my loop please?

      regards

      Comment

      Working...
      X