Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Transposing selected observations from the same dataset

    Hello,
    I want to change the layout of my dataset for only SOME observations (from long to wide). Not sure if transpose is the right word but here is what I have and want to do in detail.
    I have data on country JAM for 14 variables over 20 years.

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str20 indicatorcode str24 cntryindID double lastvalue int lastyear double value int year
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2000
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2001
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2002
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2003
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2004
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2005
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2006
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2007
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 33.6800003051758 2008
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 34.9199981689453 2009
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020             35.5 2010
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 35.4599990844727 2011
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 36.1100006103516 2012
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 38.0499992370605 2013
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 37.0299987792969 2014
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020                . 2015
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 32.3400001525879 2016
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020               31 2017
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 28.2900009155273 2018
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 24.6399993896484 2019
    "SL.UEM.NEET.FE.ZS"    "JAM-SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 31.4200000762939 2020
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2000
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2001
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2002
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2003
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2004
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2005
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2006
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2007
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 21.1700000762939 2008
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 25.6900005340576 2009
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 27.2999992370605 2010
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 27.0400009155273 2011
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 27.8400001525879 2012
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 28.6399993896484 2013
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 28.3999996185303 2014
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020                . 2015
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 26.5400009155273 2016
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 24.1800003051758 2017
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 23.0699996948242 2018
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 20.6700000762939 2019
    "SL.UEM.NEET.MA.ZS"    "JAM-SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 27.7700004577637 2020
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 98.1513290405273 2000
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 94.9265365600586 2001
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 94.4924011230469 2002
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 94.3367233276367 2003
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 102.390777587891 2004
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020                . 2005
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020                . 2006
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020                . 2007
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020                . 2008
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 98.8628234863281 2009
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 99.0639114379883 2010
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 95.7142333984375 2011
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020                . 2012
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 90.8172607421875 2013
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020  88.138557434082 2014
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020   90.54443359375 2015
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 89.5796279907227 2016
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 86.3697814941406 2017
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 83.4190826416016 2018
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020 84.7264099121094 2019
    "SE.SEC.CMPT.LO.FE.ZS" "JAM-SE.SEC.CMPT.LO.FE.ZS"  82.93396759033203 2020  82.933967590332 2020
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 97.1339416503906 2000
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 95.2227478027344 2001
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 90.3095779418945 2002
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 91.3424835205078 2003
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020  97.870979309082 2004
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020                . 2005
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020                . 2006
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020                . 2007
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020                . 2008
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 99.1570205688477 2009
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 99.9690704345703 2010
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 94.5564880371094 2011
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020                . 2012
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 90.4056167602539 2013
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 83.2212371826172 2014
    "SE.SEC.CMPT.LO.MA.ZS" "JAM-SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 86.2205581665039 2015
    end
    But then at the bottom of the same dataset I have data for the average value ("lastvalue") for Latin America (LCN...) and for Uppermiddle income countries (UMC...) for just 1 time period. I want to move the "last value" from both ASP and UMC to the right-hand side of the data for JAM, matching it around the same "indicatorcode".

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str20 indicatorcode str24 cntryindID double lastvalue int lastyear
    "SE.TER.GRAD.FE.SI.ZS" "ASP-SE.TER.GRAD.FE.SI.ZS"   31.5758056640625 2017
    "SL.UEM.NEET.FE.ZS"    "ASP-SL.UEM.NEET.FE.ZS"     7.716000080108643 2020
    "SL.UEM.NEET.MA.ZS"    "ASP-SL.UEM.NEET.MA.ZS"      8.11400032043457 2020
    "SE.SEC.CMPT.LO.FE.ZS" "ASP-SE.SEC.CMPT.LO.FE.ZS" 102.13663482666016 2019
    "SE.SEC.CMPT.LO.MA.ZS" "ASP-SE.SEC.CMPT.LO.MA.ZS" 102.65520477294922 2019
    "SP.ADO.TFRT"          "ASP-SP.ADO.TFRT"               7.61083984375 2020
    "SG.VAW.1549.ZS"       "ASP-SG.VAW.1549.ZS"                        .    .
    "SL.EMP.VULN.FE.ZS"    "ASP-SL.EMP.VULN.FE.ZS"     6.388000011444092 2019
    "SL.EMP.VULN.MA.ZS"    "ASP-SL.EMP.VULN.MA.ZS"    10.006000518798828 2019
    "IC.FRM.FEMO.ZS"       "ASP-IC.FRM.FEMO.ZS"       40.650001525878906 2020
    "FX.OWN.TOTL.FE.ZS"    "ASP-FX.OWN.TOTL.FE.ZS"     99.73249816894531 2017
    "FX.OWN.TOTL.MA.ZS"    "ASP-FX.OWN.TOTL.MA.ZS"     99.48999786376953 2017
    "SG.TIM.UWRK.FE"       "ASP-SG.TIM.UWRK.FE"                        .    .
    "SG.TIM.UWRK.MA"       "ASP-SG.TIM.UWRK.MA"                        .    .
    "SE.TER.GRAD.FE.SI.ZS" "LCN-SE.TER.GRAD.FE.SI.ZS"                  .    .
    "SL.UEM.NEET.FE.ZS"    "LCN-SL.UEM.NEET.FE.ZS"     29.49407386779785 2020
    "SL.UEM.NEET.MA.ZS"    "LCN-SL.UEM.NEET.MA.ZS"     18.25299072265625 2020
    "SE.SEC.CMPT.LO.FE.ZS" "LCN-SE.SEC.CMPT.LO.FE.ZS"  82.33609008789063 2020
    "SE.SEC.CMPT.LO.MA.ZS" "LCN-SE.SEC.CMPT.LO.MA.ZS"  77.62348937988281 2020
    "SP.ADO.TFRT"          "LCN-SP.ADO.TFRT"           60.26488494873047 2020
    "SG.VAW.1549.ZS"       "LCN-SG.VAW.1549.ZS"                        .    .
    "SL.EMP.VULN.FE.ZS"    "LCN-SL.EMP.VULN.FE.ZS"     33.77582931518555 2019
    "SL.EMP.VULN.MA.ZS"    "LCN-SL.EMP.VULN.MA.ZS"     33.37175750732422 2019
    "IC.FRM.FEMO.ZS"       "LCN-IC.FRM.FEMO.ZS"       49.907691955566406 2020
    "FX.OWN.TOTL.FE.ZS"    "LCN-FX.OWN.TOTL.FE.ZS"    52.130001068115234 2017
    "FX.OWN.TOTL.MA.ZS"    "LCN-FX.OWN.TOTL.MA.ZS"     58.68000030517578 2017
    "SG.TIM.UWRK.FE"       "LCN-SG.TIM.UWRK.FE"                        .    .
    "SG.TIM.UWRK.MA"       "LCN-SG.TIM.UWRK.MA"                        .    .
    "SE.TER.GRAD.FE.SI.ZS" "UMC-SE.TER.GRAD.FE.SI.ZS"                  .    .
    "SL.UEM.NEET.FE.ZS"    "UMC-SL.UEM.NEET.FE.ZS"                     .    .
    "SL.UEM.NEET.MA.ZS"    "UMC-SL.UEM.NEET.MA.ZS"                     .    .
    "SE.SEC.CMPT.LO.FE.ZS" "UMC-SE.SEC.CMPT.LO.FE.ZS"  89.98753356933594 2020
    "SE.SEC.CMPT.LO.MA.ZS" "UMC-SE.SEC.CMPT.LO.MA.ZS"  88.79283905029297 2020
    "SP.ADO.TFRT"          "UMC-SP.ADO.TFRT"          29.038925170898438 2020
    "SG.VAW.1549.ZS"       "UMC-SG.VAW.1549.ZS"                        .    .
    "SL.EMP.VULN.FE.ZS"    "UMC-SL.EMP.VULN.FE.ZS"     38.46841812133789 2019
    "SL.EMP.VULN.MA.ZS"    "UMC-SL.EMP.VULN.MA.ZS"     35.69723892211914 2019
    "IC.FRM.FEMO.ZS"       "UMC-IC.FRM.FEMO.ZS"        35.06666564941406 2020
    "FX.OWN.TOTL.FE.ZS"    "UMC-FX.OWN.TOTL.FE.ZS"      68.6500015258789 2017
    "FX.OWN.TOTL.MA.ZS"    "UMC-FX.OWN.TOTL.MA.ZS"     76.18000030517578 2017
    "SG.TIM.UWRK.FE"       "UMC-SG.TIM.UWRK.FE"                        .    .
    "SG.TIM.UWRK.MA"       "UMC-SG.TIM.UWRK.MA"                        .    .
    end

    What I eventually picture is to have variables "lastvalueASP" and "lastvalueUMC" for each corresponding JAMindicatorcode. This is because I eventually want to compare JAM's "lastvalue" with "lastvalueASP" and "lastvalueUMC" (all of which should be on the same vertical row). Deeply thankful for anyone who can help me achieve this!

  • #2
    This may be what you want:
    Code:
    gen country = substr(cntryindID, 1, 3)
    frame put _all if inlist(country, "UMC", "LCN", "ASP"), into(rhs)
    frame rhs {
        drop cntryindID
        reshape wide lastvalue lastyear, i(indicatorcode) j(country) string
    }
    drop if inlist(country, "UMC", "LCN", "ASP")
    frlink m:1 indicatorcode, frame(rhs)
    frget last*, from(rhs)
    drop rhs
    frame drop rhs
    Your post is unclear about important details, which I have resolved by making some arbitrary decisions. Hopefully if those are not what you want, you will be able to modify accordingly. For example, initially you talk about bringing LNC and UMC data to the "right" of the data set, but in thye end you talk only about ASP and UMC--talk of LNC has vanished. I chose to move the LNC value as well. Also "vertical row" is a contradiction in terms: rows are by definition horizontal, only columns are vertical. In any case, in Stataspeak, we do not refer to rows and columns in a Stata data set: we refer to observations and variables, respectively.

    The data shown here is very similar to what was created in an earlier thread of yours, in which I also participated. I do remember thinking to myself at the time that building the data set to include the LNC and UMC (I don't recall ASP) as separate observations in the same data set was a mistake that you would come to regret. Although I'm not sure why, as I am usually not particularly restrained about making unrequested suggestions when I see something that looks troublesome, I didn't say anything at the time. I'm not surprised, therefore, to see that you have come here today seeking, in essence, to undo that data design error.

    Comment


    • #3
      You're correct in your assumptions, Clyde Schechter. Apologies the sudden drop of LNC and addition of ASP. I tried running your code--it almost all worked. This is when it didn't work:
      Code:
       frlink m:1 indicatorcode, frame(rhs)
      As I got the following:
      Code:
       
      invalid match variables for 1:1 or m:1 match
          The variable you specified for matching
          does not uniquely identify the
          observations in frame rhs.  Each
          observation in the current frame
          default must link to one observation in
          rhs.
      I believe it is because there is not a single match for every "indicatorcode" given that the data for the country at hand (in this case, JAM) has indicatorcode values for 10 years. Any way to go around it?

      Also, thanks for sharing that variables and observations are the correct way to use Stataspeak here!

      Comment


      • #4
        Please post example data that reproduces this problem. The code runs without error messages in the example data given. In that data, there are also 10 years worth of observations for Jamaica, so that is not the source of the problem. When responding to this, please do not post the Jamaica data and the UMC etc. data as separate data sets. You stated in #1 that they are all part of the same data set. As a result, I patched the two -dataex- outputs together to create a single data set. But it may be that in doing that it did not faithfully replicate the kind of data organization you actually have. So be sure to post a single -dataex- that incorporates all of these exactly the way you have them in your full data set.

        I am actually mystified how the problem can arise at all. I say that because in frame -rhs- we do a -reshape wide, i(indicatorcode) j(country)- command. The -reshape wide- command always produces a data set in which the -i()- variable uniquely identifies observations. So if there is a problem with multiple appearances of the same indicatorcode value in frame rhs, the code should have broken at the -reshape wide- step. That it didn't is mysterious, and I don't see how you could have even gotten to the -frlink- command in that case.

        So, yes, please post back as asked above.

        Comment


        • #5
          Appreciate the patience Clyde Schechter. Here is a sample of my data using
          Code:
           randomtag if lastyear >= 2018, count(20) gen(pick)
          If I only select 1/5 or so you will only get JAM data and not for the other groups (ASP, LCN, and UMC). Let me clarify that, as you see below, JAM has multiple observations per indicatorcode (e.g. "SL.UEM.NEET.FE.ZS" for when "year is 2001 and 2019). However, the other groups only have 1 observation per indicatorcode.

          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input str20 indicatorcode double lastvalue int(lastyear year) double value str3 countrycode str24 cntryindID
          "SE.TER.GRAD.FE.SI.ZS"                  .    . 2000                 . "JAM" "JAM-SE.TER.GRAD.FE.SI.ZS"
          "SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 2001                 . "JAM" "JAM-SL.UEM.NEET.FE.ZS"  
          "SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 2019  24.6399993896484 "JAM" "JAM-SL.UEM.NEET.FE.ZS"  
          "SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 2002                 . "JAM" "JAM-SL.UEM.NEET.MA.ZS"  
          "SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 2008  21.1700000762939 "JAM" "JAM-SL.UEM.NEET.MA.ZS"  
          "SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 2000  97.1339416503906 "JAM" "JAM-SE.SEC.CMPT.LO.MA.ZS"
          "SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 2002  90.3095779418945 "JAM" "JAM-SE.SEC.CMPT.LO.MA.ZS"
          "SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 2008                 . "JAM" "JAM-SE.SEC.CMPT.LO.MA.ZS"
          "SL.EMP.VULN.FE.ZS"    30.600000381469727 2019 2006 31.40999984741215 "JAM" "JAM-SL.EMP.VULN.FE.ZS"  
          "SL.EMP.VULN.FE.ZS"    30.600000381469727 2019 2010 30.42999958992002 "JAM" "JAM-SL.EMP.VULN.FE.ZS"  
          "IC.FRM.FEMO.ZS"                        .    . 2011                 . "JAM" "JAM-IC.FRM.FEMO.ZS"      
          "FX.OWN.TOTL.MA.ZS"                     .    . 2002                 . "JAM" "JAM-FX.OWN.TOTL.MA.ZS"  
          "FX.OWN.TOTL.MA.ZS"                     .    . 2008                 . "JAM" "JAM-FX.OWN.TOTL.MA.ZS"  
          "SG.TIM.UWRK.FE"                        .    . 2005                 . "JAM" "JAM-SG.TIM.UWRK.FE"      
          "SG.TIM.UWRK.FE"                        .    . 2013                 . "JAM" "JAM-SG.TIM.UWRK.FE"      
          "SG.TIM.UWRK.FE"                        .    . 2020                 . "JAM" "JAM-SG.TIM.UWRK.FE"      
          "SE.SEC.CMPT.LO.MA.ZS"  77.62348937988281 2020    .                 . "LCN" "LCN-SE.SEC.CMPT.LO.MA.ZS"
          "SL.UEM.NEET.FE.ZS"                     .    .    .                 . "UMC" "UMC-SL.UEM.NEET.FE.ZS"  
          "SL.EMP.VULN.FE.ZS"     38.46841812133789 2019    .                 . "UMC" "UMC-SL.EMP.VULN.FE.ZS"  
          "SL.EMP.VULN.MA.ZS"     35.69723892211914 2019    .                 . "UMC" "UMC-SL.EMP.VULN.MA.ZS"  
          end
          Last edited by Daniel McAdams; 21 Nov 2022, 10:06.

          Comment


          • #6
            Thank you. But, as before, I cannot replicate the error in this example. It runs with no error messages and produces results that appear to be correct:

            Code:
            . * Example generated by -dataex-. For more info, type help dataex
            . clear*
            
            . input str20 indicatorcode double lastvalue int(lastyear year) double value str3 countrycode str24 cntryindID
            
                        indicatorcode   lastvalue  lastyear      year       value  country~e                cntryindID
              1. "SE.TER.GRAD.FE.SI.ZS"                  .    . 2000                 . "JAM" "JAM-SE.TER.GRAD.FE.SI.ZS"
              2. "SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 2001                 . "JAM" "JAM-SL.UEM.NEET.FE.ZS"   
              3. "SL.UEM.NEET.FE.ZS"    31.530000686645508 2020 2019  24.6399993896484 "JAM" "JAM-SL.UEM.NEET.FE.ZS"   
              4. "SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 2002                 . "JAM" "JAM-SL.UEM.NEET.MA.ZS"   
              5. "SL.UEM.NEET.MA.ZS"    27.860000610351563 2020 2008  21.1700000762939 "JAM" "JAM-SL.UEM.NEET.MA.ZS"   
              6. "SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 2000  97.1339416503906 "JAM" "JAM-SE.SEC.CMPT.LO.MA.ZS"
              7. "SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 2002  90.3095779418945 "JAM" "JAM-SE.SEC.CMPT.LO.MA.ZS"
              8. "SE.SEC.CMPT.LO.MA.ZS"   82.5367431640625 2020 2008                 . "JAM" "JAM-SE.SEC.CMPT.LO.MA.ZS"
              9. "SL.EMP.VULN.FE.ZS"    30.600000381469727 2019 2006 31.40999984741215 "JAM" "JAM-SL.EMP.VULN.FE.ZS"   
             10. "SL.EMP.VULN.FE.ZS"    30.600000381469727 2019 2010 30.42999958992002 "JAM" "JAM-SL.EMP.VULN.FE.ZS"   
             11. "IC.FRM.FEMO.ZS"                        .    . 2011                 . "JAM" "JAM-IC.FRM.FEMO.ZS"      
             12. "FX.OWN.TOTL.MA.ZS"                     .    . 2002                 . "JAM" "JAM-FX.OWN.TOTL.MA.ZS"   
             13. "FX.OWN.TOTL.MA.ZS"                     .    . 2008                 . "JAM" "JAM-FX.OWN.TOTL.MA.ZS"   
             14. "SG.TIM.UWRK.FE"                        .    . 2005                 . "JAM" "JAM-SG.TIM.UWRK.FE"      
             15. "SG.TIM.UWRK.FE"                        .    . 2013                 . "JAM" "JAM-SG.TIM.UWRK.FE"      
             16. "SG.TIM.UWRK.FE"                        .    . 2020                 . "JAM" "JAM-SG.TIM.UWRK.FE"      
             17. "SE.SEC.CMPT.LO.MA.ZS"  77.62348937988281 2020    .                 . "LCN" "LCN-SE.SEC.CMPT.LO.MA.ZS"
             18. "SL.UEM.NEET.FE.ZS"                     .    .    .                 . "UMC" "UMC-SL.UEM.NEET.FE.ZS"   
             19. "SL.EMP.VULN.FE.ZS"     38.46841812133789 2019    .                 . "UMC" "UMC-SL.EMP.VULN.FE.ZS"   
             20. "SL.EMP.VULN.MA.ZS"     35.69723892211914 2019    .                 . "UMC" "UMC-SL.EMP.VULN.MA.ZS"   
             21. end
            
            .
            . frame put _all if inlist(country, "UMC", "LCN", "ASP"), into(rhs)
            
            . frame rhs {
            .     drop cntryindID
            .     reshape wide lastvalue lastyear, i(indicatorcode) j(country) string
            (j = LCN UMC)
            
            Data                               Long   ->   Wide
            -----------------------------------------------------------------------------
            Number of observations                4   ->   4           
            Number of variables                   6   ->   7           
            j variable (2 values)       countrycode   ->   (dropped)
            xij variables:
                                          lastvalue   ->   lastvalueLCN lastvalueUMC
                                           lastyear   ->   lastyearLCN lastyearUMC
            -----------------------------------------------------------------------------
            . }
            
            . drop if inlist(country, "UMC", "LCN", "ASP")
            (4 observations deleted)
            
            . frlink m:1 indicatorcode, frame(rhs)
              (9 observations in frame default unmatched)
            
            . frget last*, from(rhs)
            (13 missing values generated)
            (13 missing values generated)
            (14 missing values generated)
            (14 missing values generated)
              (4 variables copied from linked frame)
            
            . drop rhs
            
            . frame drop rhs
            
            .
            . list, noobs clean
            
                       indicatorcode   lastvalue   lastyear   year       value   countr~e                 cntryindID   lastval~N   lastye~N   lastval~C   lastye~C  
                SE.TER.GRAD.FE.SI.ZS           .          .   2000           .        JAM   JAM-SE.TER.GRAD.FE.SI.ZS           .          .           .          .  
                   SL.UEM.NEET.FE.ZS   31.530001       2020   2001           .        JAM      JAM-SL.UEM.NEET.FE.ZS           .          .           .          .  
                   SL.UEM.NEET.FE.ZS   31.530001       2020   2019   24.639999        JAM      JAM-SL.UEM.NEET.FE.ZS           .          .           .          .  
                   SL.UEM.NEET.MA.ZS   27.860001       2020   2002           .        JAM      JAM-SL.UEM.NEET.MA.ZS           .          .           .          .  
                   SL.UEM.NEET.MA.ZS   27.860001       2020   2008       21.17        JAM      JAM-SL.UEM.NEET.MA.ZS           .          .           .          .  
                SE.SEC.CMPT.LO.MA.ZS   82.536743       2020   2000   97.133942        JAM   JAM-SE.SEC.CMPT.LO.MA.ZS   77.623489       2020           .          .  
                SE.SEC.CMPT.LO.MA.ZS   82.536743       2020   2002   90.309578        JAM   JAM-SE.SEC.CMPT.LO.MA.ZS   77.623489       2020           .          .  
                SE.SEC.CMPT.LO.MA.ZS   82.536743       2020   2008           .        JAM   JAM-SE.SEC.CMPT.LO.MA.ZS   77.623489       2020           .          .  
                   SL.EMP.VULN.FE.ZS        30.6       2019   2006       31.41        JAM      JAM-SL.EMP.VULN.FE.ZS           .          .   38.468418       2019  
                   SL.EMP.VULN.FE.ZS        30.6       2019   2010       30.43        JAM      JAM-SL.EMP.VULN.FE.ZS           .          .   38.468418       2019  
                      IC.FRM.FEMO.ZS           .          .   2011           .        JAM         JAM-IC.FRM.FEMO.ZS           .          .           .          .  
                   FX.OWN.TOTL.MA.ZS           .          .   2002           .        JAM      JAM-FX.OWN.TOTL.MA.ZS           .          .           .          .  
                   FX.OWN.TOTL.MA.ZS           .          .   2008           .        JAM      JAM-FX.OWN.TOTL.MA.ZS           .          .           .          .  
                      SG.TIM.UWRK.FE           .          .   2005           .        JAM         JAM-SG.TIM.UWRK.FE           .          .           .          .  
                      SG.TIM.UWRK.FE           .          .   2013           .        JAM         JAM-SG.TIM.UWRK.FE           .          .           .          .  
                      SG.TIM.UWRK.FE           .          .   2020           .        JAM         JAM-SG.TIM.UWRK.FE           .          .           .          .  
            
            .

            Comment


            • #7
              Clyde Schechter found the problem!! It looks like I had to drop "year" "value "countrycode" and "cntryindID" from the loop frame too. Huge thanks once again!

              Here is the final code

              Code:
              gen country = substr(cntryindID, 1, 3)
              frame put _all if inlist(country, "UMC", "LCN", "ASP"), into (rhs)
              frame rhs {
                  drop year value countrycode cntryindID 
                  reshape wide lastvalue lastyear, i(indicatorcode) j(country)string
              }
              drop if inlist(country, "UMC", "LCN", "ASP")
              frlink m:1 indicatorcode, frame (rhs)
              frget last*, from (rhs)
              drop rhs
              frame drop rhs

              Comment


              • #8
                A quick follow up Clyde Schechter . To expedite this process, I thought of integrating this code into my previous loop to make changes in all country files (referring to the previous posts of mine that you have commented on).

                If I start my code with:
                Code:
                foreach base in ARG BRA JAM {
                    use "${cntrs}/`base'_LAC.dta", clear
                and considering that the files for ARG BRA and JAM all have observations LCN and ASP BUT different income groups (HIC, UMC, LMC, and LIC), I thought of creating a local and running the rest of the code. However, I believe I am missing something since only STATA produced the variable "last*" for only ASP and LCN.

                Code:
                #delimit ;
                    local income             "     HIC
                                                UMC
                                                LMC
                                                LIC
                                                "
                                                ;
                    #delimit cr
                keep indicatorcode lastvalue lastyear year value countrycode cntryindID
                gen country = substr(cntryindID, 1, 3)
                frame put _all if inlist(country, "`income' ", "LCN", "ASP"), into (rhs)
                    frame rhs {
                        drop year value countrycode cntryindID
                        reshape wide lastvalue lastyear, i(indicatorcode) j(country)string
                    }
                drop if inlist(country, "`income'", "LCN", "ASP")
                frlink m:1 indicatorcode, frame (rhs)
                frget last*, from (rhs)
                drop rhs
                frame drop rhs
                    save "${cntrs}/`base'_LAC.dta", replace
                }
                Is it because I need to add a -forval- ? I attempted something like this but the syntax is terribly wrong I am sure
                Code:
                             local income             "     HIC
                                                  UMC
                                                LMC
                                                LIC
                                                "
                                                ;
                    #delimit cr
                local ncm : word  `income'
                Thank you so much. I am learning lots!
                Last edited by Daniel McAdams; 21 Nov 2022, 11:19.

                Comment


                • #9
                  I do not remember the code from your original assembly of these data sets in sufficient detail to tell you how to integrate it with the current code.

                  For a start, however, I can explain why you are getting the results you are getting. The problem is with your use of `income'.

                  You have defined local macro income to be the list "HIC UMC LMC LIC". So when you get to -if inlist(country, "`income' ", "LCN", "ASP")-, this is interpreted as -if inlist(country, "HIC UMC LMC LIC", "LCN", "ASP")-. But there aren't any observations where country is "HIC UMC LMC LIC". So only the LCN and ASP observations get picked up. What you want is not the entire list `income' but just the appropriate single element of it. You seem to have a sense of that with your second code block, but the syntax -local ncm : word `income'- is wrong.

                  While I don't remember the details of the original code that built these data sets, if you look through it you should find some commands that serve the purpose of picking out one of HIC, UMC, LMC and LIC--I believe it was based on matching something found in some variable for the current country's observations (or some of that country's observations). If you can replicate that code and refer to the local macro that was created to hold the appropriate one of HIC, UMC, LMC, and LIC instead of to `income' in that -inlist()- command, I believe you will get what you want.

                  Comment

                  Working...
                  X