Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculations within columns

    Hello,

    I'm relatively new to Stata, so this might be a dumb question. However, via google search I couldn't find anything helpful. What I want to do is to do calculations within columns. For a start I have the following table:

    Click image for larger version

Name:	shotstata.JPG
Views:	1
Size:	321.9 KB
ID:	1380159

    I'd like stata to multiply the values of those rows where indicator!="FOREX" with the values of the row where indicator="FOREX" for each y02-y15.

    Lateron I'll have more complicated within column calcuations coming up. Is there any smooth way to do this?

  • #2
    Columns is spreadsheet jargon for what Stata calls variables.

    Screenshots don't work well here and are not easy for people to use in replies. A data example posted with dataex would make a detailed reply easier and indeed more likely. Please see http://www.statalist.org/forums/help#stata which makes both points.

    Comment


    • #3
      Sebastian:
      as Nick said, screenshots can only delay helpful replies.
      That said, for calculations implying variables (as well as observations), you may wantb to take a look at -help egen-.
      Kind regards,
      Carlo
      (Stata 18.0 SE)

      Comment


      • #4
        Sebastian,

        I'd say your dataset is not probably organized for what you're trying to do (and for the use of Stata in general). In a panel like your, the indicators should generally be the titles of the "columns" and the years should be in the the "rows". There may be applications where your data structure is appropriate, but for your calculation it is not.

        I agree with Nick that using -dataex- to post (an excerpt from) your dataset here would allow helping you more efficiently.

        Comment


        • #5
          Okay, here is the thing again with dataex:

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input str5(report_ctry partner_ctry) str10 indicator double(y02 y03 y04)
          "World" "LU"    "TRADE_EX" 12969818209.4861 16388996370.2483 20836461349.9784
          "World" "LU"    "TRADE_IM" 9538604884.22017 14055627816.1563 18069828360.0666
          "World" "MT"    "TRADE_EX" 4614596764.64641 5009825058.13037 5713385118.56079
          "World" "MT"    "TRADE_IM" 2702252936.15928 3128412616.50197 3280143274.20478
          "World" "MU"    "TRADE_EX" 1983747281.66373 2511053529.37919 2391112255.71977
          "World" "MU"    "TRADE_IM" 1796622476.13251 2137414897.27811 1999299741.67798
          "World" "MX"    "TRADE_EX" 135047262444.391  136972793943.22 160863708860.937
          "World" "MX"    "TRADE_IM"  162069101298.93 168579464888.888 192671353448.253
          "World" "MS"    "TRADE_EX"     6087631.9124    11021493.0613     11636041.742
          "World" "MS"    "TRADE_IM"      563413.3245     3051935.2096     2752351.3697
          "World" "NL"    "TRADE_EX" 232626476956.291 276369808823.584  334520590460.48
          "World" "NL"    "TRADE_IM" 222145676603.598 263779781650.426 327220150528.765
          "World" "AN"    "TRADE_EX" 5892316033.16748 4675060797.64445 8880930027.08015
          "World" "AN"    "TRADE_IM" 1465877370.74905  1720680931.7799 1987228677.72018
          "World" "NZ"    "TRADE_EX" 13516204085.3143 16862223922.4027 20892654431.6302
          "World" "NZ"    "TRADE_IM" 15416254969.2359 17535020929.0479  20730191983.801
          "World" "NO"    "TRADE_EX" 32876033957.0066 39060590072.3085 46731624316.7547
          "World" "NO"    "TRADE_IM" 60203313098.8265  73251055049.313 87560846898.9662
          "World" "PA"    "TRADE_EX" 14173931334.0259 16105202440.1659  19256513777.337
          "World" "PA"    "TRADE_IM" 3259917289.00067 3543987213.27096 4127947587.46379
          "World" "PL"    "TRADE_EX" 50004152930.7436 62106322100.4722 79609268407.7751
          "World" "PL"    "TRADE_IM"  37867869821.675 50444872582.1192 65671103195.2793
          "World" "PT"    "TRADE_EX" 44338356958.5479 51703222682.9774 61353240417.2208
          "World" "PT"    "TRADE_IM" 27445012214.6618 34190966284.5788 39153218405.2984
          "World" "RO"    "TRADE_EX" 16392742636.4555 22140023251.0439 30586025575.3034
          "World" "RO"    "TRADE_IM" 14702949400.8464 18802120127.9128 24841932283.1235
          "World" "SG"    "TRADE_EX" 112281998527.146 121335549182.608 148579714622.988
          "World" "SG"    "TRADE_IM" 93206114638.2895 106886628169.315 129976867149.612
          "World" "SX"    "TRADE_EX"                0                0                0
          "World" "SX"    "TRADE_IM"                0                0                0
          "World" "SK"    "TRADE_EX" 15577113609.1831 21038408092.9947  26514730502.419
          "World" "SK"    "TRADE_IM" 14856325078.5359 22046380400.0622 27109371522.4665
          "World" "SI"    "TRADE_EX" 10815308634.4726 13363829453.4375 16869119796.1401
          "World" "SI"    "TRADE_IM" 10475775466.4725 12978992358.7531 15457638315.5701
          "World" "ES"    "TRADE_EX" 165539153382.637 211201906140.994 255813385042.523
          "World" "ES"    "TRADE_IM" 119876379991.362 148846313200.866 175652946697.795
          "World" "KN"    "TRADE_EX"   163021876.7102 214621851.389741 340068101.941628
          "World" "KN"    "TRADE_IM" 73455524.1262041 68327942.0577526 68236736.6823281
          "World" "SE"    "TRADE_EX" 68681962205.5293 82996371688.5168  99327091815.438
          "World" "SE"    "TRADE_IM" 83443163585.2272 102156498791.927  124280522134.66
          "World" "CH"    "TRADE_EX" 96083473694.7845 113160031149.342 131759895176.826
          "World" "CH"    "TRADE_IM" 102279710251.809 116256122468.568 138408656575.117
          "World" "TR"    "TRADE_EX" 45148873267.5326 61144461112.7778 88287171581.1571
          "World" "TR"    "TRADE_IM" 35821269081.1907 47341418882.6606 65363695434.3343
          "World" "GB"    "TRADE_EX" 348203595469.934 396372010649.907 466422497629.913
          "World" "GB"    "TRADE_IM" 269700471964.659 295262193835.884 339277631723.859
          "World" "US"    "TRADE_EX" 1127210225907.96 1212581412407.78 1414830589935.24
          "World" "US"    "TRADE_IM" 734558432759.211 763711827858.292 861678823309.837
          "World" "World" "TRADE_EX" 6449336848292.05  7524595024544.2 9135857054713.71
          "World" "World" "TRADE_IM"  6631685013247.4  7726807843172.1 9435905539281.74
          "EA"    "US"    "FOREX"    1.06255166666667 .886034166666667          .805365
          end
          Any line where indicator!="FOREX" should be divided by the value of the line for which indicator="FOREX" for each variable.

          If already though about flipping the table. But honestly I don't see how this would in the end make it easier for what I intend to do. Because in the end I always need Stata to find certain row values, because lateron I'll have more complicated calculations which also include the *_ctry variables as conditions.

          Comment


          • #6
            That helps, but in the example I see only one example of FOREX and the countries don't correspond to the rest of the data.

            Perhaps you want something like

            Code:
            foreach y in y02 y03 y04 { 
                egen `y'_ref = mean(`y'  / (indicator == "FOREX")), by(*ctry) 
                gen `y'_scaled = `y'/`y'_ref 
            }
            See also http://www.stata-journal.com/sjpdf.h...iclenum=dm0055 sections 9 and 10.

            Comment


            • #7
              Your solution works, but only if you remove the by option. What I used now specifically is:
              Code:
              foreach y of varlist y02-y15 {
                  egen fx`y' = mean(`y'  / (indicator == "FOREX"))
                  replace `y' = `y'*fx`y'
                  drop fx`y'
              }
              drop if indicator=="FOREX"
              Thank you for the help. But one other thing: what do you mean by "the countries don't correspond to the rest of the data"?

              Comment


              • #8
                I mean that in your data example there is just one observation with FOREX mentioned, namely

                Code:
                 
                 "EA"    "US"    "FOREX"    1.06255166666667 .886034166666667          .805365
                and the countries in that observation

                Code:
                 
                 "EA"    "US"
                don't appear in the rest of the example. I was guessing wildly that in the full dataset there was a line for FOREX mentioned for each distinct pair of countries.

                I don't know anything about this kind of data or its analysis that you have not explained.

                Economists joining the forum often seem to assume that everyone else is an economist, even their kind of economist, but that is not so!

                Note that Sebastian Geiger in #4 has an important point, in so far as observations with FOREX mentioned should probably be excluded from most analyses if they are quite different.

                Comment


                • #9
                  Originally posted by Nick Cox View Post
                  Economists joining the forum often seem to assume that everyone else is an economist, even their kind of economist, but that is not so!
                  You're right, I totally oversaw that. My apologies for that.

                  What I have here is bilateral trade data. The original data taken from the IMF is in US Dollars and what I did here was to convert it into Euros. This is why there is only one row with the FOREX indicator, that's the USD/EUR exchange rate for each year. That is why I deleted it after the conversion was done, because the exchange rate is no longer needed from here on.

                  And yes, after thinking it through again, putting the indicators as variables and having the years in the "rows" is a good idea for me. Any suggestion about how I best achieve this with how the data is organized now?

                  Comment


                  • #10
                    Nick:
                    is right.
                    Sometimes pretentious, always bad at forecasting, we (economists) bear on our shoulders the sad burden of the dismal science (https://en.wikipedia.org/wiki/The_dismal_science)!
                    Kind regards,
                    Carlo
                    (Stata 18.0 SE)

                    Comment


                    • #11
                      Something like

                      Code:
                      reshape long y, i(report_ctry partner_ctry indicator) j(Y) string 
                      gen year = 2000 + real(Y)
                      You may want then to reshape wide your trade into separate import and export variables.

                      Comment


                      • #12
                        What do they say of my own field? Mostly they don't think about it at all. But: Geography ... the superficial Earth science. That's my favourite.

                        Comment


                        • #13
                          Okay, I've now managed to do the reshape long thing. My specific .do file here looks as follows:
                          Code:
                          reshape long y, i(report_ctry partner_ctry indicator) j(Y) string
                          gen year = 2000 + real(Y), after (partner_ctry)
                          label variable year "YEAR"
                          drop Y
                          rename y value
                          label variable value "VALUE"
                          The data then looks like it should at this place:
                          Code:
                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input str5(report_ctry partner_ctry) float year str10 indicator double value
                          "AG" "AI" 2002 "TRADE_EX"                     0
                          "AG" "AI" 2003 "TRADE_EX"                     0
                          "AG" "AI" 2004 "TRADE_EX"    2.9640066500330566
                          "AG" "AI" 2005 "TRADE_EX"     6.815992947966456
                          "AG" "AI" 2006 "TRADE_EX"                     0
                          "AG" "AI" 2007 "TRADE_EX"    2.0419315919719936
                          "AG" "AI" 2008 "TRADE_EX"                     0
                          "AG" "AI" 2009 "TRADE_EX"    3.4424660014643673
                          "AG" "AI" 2010 "TRADE_EX"    .08853505923843384
                          "AG" "AI" 2011 "TRADE_EX"    .04214558395528793
                          "AG" "AI" 2012 "TRADE_EX"   .013992162513732911
                          "AG" "AI" 2013 "TRADE_EX"     .2988234304571152
                          "AG" "AI" 2014 "TRADE_EX"   .047987019657969476
                          "AG" "AI" 2015 "TRADE_EX"    .06530896122837067
                          "AG" "AI" 2002 "TRADE_IM"                     0
                          "AG" "AI" 2003 "TRADE_IM"                     0
                          "AG" "AI" 2004 "TRADE_IM"    .01226168252080679
                          "AG" "AI" 2005 "TRADE_IM"    .01507805419832468
                          "AG" "AI" 2006 "TRADE_IM"   .017252519129037858
                          "AG" "AI" 2007 "TRADE_IM"    .09252647055351734
                          "AG" "AI" 2008 "TRADE_IM"                     0
                          "AG" "AI" 2009 "TRADE_IM"   .014960505124568941
                          "AG" "AI" 2010 "TRADE_IM"    .34510612428092957
                          "AG" "AI" 2011 "TRADE_IM"    .10273687460446358
                          "AG" "AI" 2012 "TRADE_IM"    .01657220582962036
                          "AG" "AI" 2013 "TRADE_IM" .00016720133471488953
                          "AG" "AI" 2014 "TRADE_IM"   .007270486470341682
                          "AG" "AI" 2015 "TRADE_IM"   .003901478286325931
                          end

                          I then apply:
                          Code:
                          gen id=_n
                          reshape wide value, i(id) j(indicator) string
                          The id is necessary, because the other variables no longer clearly identify one single line/row. This does work, the outcome, though, does not look very nice:
                          Code:
                          * Example generated by -dataex-. To install: ssc install dataex
                          clear
                          input float id double(valueTRADE_EX valueTRADE_IM) str5(report_ctry partner_ctry) float year
                           1                   0                     . "AG" "AI" 2002
                           2                   0                     . "AG" "AI" 2003
                           3  2.9640066500330566                     . "AG" "AI" 2004
                           4   6.815992947966456                     . "AG" "AI" 2005
                           5                   0                     . "AG" "AI" 2006
                           6  2.0419315919719936                     . "AG" "AI" 2007
                           7                   0                     . "AG" "AI" 2008
                           8  3.4424660014643673                     . "AG" "AI" 2009
                           9  .08853505923843384                     . "AG" "AI" 2010
                          10  .04214558395528793                     . "AG" "AI" 2011
                          11 .013992162513732911                     . "AG" "AI" 2012
                          12   .2988234304571152                     . "AG" "AI" 2013
                          13 .047987019657969476                     . "AG" "AI" 2014
                          14  .06530896122837067                     . "AG" "AI" 2015
                          15                   .                     0 "AG" "AI" 2002
                          16                   .                     0 "AG" "AI" 2003
                          17                   .    .01226168252080679 "AG" "AI" 2004
                          18                   .    .01507805419832468 "AG" "AI" 2005
                          19                   .   .017252519129037858 "AG" "AI" 2006
                          20                   .    .09252647055351734 "AG" "AI" 2007
                          21                   .                     0 "AG" "AI" 2008
                          22                   .   .014960505124568941 "AG" "AI" 2009
                          23                   .    .34510612428092957 "AG" "AI" 2010
                          24                   .    .10273687460446358 "AG" "AI" 2011
                          25                   .    .01657220582962036 "AG" "AI" 2012
                          26                   . .00016720133471488953 "AG" "AI" 2013
                          27                   .   .007270486470341682 "AG" "AI" 2014
                          28                   .   .003901478286325931 "AG" "AI" 2015
                          end
                          Obviously, I want the two value variables to be next to each other for each respective year, without all those missings, like:
                          report_ctry partner_ctry trade_ex trade_im year
                          AD AI 134872389 239543498 2002
                          AD AI 34353958 239258954 2003
                          AD AI 324935 3498573485 2004
                          [fictional export/import values added in this table]

                          First I though it might work if I add a id that is unique for each report_ctry-partner_ctry-year combination (i.e. adding the id variable before the reshape long). But this does not work. In this example here, however, it seems to: http://stats.idre.ucla.edu/stata/mod...-long-to-wide/
                          Last edited by Sebastian Eiblmeier; 27 Mar 2017, 08:54.

                          Comment


                          • #14
                            Stata's just doing what you are asked. As each individual observation is either imports or exports, there is no question of both variables being populated at once in the result.

                            This works with your data example.

                            Code:
                            reshape wide value, i(*ctry year) j(indicator) string

                            Comment


                            • #15
                              Okay, it works now. Thanks again.

                              But just that I get it right: in the i() option I have to put variables that clearly identify a line in the new dataset, after the reshaping, yes? Because until now I thought those variables have to clearly identify a line before the reshaping.

                              Comment

                              Working...
                              X