Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • compute averages across firms by country

    Dear all,
    I have a little problem with creating averages across countries
    suppose I have data in the following format

    Code:
    date          name        country     ret
    01.10.20001   company1       UK       0.01
    ....
    30.10.20001   company1       UK       0.02
    
    01.10.20001   company2       UK       0.04
    ....
    30.10.20001   company2       UK       0.01
    
    01.10.20001   company3       US       0.03
    ....
    30.10.20001   company3       US       0.02
    
    
          
    is there a way to have Stata compute the following:
    
    
    01.10.20001   UnitedKingdom  UK       (0.01+0.04)/2
    ....
    30.10.20001   UnitedKingdom  UK       (0.02+0.01)2
    
    01.10.20001   USA            US       0.03
    ....
    30.10.20001   USA            US       0.02

    I want to create avgerages across countries. I have 25 countries, so it would be feasible to create names for fountries manually. but I have 160+ company, so summing across them manually would take a lot of time. Probably this is not the ideal way of approaching this problem I guess, but I am not sure how else to do it.

    Any advice is appreciated!

    Best,
    Harvey

  • #2
    though not completely clear (please read the FAQ), it looks like the following will do what you want:
    Code:
    egen meanret = mean(ret), by(date country)

    Comment


    • #3
      See also the help for collapse (even if it's not what you want now).

      Comment


      • #4
        hey,

        thanks guys
        @Rich: I guess this only creates a new variable "meanret" for each company. Bascially I would like to introduce new observations, but probably this is not how Stata is supposed to be used.

        @Nick: not sure about the collapse. The way I understand it is, I could compute the average for each firm.

        I should have posted a more livly example, consider of the following form

        Code:
        clear
        webuse grunfeld, clear
         g ret=invest/1500
         bys company (year): g cumul=ret if _n==1
         replace cumul=ret + l.cumul if missing(cumul)
         gen country = "empty"
         keep if company <6
         replace country = "DE" if company ==1
         replace country = "FR" if company ==2
         replace country = "DE" if company ==3
         replace country = "FR" if company ==4
         replace country = "US" if company ==5
            
         
         li year ret cumul if company==1, noo clean
        Ignore the generation of countries names. Country names are included in my data, I just wanted to use the grundfeld example and had to modify it a little such that it matches my data in important characertics, that is having multiple companies in multiple countrys and observing cumulative return "cumul" along time.

        Basically I would like to add "observations" (not variables) which are computed averaging across firms country by country for each period. Is there a way to do this?

        Comment


        • #5
          Actually not close, Stata keeps buzzing me.

          Code:
          clear
          webuse grunfeld, clear
          g ret=invest/1500
          bys company (year): g cumul=ret if _n==1
          replace cumul=ret + l.cumul if missing(cumul)
          gen country = "empty"
          keep if company <8
          replace country = "FR" if company ==1
          replace country = "FR" if company ==2
          replace country = "DE" if company ==3
          replace country = "FR" if company ==4
          replace country = "US" if company ==5
          replace country = "DE" if company ==6     
          replace country = "FR" if company ==7
              
          egen meancumul = mean(cumul), by(year country)
          levelsof country, local(localcountry)         
          display `localcountry'
           foreach i of local localcountry {
              local j `"Cumulative returns `i'"'
              twoway line meancumul year if country == `"`i'"', title(`"`j'"')
              graph export `"`i'.png"', replace
          }
          
          
          }

          It now does what it should, it just for some reason add this weird line that connects begining and end points. Would anyone know why it does that? :D



          Last edited by Jannic Cutura; 22 Feb 2016, 06:40.

          Comment


          • #6
            Yes; you could indeed add observations with totals but so far as using Stata is concerned your realisation that "this is not how Stata is supposed to be used" is if possible a drastic understatement.

            It's a very bad idea. Ever thereafter you have to remember to exclude all such observations from further calculations to avoid double counting if not complete nonsense. If the observations containing totals change position within the dataset through some sorting, that can become even more awkward.

            Hence I decline to explain how to do it, in your own best interests!

            If you want a spreadsheet, you know where to find one!

            Otherwise egen, tag() offers a solution to any problem in which you want to see group totals once and once only for each group.

            Comment


            • #7
              Harvey:
              why can't you simply elaborate a bit on the following toy-example?
              Code:
              webuse grunfeld, clear
              
              collapse (mean) invest, by( company )
              PS: I've just realized I chimed in later!
              Kind regards,
              Carlo
              (Stata 19.0)

              Comment


              • #8
                Yes, I know.

                Comment


                • #9
                  @Nick: Just read your statement above. I agree. Can you recommend a good way to (not only temporirily) be able to store the results?
                  I have data on companies, and would like to compute different kind of average for example US vs EMU, EMUvsNonEMU. Creating "new observations" is probably not a good way to go about this. For simple plotting my solution above works, except for the weird line connect the begining and the end (why does it do that?!).
                  But if I later want to compare the average cumulative returns in €-Zone vs non€-zone, I will be in trouble. Which way would you recommend I move from here?

                  Thank you so much for your help! I really appreciate it.

                  Comment


                  • #10
                    Or is there a way to export (to excel) the following:

                    Code:
                    date            FR                          DE
                    1935    meancumul(France,1935)      meancumul(Germany,1935)
                    1936    meancumul(France,1936)      meancumul(Germany,1939)
                    ...
                    based on the example from above:

                    Code:
                    clear
                    webuse grunfeld, clear
                    g ret=invest/1500
                    bys company (year): g cumul=ret if _n==1
                    replace cumul=ret + l.cumul if missing(cumul)
                    gen country = "empty"
                    keep if company <8
                    replace country = "FR" if company ==1
                    replace country = "FR" if company ==2
                    replace country = "DE" if company ==3
                    replace country = "FR" if company ==4
                    replace country = "US" if company ==5
                    replace country = "DE" if company ==6     
                    replace country = "FR" if company ==7
                        
                    egen meancumul = mean(cumul), by(year country)

                    Comment


                    • #11
                      I've already pointed you to collapse. You can collapse to group means and SDs.

                      Comment


                      • #12
                        Nick Cox:
                        I think I get what you mean, e.g.
                        Code:
                        collapse (mean) invest meancumul, by(country year)
                        collapses the data in the way I need it for countries. But now I lost my other identifiers (e.g. EMU, non-EMU, etc...), which are not in the mini example. I am pondering of using a second .do file to work with contry level data from here...

                        Thanks again Nick Cox, I really appreciate your help One last thing:
                        would you know why Stata create these lines that connect the first and the last observation in my example above (and how to turn that off)?

                        Comment

                        Working...
                        X