Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • average at sector level within state (excluding current sector)

    Hi all,

    I have a company level data as below and I want to create a new variable that is the average roa for all sectors in the same state excluding the current one, e.g. roa_avg for sector 12 in state 1 equals 0.34513567 which is the average of roa for sectors 25, 36, 37 and 44.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte stateid int(firmid year) byte sector float(roa avg)
    1 10010 1987 12  .076647 .034513567
    1 10010 1994 12 .0894412 .034513567
    1 10010 1990 12 .0190569 .034513567
    1 10010 1988 12 .0505933 .034513567
    1 10010 1991 12 .0207386 .034513567
    1 10010 1992 12 .0387597 .034513567
    1 10010 1989 12 .0626254 .034513567
    1 10010 1986 12 .1205143 .034513567
    1 10010 1993 12 .0966105 .034513567
    1 10006 1980 25 .0158416  .04894846
    1 10006 1983 25 .0162004  .04894846
    1 10006 1977 25 .0128079  .04894846
    1 10006 1982 25 .0284683  .04894846
    1 10006 1976 25 .0124178  .04894846
    1 10006 1975 25 .0145676  .04894846
    1 10006 1978 25 .0174595  .04894846
    1 10006 1981 25 .0220012  .04894846
    1 10006 1979 25 .0166106  .04894846
    1 10007 1989 36 .0988984  .03991843
    1 10012 2002 37 .0738338   .0395362
    1 10012 2000 37 .0391909   .0395362
    1 10012 1995 37 .0050087   .0395362
    1 10012 2003 37 .0986469   .0395362
    1 10012 1988 37 .0664791   .0395362
    1 10012 1997 37   .02295   .0395362
    1 10012 1991 37 .0143952   .0395362
    1 10012 1999 37  .008036   .0395362
    1 10012 1998 37 .0302348   .0395362
    1 10012 1996 37 .0177345   .0395362
    1 10012 1992 37 .0079708   .0395362
    1 10012 1993 37 .0124058   .0395362
    1 10012 2001 37 .0288672   .0395362
    1 10012 1994 37 .0182261   .0395362
    1 10012 1990 37 .1175252   .0395362
    1 10012 1989 37 .0830875   .0395362
    1 10012 1987 37 .1010276   .0395362
    1 10015 1983 44        0  .04377445
    1 10015 1985 44        0  .04164537
    end

    I understand that I can use the following code suggested by Clyde Schechter in another post. but it takes a significant time if the number of observations is large.

    Code:
    isid firmid year
    gen avg_roa= .
    forvalues i = 1/`=_N' {    // LOOP OVER OBSERVATIONS
        quietly summ roa if  stateid == stateid[`i'] & sector != sector[`i'], detail
        replace avg_roa= r(mean) in `i'
    }
    Thanks in advance

  • #2
    The following may or may not be faster:

    Code:
    preserve
    keep stateid sector roa
    rename (roa sector) =_U
    tempfile copy
    save `copy'
    
    restore
    gen long obs_no = _n
    joinby stateid using `copy'
    drop if sector == sector_U
    collapse (first) stateid firmid year sector roa (mean) roa_U, by(obs_no)

    Comment


    • #3
      The following is likely the fastest way to do this:

      Code:
      bysort stateid sector: egen double roasstot = total(roa)
      by stateid sector: gen long ssN = _N
      by stateid: egen double roastot = total(roa)
      by stateid: gen wanted = (roastot - roasstot) / (_N - ssN)
      Last edited by Robert Picard; 04 Dec 2019, 11:01.

      Comment


      • #4
        Perfect! Thank you both for this.

        Comment

        Working...
        X