Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stuck while calculating moving median

    Hello,

    Currently, I am writing my thesis, in which I have to calculate the moving median of the ROE of 48 firm classifications by Fama French (1997). The calculation is as follows:

    "To compute a target industry ROE, we group all stocks into the same 48 industry
    classifications as Fama and French [1997]. The industry target ROE is a moving median of past ROEs from all firms in the same industry. We use at least five years, and up to ten years, of past data to compute this median."

    I already had a look at the last post: https://www.statalist.org/forums/for...for-industries

    I used and then modified the code they used in the post because I already selected 5 years. However, I did not manage to get the code right, because I am getting an "if not found" error. Maybe someone knows how to get it right? It would be of great help.

    This is the code I now have:

    levelsof industry, local(industries)
    levelsof Date, local(years)

    gen moving_median_roe = .

    foreach ind of local industries {
    foreach y of local years {
    replace moving_median_roe = `r(p50)' if industry == `ind' & Date == `y'
    }
    }

    summ moving_median_roe


    This is my data:
    Attached Files

  • #2
    I am now using this code, which seems to give me the right results...

    egen industrymed = median(ROE), by (industry)
    egen tag = tag(industry)
    su industrymed if tag, detail
    gen wanted = industrymed > r(p50)
    tabdisp industry, c(industrymed wanted)

    Comment


    • #3
      #1 loops over industries and years and repeatedly tries to assigns an empty string to each cross-combination -- because the summarize comes after all of that and there is no r(p50) to assign. Stata doesn't see the empty string, but rather sees the if and tries to interpret it as a variable or scalar name, which could be legal at that point. There is no such variable or scalar in your dataset, so Stata bails out, puzzled.

      #2 looks at industry median ROE and all median ROE. Nothing moves there If it's what you want, then fine.

      The thread you cited in #1 later gives one-line solutions for the question, which can be adapted here.

      Code:
      ssc install rangestat 
      
      rangestat (count) count10=ROE (median) median10=ROE, int(Date_y -10 -1) by(industry)
      gives a median over the previous 10 years for each year and industry. For the previous 5 years, adjust as needed. It's a good idea to keep track of how many values were used to calculate each median. Some researchers ignore results based on only a few values.

      Your screenshot was readable and helpful, but https://www.statalist.org/forums/help#stata explains our request to use dataex, which is usually much more helpful.

      Comment


      • #4
        Thank you for your quick answer! I will take a look at it and use the code most suitable for my research.

        Comment

        Working...
        X