Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Variable for Mean by another variable

    Hi, I currently have two variables: O2 (measuring oxygen in the atmosphere - numerical) and AGA (which is a scale for geographical rurality - categorical)

    What would be the code that is needed to create a variable that lists the mean O2 by AGA area ?

  • #2
    Code:
    by AGA, sort: egen wanted = mean(O2)

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      Code:
      by AGA, sort: egen wanted = mean(O2)
      Thank you Clyde!

      And if I wanted to also create variables for the Lower and Upper Confidence Intervals would it be something like:

      gen O2_lowerci = wanted - invnorm(0.975)*O2

      gen O2_upperci = wanted + invnorm(0.975)*O2


      Comment


      • #4
        No. Those formulas aren't right. First, they will produce different results in every observation, not reflecting the group. More important, multiplying the value of O2 by invnorm(0.975) is not going to get you lower and upper confidence intervals. You need the standard error of O2. But there is no need to calculate the formula from scratch anyway, as we have the -ci means- command. So, I would scrap the code shown in #2 and instead do this:

        Code:
        gen mean_O2 = .
        gen lb_O2 = .
        gen ub_O2 = .
        levelsof AGA, local(agas)
        foreach a of local agas {
            ci means O2 if AGA == `a'
            replace mean_O2 = r(mean) if AGA == `a'
            replace lb_O2 = r(lb) if AGA == `a'
            replace ub_O2 = r(ub) if AGA == `a'
        }

        Comment


        • #5
          Compare also the use of statsby with ci. The syntax in https://www.stata-journal.com/articl...article=gr0045 is no longer current, but it's easy to adapt.

          Comment


          • #6
            And consider:

            Code:
            * Setup
            * https://econpapers.repec.org/software/bocbocode/S458928.htm
            ssc install _gwmean , replace
            
            * Example using cars from Chambers, J. M., Cleveland, W. S., Kleiner, B., & Tukey, P. A. (2018). Graphical methods for data analysis. Chapman and Hall/CRC. Page 352-355
            sysuse auto, clear
            replace rep78=. if rep78<3
            keep rep78 price weight
            
            * Using the user community contributed Stata module to compute optionally weighted means
            * Same result as with using #4 but not (yet) with the upper and lower bounds of the confidence interval
            egen arimean = wmean(weight) if rep78!=. , by(rep78)
            
            * Using #4
            gen mean_weight = .
            gen lb_weight = .
            gen ub_weight = .
            levelsof rep78, local(cats)
            foreach a of local cats {
                ci means weight if rep78 == `a'
                replace mean_weight = r(mean) if rep78 == `a'
                replace lb_weight = r(lb) if rep78 == `a'
                replace ub_weight = r(ub) if rep78 == `a'
            }
            
            * Using Stata's internal command ameans
            bysort rep78: ameans weight
            http://publicationslist.org/eric.melse

            Comment

            Working...
            X