Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Trying to make a plot with median and CI

    Hi,

    I would like to create a plot that shows the median of a variable within two different groups with the median confidence intervals.

    Specifically, with the data below, I would like create a plot that shows the median of sppbscore0 within the two age groups (age < 70.77 & age >=70.77) with the confidence intervals included, all on the same plot.

    This plot would be similar to the plots produced by the
    Code:
    ciplot
    command from SSC for the mean of a variable.

    I am using Stata 16.1 for Mac.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte sppbscore0 float age_hl
     8 0
    10 1
     6 1
    12 1
     9 0
     7 1
     7 1
    10 0
    10 0
     8 1
     5 0
     8 1
     7 1
     8 0
     8 0
     6 1
     9 1
     5 1
     9 0
     9 0
    end
    label values age_hl Age
    label def Age 0 "Age <70.77", modify
    label def Age 1 "Age >70.77", modify
    Any help would be appreciated.

    Thank you.
    Last edited by Harry Stern; 27 Sep 2020, 16:43.

  • #2
    I don't know -ciplot- myself, but I think you want something like this:

    Code:
    frame create medians age_hl median lb ub
    levelsof age_hl, local(ages)
    foreach a of local ages {
        centile sppbscore0 if age_hl == `a', centile(50)
        frame post medians (`a') (`r(c_1)') (`r(lb_1)') (`r(ub_1)')
    }
    
    frame medians: graph twoway (scatter median age_hl) (rcap ub lb age), ///
        xlabel(0 1) ytitle(sppbscore0) legend(off) note("Median and 95% CI")
    You can add other options to the -graph twoway- command to customize the appearance of the code to your taste.

    If in your real data the age_hl variable is really just 0 or 1, then you can skip the -levelsof- command and replace -foreach a of local ages- with -forvalues a = 0/1-. I made the guess that the full data includes more values, since it's a little odd to make a graph like this if there are only two categories--but there's no reason you can't do it if that's what you want.

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      I don't know -ciplot- myself, but I think you want something like this:

      Code:
      frame create medians age_hl median lb ub
      levelsof age_hl, local(ages)
      foreach a of local ages {
      centile sppbscore0 if age_hl == `a', centile(50)
      frame post medians (`a') (`r(c_1)') (`r(lb_1)') (`r(ub_1)')
      }
      
      frame medians: graph twoway (scatter median age_hl) (rcap ub lb age), ///
      xlabel(0 1) ytitle(sppbscore0) legend(off) note("Median and 95% CI")
      You can add other options to the -graph twoway- command to customize the appearance of the code to your taste.

      If in your real data the age_hl variable is really just 0 or 1, then you can skip the -levelsof- command and replace -foreach a of local ages- with -forvalues a = 0/1-. I made the guess that the full data includes more values, since it's a little odd to make a graph like this if there are only two categories--but there's no reason you can't do it if that's what you want.
      Hi Clyde,

      Thank you for the response.

      When I do the
      -frame medians: graph twoway (scatter median age_hl) (rcap ub lb age), /// xlabel(0 1) ytitle(sppbscore0) legend(off) note("Median and 95% CI")- command, I receive the error message
      Code:
      option / not allowed
      .

      Comment


      • #4
        Originally posted by Clyde Schechter View Post
        I don't know -ciplot- myself, but I think you want something like this:

        Code:
        frame create medians age_hl median lb ub
        levelsof age_hl, local(ages)
        foreach a of local ages {
        centile sppbscore0 if age_hl == `a', centile(50)
        frame post medians (`a') (`r(c_1)') (`r(lb_1)') (`r(ub_1)')
        }
        
        frame medians: graph twoway (scatter median age_hl) (rcap ub lb age), ///
        xlabel(0 1) ytitle(sppbscore0) legend(off) note("Median and 95% CI")
        You can add other options to the -graph twoway- command to customize the appearance of the code to your taste.

        If in your real data the age_hl variable is really just 0 or 1, then you can skip the -levelsof- command and replace -foreach a of local ages- with -forvalues a = 0/1-. I made the guess that the full data includes more values, since it's a little odd to make a graph like this if there are only two categories--but there's no reason you can't do it if that's what you want.
        Actually, I just deleted the /// from that command and it worked. Thank you!

        Comment


        • #5
          I'm guessing you entered the commands in the Command window. When I post code here, I always write the code for use in the do-editor. /// in the do-editor tells Stata that the command continues on the next line. In the Command window, /// is not allowed.

          Comment


          • #6
            ciplot was mentioned. The help carries this note:

            (note added August 2011)

            For fuller flexibility, consider using statsby first and then standard graphics commands.
            I don't remember the history of ciplot in detail. But it was probably roughly like this. In the late 1990s, there were many posts on Statalist from people wanting to plot confidence intervals for means, so programs gotwritten and posted. Then other people start commenting, wanting something different, and reasonably. At that point, the problem rather obviously is too big. Wanting to plot confidence intervals for estimated parameters is too big for one command, unless the command is called twoway, meaning just work out your confidence intervals first and then plot them how you wish. Nothing in that is meant to discourage convenient wrappers for intervals that suit each program author, which was ciplot too, back in the day.

            Clyde Schechter posted nice code that will work in Stata 16 up. Here's another solution using the same data example for people on Stata before 16.


            Code:
            preserve 
            
            statsby median=r(c_1) ub=r(ub_1) lb=r(lb_1), by(age_hl) clear: centile sppbscore0, centile(50)
            scatter median age_hl || rcap ub lb age_hl, xlabel(0 1) ytitle(sppbscore0) legend(off) note("Median and 95% CI")
            save median_ci 
            
            restore

            Note: I am hoping to get to age 70.77 and find out what happens then. I hope it's not a shock.

            Comment


            • #7
              Thank you both for your help. I understand now why that command wasn't working - was not using the do-editor. Indeed, I also hope age 70.77 is not a shocker. You raise a good point, though. Maybe would make more sense to round up...

              Comment


              • #8
                an alternative tht i have used routinely would be to use a regrression model followed by margins and marginsplot.
                an example would be:
                Code:
                qreg sppbscore0 i.age
                margins i.age
                marginsplot, x(age)
                Last edited by George Hoffman; 29 Sep 2020, 07:00.

                Comment

                Working...
                X