Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate 2 new variables consisting of the upper and lower confidence intervals of another variable

    Hello everyone,
    I have an issue regarding creating two new variables consisting of the upper and lower confidence intervals of another variable. I would also like to combine two bar graphs into one. Thank you in advance for your help.

  • #2
    Please read the Forum FAQ for excellent advice on how to post questions effectively and enhance your chances of a timely and helpful response. Your question is quite vague and general. You fail to show example data. You don't explain in what way you want to combine the bar graphs, nor give any indication of what the bar graphs themselves are like.

    Comment


    • #3
      Thank you for your response. I am sorry that it has not been clear. Below are the commands I tried to use. What I want to know is
      1. How to calculate the lower and upper bounds for each penta over hh7 (variable has 11 strata) and then show it in a graph.
      2. To do a similar graph for two additional strata and add them to them to number one. The two additional strata are zones within which the 11 strata (states) in hh7 are located. So it is putting a bar summarising penta for each zone after the states in that zone in the graph.

      twoway (bar penta hh7, fcolor(gs15) barwidth(.7) ) (lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))

      Comment


      • #4
        So something like this:

        Code:
        levelsof hh7, local(hh7s)
        gen lowerbound = .
        gen upperbound = .
        foreach h of local hh7s {
            ci means penta if hh7 == `h'
            replace lowerbound = r(lb) if hh7 == `h'
            replace upperbound = r(ub) if hh7 == `h'
        }
        Note: this code is untested, as you did not provide example data to test it on. Consequently it may contain typos or other errors. You'll have to fix it up if so. Also because you did not provide example data, I have made certain assumptions about your data that are necessary for this code to work properly: hh7 must be a numeric variable, penta is a continuous variable, not a proportion. You will need to modify the code accordingly if these assumptions are wrong.

        Comment


        • #5
          Thank you so much Clyde. penta is a proportion and I think I will be able to fix that. Once again, thank you. I am grateful.

          Comment


          • #6
            Click image for larger version

Name:	ci.png
Views:	1
Size:	14.6 KB
ID:	1421090
            Clyde gives excellent advice as always. Consider also http://www.stata-journal.com/sjpdf.h...iclenum=gr0045 which explains how statsby and ci can be combined to get a graph of means and confidence intervals. But note that the syntax of ci has changed since that paper.

            Here is a complete example that can be run.


            Code:
            sysuse auto, clear
            statsby, by(foreign) : ci means mpg
            twoway rcap lb ub foreign, lc(blue) || scatter mean foreign , mc(blue) yti(Mileage (mpg)) aspect(1) xla(0 "Domestic" 1 "Foreign", tlc(none)) legend(off) xsc(r(-0.2 1.2)) yla(, ang(h))

            Comment


            • #7
              Thank you Nick for your response. It is proportion instead of means.

              Comment


              • #8
                So, your command will be different accordingly. (I think you posted #5 while I was writing #6.)

                Comment


                • #9
                  I have downloaded the pdf in the link you share and will be reading it. Yes, I believe and below is what did with the command Clyde gave. Now, I have a problem with putting them on a bar graph. It would be nice if you could help fix it and suggest how to merge two such graphs. See the syntax i used below the code given by Clyde.

                  levelsof hh7, local(hh7s)
                  gen lowerbound = .
                  gen upperbound = .
                  foreach h of local hh7s {
                  ci penta if hh7 == `h', binomial wilson
                  replace lowerbound = r(lb) if hh7 == `h'
                  replace upperbound = r(ub) if hh7 == `h'
                  }

                  twoway (bar penta hh7, fcolor(gs15) barwidth(.7) ) (lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))

                  I get the following: error lowerbound is not a twoway plot type

                  Comment


                  • #10
                    Something like:

                    Code:
                    graph twoway (bar penta hh7) (rcap lowerbound upperbound hh7)
                    That, I think, would be the basic code. I do think that Nick's suggestion of using -scatter- instead of -bar- is better, but that's up to you. If this code gives you the basic graph you want, then start adding in the particular options you like. One issue with error bars on bar graphs is that the fill of the bars tends to obscure the descending error bar. Scatter graphs don't have that problem, and also use a lot less ink to convey the same information as a bar, which usually results in a visually more appealing graph as well.

                    Also, let me point out that before you graph this data, you should -collapse penta lowerbound upperbound, by(hh7)-; otherwise you graph is going to have a large number of central dots or overlapping bars. You want to reduce this to one observation per hh7 category.

                    Comment


                    • #11
                      The ci command is out-of-date here unless you are using an out-of-date version of Stata, in which case you should tell us. See https://www.statalist.org/forums/help#version and indeed the entire document.


                      The subcommand


                      Code:
                      (lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))
                      should presumably be more like

                      Code:
                      (rspike lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))
                      or rcap (you already have an example of syntax that works in #6).

                      Note that lowerbound is being interpreted by graph as an attempt at a plot type, signalling the lack of a plot type.

                      This is all depending on what you want. (I can't recommend superimposing one bar graph on top of another.)

                      Otherwise much of what Clyde said in #2 remains true. It is hard to test your code without example data.

                      Comment


                      • #12
                        Thank you Clyde and Nick. I am using Stata 13.1. Let me try the suggestions given and I will let you know of the outcome.

                        Comment


                        • #13
                          Originally posted by Clyde Schechter View Post
                          So something like this:

                          Code:
                          levelsof hh7, local(hh7s)
                          gen lowerbound = .
                          gen upperbound = .
                          foreach h of local hh7s {
                          ci means penta if hh7 == `h'
                          replace lowerbound = r(lb) if hh7 == `h'
                          replace upperbound = r(ub) if hh7 == `h'
                          }
                          Note: this code is untested, as you did not provide example data to test it on. Consequently it may contain typos or other errors. You'll have to fix it up if so. Also because you did not provide example data, I have made certain assumptions about your data that are necessary for this code to work properly: hh7 must be a numeric variable, penta is a continuous variable, not a proportion. You will need to modify the code accordingly if these assumptions are wrong.
                          Dear Mr Schechter,

                          I plot my empirical distribution as you can see in the following diagram. The blue line is my empirical distribution (net earnings) and the red line is the reference distribution (RD).
                          Graph.png











                          I want to calculate the upper and the lower confidence interval of my empirical distribution (net earning).I am confused.
                          Is it possible to explain further how to use your code and what to replace in order to compute my confidence intervals?

                          Thank you in advance
                          Kleon

                          Comment

                          Working...
                          X