How to generate 2 new variables consisting of the upper and lower confidence intervals of another variable

Alieu Sowe

Join Date: Dec 2017

Posts: 8
#1

How to generate 2 new variables consisting of the upper and lower confidence intervals of another variable

05 Dec 2017, 08:45

Hello everyone,
I have an issue regarding creating two new variables consisting of the upper and lower confidence intervals of another variable. I would also like to combine two bar graphs into one. Thank you in advance for your help.
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#2

05 Dec 2017, 09:00

Please read the Forum FAQ for excellent advice on how to post questions effectively and enhance your chances of a timely and helpful response. Your question is quite vague and general. You fail to show example data. You don't explain in what way you want to combine the bar graphs, nor give any indication of what the bar graphs themselves are like.
2 likes
Comment
Alieu Sowe

Join Date: Dec 2017

Posts: 8
#3

05 Dec 2017, 09:35

Thank you for your response. I am sorry that it has not been clear. Below are the commands I tried to use. What I want to know is
1. How to calculate the lower and upper bounds for each penta over hh7 (variable has 11 strata) and then show it in a graph.
2. To do a similar graph for two additional strata and add them to them to number one. The two additional strata are zones within which the 11 strata (states) in hh7 are located. So it is putting a bar summarising penta for each zone after the states in that zone in the graph.

twoway (bar penta hh7, fcolor(gs15) barwidth(.7) ) (lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#4

05 Dec 2017, 10:04

So something like this:

Code:

levelsof hh7, local(hh7s) gen lowerbound = . gen upperbound = . foreach h of local hh7s { ci means penta if hh7 == `h' replace lowerbound = r(lb) if hh7 == `h' replace upperbound = r(ub) if hh7 == `h' }

Note: this code is untested, as you did not provide example data to test it on. Consequently it may contain typos or other errors. You'll have to fix it up if so. Also because you did not provide example data, I have made certain assumptions about your data that are necessary for this code to work properly: hh7 must be a numeric variable, penta is a continuous variable, not a proportion. You will need to modify the code accordingly if these assumptions are wrong.
Comment
Alieu Sowe

Join Date: Dec 2017

Posts: 8
#5

05 Dec 2017, 10:31

Thank you so much Clyde. penta is a proportion and I think I will be able to fix that. Once again, thank you. I am grateful.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#6

05 Dec 2017, 10:38

Clyde gives excellent advice as always. Consider also http://www.stata-journal.com/sjpdf.h...iclenum=gr0045 which explains how statsby and ci can be combined to get a graph of means and confidence intervals. But note that the syntax of ci has changed since that paper.

Here is a complete example that can be run.

Code:

sysuse auto, clear statsby, by(foreign) : ci means mpg twoway rcap lb ub foreign, lc(blue) || scatter mean foreign , mc(blue) yti(Mileage (mpg)) aspect(1) xla(0 "Domestic" 1 "Foreign", tlc(none)) legend(off) xsc(r(-0.2 1.2)) yla(, ang(h))
1 like
Comment
Alieu Sowe

Join Date: Dec 2017

Posts: 8
#7

05 Dec 2017, 10:57

Thank you Nick for your response. It is proportion instead of means.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#8

05 Dec 2017, 11:21

So, your command will be different accordingly. (I think you posted #5 while I was writing #6.)
Comment
Alieu Sowe

Join Date: Dec 2017

Posts: 8
#9

05 Dec 2017, 11:52

I have downloaded the pdf in the link you share and will be reading it. Yes, I believe and below is what did with the command Clyde gave. Now, I have a problem with putting them on a bar graph. It would be nice if you could help fix it and suggest how to merge two such graphs. See the syntax i used below the code given by Clyde.

levelsof hh7, local(hh7s)
gen lowerbound = .
gen upperbound = .
foreach h of local hh7s {
ci penta if hh7 == `h', binomial wilson
replace lowerbound = r(lb) if hh7 == `h'
replace upperbound = r(ub) if hh7 == `h'
}

twoway (bar penta hh7, fcolor(gs15) barwidth(.7) ) (lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))

I get the following: error lowerbound is not a twoway plot type
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30097
#10

05 Dec 2017, 12:03

Something like:

Code:

graph twoway (bar penta hh7) (rcap lowerbound upperbound hh7)

That, I think, would be the basic code. I do think that Nick's suggestion of using -scatter- instead of -bar- is better, but that's up to you. If this code gives you the basic graph you want, then start adding in the particular options you like. One issue with error bars on bar graphs is that the fill of the bars tends to obscure the descending error bar. Scatter graphs don't have that problem, and also use a lot less ink to convey the same information as a bar, which usually results in a visually more appealing graph as well.

Also, let me point out that before you graph this data, you should -collapse penta lowerbound upperbound, by(hh7)-; otherwise you graph is going to have a large number of central dots or overlapping bars. You want to reduce this to one observation per hh7 category.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35696
#11

05 Dec 2017, 12:10

The ci command is out-of-date here unless you are using an out-of-date version of Stata, in which case you should tell us. See https://www.statalist.org/forums/help#version and indeed the entire document.

The subcommand

Code:

(lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))

should presumably be more like

Code:

(rspike lowerbound upperbound hh7, lcolor(dknavy) lwidth(medium))

or rcap (you already have an example of syntax that works in #6).

Note that lowerbound is being interpreted by graph as an attempt at a plot type, signalling the lack of a plot type.

This is all depending on what you want. (I can't recommend superimposing one bar graph on top of another.)

Otherwise much of what Clyde said in #2 remains true. It is hard to test your code without example data.
Comment
Alieu Sowe

Join Date: Dec 2017

Posts: 8
#12

05 Dec 2017, 12:16

Thank you Clyde and Nick. I am using Stata 13.1. Let me try the suggestions given and I will let you know of the outcome.
Comment
Kleon Marenas

Join Date: Apr 2020

Posts: 19
#13

02 May 2020, 08:19

Originally posted by Clyde Schechter View Post

So something like this:

Code:

levelsof hh7, local(hh7s) gen lowerbound = . gen upperbound = . foreach h of local hh7s { ci means penta if hh7 == `h' replace lowerbound = r(lb) if hh7 == `h' replace upperbound = r(ub) if hh7 == `h' }

Note: this code is untested, as you did not provide example data to test it on. Consequently it may contain typos or other errors. You'll have to fix it up if so. Also because you did not provide example data, I have made certain assumptions about your data that are necessary for this code to work properly: hh7 must be a numeric variable, penta is a continuous variable, not a proportion. You will need to modify the code accordingly if these assumptions are wrong.

Dear Mr Schechter,

I plot my empirical distribution as you can see in the following diagram. The blue line is my empirical distribution (net earnings) and the red line is the reference distribution (RD).

I want to calculate the upper and the lower confidence interval of my empirical distribution (net earning).I am confused.
Is it possible to explain further how to use your code and what to replace in order to compute my confidence intervals?

Thank you in advance
Kleon
Comment

Announcement

How to generate 2 new variables consisting of the upper and lower confidence intervals of another variable

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment