Random normal variable between two PDF p-values

Ernesto Vincenti

Join Date: Mar 2015

Posts: 29
#1

Random normal variable between two PDF p-values

05 Jul 2019, 14:43

I'd like to come up with a way of generating random numbers between 0.84 and 1 that follow the right tail of a random normal distribution PDF with a mean and standard deviation that I provide.

Some background in case it helps. I'm working with survey data in Stata 15. There is an income variable, income, that is top-coded at a fixed nominal level of $100,000. Records below this are coded as-is; records above this are all coded at $100,000. The survey has a probability weight, weight. I've created a dummy variable tc equal to 1 if a record is top-coded. About 16% of records are so top-coded.

I would like to test the implications of imputing income for top-coded records under different distributional assumptions. The first imputation I would like to test is under the assumption that income is log-normal distributed. I've created a lincome variable for this purpose. Assume that the mean of lincome is 10.5 and the standard deviation is 1.5.

To do the imputation, I'd need a strategy for randomly generating a normal variable given a mean of 10.5 and a standard deviation of 1.5, but only at the values between the 84-100% percentiles in the normal PDF.

Here's code that generates a synthetic version of my data.

Code:

clear all set seed 10016 set obs 1000 gen lincome = rnormal(10, 1.5) gen income = exp(lincome) gen weight = 100 gen tc = income > 100000 replace income = 100000 if income > 100000 replace lincome = ln(100000) if lincome > ln(100000)

Last edited by Ernesto Vincenti; 05 Jul 2019, 15:04.
Tags: None

Mike Lacy

Join Date: Apr 2014
Posts: 2421

05 Jul 2019, 15:04

Here's one approach: Create a column vector of such values, save it as a variable, and use the values as needed.

Code:

mat newinc = J(`=_N', 1, -1)  // -1 is out of range
forval i = 1/`=_N' {
   while newinc[`i', 1] < 0.84 {
        mat newinc[`i',1] = rnormal(0,1) 
   }
}
svmat newinc  
replace lincome = 10.5 + 1.5 * newinc1 if !tc
drop newinc

Comment

Ernesto Vincenti

Join Date: Mar 2015

Posts: 29
#3

07 Jul 2019, 20:15

Thanks Mike Lacy. This is very close. The rnormal call should be wrapped in a normal function, and lincome should be replaced if tc, not if !tc. Also, to ensure that the imputed values stayed above the top code, I changed up the final replace call so that it added the marginal standard deviation to the top-code limit.

Code:

mat newinc = J(`=_N', 1, -1) // -1 is out of range forval i = 1/`=_N' { while newinc[`i', 1] < 0.84 { mat newinc[`i',1] = normal(rnormal(0,1)) } } svmat newinc replace lincome = lincome + (invnormal(newinc1)-invnormal(0.84))*1.5 if tc
Comment

Announcement

Random normal variable between two PDF p-values

Comment

Comment