Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Creating a variable that is chosen from two different distributions with probability=0.5

    Hi there,

    I am trying to create a variable u with 1000 observations that is distributed according to a mixture of normals: with probability 0.5 ui drawn from N (−5, 3), and with probability 0.5 it is drawn from N (5, 1). I am very new to Stata, so have been attempting the following code, but there is likely a much easier way to do this.


    set seed 9876
    set obs 1000

    gen e1 = rnormal(-5, 3)

    gen e2 = rnormal(5, 1)

    gen p = runiform()

    if p>0.5 {
    generate u=e1
    label variable u "ue1"
    }
    else if p<0.5 {
    generate u=e2
    label variable u "ue2"
    }

    The problem is that whatever the first value of p is, that is what the entire list of u becomes. For example, the first observation of p is <0.5, so my entire set of u are from the e2 distribution, as seen below


    +--------------------------------+
    | p e2 u |
    |--------------------------------|
    1. | .3038384 5.882499 5.882499 |
    2. | .0229448 5.948315 5.948315 |
    3. | .7245733 5.705297 5.705297 |
    4. | .1281148 4.831472 4.831472 |
    5. | .8212033 4.907266 4.907266 |
    |--------------------------------|
    6. | .3516393 6.40349 6.40349 |
    7. | .6429143 4.464439 4.464439 |
    8. | .522856 3.017771 3.017771 |
    9. | .1444471 4.213279 4.213279 |
    10. | .1331125 5.73031 5.73031 |
    |--------------------------------|
    11. | .0952627 3.310728 3.310728 |
    12. | .8458152 5.74537 5.74537 |
    13. | .1512675 2.610965 2.610965 |
    14. | .7207158 4.76138 4.76138 |
    15. | .4228746 4.465273 4.465273 |


    I am not sure if I am supposed to run this as 1000 simulations of a randomly generated p, or if there is a much simpler way of getting a bimodal distribution on Stata. Any help is incredibly appreciated.
    Thanks
    Mike

  • #2
    You are confusing the -if- command, which you are using inappropriately here, with the -if- condition, which is what you need to use. This is a subtlety that many people coming to Stata from other languages stumble over. What you want is:

    Code:
    clear*
    set seed 9876
    set obs 1000
    
    gen e1 = rnormal(-5, 3)
    
    gen e2 = rnormal(5, 1)
    
    gen p = runiform()
    
    gen wanted = e1 if p < 0.5
    replace wanted = e2 if p >= 0.5
    


    Actually, those last two commands can be compressed to a single command that avoids the use of explicit -if- entirely:
    Code:
    gen wanted = cond(p < 0.5, e1, e2)
    Suggested reading: -help if- and -help ifcmd-.

    Comment

    Working...
    X