Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How draw a number from a given vector of distribution?

    Hello everyone, and happy new year!

    I searched everywhere for an answer but could not find any, so I hope someone here could help me out.

    I have a vector where each entry is a given probability from 0 to 1.

    What I need is a function/way to create a new dummy vector with values 0 or 1 where the probability of having a 1 is given by the probability of the line of the other vector.
    That is: if in probability-vector I have an entry with 0.5 I need to have a corresponding cell equal to 1 (almost) half of the time over 100 repetitions of this function.
    Something like the table below
    Probability iter1 iter2 iter3 iter4 iter5 iter6 iter7 iter8 iter9 iter10
    0.5 1 1 0 1 0 0 1 1 0 1
    1 1 1 1 1 1 1 1 1 1 1

    Do you know how I could achieve something like this?

    please let me know if the question is not clear enough
    Thank you all,
    D.

  • #2
    Perhaps this example using Mata vectors will start you in a useful direction.
    Code:
    set seed 42
    mata:
    p1 = .5
    p2 = 1
    x = runiform(1,10)
    y = x:<=p1
    z = x:<=p2
    x \ y \ z
    end
    Code:
    : x \ y \ z
                     1             2             3             4             5             6
        +-------------------------------------------------------------------------------------
      1 |   .755155533   .6390313939   .7521452007   .1362726836   .9032689664   .0940683118
      2 |            0             0             0             1             0             1
      3 |            1             1             1             1             1             1
        +-------------------------------------------------------------------------------------
                     7             8             9            10
         ---------------------------------------------------------+
      1    .5745703041   .3728876995   .2738741017   .3902708814  |
      2              0             1             1             1  |
      3              1             1             1             1  |
         ---------------------------------------------------------+
    If not, your question really isn't clear without more detail, or at a minimum it is too difficult to guess at a good answer from what you have shared. Please help us help you. Show example data - your "probability vector". Is it a Stata dataset variable? Is it a Stata matrix? Is it a Mata matrix? In my example, I have just used two Mata scalars for convenience to demonstrate a technique you should be able to adapt.

    The Statalist FAQ provides advice on effectively posing your questions, posting data, and sharing Stata output.

    Comment


    • #3
      Thanks a lot William and sorry for not being very clear.

      So what I actually did is estimating a probit model from which I estimated the following probabilities in a standard Stata dataset variable (not mata):
      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      input double phat_af
          .94836003308945
        .8769213812682086
        .9215910212697133
        .9160083252126953
        .8136632295801208
        .6240468606644965
        .5985503567780064
       .49904546967474017
        .9675509892934736
        .9489425567744427
        .9239985676019286
        .9951993575259833
        .5700241203161944
        .8838109343484395
        .7709431776265225
        .8589910480484257
        .6265504674642524
        .5553053201189672
      .032334087049388546
         .974324634276116
      end
      Now I would like to create a new variable called dummy made of 0 and 1. And the 0 and 1 have to be given by the probability of the variable phat_af. So If I create the 10 dummy, I expect to have Zeros 5 times when the phat_af is equal to 0.5.


      Your suggetion is interesting though. Let me see if I got it correctly, you use the uniform distribution to create a new variable that we can call r_unif and compare the phat_af with r_unif. If r_unif<=phat_af then the dummy is equal to 1, and 0 otherwise. Is that what you meant right?

      But, and this is of course my limitation, in this way am I actually drawing from the probability phat_af? I am not sure if it is statistically the same


      Thank you a lot in any case for your time.
      D.

      Comment


      • #4
        Thanks for clarifying that your question involved numbers stored in a variable, not in a vector.

        So, relying on: "create a new variable ...[with] 0 and 1 ... given by the probability of the variable phat_a."

        Code:
        gen byte y = (runiform() < phat_a)

        Comment


        • #5
          in this way am I actually drawing from the probability phat_af? I am not sure if it is statistically the same
          If we follow the code in post #4 to generate the variable y, then Pr{y=1} = Pr{runiform()<phat_af} = F(phat_af) where F is the cumulative distribution function of a uniform(0,1) random variable.

          But F(x) = x defines the cumulative distribution function of a uniform(0,1) random variable, so F(phat_af) = phat_af.

          Thus Pr{y=1} = phat_af, which is what you seek.

          Last edited by William Lisowski; 01 Jan 2021, 14:46.

          Comment


          • #6
            That's great.
            Thank you Mike for this very neat code and thank you William for your explanations.

            Have an happy new year!

            Comment

            Working...
            X