Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using runiformint(a,b) - Create each Value between a and b once?

    Hello everybody,

    I am currently looking for a solution for the creation of random integer values where each value appears once only. For example:

    gen no = runiformint(1,4280) if ....

    If the condition is met I would like to see the numbers 1 to 4280 randomly distributed where no number occurs twice. I think this should be fairly easy to obtain but for some reason I cannot find a solution to the issue nor come to it myself.

    Maybe you can help me with this! I am looking forward to your responses and already thank you very much in advance!

    All the best
    Sebastian

  • #2
    You may find this Stata Blog discussion helpful.

    https://blog.stata.com/2012/08/03/us...t-replacement/

    Comment


    • #3
      I do not think this is a well posed problem as vaguely as Original Poster puts it, e.g., is the 4280 the total number of observations/numbers to be randomly distributed? What is this -if- condition?

      But if one wants to randomly reshuffle 4280 numbers

      Code:
      . set obs 4280
      number of observations (_N) was 0, now 4,280
      
      . gen n = _n
      
      . gen u = runiform()
      
      . sort u
      and the 4280 numbers from 1 to 4280 are reshuffled in random order.

      Comment


      • #4
        I agree with Joro's interpretation of the problem, and his solution . . . with one caveat:
        Code:
        gen u = runiform()
        would be better off as
        Code:
        generate double u = runiform()
        which would help alleviate the chances of the following.

        .ÿ
        .ÿversionÿ16.1

        .ÿ
        .ÿclearÿ*

        .ÿ
        .ÿsetÿseedÿ`=strreverse("1574573")'

        .ÿquietlyÿsetÿobsÿ4280

        .ÿ
        .ÿgenerateÿfloatÿuÿ=ÿruniform()

        .ÿisidÿu

        .ÿ
        .ÿforvaluesÿiÿ=ÿ1/100ÿ{
        ÿÿ2.ÿÿÿÿÿdisplayÿinÿsmclÿasÿtextÿ`i'
        ÿÿ3.ÿÿÿÿÿquietlyÿreplaceÿuÿ=ÿruniform()
        ÿÿ4.ÿÿÿÿÿÿÿÿÿisidÿu
        ÿÿ5.ÿ}
        1
        2
        3
        4
        5
        6
        7
        variableÿuÿdoesÿnotÿuniquelyÿidentifyÿtheÿobservations
        r(459);

        endÿofÿdo-file

        r(459);

        .


        Chalk up yet another example arguing for this proposal to StataCorp.

        Comment


        • #5
          Thank you for all your responses. I will go through them and see whether they can help me. My apologies for maybe holding back information you would have needed to better understand what is supposed to happen. It is a sample of 42,000 observations which, in turn, consist of ten subsamples given their respective decile. For example: 4280 observations belong to decile one. Thus, Joro Kolev is right in the sense that the creation of numbers 1 to 4280 only affects the (total) observations of the subsample if Decile == 1, which is the if condition asked in this.

          This means that I need one variable where numbers 1 to n oberservations (given the decile appears)
          Decile Number
          1 1
          ... ...
          1 4280
          2 1
          ... ...
          2 4279
          3 1
          ... ...
          3 4285
          4 1
          ... ...
          4 4273
          5 1
          ... ...
          5 4280
          6 1
          ... ...
          6 4279
          and so on and so on
          The numbers need to be randomly ordered because afterwards I replace them with numbers 0-3 given a specifc pattern which is supposed to randomly affect the observations. For example: in decile 1 numbers 1 to 3766 (randomly placed over the subsample) are replaced with a 0 and 3767 to 4273 with 1 and so on.

          The example below shows the deciles. Each decile has ten observations. Hence each decile should have randomly assigned the numbers 1 to 10. For each decile in one variable.

          Code:
          * Example generated by -dataex-. To install: ssc install dataex
          clear
          input float Decile
           1
           1
           1
           1
           1
           1
           1
           1
           1
           1
           2
           2
           2
           2
           2
           2
           2
           2
           2
           2
           3
           3
           3
           3
           3
           3
           3
           3
           3
           3
           4
           4
           4
           4
           4
           4
           4
           4
           4
           4
           5
           5
           5
           5
           5
           5
           5
           5
           5
           5
           6
           6
           6
           6
           6
           6
           6
           6
           6
           6
           7
           7
           7
           7
           7
           7
           7
           7
           7
           7
           8
           8
           8
           8
           8
           8
           8
           8
           8
           8
           9
           9
           9
           9
           9
           9
           9
           9
           9
           9
          10
          10
          10
          10
          10
          10
          10
          10
          10
          10
          end

          I will look into your solutions now to see if they already help me with the issue. Thank you so much for responding and helping me.
          Last edited by Sebastian Schoen; 28 Sep 2020, 01:40.

          Comment


          • #6
            I think gen = _n does not work. Even if I use if-conditions (if Decile == 1 ........ == 10) it does not restart with value 1 as soon as the second, third, fourth and so on decile is reached. I ran it with the conditions and it produced me the numbers of total observations 42,000.

            Do you have an idea how I can do that? I appreciate your help!

            Comment


            • #7
              Originally posted by Sebastian Schoen View Post
              . . . restart with value 1 as soon as the second, third, fourth and so on decile is reached. . . . Do you have an idea how I can do that?
              Probably something like
              Code:
              bysort Decile: generate int n = _n
              Why don't you just sort the entire dataset with a randu in the manner like Joro showed (but using a double-precision floating point variable)? You can then get your deciles and the observation number both without further ado.

              Comment

              Working...
              X