Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • interquartile regression

    Hi,
    I am having a problem with interquartile regression. Every time I rerun the data (the same data) I get a different p value. this is quite frustrating. I was told to set the seed. But how do you know you are setting an appropriate seed? and it is not intuitive as to how to set the seed either. The help options do not have anything about setting the seeds as suggested by the person from Stata who responded to my question.

    pls help.

    -Cynthia
    Last edited by sladmin; 20 Aug 2018, 16:04. Reason: update username

  • #2
    Setting the seed is easy. Pick a positive integer, say 45982, and just run
    Code:
    set seed 45982
    As for how to pick a seed: it doesn't matter. Any number is as good as any other. All that matters is that you include the -set seed- command (or, some commands that rely on random numbers have a -seed()- option) once, and only once, in the do-file before you reach any commands that use random numbers. The purpose of doing that is so that if you need to re-run the analysis your results will be reproducible.

    Now, there is one other possibility for your non-reproducing results. It may have nothing to do with the random number generator. If you have any commands that explicitly or implicitly sort the data, and if the sort key variables do not uniquely identify observations in the data set, then the sort order is (partly) indeterminate, and in Stata the results will be different each time you run it. If subsequent to an indeterminate sort there are commands whose results depend on the order of the observations (e.g. if they use the first, or last, observation in a group), those results will differ from one occasion to the next. So scour your code for -sort- commands and for other commands that, in turn, sort the data themselves to see if you have this problem.

    If indeterminate sorting is your problem, there are three possible solutions:

    1. Add the -stable- option to the offending -sort- command(s): then the sort will preserve the existing order within groups corresponding to each combination of the sort key variables.

    2. Put a -set sortseed pick_a_positive_integer- command into the code, somewhere prior to any commands that sort the data. This will set the seed of the (separate) random number generator that Stata's -sort- command uses, and all sorts thereafter will be determinate (but would change if you used a different seed). As with -set seed-, any positive integer is as good as any other.

    3. Expand the sort key to include enough variables to uniquely identify observations in the data. That makes the sort order determinate.

    Comment


    • #3
      Dear Cynthia Reed

      Just ro add to Clyde's excellent advice, note that the p-value is computed by bootstrap and therefore it is not surprising to see small differences in different runs. If the differences are not small, setting the seed will only disguise the problem; what you have to do is to increase the number of replicas used.

      Out of curiosity, could you please let us know why you are running such regression?

      Best wishes,

      Joao
      Last edited by sladmin; 20 Aug 2018, 16:04. Reason: update username

      Comment

      Working...
      X