interquartile regression

Cynthia Reed

Join Date: Aug 2016

Posts: 1
#1

interquartile regression

17 Aug 2018, 14:35

Hi,
I am having a problem with interquartile regression. Every time I rerun the data (the same data) I get a different p value. this is quite frustrating. I was told to set the seed. But how do you know you are setting an appropriate seed? and it is not intuitive as to how to set the seed either. The help options do not have anything about setting the seeds as suggested by the person from Stata who responded to my question.

pls help.

-Cynthia

Last edited by sladmin; 20 Aug 2018, 16:04. Reason: update username
Tags: None
Clyde Schechter

Join Date: Apr 2014

Posts: 30064
#2

17 Aug 2018, 14:47

Setting the seed is easy. Pick a positive integer, say 45982, and just run

Code:

set seed 45982

As for how to pick a seed: it doesn't matter. Any number is as good as any other. All that matters is that you include the -set seed- command (or, some commands that rely on random numbers have a -seed()- option) once, and only once, in the do-file before you reach any commands that use random numbers. The purpose of doing that is so that if you need to re-run the analysis your results will be reproducible.

Now, there is one other possibility for your non-reproducing results. It may have nothing to do with the random number generator. If you have any commands that explicitly or implicitly sort the data, and if the sort key variables do not uniquely identify observations in the data set, then the sort order is (partly) indeterminate, and in Stata the results will be different each time you run it. If subsequent to an indeterminate sort there are commands whose results depend on the order of the observations (e.g. if they use the first, or last, observation in a group), those results will differ from one occasion to the next. So scour your code for -sort- commands and for other commands that, in turn, sort the data themselves to see if you have this problem.

If indeterminate sorting is your problem, there are three possible solutions:

1. Add the -stable- option to the offending -sort- command(s): then the sort will preserve the existing order within groups corresponding to each combination of the sort key variables.

2. Put a -set sortseed pick_a_positive_integer- command into the code, somewhere prior to any commands that sort the data. This will set the seed of the (separate) random number generator that Stata's -sort- command uses, and all sorts thereafter will be determinate (but would change if you used a different seed). As with -set seed-, any positive integer is as good as any other.

3. Expand the sort key to include enough variables to uniquely identify observations in the data. That makes the sort order determinate.
1 like
Comment
Joao Santos Silva

Join Date: Apr 2014

Posts: 3006
#3

17 Aug 2018, 17:57

Dear Cynthia Reed

Just ro add to Clyde's excellent advice, note that the p-value is computed by bootstrap and therefore it is not surprising to see small differences in different runs. If the differences are not small, setting the seed will only disguise the problem; what you have to do is to increase the number of replicas used.

Out of curiosity, could you please let us know why you are running such regression?

Best wishes,

Joao

Last edited by sladmin; 20 Aug 2018, 16:04. Reason: update username
1 like
Comment

Announcement

interquartile regression

Comment

Comment