How do I run a regression 1000 times and then save each of the coefficients and p-values?

Chris Sean

Join Date: Mar 2022
Posts: 18

How do I run a regression 1000 times and then save each of the coefficients and p-values?

24 Oct 2023, 13:49

First, I have a dataset (call it "count.dta") that looks like this:

Code:

input id    num_treat    num_control
1    2    2
6    1    2
end

Then I have another dataset (call it "fulldata.dta") that looks like this:

Code:

input id    gvkey    treatment    outcome_var    event_year    year    post    control_var
1    1004    0    4.20    2007    2005    0    1.55
1    1004    0    1.25    2007    2006    0    1.41
1    1004    0    1.38    2007    2007    1    1.47
1    1004    0    1.38    2007    2008    1    1.65
1    1004    0    2.24    2007    2009    1    1.73
1    1008    0    1.47    2007    2005    0    1.72
1    1008    0    2.04    2007    2006    0    0.58
1    1008    0    2.03    2007    2007    1    0.82
1    1008    0    1.35    2007    2008    1    0.98
1    1008    0    1.90    2007    2009    1    0.79
1    2013    0    1.12    2007    2005    0    0.95
1    2013    0    0.00    2007    2006    0    0.79
1    2013    0    0.00    2007    2007    1    0.75
1    2013    0    0.00    2007    2008    1    1.06
1    2013    0    0.00    2007    2009    1    1.03
1    4055    0    0.00    2007    2005    0    0.75
1    4055    0    0.00    2007    2006    0    0.77
1    4055    0    0.00    2007    2007    1    0.78
1    4055    0    0.00    2007    2008    1    0.74
1    4055    0    0.00    2007    2009    1    0.56
1    7544    1    1.82    2007    2005    0    0.62
1    7544    1    1.67    2007    2006    0    0.63
1    7544    1    4.00    2007    2007    1    0.73
1    7544    1    3.17    2007    2008    1    0.78
1    7544    1    3.85    2007    2009    1    0.98
1    8922    1    2.70    2007    2005    0    0.77
1    8922    1    1.89    2007    2006    0    0.62
1    8922    1    1.25    2007    2007    1    0.95
1    8922    1    1.28    2007    2008    1    1.28
1    8922    1    1.56    2007    2009    1    0.81
1    10334    1    1.52    2007    2005    0    1.09
1    10334    1    1.72    2007    2006    0    1.52
1    10334    1    2.00    2007    2007    1    0.30
1    10334    1    1.85    2007    2008    1    0.19
1    10334    1    1.96    2007    2009    1    0.88
6    1008    0    0.98    2014    2012    0    0.87
6    1008    0    1.45    2014    2013    0    0.36
6    1008    0    1.41    2014    2014    1    1.02
6    1008    0    1.49    2014    2015    1    1.26
6    1008    0    1.56    2014    2016    1    1.44
6    2742    0    1.64    2014    2012    0    0.18
6    2742    0    1.39    2014    2013    0    0.14
6    2742    0    1.12    2014    2014    1    0.18
6    2742    0    1.09    2014    2015    1    0.12
6    2742    0    1.37    2014    2016    1    0.10
6    6342    0    1.35    2014    2012    0    0.15
6    6342    0    2.63    2014    2013    0    0.06
6    6342    0    2.67    2014    2014    1    0.05
6    6342    0    2.67    2014    2015    1    0.07
6    6342    0    2.56    2014    2016    1    0.94
6    10334    1    2.63    2014    2012    0    0.94
6    10334    1    2.60    2014    2013    0    0.97
6    10334    1    1.39    2014    2014    1    0.99
6    10334    1    0.00    2014    2015    1    0.95
6    10334    1    0.00    2014    2016    1    1.09
6    74232    1    0.00    2014    2012    0    1.01
6    74232    1    2.78    2014    2013    0    1.04
6    74232    1    0.00    2014    2014    1    1.08
6    74232    1    0.00    2014    2015    1    0.11
6    74232    1    0.00    2014    2016    1    0.62
6    80892    1    0.00    2014    2012    0    0.29
6    80892    1    1.89    2014    2013    0    0.10
6    80892    1    1.89    2014    2014    1    0.15
6    80892    1    0.46    2014    2015    1    0.06
6    80892    1    0.51    2014    2016    1    0.05
end

What I want to do is from the "count.dta" data, for each id, the num_treat and num_control represents how many treatment and control gvkeys I want to randomly select (without replacement) from the "fulldata.dta" database.

Let me give an example. Look at the first row of "count.dta". We have id=1 and num_treat=2, this means from the "fulldata.dta" dataset, I want to randomly select two gvkeys with treatment=1. Specifically, there are three gvkeys to choose from, gvkey=7544, 8922, and 10334 and I want to pick two of these randomly without replacement. So, say I pick gvkeys 8922 and 10334. Now similarly, num_control=2 means I want to randomly select two gvkeys with treatment=0 from "fulldata.dta". Specifically, there are four gvkeys to choose from gvkey=1004, 1008, 2013, and 4055 without replacement. So, say I pick 1004 and 2013.

For id=6, we do exactly the same procedure, i.e., num_treat=1 means we pick ONE gvkey randomly from "fulldata.dta" with treatment=1 and num_control=2 means we pick TWO gvkey randomly from "fulldata.dta" with treatment=0 (without replacement). So, say I randomly picked gvkey=80892 for treatment=1 and gvkey=2742 and 6342 for treatment=0.

This means the random sample I have constructed should look as follows (call this dataset "randomselect.dta"):

Code:

input id    gvkey    treatment    outcome_var    event_year    year    post    control_var
1    1004    0    4.20    2007    2005    0    1.55
1    1004    0    1.25    2007    2006    0    1.41
1    1004    0    1.38    2007    2007    1    1.47
1    1004    0    1.38    2007    2008    1    1.65
1    1004    0    2.24    2007    2009    1    1.73
1    2013    0    1.12    2007    2005    0    0.95
1    2013    0    0.00    2007    2006    0    0.79
1    2013    0    0.00    2007    2007    1    0.75
1    2013    0    0.00    2007    2008    1    1.06
1    2013    0    0.00    2007    2009    1    1.03
1    8922    1    2.70    2007    2005    0    0.77
1    8922    1    1.89    2007    2006    0    0.62
1    8922    1    1.25    2007    2007    1    0.95
1    8922    1    1.28    2007    2008    1    1.28
1    8922    1    1.56    2007    2009    1    0.81
1    10334    1    1.52    2007    2005    0    1.09
1    10334    1    1.72    2007    2006    0    1.52
1    10334    1    2.00    2007    2007    1    0.30
1    10334    1    1.85    2007    2008    1    0.19
1    10334    1    1.96    2007    2009    1    0.88
6    2742    0    1.64    2014    2012    0    0.18
6    2742    0    1.39    2014    2013    0    0.14
6    2742    0    1.12    2014    2014    1    0.18
6    2742    0    1.09    2014    2015    1    0.12
6    2742    0    1.37    2014    2016    1    0.10
6    6342    0    1.35    2014    2012    0    0.15
6    6342    0    2.63    2014    2013    0    0.06
6    6342    0    2.67    2014    2014    1    0.05
6    6342    0    2.67    2014    2015    1    0.07
6    6342    0    2.56    2014    2016    1    0.94
6    80892    1    0.00    2014    2012    0    0.29
6    80892    1    1.89    2014    2013    0    0.10
6    80892    1    1.89    2014    2014    1    0.15
6    80892    1    0.46    2014    2015    1    0.06
6    80892    1    0.51    2014    2016    1    0.05
end

Then I want to run the following regression on the above randomly selected sample:

Code:

egen firm_evt = group(gvkey event_year)
egen year_evt = group(year event_year)

reghdfe outcome_var  ib0.treatment##ib0.post control_var, absorb(firm_evt year_evt) vce(cluster gvkey)

One should get the following output:

Click image for larger version

Name: output.png
Views: 1
Size: 17.0 KB
ID: 1731401

What I want to do is save the coefficient output on treatment#post, which is 0.1613656, its corresponding t-statistic and p-value, which are 0.43 and 0.685.

Then, I want to repeat the above process 1000 times, saving the coefficient, t-statistic, and p-value from each random sample and save the results in a dataset. Could someone help me code up this process? I am very new to STATA. Thank you.

Last edited by Chris Sean; 24 Oct 2023, 13:53.

Tags: None

Rich Goldstein

Join Date: Mar 2014

Posts: 4465
#2

24 Oct 2023, 13:53

Code:

h statsby h runby

note that -runby- is user-written and must be installed; you should be able to do that from the attempt to find the help file; otherwise use -search- to find and download/install
Comment
Chris Sean

Join Date: Mar 2022

Posts: 18
#3

24 Oct 2023, 14:25

Thanks for that, but how do I incorporate those two packages into my process? E.g., I'm not sure how to code up the random sample selection part described above.
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#4

24 Oct 2023, 14:58

asreg will eat it up
Comment
Chris Sean

Join Date: Mar 2022

Posts: 18
#5

24 Oct 2023, 15:44

I need to use reghdfe because of clustering adjustments with interacted fixed effects. I guess my first question is how do I randomly pick the sample using the count.dta dataset?
Comment
George Ford

Join Date: Aug 2014

Posts: 3152
#6

24 Oct 2023, 15:58

how big is count.dta?
Comment
Chris Sean

Join Date: Mar 2022

Posts: 18
#7

24 Oct 2023, 16:11

The full count.dta data has 1362 rows, i.e., 1362 unique ids. But I just wanted to test it out on the dummy data I posted.
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30101

24 Oct 2023, 16:30

Here's an illustration of the overall flow of the approach. You will need to modify it to do the actual regressions you want and saving the actual results you want.

Code:

clear*
input id    num_treat    num_control
1    2    2
6    1    2
end

frame create fulldata
frame change fulldata
input id    gvkey    treatment    outcome_var    event_year    year    post    control_var
1    1004    0    4.20    2007    2005    0    1.55
1    1004    0    1.25    2007    2006    0    1.41
1    1004    0    1.38    2007    2007    1    1.47
1    1004    0    1.38    2007    2008    1    1.65
1    1004    0    2.24    2007    2009    1    1.73
1    1008    0    1.47    2007    2005    0    1.72
1    1008    0    2.04    2007    2006    0    0.58
1    1008    0    2.03    2007    2007    1    0.82
1    1008    0    1.35    2007    2008    1    0.98
1    1008    0    1.90    2007    2009    1    0.79
1    2013    0    1.12    2007    2005    0    0.95
1    2013    0    0.00    2007    2006    0    0.79
1    2013    0    0.00    2007    2007    1    0.75
1    2013    0    0.00    2007    2008    1    1.06
1    2013    0    0.00    2007    2009    1    1.03
1    4055    0    0.00    2007    2005    0    0.75
1    4055    0    0.00    2007    2006    0    0.77
1    4055    0    0.00    2007    2007    1    0.78
1    4055    0    0.00    2007    2008    1    0.74
1    4055    0    0.00    2007    2009    1    0.56
1    7544    1    1.82    2007    2005    0    0.62
1    7544    1    1.67    2007    2006    0    0.63
1    7544    1    4.00    2007    2007    1    0.73
1    7544    1    3.17    2007    2008    1    0.78
1    7544    1    3.85    2007    2009    1    0.98
1    8922    1    2.70    2007    2005    0    0.77
1    8922    1    1.89    2007    2006    0    0.62
1    8922    1    1.25    2007    2007    1    0.95
1    8922    1    1.28    2007    2008    1    1.28
1    8922    1    1.56    2007    2009    1    0.81
1    10334    1    1.52    2007    2005    0    1.09
1    10334    1    1.72    2007    2006    0    1.52
1    10334    1    2.00    2007    2007    1    0.30
1    10334    1    1.85    2007    2008    1    0.19
1    10334    1    1.96    2007    2009    1    0.88
6    1008    0    0.98    2014    2012    0    0.87
6    1008    0    1.45    2014    2013    0    0.36
6    1008    0    1.41    2014    2014    1    1.02
6    1008    0    1.49    2014    2015    1    1.26
6    1008    0    1.56    2014    2016    1    1.44
6    2742    0    1.64    2014    2012    0    0.18
6    2742    0    1.39    2014    2013    0    0.14
6    2742    0    1.12    2014    2014    1    0.18
6    2742    0    1.09    2014    2015    1    0.12
6    2742    0    1.37    2014    2016    1    0.10
6    6342    0    1.35    2014    2012    0    0.15
6    6342    0    2.63    2014    2013    0    0.06
6    6342    0    2.67    2014    2014    1    0.05
6    6342    0    2.67    2014    2015    1    0.07
6    6342    0    2.56    2014    2016    1    0.94
6    10334    1    2.63    2014    2012    0    0.94
6    10334    1    2.60    2014    2013    0    0.97
6    10334    1    1.39    2014    2014    1    0.99
6    10334    1    0.00    2014    2015    1    0.95
6    10334    1    0.00    2014    2016    1    1.09
6    74232    1    0.00    2014    2012    0    1.01
6    74232    1    2.78    2014    2013    0    1.04
6    74232    1    0.00    2014    2014    1    1.08
6    74232    1    0.00    2014    2015    1    0.11
6    74232    1    0.00    2014    2016    1    0.62
6    80892    1    0.00    2014    2012    0    0.29
6    80892    1    1.89    2014    2013    0    0.10
6    80892    1    1.89    2014    2014    1    0.15
6    80892    1    0.46    2014    2015    1    0.06
6    80892    1    0.51    2014    2016    1    0.05
end

frame change default

capture program drop one_sample
program define one_sample
    local n_treat = num_treat[1]
    local n_ctrl = num_control[1]
    frame fulldata {
        frame put gvkey treatment, into(selections)
        frame selections {
            duplicates drop
            gen double shuffle = runiform()
            gen keepers = cond(treatment, `n_treat', `n_ctrl')
            sort treatment shuffle
            by treatment, sort: keep if _n <= keepers
            drop keepers shuffle
        }
        frlink m:1 gvkey, frame(selections)
        frame drop selections
        frame put _all if !missing(selections), into(sample)
        frame sample {
             regress outcome_var i.treatment control_var i.year
            matrix M = r(table)
        }
        drop selections
    }        
    gen b_treatment = M["b", "1.treatment"]
    gen p_treatment = M["pvalue", "1.treatment"]
    frame drop sample
    exit
end

set seed 1234 // OR YOUR FAVORITE RANDOM NUMBER SEED
runby one_sample, by(id) status

The italicized code is just a placeholder to illustrate the approach. Replace this with the actual regression you want to do and the actual results you want to store. Everything else should be left as is.

Note: Both of your data sets have a variable called id. It's not clear to me what purpose it serves in either data set. The code above assumes that in the "count" data set it uniquely identifies observations. If that's not true, then you need to create a new variable in the count data set that does uniquely identify observations and use that variable, not id, in the -by()- option of the -runby- command.

Comment

Chris Sean

Join Date: Mar 2022

Posts: 18
#9

24 Oct 2023, 16:45

Thank you Clyde, I will have a look and test your sample code now. Just to clarify, the id in count and fulldata serves to identify the group of gvkey's to randomly select. For example, id=1 refers to num_treat=2 and num_control=2 in count, this means that in fulldata, we only look at the gvkeys with id=1, that is, gvkey=1004 1008 2013 4055 7544 8922 and 10334. Then within this sample of gvkeys, num_treat=2 means we need to pick two gvkeys from this list with treatment=1, there are three options for this: gvkey=7544 8922 and 10334. Likewise, num_control=2 means we look at all the gvkeys with treatment=0 (within id=1) and pick two random gvkeys, there are four options for this: gvkey=1004 1008 2013 4055.

The id variable is important in linking count and fulldata because it tells you how many treatment=1 and treatment=0 gvkeys to randomly pick WITHIN each id. I'm sorry if I made it too confusing... but does this make sense?
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30101

#10

24 Oct 2023, 17:13

I see. The code I offered does not do that; it selects gvkeys without regard to the corresponding id variable. With slight modification, though, it can do what you are looking for:

Code:

clear*
capture program drop one_sample
program define one_sample
    local n_treat = num_treat[1]
    local n_ctrl = num_control[1]
    local id = id[1]
    frame fulldata {
        frame put gvkey treatment if id == `id', into(selections)
        frame selections {
            duplicates drop
            gen double shuffle = runiform()
            gen keepers = cond(treatment, `n_treat', `n_ctrl')
            sort treatment shuffle
            by treatment, sort: keep if _n <= keepers
            drop keepers shuffle
        }
        frlink m:1 gvkey, frame(selections)
        frame drop selections
        frame put _all if !missing(selections) & id == `id', into(sample)
        frame sample {
             regress outcome_var i.treatment control_var i.year
            matrix M = r(table)
        }
        drop selections
    }        
    gen b_treatment = M["b", "1.treatment"]
    gen p_treatment = M["pvalue", "1.treatment"]
    frame drop sample
    exit
end

set seed 1234 // OR YOUR FAVORITE RANDOM NUMBER SEED
frame create fulldata
frame fulldata {
    use fulldata
}
use count_data
gen `c(obs_t)' sample_num = _n
runby one_sample, by(sample_num) status

Changes are in bold face.

Comment

Chris Sean

Join Date: Mar 2022

Posts: 18
#11

24 Oct 2023, 21:24

Hi Clyde, thank you for that! The code almost works, but I actually wanted to have regression run on the FULL entire across all ids, then repeat this process 1000 times. Right now, the code runs the regression on each id separately. Specifically, what I envisioned was that after you picked the random selection of gvkeys for id=1 and id=6 (and potentially more ids in the complete dataset that I have), these datasets are "stacked" together to form one big dataset. Then you run:

Code:

regress outcome_var i.treatment control_var i.year

or whatever regression that I want and then save the results ("b" and "pvalue" in your example) for this iteration. Then you redo the entire process again, i.e., go back to id=1, pick the random gvkeys, same for id=6, and then stack the dataset together, run the same regression and save the same output, then repeat this 1000 times. So at the end, one should have an output like this:

where replication_num lists the iteration of the replication, and the other columns are the saved outputs from that iteration.

Thank you so much, I really appreciate your time and effort on this.
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30101
#12

25 Oct 2023, 11:14

Oh, that's actually much simpler than what I had understood you to want.

Code:

set seed 1234 // OR YOUR FAVORITE RANDOM NUMBER SEED local reps 10 frame create results int replication_num float(b pvalue) use countdata, clear rename (num_control num_treat) num#, addnumber(0) reshape long num, i(id) j(treatment) merge 1:m id treatment using fulldata, assert(match) nogenerate gen double shuffle = . gen byte selected = . forvalues i = 1/`reps' { quietly { replace shuffle = runiform() by id treatment (shuffle), sort: replace selected = _n <= num regress outcome_var i.treatment control_var i.year if selected matrix M = r(table) frame post results (`i') (M["b", "1.treatment"]) (M["pvalue", "1.treatment"]) } }

Notes:
1. In your example data, this particular regression produces erratic results. The problem arises because the number of years that get included in the sample varies from one replication to the next. When the number of years is large, the model df is too large for the tiny sample size and so you get no p-value, only a coefficient. Presumably this will not happen in the full data. However, depending on the circumstances, you might want to give some thought to revising the scheme to assure uniform representation of the years in all samples. I don't know if that's actually important for your research question(s) or not, but if it is, that will require some modification to the programming.

2. If your data set is very large, (order of magnitude 50,000,000 observations or more) you should create two shuffle variables, say shuffle1 and shuffle2. Each should be updated to runiform() in every iteration of the loop, and the -replace selected- command should be prefixed with -by id treatment (shuffle1 shuffle2), sort:-.

3. I have put -quietly- around the loop body on the assumption that you don't want to see the full output from many -replace-s and 1,000 regressions in the Results window and your log file. However, with 1,000 replications, you might want some kind of progress report while the loop is running so you don't have to wonder whether Stata has hung. So you could put something like

Code:

if mod(`i', 50) == 0 { display `"`i' replications completed"' }

after the quietly block so you get an update every 50 reps.
Comment
Chris Sean

Join Date: Mar 2022

Posts: 18
#13

25 Oct 2023, 11:53

Thanks Clyde. I just tested your new code, however, that only seems to randomly pick individual observations as to a whole "bunch". What I wanted was to randomly pick the unique gvkeys but include the whole "year" associated with that gvkey in that id. For example, for id=1, say I randomly picked gvkey=1004 and gvkey=2013 for treatment=0 (represented by num_control=2 for id=1) and gvkey=8922 and gvkey=10334 (represented by num_treat=2 for id=1). Then for id=6, say I randomly picked gvkey=2742 and gvkey=6342 for treatment=0 (represented by num_control=2 for id=6) and gvkey=80892 for treatment=1 (represented by num_treat=1 for id=6). Then I want to keep all of the years associated with the gvkeys that were randomly picked and put them ALL together as follows:

Code:

input id gvkey treatment outcome_var event_year year post control_var 1 1004 0 4.20 2007 2005 0 1.55 1 1004 0 1.25 2007 2006 0 1.41 1 1004 0 1.38 2007 2007 1 1.47 1 1004 0 1.38 2007 2008 1 1.65 1 1004 0 2.24 2007 2009 1 1.73 1 2013 0 1.12 2007 2005 0 0.95 1 2013 0 0.00 2007 2006 0 0.79 1 2013 0 0.00 2007 2007 1 0.75 1 2013 0 0.00 2007 2008 1 1.06 1 2013 0 0.00 2007 2009 1 1.03 1 8922 1 2.70 2007 2005 0 0.77 1 8922 1 1.89 2007 2006 0 0.62 1 8922 1 1.25 2007 2007 1 0.95 1 8922 1 1.28 2007 2008 1 1.28 1 8922 1 1.56 2007 2009 1 0.81 1 10334 1 1.52 2007 2005 0 1.09 1 10334 1 1.72 2007 2006 0 1.52 1 10334 1 2.00 2007 2007 1 0.30 1 10334 1 1.85 2007 2008 1 0.19 1 10334 1 1.96 2007 2009 1 0.88 6 2742 0 1.64 2014 2012 0 0.18 6 2742 0 1.39 2014 2013 0 0.14 6 2742 0 1.12 2014 2014 1 0.18 6 2742 0 1.09 2014 2015 1 0.12 6 2742 0 1.37 2014 2016 1 0.10 6 6342 0 1.35 2014 2012 0 0.15 6 6342 0 2.63 2014 2013 0 0.06 6 6342 0 2.67 2014 2014 1 0.05 6 6342 0 2.67 2014 2015 1 0.07 6 6342 0 2.56 2014 2016 1 0.94 6 80892 1 0.00 2014 2012 0 0.29 6 80892 1 1.89 2014 2013 0 0.10 6 80892 1 1.89 2014 2014 1 0.15 6 80892 1 0.46 2014 2015 1 0.06 6 80892 1 0.51 2014 2016 1 0.05 end

Then the regression is run on the above dataset and outputs are saved. This is counted as one iteration. Then the above process repeats for n iterations.

Also note that each gvkey picked must be unique within that id (i.e., without replacement), so for example, for id=1 and num_treat=2, to pick two gvkeys for the treatment, I can't pick gvkey=8922 twice, once 8922 is chosen, I must pick either 7544 or 10334 as the other treatment gvkey. Same goes for num_control=2.

Hopefully I made more sense now

Last edited by Chris Sean; 25 Oct 2023, 12:00.
Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30101

#14

25 Oct 2023, 12:58

OK. I hope we are on the same page about what is wanted now.

Code:

clear*
set seed 1234 // OR YOUR FAVORITE RANDOM NUMBER SEED
local reps 1000
frame create results int replication_num float(b pvalue)

use countdata, clear
rename (num_control num_treat) num#, addnumber(0)
reshape long num, i(id) j(treatment)
merge 1:m id treatment using fulldata, assert(match) nogenerate
frame put id treatment num gvkey, into(selections)
frame selections {
    duplicates drop
    gen double shuffle = runiform()
    gen byte selected = .
    isid id gvkey, sort
}
frlink m:1 id gvkey, frame(selections)

forvalues i = 1/`reps' {
    frame selections {
        quietly replace shuffle = runiform()
        quietly by id treatment (shuffle), sort: replace selected = _n <= num
    }
    quietly {
        frlink rebuild selections, frame(selections)
        frget selected, from(selections)
        regress outcome_var i.treatment control_var i.year if selected
        matrix M = r(table)
        frame post results (`i') (M["b", "1.treatment"]) (M["pvalue", "1.treatment"])
        drop selected
    }
}

The results you want are in frame results.

Comment

Chris Sean

Join Date: Mar 2022

Posts: 18
#15

25 Oct 2023, 13:31

This is perfect Clyde, thank you so much. I learnt so much as well, especially using frame which I had never used before. Thanks again!
Comment

Announcement

How do I run a regression 1000 times and then save each of the coefficients and p-values?

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment