Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiple regressions in a loop and storing regression coefficients

    Hi all,

    For my master thesis I am working with dynamic panel data with small T (13) and large N(20.000). I am estimating the long-term price elasticity of energy consumption using the Arellano-Bond estimator in xtabond2.
    What I want to research is whether there is heteroscedasticity with respect to these elasticities and to do this I want to randomly take a sample from my dataset, run the regression, calculate the long-term elasticitiy, store it somewhere(?) and repeat this several (100+) times.
    I then want to further inspect the datasets that resulted in the maximum and minimum with respect to the elasticity. I tried the following code

    Code:
    forval i = 1/3 {
        set seed `i'
        tempfile holding
        save `holding'
        keep id
        duplicates drop
        sample 200, count
        merge 1:m id using `holding', assert (match using) keep(match) nogenerate
        sort id year
        xtabond2 consumption consumptionL1 consumptionL2 gas gasL1 gasL2 gdp gdpL1 gdpL2 heatdays, gmm(consumptionL1) iv(gas gasL1 gasL2 gdp gdpL1 gdpL2 heatdays) nolevel robust small
        matrix elast = (_b[gas] + _b[gasL1] + _b[gasL2])/(1-_b[consumptionL1] - _b[consumptionL2])
    }
    Indeed, three regressions are run, however they seem to be identical so I guess something must be wrong with how I set up the seed?
    Also, I cannot seem to find how to inspect the values for the elasticities that I stored as a matrix? Should I store the results differently.

    Any help is welcome!

    Kind regards,


    Hein Willems

  • #2
    Nothing is wrong with the way you set the seed, but after you run the first regression, you don't bring the original data back in. Instead you "resample" the first sample. What you need to do is revise the management of the `holding' file. Moreover, there is no need to reset the random number seed on each iteration of the loop. You will do just as well setting it once at the top of the code. The random number generator will just keep going through its sequence on subsequent iterations: it will not reset itself and repeat what it did the first time.
    Code:
    tempfile holding
    save `holding'
    set seed 1234
    
    forval i = 1/3 {
        use `holding', clear
        keep id
        duplicates drop
        sample 200, count
        merge 1:m id using `holding', assert (match using) keep(match) nogenerate
        sort id
        xtabond2 consumption consumptionL1 consumptionL2 gas gasL1 gasL2 gdp gdpL1 gdpL2 heatdays, gmm(consumptionL1) iv(gas gasL1 gasL2 gdp gdpL1 gdpL2 heatdays) nolevel robust small
        matrix elast = (_b[gas] + _b[gasL1] + _b[gasL2])/(1-_b[consumptionL1] - _b[consumptionL2])
    }

    Comment


    • #3
      Clyde Schechter Thank you so much, it works perfectly.
      Is there also a way to see what sample is used in a certain iteration? For example, the results show that the 5th iteration gave the largest elasticity, I would like to know which ID's belong to this sample, what would be a way to go?

      Thanks in advance!

      Hein Willems

      Comment


      • #4
        You can build up a matrix containing the ids from each sample as you iterate through the loop. The code below ends up by listing that matrix, but you can also use the -svmat- command to turn it into a data set. See -help svmat- if you want to do that.

        Code:
        tempfile holding
        save `holding'
        set seed 1234
        
        forval i = 1/3 {
            use `holding', clear
            keep id
            duplicates drop
            sample 200, count
            sort id
            mkmat id, matrix(current_sample)
            matname current_sample sample`i', explicit columns(1)
            matrix all_samples = nullmat(all_samples), current_sample
            merge 1:m id using `holding', assert (match using) keep(match) nogenerate
            sort id
            xtabond2 consumption consumptionL1 consumptionL2 gas gasL1 gasL2 gdp gdpL1 gdpL2 heatdays, gmm(consumptionL1) iv(gas gasL1 gasL2 gdp gdpL1 gdpL2 heatdays) nolevel robust small
            matrix elast = (_b[gas] + _b[gasL1] + _b[gasL2])/(1-_b[consumptionL1] - _b[consumptionL2])
        }
        
        matrix list all_samples

        Comment


        • #5
          Clyde Schechter Thanks for your help some time ago.

          I am now using the above code again but for bootstrapping standard errors I now want to sample with replacement.
          I thought this should be an easy adaption but I dont seem to be able to figure it out. Do you know how to resolve this?

          Thanks in advance,

          Hein Willems

          Comment


          • #6
            Please show the exact code you are trying and the exact Results that Stata is giving you. Also, please show example data, using the -dataex- command.

            Comment


            • #7
              Dear Clyde Schechter ,

              I was trying to adapt the code above by replacing sample with bsample (as this is said to sample with replacement). However, when I run it I get the error that the option count is not possible. Removing count than gives a new error.
              This is my code:

              Code:
              clear all
              
              
              cd "C:\Users\wille\OneDrive\Afstuderen\Stata\data_enexis_1500"
              insheet using "quantile14.csv", comma clear
              
              format year %ty
              encode postcode, generate(id)
              egen newid = group(id)
              global id id
              global year year
              sort $id $year
              xtset $id $year
              
              matrix elast = (.)
              
              tempfile holding
              save `holding'
              set seed 1234
              
              forval i = 1/3 {
                  use `holding', clear
                  keep id
                  duplicates drop
                  bsample 200, count
                  merge 1:m id using `holding', assert (match using) keep(match) nogenerate
                  sort id
                  xtdpdgmm L(0/2).consumption L(0/2).gas L(0/2).gdp heatdays, model(diff) gmm(    consumption, lag(1 .)) gmm(gas, lag(1 .)) gmm(gdp, lag(1 .)) gmm(heatdays, lag(1 .)) two vce(r) overid
                  matrix elast = (elast \ (_b[gas] + _b[L1.gas] + _b[L2.gas])/(1-_b[L1.consumption] - _b[L2.consumption]))
              }
              And I tried to show my data using dataex, but the input statement exceeds the linesize limit..

              Thanks in advance,

              Hein Willems

              Comment


              • #8
                The problem is that you are not using -bsample- correctly. You need to read the help file and manual section on -bsample- before proceeding. It is a more complicated command than -sample-, and its syntax and semantics are different. I don't know enough about your project, and know nothing at all about -xtdpdgmm- to fully correct your code. But it is a fair bet that you are dealing here with panel data and that you will want to sample whole panels, not randomly selected observations in panels, especially since randomly selected observations in panels will result in the lagged values you mention being mostly missing!

                Here is an example, using the online -grunfeld- data set of bootstrapping a fixed effects regression, using sampling with replacement of 5 panels (companies) and creating a matrix containing an expression calculated from the regression coefficients:
                Code:
                clear*
                
                webuse grunfeld
                
                tempfile holding
                save `holding'
                set seed 1234
                
                forval i = 1/3 {
                    use `holding', clear
                    bsample 5, cluster(company) idcluster(new_id)
                    xtset new_id year
                    xtreg mvalue kstock L1.invest, fe
                    matrix elast = nullmat(elast) \ (_b[kstock] * _b[L1.invest])
                }
                
                matrix list elast
                Let me re-emphasize: the example I have shown here may not be appropriate for what you are trying to do. It is shown just to illustrate some of the features of -bsample- that you probably need to use. You need to read the support materials on -bsample- to use it properly. And what constitutes proper use will depend on details of your problem that have not been discussed in this thread.

                Comment

                Working...
                X