Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Regression coefficients vary with every run

    We are running a series of 3 regressions where we first use 2 areg commands, and then regress the residuals from the first 1st areg on the 2nd areg. In total, we have 20,000 sets for which we perform these regressions in a loop. For certain instances where the coefficients are very small (for example 10^-10) - the direction of the coefficient (i.e., positive/negative), magnitude as well as the standard error varies with every run. Why is this the case? We set type double, even set a seed, etc. but these did not help. The base dataset is exactly the same, but every regression run appears to produce different results for these regressions where the coefficients are very small.

    Thanks in advance for your help.

  • #2
    Showing the code and output could help. Is there any sort of sample selection going on, or any generation of random variables, or any data sorting? Things like that can sometimes produce differing results.
    -------------------------------------------
    Richard Williams, Notre Dame Dept of Sociology
    Stata Version: 17.0 MP (2 processor)

    EMAIL: [email protected]
    WWW: https://www3.nd.edu/~rwilliam

    Comment


    • #3
      Hi Richard,

      Thanks for the response. We actually sorted the relevant variables in the same order - we also tried with and without sorting, no luck. There is no random sampling, sample selection or generation. The general code is:
      set more off
      set type double
      local a=1
      while `a'<`total'+1{
      use `file`a'', clear
      sort A B C D E g1 g2
      egen long fe=group(g1 g2)
      areg A B C D, absorb(fe)
      predict y_t, res
      areg E B C D, absorb(fe)
      predict d_t, res
      reg y_t d_t
      regsave using "filepath\file_`a'",replace t detail(scalars)
      }

      Thanks,
      Nethra


      Comment


      • #4
        You can try adding the -stable- option to sort. Although if that "solves" the problem, it may indicate you have some other sort of problem. See the first few posts of

        http://www.statalist.org/forums/foru...ing-3-datasets
        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        Stata Version: 17.0 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://www3.nd.edu/~rwilliam

        Comment


        • #5
          Why do you believe it necessary to sort? It shouldn't have any effect with the operations you're showing.

          Anyway, I cannot reproduce your problem: there is no flip of the sign or change in magnitude when performing the regression operations that you show in a series of datasets where more than half of the residuals-on-residuals regression coefficients are smaller than ±10-10 (see below).

          Maybe the "general code" that you're showing us is omitting something important, the user-written command that you're using is doing something you're not aware of, or there is something about your data that you're not telling us, e.g., edge cases of collinearity between A, B, C and D, or E, B, C and D.

          .ÿversionÿ14.2

          .ÿ
          .ÿclearÿ*

          .ÿsetÿmoreÿoff

          .ÿsetÿseedÿ1379075

          .ÿ
          .ÿ*
          .ÿ*ÿCreateÿdatasets
          .ÿ*
          .ÿquietlyÿsetÿobsÿ10

          .ÿgenerateÿbyteÿg1ÿ=ÿ_n

          .ÿquietlyÿexpandÿ10

          .ÿbysortÿg1:ÿgenerateÿbyteÿg2ÿ=ÿ_n

          .ÿquietlyÿexpandÿ10

          .ÿ
          .ÿegenÿlongÿfe=group(g1ÿg2)

          .ÿ
          .ÿforeachÿvarÿofÿnewlistÿAÿBÿCÿDÿEÿ{
          ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿgenerateÿdoubleÿ`var'ÿ=ÿ.
          ÿÿ3.ÿ}

          .ÿ
          .ÿprogramÿdefineÿfillem
          ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ14.2
          ÿÿ2.ÿÿÿÿÿÿÿÿÿsyntax
          ÿÿ3.ÿ
          .ÿÿÿÿÿÿÿÿÿforeachÿvarÿofÿvarlistÿB-Dÿ{
          ÿÿ4.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿ`var'ÿ=ÿruniform()
          ÿÿ5.ÿÿÿÿÿÿÿÿÿ}
          ÿÿ6.ÿ
          .ÿÿÿÿÿÿÿÿÿforeachÿvarÿofÿvarlistÿAÿEÿ{
          ÿÿ7.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿ`var'ÿ=ÿrnormal()
          ÿÿ8.ÿÿÿÿÿÿÿÿÿ}
          ÿÿ9.ÿend

          .ÿ
          .ÿforvaluesÿiÿ=ÿ1/200ÿ{
          ÿÿ2.ÿÿÿÿÿÿÿÿÿtempfileÿ`i'
          ÿÿ3.ÿÿÿÿÿÿÿÿÿfillem
          ÿÿ4.ÿÿÿÿÿÿÿÿÿquietlyÿsaveÿ``i''
          ÿÿ5.ÿ}

          .ÿ
          .ÿ*
          .ÿ*ÿ"Theÿgeneralÿcodeÿis:
          .ÿ*
          .ÿprogramÿdefineÿregem
          ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ14.2
          ÿÿ2.ÿÿÿÿÿÿÿÿÿsyntaxÿanything(name=dataset)
          ÿÿ3.ÿ
          .ÿÿÿÿÿÿÿÿÿquietlyÿuseÿ`dataset',ÿclear
          ÿÿ4.ÿ
          .ÿÿÿÿÿÿÿÿÿquietlyÿaregÿAÿBÿCÿD,ÿabsorb(fe)
          ÿÿ5.ÿÿÿÿÿÿÿÿÿpredictÿdoubleÿy_t,ÿresidual
          ÿÿ6.ÿ
          .ÿÿÿÿÿÿÿÿÿquietlyÿaregÿEÿBÿCÿD,ÿabsorb(fe)
          ÿÿ7.ÿÿÿÿÿÿÿÿÿpredictÿdoubleÿd_t,ÿresidual
          ÿÿ8.ÿ
          .ÿÿÿÿÿÿÿÿÿquietlyÿregressÿy_tÿc.d_t
          ÿÿ9.ÿend

          .ÿ
          .ÿtempnameÿfile_handle

          .ÿtempfileÿtmpfil0ÿtmpfil1

          .ÿ
          .ÿforvaluesÿiÿ=ÿ0/1ÿ{
          ÿÿ2.ÿÿÿÿÿÿÿÿÿpostfileÿ`file_handle'ÿintÿsetÿdoubleÿ(b_d`i'ÿb_c`i'ÿs_d`i'ÿs_c`i')ÿusingÿ`tmpfil`i''
          ÿÿ3.ÿ
          .ÿÿÿÿÿÿÿÿÿforvaluesÿaÿ=ÿ1/200ÿ{
          ÿÿ4.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿregemÿ``a''
          ÿÿ5.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿpostÿ`file_handle'ÿ(`a')ÿ(_b[d_t])ÿ(_b[_cons])ÿ(_se[d_t])ÿ(_se[_cons])
          ÿÿ6.ÿÿÿÿÿÿÿÿÿ}
          ÿÿ7.ÿÿÿÿÿÿÿÿÿpostcloseÿ`file_handle'
          ÿÿ8.ÿ}

          .ÿ
          .ÿ/*ÿ"whereÿtheÿcoefficientsÿareÿveryÿsmallÿ(forÿexampleÿ10^-10)ÿ-ÿ
          >ÿÿÿÿÿtheÿdirectionÿofÿtheÿcoefficientÿ(i.e.,ÿpositive/negative),ÿ
          >ÿÿÿÿÿmagnitudeÿasÿwellÿasÿtheÿstandardÿerrorÿvariesÿwithÿeveryÿrun"ÿ*/
          .ÿ
          .ÿuseÿ`tmpfil0',ÿclear

          .ÿmergeÿ1:1ÿsetÿusingÿ`tmpfil1',ÿassert(match)ÿnogenerateÿnoreport

          .ÿ
          .ÿforeachÿvar0ÿofÿvarlistÿ*0ÿ{
          ÿÿ2.ÿÿÿÿÿÿÿÿÿlocalÿvar1ÿ:ÿsubinstrÿlocalÿvar0ÿ"0"ÿ"1",ÿall
          ÿÿ3.ÿÿÿÿÿÿÿÿÿgenerateÿbyteÿ`var0'_signÿ=ÿsign(`var0')ÿ!=ÿsign(`var1')
          ÿÿ4.ÿÿÿÿÿÿÿÿÿgenerateÿdoubleÿ`var0'_deltaÿ=ÿabs(`var0'ÿ-ÿ`var1')
          ÿÿ5.ÿ}

          .ÿ
          .ÿtabulateÿb_d0_sign

          ÿÿb_d0_signÿ|ÿÿÿÿÿÿFreq.ÿÿÿÿÿPercentÿÿÿÿÿÿÿÿCum.
          ------------+-----------------------------------
          ÿÿÿÿÿÿÿÿÿÿ0ÿ|ÿÿÿÿÿÿÿÿ200ÿÿÿÿÿÿ100.00ÿÿÿÿÿÿ100.00
          ------------+-----------------------------------
          ÿÿÿÿÿÿTotalÿ|ÿÿÿÿÿÿÿÿ200ÿÿÿÿÿÿ100.00

          .ÿ
          .ÿgenerateÿbyteÿsmallÿ=ÿmin(b_d0,ÿb_d1)ÿ<ÿ1e-10

          .ÿtableÿsmall,ÿcontents(nÿsetÿmaxÿb_d0_deltaÿmaxÿs_d0_delta)

          -------------------------------------------------------
          ÿÿÿÿsmallÿ|ÿÿÿÿÿÿÿÿN(set)ÿÿmax(b_d0_d~a)ÿÿmax(s_d0_d~a)
          ----------+--------------------------------------------
          ÿÿÿÿÿÿÿÿ0ÿ|ÿÿÿÿÿÿÿÿÿÿÿÿ88ÿÿÿÿÿÿ4.510e-17ÿÿÿÿÿÿ1.041e-17
          ÿÿÿÿÿÿÿÿ1ÿ|ÿÿÿÿÿÿÿÿÿÿÿ112ÿÿÿÿÿÿ4.163e-17ÿÿÿÿÿÿ6.939e-18
          -------------------------------------------------------

          .ÿ
          .ÿexit

          endÿofÿdo-file


          .

          Comment


          • #6
            For the coefficients that you're seeing the sign flipping and magnitude changes, are they the intercepts (constants)?

            For a regression of residuals on residuals with the same predictors, the intercept is expected to be exactly zero, and with the limits of precision this could see some variation, including change of sign, but the magnitudes would all be at the limits for double precision, far smaller than ±10-10. See below.

            For slope coefficients, which are far larger than the limits of double precision, there is no sign change and no change in magnitude even when the values are smaller than ±10-10, which is the case in more than half of the datasets below.

            .ÿversionÿ14.2

            .ÿ
            .ÿclearÿ*

            .ÿsetÿmoreÿoff

            .ÿsetÿseedÿ`=strreverse("1379075")'

            .ÿ
            .ÿ*
            .ÿ*ÿCreateÿdatasets
            .ÿ*
            .ÿquietlyÿsetÿobsÿ10

            .ÿgenerateÿbyteÿg1ÿ=ÿ_n

            .ÿquietlyÿexpandÿ10

            .ÿbysortÿg1:ÿgenerateÿbyteÿg2ÿ=ÿ_n

            .ÿquietlyÿexpandÿ10

            .ÿ
            .ÿegenÿlongÿfe=group(g1ÿg2)

            .ÿ
            .ÿforeachÿvarÿofÿnewlistÿAÿBÿCÿDÿEÿ{
            ÿÿ2.ÿÿÿÿÿÿÿÿÿquietlyÿgenerateÿdoubleÿ`var'ÿ=ÿ.
            ÿÿ3.ÿ}

            .ÿ
            .ÿprogramÿdefineÿfillem
            ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ14.2
            ÿÿ2.ÿÿÿÿÿÿÿÿÿsyntax
            ÿÿ3.ÿ
            .ÿÿÿÿÿÿÿÿÿforeachÿvarÿofÿvarlistÿB-Dÿ{
            ÿÿ4.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿ`var'ÿ=ÿruniform()
            ÿÿ5.ÿÿÿÿÿÿÿÿÿ}
            ÿÿ6.ÿ
            .ÿÿÿÿÿÿÿÿÿforeachÿvarÿofÿvarlistÿAÿEÿ{
            ÿÿ7.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿquietlyÿreplaceÿ`var'ÿ=ÿBÿ+ÿCÿ+ÿDÿ+ÿrnormal(0,ÿ1.05e-11)
            ÿÿ8.ÿÿÿÿÿÿÿÿÿ}
            ÿÿ9.ÿend

            .ÿ
            .ÿforvaluesÿiÿ=ÿ1/200ÿ{
            ÿÿ2.ÿÿÿÿÿÿÿÿÿtempfileÿ`i'
            ÿÿ3.ÿÿÿÿÿÿÿÿÿfillem
            ÿÿ4.ÿÿÿÿÿÿÿÿÿquietlyÿsaveÿ``i''
            ÿÿ5.ÿ}

            .ÿ
            .ÿ*
            .ÿ*ÿ"Theÿgeneralÿcodeÿis:"
            .ÿ*
            .ÿprogramÿdefineÿregem
            ÿÿ1.ÿÿÿÿÿÿÿÿÿversionÿ14.2
            ÿÿ2.ÿÿÿÿÿÿÿÿÿsyntaxÿanything(name=dataset)
            ÿÿ3.ÿ
            .ÿÿÿÿÿÿÿÿÿquietlyÿuseÿ`dataset',ÿclear
            ÿÿ4.ÿ
            .ÿÿÿÿÿÿÿÿÿquietlyÿaregÿAÿBÿCÿD,ÿabsorb(fe)
            ÿÿ5.ÿÿÿÿÿÿÿÿÿpredictÿdoubleÿy_t,ÿresidual
            ÿÿ6.ÿ
            .ÿÿÿÿÿÿÿÿÿquietlyÿaregÿEÿBÿCÿD,ÿabsorb(fe)
            ÿÿ7.ÿÿÿÿÿÿÿÿÿpredictÿdoubleÿd_t,ÿresidual
            ÿÿ8.ÿ
            .ÿÿÿÿÿÿÿÿÿquietlyÿregressÿy_tÿc.d_t
            ÿÿ9.ÿend

            .ÿ
            .ÿtempnameÿfile_handle

            .ÿtempfileÿtmpfil0ÿtmpfil1

            .ÿ
            .ÿforvaluesÿiÿ=ÿ0/1ÿ{
            ÿÿ2.ÿÿÿÿÿÿÿÿÿpostfileÿ`file_handle'ÿintÿsetÿdoubleÿ(b_d`i'ÿb_c`i'ÿs_d`i'ÿs_c`i')ÿusingÿ`tmpfil`i''
            ÿÿ3.ÿ
            .ÿÿÿÿÿÿÿÿÿforvaluesÿaÿ=ÿ1/200ÿ{
            ÿÿ4.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿregemÿ``a''
            ÿÿ5.ÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿÿpostÿ`file_handle'ÿ(`a')ÿ(_b[d_t])ÿ(_b[_cons])ÿ(_se[d_t])ÿ(_se[_cons])
            ÿÿ6.ÿÿÿÿÿÿÿÿÿ}
            ÿÿ7.ÿÿÿÿÿÿÿÿÿpostcloseÿ`file_handle'
            ÿÿ8.ÿ}

            .ÿ
            .ÿ/*ÿ"whereÿtheÿcoefficientsÿareÿveryÿsmallÿ(forÿexampleÿ10^-10)ÿ-ÿ
            >ÿÿÿÿÿtheÿdirectionÿofÿtheÿcoefficientÿ(i.e.,ÿpositive/negative),ÿ
            >ÿÿÿÿÿmagnitudeÿasÿwellÿasÿtheÿstandardÿerrorÿvariesÿwithÿeveryÿrun"ÿ*/
            .ÿ
            .ÿuseÿ`tmpfil0',ÿclear

            .ÿmergeÿ1:1ÿsetÿusingÿ`tmpfil1',ÿassert(match)ÿnogenerateÿnoreport

            .ÿ
            .ÿforeachÿvar0ÿofÿvarlistÿ*0ÿ{
            ÿÿ2.ÿÿÿÿÿÿÿÿÿlocalÿvar1ÿ:ÿsubinstrÿlocalÿvar0ÿ"0"ÿ"1",ÿall
            ÿÿ3.ÿÿÿÿÿÿÿÿÿgenerateÿbyteÿ`var0'_signÿ=ÿsign(`var0')ÿ!=ÿsign(`var1')
            ÿÿ4.ÿÿÿÿÿÿÿÿÿgenerateÿdoubleÿ`var0'_deltaÿ=ÿabs(`var0'ÿ-ÿ`var1')
            ÿÿ5.ÿ}

            .ÿ
            .ÿ//ÿSlopeÿCoefficients
            .ÿtabulateÿb_d0_sign

            ÿÿb_d0_signÿ|ÿÿÿÿÿÿFreq.ÿÿÿÿÿPercentÿÿÿÿÿÿÿÿCum.
            ------------+-----------------------------------
            ÿÿÿÿÿÿÿÿÿÿ0ÿ|ÿÿÿÿÿÿÿÿ200ÿÿÿÿÿÿ100.00ÿÿÿÿÿÿ100.00
            ------------+-----------------------------------
            ÿÿÿÿÿÿTotalÿ|ÿÿÿÿÿÿÿÿ200ÿÿÿÿÿÿ100.00

            .ÿ
            .ÿgenerateÿbyteÿsmallÿ=ÿmin(abs(b_d0),ÿabs(b_d1))ÿ<ÿ1e-10ÿ//ÿcorrectedÿforÿ±1e-10

            .ÿlabelÿdefineÿMagnitudesÿ0ÿ"≥1e-10"ÿ1ÿ"<1e-10"

            .ÿlabelÿvaluesÿsmallÿMagnitudes

            .ÿtableÿsmall,ÿcontents(nÿsetÿmaxÿb_d0_deltaÿmaxÿs_d0_delta)

            -------------------------------------------------------
            ÿÿÿÿsmallÿ|ÿÿÿÿÿÿÿÿN(set)ÿÿmax(b_d0_d~a)ÿÿmax(s_d0_d~a)
            ----------+--------------------------------------------
            ÿÿÿ≥1e-10ÿ|ÿÿÿÿÿÿÿÿÿÿÿÿ80ÿÿÿÿÿÿ2.383e-06ÿÿÿÿÿÿ6.329e-08
            ÿÿÿ<1e-10ÿ|ÿÿÿÿÿÿÿÿÿÿÿ120ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ0ÿÿÿÿÿÿÿÿÿÿÿÿÿÿ0
            -------------------------------------------------------

            .ÿ
            .ÿ//ÿInterceptÿCoefficients
            .ÿlabelÿdefineÿSignsÿ0ÿSameÿ1ÿFlipped

            .ÿlabelÿvaluesÿb_c0_signÿSigns

            .ÿtabulateÿb_c0_sign

            ÿÿb_c0_signÿ|ÿÿÿÿÿÿFreq.ÿÿÿÿÿPercentÿÿÿÿÿÿÿÿCum.
            ------------+-----------------------------------
            ÿÿÿÿÿÿÿSameÿ|ÿÿÿÿÿÿÿÿ104ÿÿÿÿÿÿÿ52.00ÿÿÿÿÿÿÿ52.00
            ÿÿÿÿFlippedÿ|ÿÿÿÿÿÿÿÿÿ96ÿÿÿÿÿÿÿ48.00ÿÿÿÿÿÿ100.00
            ------------+-----------------------------------
            ÿÿÿÿÿÿTotalÿ|ÿÿÿÿÿÿÿÿ200ÿÿÿÿÿÿ100.00

            .ÿtableÿb_c0_sign,ÿcontents(nÿsetÿmaxÿb_c0_deltaÿmaxÿs_c0_delta)

            -------------------------------------------------------
            b_c0_signÿ|ÿÿÿÿÿÿÿÿN(set)ÿÿmax(b_c0_d~a)ÿÿmax(s_c0_d~a)
            ----------+--------------------------------------------
            ÿÿÿÿÿSameÿ|ÿÿÿÿÿÿÿÿÿÿÿ104ÿÿÿÿÿÿ5.018e-29ÿÿÿÿÿÿ4.827e-19
            ÿÿFlippedÿ|ÿÿÿÿÿÿÿÿÿÿÿÿ96ÿÿÿÿÿÿ8.236e-29ÿÿÿÿÿÿ4.533e-19
            -------------------------------------------------------

            .ÿ
            .ÿforvaluesÿiÿ=ÿ0/1ÿ{
            ÿÿ2.ÿÿÿÿÿÿÿÿÿgenerateÿdoubleÿac`i'ÿ=ÿabs(b_c`i')
            ÿÿ3.ÿ}

            .ÿtableÿb_c0_sign,ÿcontents(nÿsetÿmaxÿac0ÿmaxÿac1)

            ----------------------------------------------
            b_c0_signÿ|ÿÿÿÿÿN(set)ÿÿÿÿmax(ac0)ÿÿÿÿmax(ac1)
            ----------+-----------------------------------
            ÿÿÿÿÿSameÿ|ÿÿÿÿÿÿÿÿ104ÿÿÿ5.271e-29ÿÿÿ5.483e-29
            ÿÿFlippedÿ|ÿÿÿÿÿÿÿÿÿ96ÿÿÿ4.842e-29ÿÿÿ5.783e-29
            ----------------------------------------------

            .ÿ
            .ÿdisplayÿinÿsmclÿasÿtextÿepsdouble()
            2.220e-16

            .ÿ
            .ÿexit

            endÿofÿdo-file


            .

            Comment

            Working...
            X