Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Pualine:
    what if you log the regressand only?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #17
      Hello Carlo,

      I followed the advice from Clyde (see above) , as log transform my regressand was my initial bet. If I do that the impact becomes very low

      Code:
      Linear regression                               Number of obs     =        974
                                                      F(8, 139)         =      99.42
                                                      Prob > F          =     0.0000
                                                      R-squared         =     0.6090
                                                      Root MSE          =     .08558
      
                                       (Std. err. adjusted for 140 clusters in week_cluster)
      --------------------------------------------------------------------------------------
                           |               Robust
        lndownloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
      ---------------------+----------------------------------------------------------------
        daily_vaccinations |  -4.20e-08   6.43e-09    -6.53   0.000    -5.47e-08   -2.93e-08
                 new_cases |   9.96e-08   2.51e-08     3.97   0.000     5.00e-08    1.49e-07
       stay_at_home_policy |   .0059275   .0116929     0.51   0.613    -.0171914    .0290464
          facial_coverings |   .0162332   .0067554     2.40   0.018     .0028767    .0295897
               residential |  -.0174582   .0023205    -7.52   0.000    -.0220462   -.0128703
                workplaces |   .0042913   .0008878     4.83   0.000     .0025359    .0060467
      grocery_and_pharmacy |   .0027483   .0010341     2.66   0.009     .0007038    .0047929
                   leisure |  -.0092535   .0009861    -9.38   0.000    -.0112032   -.0073038
                     _cons |   12.64769   .0214231   590.38   0.000     12.60533    12.69004
      --------------------------------------------------------------------------------------
      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input long downloads_market float lndownloads_market long new_cases float lnnew_cases long daily_vaccinations float lndaily_vaccinations byte(facial_coverings leisure residential)
      341189 12.740191 164238 12.009072 1482195 14.209035 4 -27  7
      276702 12.530697 138823 11.840955 1487034 14.212294 4 -29 14
      273405  12.51871 111938   11.6257 1493026 14.216315 4 -26 13
      261750 12.475145 129043   11.7679 1523847  14.23675 4 -20 11
      258194 12.461467 121330  11.70627 1579060  14.27234 4 -23 11
      271123 12.510328 108669 11.596062 1642312 14.311616 4 -24 11
      335024 12.721957 122620 11.716846 1683043 14.336114 4 -21  6
      331554 12.711546 124827 11.734684 1680143  14.33439 4 -31  6
      268429  12.50034 110180  11.60987 1704200 14.348606 4 -26 11
      258350  12.46207  92931 11.439612 1742649 14.370917 4 -25 12
      258774  12.46371  80726 11.298816 1766134 14.384303 4 -22 11
      258242 12.461653  91417 11.423186 1775339 14.389502 4 -26 13
      272266 12.514535  98339 11.496176 1809237 14.408416 4 -24 13
      337155 12.728298  99548 11.508395 1833702 14.421847 4 -23  8
      343975 12.748324  97431   11.4869 1854830 14.433304 4 -24  7
      278952 12.538795  85221 11.353004 1799377  14.40295 4 -34 18
      261930 12.475833  67413 11.118593 1748582 14.374315 4 -33 16
      253641 12.443675  58066 10.969336 1699474  14.34583 4 -27 13
      252157 12.437807  56871  10.94854 1619739 14.297775 4 -33 17
      265697 12.490112  63770 11.063038 1529221  14.24027 4 -26 14
      334151 12.719348  68568  11.13558 1505355  14.22454 4 -19  6
      339249  12.73449  73576 11.206074 1506831  14.22552 4 -19  5
      264264 12.484704  65510 11.089958 1545559 14.250896 4 -22 11
      249671   12.4279  56628  10.94426 1591287 14.280054 4 -18 10
      243892  12.40448  55821 10.929906 1671486 14.329224 4 -15 10
      242057  12.39693  70162 11.158562 1826222  14.41776 4 -17 10
      251278 12.434315  72358 11.189382 1990581 14.503937 4 -18 10
      311802 12.650124  71998 11.184394 2088405  14.55191 4 -15  5
      308777 12.640374  73274 11.201962 2124824   14.5692 4 -16  4
      239650 12.386935  68298 11.131636 2214511 14.610542 4 -17  9
      224399  12.32118  52570   10.8699 2289948  14.64404 4 -16 10
      220762  12.30484  51114 10.841814 2344331  14.66751 4 -12  9
      223473 12.317046  53761 10.892303 2381996  14.68345 4 -15  9
      239508 12.386342  66043 11.098062 2399133 14.690618 4 -16  9
      293563 12.589848  64859  11.07997 2409485 14.694923 4 -13  4
      293945 12.591148  64094 11.068106 2420479 14.699476 4 -12  3
      231257 12.351285  57710 10.963185 2431384  14.70397 4 -16  9
      219047 12.297042  43758  10.68643 2467021 14.718522 4 -14  9
      216059 12.283307  42703 10.662024 2499757 14.731704 4 -13  9
      223494  12.31714  54088 10.898368 2536720 14.746383 4 -14  9
      240834 12.391863  61755  11.03093 2566875   14.7582 4 -14  9
      300223  12.61228  59764 10.998158 2585933 14.765597 4 -10  3
      306224 12.632072  64222   11.0701 2593402  14.76848 4 -11  4
      247223 12.418046  52274 10.864254 2608866 14.774426 4 -14 10
      229951  12.34562  47230 10.762785 2633115 14.783678 4 -11  9
      219173 12.297617  43532 10.681252 2637533 14.785355 4  -4  8
      219060   12.2971  51806  10.85526 2628017  14.78174 4 -10 10
      237601 12.378348  59588  10.99521 2609056   14.7745 4 -10  8
      297082 12.601764  63099  11.05246 2603838 14.772497 4  -6  2
      303143  12.62196  62022 11.035244 2609841   14.7748 4  -4  2
      248635  12.42374  58456  10.97603 2620473 14.778866 4  -9  8
      258886 12.464143  45228 10.719472 2644185 14.787873 4  -9  8
      251465  12.43506  48228 10.783695 2698082 14.808052 4  -9  9
      244261 12.405993  57684 10.962735 2758421  14.83017 4 -10  9
      252701 12.439962  68021 11.127572 2828803 14.855364 4 -10  8
      315511  12.66195  65367 11.087772 2873156 14.870922 4  -8  2
      318479  12.67131  74834 11.223027 2899438 14.880028 4  -8  3
      255835 12.452288  63203 11.054107 2967615  14.90327 4  -8  8
      241676 12.395353  53256 10.882866 3043800 14.928617 4  -7  8
      235200 12.368192  58014  10.96844 3132209  14.95725 4  -6  8
      234764 12.366336  61789  11.03148 3240378   14.9912 4  -4  8
      279313  12.54009  69567 11.150045 3216591 14.983832 4  -6 10
      341929 12.742358  73685 11.207555 3192561 14.976334 4  -6  2
      347939 12.759783  70446 11.162601 3128761 14.956147 4 -28  1
      263302 12.481057  63695  11.06186 3167217 14.968364 4  -6  8
      247202  12.41796  45722 10.730335 3221788 14.985447 4  -6  7
      239936 12.388127  56400 10.940225 3248620  14.99374 4  -4  7
      240578   12.3908  63657 11.061265 3237451 14.990297 4  -6  8
      261429 12.473918  74627 11.220258 3333157  15.01943 4 -10  7
      326151 12.695116  75434 11.231013 3417042 15.044286 4  -7  2
      320046  12.67622  76415 11.243935 3508126 15.070593 4  -6  2
      258481 12.462578  68111 11.128894 3497188  15.06747 4  -9  7
      244370  12.40644  53239 10.882546 3427241 15.047266 4  -8  7
      243909  12.40455  60435 11.009324 3334168 15.019733 4  -7  8
      244612  12.40743  72206  11.18728 3228039 14.987386 4 -10  8
      259203 12.465366  70654  11.16555 3127921  14.95588 4 -11  8
      326913  12.69745  71498 11.177424 3052968 14.931624 4  -8  2
      331781  12.71223  71074 11.171477 3014943  14.91909 4  -5  2
      254606 12.447473  77212  11.25431 2952878  14.89829 4 -10  8
      239776  12.38746  49337  10.80643 2917771  14.88633 4  -8  8
      236824 12.375072  35281   10.4711 2886773  14.87565 4  -9  8
      237560 12.378176  57318  10.95637 2850998  14.86318 4  -9  8
      256040  12.45309  63049 11.051667 2807666 14.847864 4 -11  7
      323971  12.68841  61479  11.02645 2785657 14.839994 4  -9  2
      322041 12.682434  62419 11.041625 2765597 14.832767 4  -6  2
      265313 12.488666  51698 10.853174 2738301 14.822848 4  -9  7
      253175 12.441836  40847  10.61759 2690417 14.805206 4  -8  7
      249020 12.425288  30199 10.315564 2634173  14.78408 4  -7  7
      246365  12.41457  50313 10.826018 2570304 14.759535 4  -8  8
      260354 12.469797  54396 10.904046 2515052 14.737804 4  -8  6
      330265  12.70765  55643 10.926712 2466030  14.71812 4  -6  1
      332922 12.715664  55311 10.920727 2449788 14.711512 4  -4  1
      251693 12.435966  48352 10.786263 2388627  14.68623 4  -8  7
      239938 12.388136  36593 10.507612 2332468 14.662437 4  -7  7
      234723  12.36616  32953 10.402838 2256142 14.629167 4  -4  7
      231805 12.353652  42552 10.658483 2180771  14.59519 4  -6  6
      250938 12.432961  44541 10.704165 2123268 14.568467 4  -7  6
      304681  12.62702  42692 10.661767 2076177  14.54604 4  -3  0
      309428  12.64248  44641 10.706408 2038290 14.527622 4  -2  0
      258221  12.46157  37021  10.51924 1986770  14.50202 4  -9  7
      end
      leisure and residential are Google mobility data during covid, to see if there are any changes if public places are closed vs open

      Thank you! This is such a great help.

      Comment


      • #18
        Click image for larger version

Name:	Screenshot 2023-11-15 at 12.32.54.png
Views:	1
Size:	428.3 KB
ID:	1733949 Click image for larger version

Name:	Screenshot 2023-11-15 at 12.34.36.png
Views:	1
Size:	272.8 KB
ID:	1733950

        If I insert the interaction term (e.g., If I am assuming a moderating effect of new cases combined with vaccinations)

        I'll get this result

        Code:
         reg downloads_market lndaily_vaccinations lnnew_cases Int_cases_vacc, vce(cluster week_cluster)
        
        Linear regression                               Number of obs     =        628
                                                        F(3, 89)          =      65.19
                                                        Prob > F          =     0.0000
                                                        R-squared         =     0.1347
                                                        Root MSE          =      35297
        
                                          (Std. err. adjusted for 90 clusters in week_cluster)
        --------------------------------------------------------------------------------------
                             |               Robust
            downloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        ---------------------+----------------------------------------------------------------
        lndaily_vaccinations |   29690.39   18415.31     1.61   0.110    -6900.442    66281.23
                 lnnew_cases |   61040.06   22296.62     2.74   0.007     16737.15      105343
              Int_cases_vacc |  -3768.773   1670.475    -2.26   0.027    -7087.972   -449.5736
                       _cons |    -231914   245800.3    -0.94   0.348      -720314    256486.1
        --------------------------------------------------------------------------------------
        When introducing other variables as well this effect becomes again insignificant, what do you suggest? I plotted the residuals but can't figure out "the right way" of doing this..

        Code:
        . reg downloads_market lndaily_vaccinations lnnew_cases Int_cases_vacc stay_at_home_policy facial_coverings residential workplaces grocer
        > y_and_pharmacy leisure, vce(cluster week_cluster)
        
        Linear regression                               Number of obs     =        623
                                                        F(9, 89)          =      70.81
                                                        Prob > F          =     0.0000
                                                        R-squared         =     0.6277
                                                        Root MSE          =      23246
        
                                          (Std. err. adjusted for 90 clusters in week_cluster)
        --------------------------------------------------------------------------------------
                             |               Robust
            downloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
        ---------------------+----------------------------------------------------------------
        lndaily_vaccinations |  -53346.56   22611.26    -2.36   0.020    -98274.65   -8418.473
                 lnnew_cases |  -40070.24   26212.51    -1.53   0.130    -92153.95    12013.47
              Int_cases_vacc |   3846.079   1985.015     1.94   0.056    -98.10229    7790.261
         stay_at_home_policy |   5121.802   3486.443     1.47   0.145    -1805.686    12049.29
            facial_coverings |   9676.553   2465.764     3.92   0.000     4777.134    14575.97
                 residential |  -8046.373   1100.117    -7.31   0.000    -10232.28   -5860.464
                  workplaces |   379.8358   390.1568     0.97   0.333    -395.3975    1155.069
        grocery_and_pharmacy |   474.6654   462.9559     1.03   0.308     -445.218    1394.549
                     leisure |  -2123.055   405.7492    -5.23   0.000     -2929.27    -1316.84
                       _cons |   873973.1   295998.8     2.95   0.004     285829.7     1462116
        --------------------------------------------------------------------------------------

        Comment


        • #19
          Pauline:
          what did -linktest- tell you after wach regression that you ran?
          Kind regards,
          Carlo
          (Stata 19.0)

          Comment


          • #20
            If I do that the impact becomes very low
            When you are working with continuous variables, you cannot judge the "impact" by the absolute size of the regression coefficient. Yes, -4.20e-08 looks like a very small number. But you have to consider that it is a ratio of ln downloads to daily vaccinations. But daily vaccinations, once the vaccine was introduced is a huge number, in the millions (106), and ln downloads is a very modest number, in the low teens (101). So any regression coefficient is going to have to be of order of magnitude 10-5 just to bring the result into the right order of magnitude. Then when you consider that, especially early in the vaccination campaign, vaccines were targeted to the elderly, whereas downloads of dating apps would be targeted to the young, you can't expect the relationship to be all that strong.

            Never judge a coefficient of a continuous variable (or the coefficient of a discrete variable when the outcome variable is continuous) without thinking about the scale of the continuous variable(s).

            Comment


            • #21
              Carlo,

              This is the linktest if running only the two main independent variables:
              Code:
              . reg downloads_market lndaily_vaccinations lnnew_cases Int_cases_vacc, vce(cluster week_cluster)
              
              Linear regression                               Number of obs     =        628
                                                              F(3, 89)          =      65.19
                                                              Prob > F          =     0.0000
                                                              R-squared         =     0.1347
                                                              Root MSE          =      35297
              
                                                (Std. err. adjusted for 90 clusters in week_cluster)
              --------------------------------------------------------------------------------------
                                   |               Robust
                  downloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
              ---------------------+----------------------------------------------------------------
              lndaily_vaccinations |   29690.39   18415.31     1.61   0.110    -6900.442    66281.23
                       lnnew_cases |   61040.06   22296.62     2.74   0.007     16737.15      105343
                    Int_cases_vacc |  -3768.773   1670.475    -2.26   0.027    -7087.972   -449.5736
                             _cons |    -231914   245800.3    -0.94   0.348      -720314    256486.1
              --------------------------------------------------------------------------------------
              
              
              
              Linear regression                               Number of obs     =        628
                                                              F(2, 89)          =      86.34
                                                              Prob > F          =     0.0000
                                                              R-squared         =     0.1311
                                                              Root MSE          =      35341
              
                                                (Std. err. adjusted for 90 clusters in week_cluster)
              --------------------------------------------------------------------------------------
                                   |               Robust
                  downloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
              ---------------------+----------------------------------------------------------------
              lndaily_vaccinations |  -11962.97   1236.974    -9.67   0.000    -14420.81   -9505.128
                       lnnew_cases |   10441.25   1162.287     8.98   0.000     8131.814    12750.69
                             _cons |   327125.5   20953.09    15.61   0.000     285492.2    368758.9
              --------------------------------------------------------------------------------------
              
              . 
              end of do-file
              
              . linktest
              
                    Source |       SS           df       MS      Number of obs   =       628
              -------------+----------------------------------   F(2, 625)       =     47.88
                     Model |  1.1937e+11         2  5.9685e+10   Prob > F        =    0.0000
                  Residual |  7.7910e+11       625  1.2466e+09   R-squared       =    0.1329
              -------------+----------------------------------   Adj R-squared   =    0.1301
                     Total |  8.9847e+11       627  1.4330e+09   Root MSE        =     35307
              
              ------------------------------------------------------------------------------
              downloads_~t | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                      _hat |  -3.731911   4.256555    -0.88   0.381    -12.09079    4.626972
                    _hatsq |   8.37e-06   7.53e-06     1.11   0.267    -6.41e-06    .0000232
                     _cons |   666850.5   600391.9     1.11   0.267    -512179.2     1845880
              This is the linktest when including an interaction term:

              Code:
              . reg downloads_market lndaily_vaccinations lnnew_cases Int_cases_vacc, vce(cluster week_cluster)
              
              Linear regression                               Number of obs     =        628
                                                              F(3, 89)          =      65.19
                                                              Prob > F          =     0.0000
                                                              R-squared         =     0.1347
                                                              Root MSE          =      35297
              
                                                (Std. err. adjusted for 90 clusters in week_cluster)
              --------------------------------------------------------------------------------------
                                   |               Robust
                  downloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
              ---------------------+----------------------------------------------------------------
              lndaily_vaccinations |   29690.39   18415.31     1.61   0.110    -6900.442    66281.23
                       lnnew_cases |   61040.06   22296.62     2.74   0.007     16737.15      105343
                    Int_cases_vacc |  -3768.773   1670.475    -2.26   0.027    -7087.972   -449.5736
                             _cons |    -231914   245800.3    -0.94   0.348      -720314    256486.1
              --------------------------------------------------------------------------------------
              
              . linktest
              
                    Source |       SS           df       MS      Number of obs   =       628
              -------------+----------------------------------   F(2, 625)       =     49.13
                     Model |  1.2206e+11         2  6.1030e+10   Prob > F        =    0.0000
                  Residual |  7.7641e+11       625  1.2422e+09   R-squared       =    0.1359
              -------------+----------------------------------   Adj R-squared   =    0.1331
                     Total |  8.9847e+11       627  1.4330e+09   Root MSE        =     35246
              
              ------------------------------------------------------------------------------
              downloads_~t | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                      _hat |   -2.64874   4.038392    -0.66   0.512     -10.5792     5.28172
                    _hatsq |   6.41e-06   7.09e-06     0.90   0.366    -7.52e-06    .0000203
                     _cons |   517863.5   573703.6     0.90   0.367    -608756.5     1644484
              ------------------------------------------------------------------------------
              
              .
              And this is the linktest when including all variables

              Code:
              . reg downloads_market lndaily_vaccinations lnnew_cases Int_cases_vacc stay_at_home_policy facial_coverings residential  leisure, vce(clu
              > ster week_cluster)
              
              Linear regression                               Number of obs     =        623
                                                              F(7, 89)          =      74.41
                                                              Prob > F          =     0.0000
                                                              R-squared         =     0.6249
                                                              Root MSE          =      23294
              
                                                (Std. err. adjusted for 90 clusters in week_cluster)
              --------------------------------------------------------------------------------------
                                   |               Robust
                  downloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
              ---------------------+----------------------------------------------------------------
              lndaily_vaccinations |  -60499.93   22316.58    -2.71   0.008    -104842.5   -16157.36
                       lnnew_cases |  -48598.88      25800    -1.88   0.063    -99862.94    2665.175
                    Int_cases_vacc |   4546.102   1958.324     2.32   0.023      654.953    8437.251
               stay_at_home_policy |   6518.087   3319.346     1.96   0.053    -77.38212    13113.56
                  facial_coverings |   9684.859   2408.872     4.02   0.000     4898.482    14471.24
                       residential |  -9157.487   599.9015   -15.26   0.000    -10349.48   -7965.496
                           leisure |  -1787.785   359.5429    -4.97   0.000    -2502.189   -1073.381
                             _cons |   958239.1   292745.7     3.27   0.002     376559.6     1539919
              --------------------------------------------------------------------------------------
              
              . 
              end of do-file
              
              . linktest
              
                    Source |       SS           df       MS      Number of obs   =       623
              -------------+----------------------------------   F(2, 620)       =    869.55
                     Model |  6.5591e+11         2  3.2795e+11   Prob > F        =    0.0000
                  Residual |  2.3384e+11       620   377155106   R-squared       =    0.7372
              -------------+----------------------------------   Adj R-squared   =    0.7363
                     Total |  8.8974e+11       622  1.4305e+09   Root MSE        =     19420
              
              ------------------------------------------------------------------------------
              downloads_~t | Coefficient  Std. err.      t    P>|t|     [95% conf. interval]
              -------------+----------------------------------------------------------------
                      _hat |  -4.433885   .3349531   -13.24   0.000    -5.091665   -3.776104
                    _hatsq |   9.67e-06   5.94e-07    16.27   0.000     8.50e-06    .0000108
                     _cons |   755080.8   46990.94    16.07   0.000     662800.1    847361.5
              ------------------------------------------------------------------------------

              Maybe you can shortly explain how to compare these results and draw a conclusion of it, this helps me a lot in making sense of my struggles. Thank you all!

              Comment


              • #22
                Clyde, this makes absolutely sense. You are right.

                But isn't this explanation the perfect reason why to switch the ln on the dependent variable? As if the magnitudes are incredibly small I can not draw reasonable conclusions from it.

                Prior in our discussion you explained that the relationship looks more linear if I log the independent variables - how do I recognize which transformation makes more "sense"?

                Thank you - This is very helpful - very much appreciated

                Comment


                • #23
                  how do I recognize which transformation makes more "sense"?
                  As you are working with simple linear regression, in the absence of any good prior knowledge of what it should be, you should use the one that gives the most linear relationship. I think a good way to decide that is graphically. Use the -rvfplot- postestimation command to see which way of specifying your model gives you the best plot (which, in the ideal situation, is a more or less rectangular smear of points). If you have more than one model that gives a good -rvfplot- then you can focus more strongly on the role of specific variables with the -rvpplot- postestimation command. (See -help rvpplot- for details.)

                  If you are uncomfortable with graphical methods, the -linktest- command that Carlo has urged you to use can be helpful, though it only detects certain specific kinds of non-linearity. (I think the only methods that are sensitive to all kinds of non-linearity are the graphical ones.)

                  Turning to judging the strength of an effect, and the problem that the coefficient's magnitude is sensitive to the scale of continuous variables, another way to assess this is to run the model with and without the predictor (making sure not to change the estimation sample because of the vagaries of missing values in the data) and see how large the change in R2 is. So
                  Code:
                  regress ... variable_with_small_coefficient ...
                  local full_r2 = e(r2)
                  regress ... if e(sample) // EXCLUDE variable_with_small_coefficient FROM VARLIST
                  local delta = `full_r2' - e(r2)
                  
                  display "Change in R2 = " %03.2f `delta'
                  This change in R2 statistic tells you how much of the variance in the outcome is attributable to your variable of interest over and above what all the other variables in the model contribute to it. This statistic is not sensitive to scale and is a much better measure of the "importance" of a variable's effect.

                  Comment


                  • #24
                    Clyde,

                    before applying the log, I added 1 to vaccinations, and cases, as there are some observations which are obviosly zero, I think thereby I can also solve the problem of the ratio problem, at least I've read this approach in a comparable study about app downloads.

                    I followed your advice, and ran a regression for each variable and even though I detected some patterns, it doesn't really change about my indecisiveness using a Log-Log model or not.
                    For the variable new cases I rather assume a U -shape - see attached 1) normal - log 2) log-log

                    For daily vaccinations I don't know what to do at all (picture 3 & 4) After I solved this problem - I will continue with focusing on strength as you said, Clyde.

                    Thank you - I am really grateful to receive advice.
                    Click image for larger version

Name:	rvf - log-log (vaccinations) .png
Views:	1
Size:	46.6 KB
ID:	1734042

                    Click image for larger version

Name:	rvf - normal-log (vaccinations).png
Views:	1
Size:	62.9 KB
ID:	1734040

                    Click image for larger version

Name:	normal - log (newcasses).png
Views:	1
Size:	48.3 KB
ID:	1734038

                    Click image for larger version

Name:	log-log (newcases) .png
Views:	1
Size:	56.3 KB
ID:	1734039

                    Comment


                    • #25
                      Update to my post above - I tried to test the U-shape relationship but R-squared decreases after application - see below:

                      Including squared term - linear coefficient and non-linear coefficient become insignificant:

                      Code:
                      . reg lndownloads_market lnnew_cases lnnew_cases_squared lndaily_vaccinations stay_at_home_policy residential  leisure, vce(cluster week_
                      > cluster)
                      
                      Linear regression                               Number of obs     =        974
                                                                      F(6, 139)         =      98.46
                                                                      Prob > F          =     0.0000
                                                                      R-squared         =     0.5806
                                                                      Root MSE          =     .08854
                      
                                                       (Std. err. adjusted for 140 clusters in week_cluster)
                      --------------------------------------------------------------------------------------
                                           |               Robust
                        lndownloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                      ---------------------+----------------------------------------------------------------
                               lnnew_cases |  -.0165251    .013141    -1.26   0.211    -.0425072    .0094569
                       lnnew_cases_squared |   .0004549   .0006193     0.73   0.464    -.0007696    .0016794
                      lndaily_vaccinations |  -.0031087   .0012676    -2.45   0.015    -.0056149   -.0006025
                       stay_at_home_policy |   .0174767   .0120255     1.45   0.148    -.0062998    .0412532
                               residential |  -.0297468    .001476   -20.15   0.000    -.0326652   -.0268284
                                   leisure |  -.0091931   .0009454    -9.72   0.000    -.0110624   -.0073239
                                     _cons |   12.81617   .0480136   266.93   0.000     12.72124     12.9111
                      --------------------------------------------------------------------------------------
                      Including only the square term:

                      Code:
                       Linear regression                               Number of obs     =        974
                                                                      F(5, 139)         =     102.10
                                                                      Prob > F          =     0.0000
                                                                      R-squared         =     0.5794
                                                                      Root MSE          =     .08862
                      
                                                       (Std. err. adjusted for 140 clusters in week_cluster)
                      --------------------------------------------------------------------------------------
                                           |               Robust
                        lndownloads_market | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                      ---------------------+----------------------------------------------------------------
                       lnnew_cases_squared |  -.0002666   .0001164    -2.29   0.024    -.0004969   -.0000364
                      lndaily_vaccinations |   -.002627   .0011574    -2.27   0.025    -.0049154   -.0003387
                       stay_at_home_policy |   .0118005   .0113109     1.04   0.299    -.0105632    .0341642
                               residential |  -.0300365   .0014743   -20.37   0.000    -.0329515   -.0271216
                                   leisure |   -.009119   .0009289    -9.82   0.000    -.0109556   -.0072823
                                     _cons |   12.74831   .0170782   746.47   0.000     12.71455    12.78208
                      --------------------------------------------------------------------------------------

                      Additional I tested the strength of the effect by applying Clyde's proposed option (Thank you!!)

                      Code:
                       reg lnnew_cases lndaily_vaccinations stay_at_home_policy residential leisure, vce(cluster week_cluster)
                      
                      Linear regression                               Number of obs     =        974
                                                                      F(4, 139)         =      39.31
                                                                      Prob > F          =     0.0000
                                                                      R-squared         =     0.6058
                                                                      Root MSE          =     1.6514
                      
                                                       (Std. err. adjusted for 140 clusters in week_cluster)
                      --------------------------------------------------------------------------------------
                                           |               Robust
                               lnnew_cases | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                      ---------------------+----------------------------------------------------------------
                      lndaily_vaccinations |   .2839621   .0450929     6.30   0.000     .1948053    .3731188
                       stay_at_home_policy |   1.710963   .5413445     3.16   0.002      .640629    2.781298
                               residential |  -.0454098   .0112976    -4.02   0.000    -.0677471   -.0230725
                                   leisure |  -.0219784   .0157013    -1.40   0.164    -.0530226    .0090658
                                     _cons |   10.67888   1.433494     7.45   0.000     7.844606    13.51315
                      --------------------------------------------------------------------------------------
                      
                      . local full_r2 = e(r2)
                      
                      . reg lndaily_vaccinations stay_at_home_policy residential leisure, vce(cluster week_cluster)
                      
                      Linear regression                               Number of obs     =        974
                                                                      F(3, 139)         =      27.70
                                                                      Prob > F          =     0.0000
                                                                      R-squared         =     0.4069
                                                                      Root MSE          =     7.0083
                      
                                                      (Std. err. adjusted for 140 clusters in week_cluster)
                      -------------------------------------------------------------------------------------
                                          |               Robust
                      lndaily_vaccinati~s | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                      --------------------+----------------------------------------------------------------
                      stay_at_home_policy |  -7.859408   .9495217    -8.28   0.000    -9.736781   -5.982035
                              residential |  -.0214445   .0608805    -0.35   0.725     -.141816     .098927
                                  leisure |   .0419233   .0550766     0.76   0.448    -.0669729    .1508195
                                    _cons |   23.55035   1.547445    15.22   0.000     20.49077    26.60992
                      -------------------------------------------------------------------------------------
                      
                      . local delta = `full_r2' - e(r2)
                      
                      . display "Change in R2 = " %03.2f `delta'
                      Change in R2 = 0.20
                      So judging from an R2 delta of 0.2 I assume the strength of lnnew_cases is quite significant - How can I explain this rationale in my thesis? Do I interpret the regression at all? Or should I explain the the issue of judging the strength of an effect, if the coefficient's magnitude is sensitive to the scale of the continuous variable "daily app downloads".

                      (The impact of lndaily_vaccination is even stronger, see below)

                      Code:
                      . reg lnnew_cases lndaily_vaccinations stay_at_home_policy residential leisure, vce(cluster week_cluster)
                      
                      Linear regression                               Number of obs     =        974
                                                                      F(4, 139)         =      39.31
                                                                      Prob > F          =     0.0000
                                                                      R-squared         =     0.6058
                                                                      Root MSE          =     1.6514
                      
                                                       (Std. err. adjusted for 140 clusters in week_cluster)
                      --------------------------------------------------------------------------------------
                                           |               Robust
                               lnnew_cases | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                      ---------------------+----------------------------------------------------------------
                      lndaily_vaccinations |   .2839621   .0450929     6.30   0.000     .1948053    .3731188
                       stay_at_home_policy |   1.710963   .5413445     3.16   0.002      .640629    2.781298
                               residential |  -.0454098   .0112976    -4.02   0.000    -.0677471   -.0230725
                                   leisure |  -.0219784   .0157013    -1.40   0.164    -.0530226    .0090658
                                     _cons |   10.67888   1.433494     7.45   0.000     7.844606    13.51315
                      --------------------------------------------------------------------------------------
                      
                      . local full_r2 = e(r2)
                      
                      .  reg lnnew_cases stay_at_home_policy residential leisure, vce(cluster week_cluster)
                      
                      Linear regression                               Number of obs     =        974
                                                                      F(3, 139)         =       6.95
                                                                      Prob > F          =     0.0002
                                                                      R-squared         =     0.0328
                                                                      Root MSE          =     2.5855
                      
                                                      (Std. err. adjusted for 140 clusters in week_cluster)
                      -------------------------------------------------------------------------------------
                                          |               Robust
                              lnnew_cases | Coefficient  std. err.      t    P>|t|     [95% conf. interval]
                      --------------------+----------------------------------------------------------------
                      stay_at_home_policy |  -.5208105   .3225056    -1.61   0.109    -1.158461    .1168403
                              residential |  -.0514992   .0180318    -2.86   0.005    -.0871513   -.0158472
                                  leisure |  -.0100738   .0254498    -0.40   0.693    -.0603927    .0402451
                                    _cons |   17.36628   .7748165    22.41   0.000     15.83433    18.89823
                      -------------------------------------------------------------------------------------
                      
                      . 
                      . 
                      . local delta = `full_r2' - e(r2)
                      
                      . display "Change in R2 = " %03.2f `delta'
                      Change in R2 = 0.57

                      Comment


                      • #26
                        Please ignore the part about strength - I did not use it in the correct way - I will restructure my Assumptions.

                        Comment


                        • #27
                          Pauline:
                          adding a constant to value=0 before logging, is not a good idea.
                          Kind regards,
                          Carlo
                          (Stata 19.0)

                          Comment

                          Working...
                          X