Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    hello
    Last edited by saikat sarkar; 06 Jan 2024, 12:29.

    Comment


    • #17
      Originally posted by Clyde Schechter View Post
      It's not as simple as that. To get the residuals you have to dance around a little bit with variables inside -myregress- just as was done with the coefficients and standard errors. So I would revise the code as follows:

      Code:
      capture program drop myregress
      program define myregress
      levelsof ISIN3, local(levels)
      foreach l of local levels {
      preserve
      drop if ISIN3 == `l'
      regress CFO_scLaggedAssets One_scLaggedAssets Sales_scLaggedAssets DeltaSales_scLaggedAssets
      restore
      replace b_One_scLaggedAssets = _b[One_scLaggedAssets] if ISIN3 == `l'
      replace b_Sales_scLaggedAssets = _b[Sales_scLaggedAssets] if ISIN3 == `l'
      replace b_DeltaSales_scLaggedAssets = _b[DeltaSales_scLaggedAssets] if ISIN3 == `l'
      replace b_cons = _b[_cons] if ISIN3 == `l'
      replace se_One_scLaggedAssets = _se[One_scLaggedAssets] if ISIN3 == `l'
      replace se_Sales_scLaggedAssets = _se[Sales_scLaggedAssets] if ISIN3 == `l'
      replace se_DeltaSales_scLaggedAssets = _se[DeltaSales_scLaggedAssets] if ISIN3 == `l'
      replace r2 = e(r2) if ISIN3 == `l'
      replace n_obs = e(N) if ISIN3 == `l'
      predict r, resid
      replace residual = r if ISIN3 == `l'
      drop r
      }
      
      exit
      end
      
      foreach v of varlist One_scLaggedAssets Sales_scLaggedAssets DeltaSales_scLaggedAssets {
      gen b_`v' = .
      gen se_`v' = .
      }
      gen b_cons = .
      gen r2 = .
      gen n_obs = .
      gen residual = .
      
      runby my_regress, by(Industry Year) status
      Hello Clyde Schechter,

      I have tried to use your code to do the same estimate as like others. I do not know why it is not working.
      I would like to run the regression by industry and for each year. I would like to get the fitted values and residuals. This code only generates errors for my case.

      Could you please help me?

      Thanks in advance.


      capture program drop myregress
      program define myregress
      levelsof industry, local(levels)
      foreach l of local levels {
      preserve
      drop if industry == `l'
      regress y x1 x2 x3
      restore
      replace b_x1 = _b[x1] if industry == `l'
      replace b_x2 = _b[x2] if industry == `l'
      replace b_x3 = _b[x3] if industry == `l'
      replace b_cons = _b[_cons] if industry == `l'
      replace se_x1 = _se[x1] if industry == `l'
      replace se_x2 = _se[x2] if industry == `l'
      replace se_x3 = _se[x3] if industry == `l'
      replace r2 = e(r2) if industry == `l'
      replace n_obs = e(N) if industry == `l'
      predict r, resid
      replace residual = r if industry == `l'
      drop r
      }

      exit
      end

      foreach v of varlist x1 x2 x3 {
      gen b_`v' = .
      gen se_`v' = .
      }
      gen b_cons = .
      gen r2 = .
      gen n_obs = .
      gen residual = .

      runby my_regress, by(industry year) status

      If I run the the code it generates

      It is generating (200 missing values generated)
      firm industry year y x1 x2 x3
      1 a 1935 33.435 317.6 3078.5 2.8
      1 a 1936 44.893 391.8 4661.7 52.6
      1 a 1937 67.474 410.6 5387.1 156.9
      1 a 1938 75.918 257.7 2792.2 209.2
      1 a 1939 73.878 330.8 4313.2 203.4
      1 a 1940 97.601 461.2 4643.9 207.2
      1 a 1941 120.688 512 4551.2 255.2
      1 a 1942 133.084 448 3244.1 303.7
      1 a 1943 125.408 499.6 4053.7 264.1
      1 a 1944 116.107 547.5 4379.3 201.6
      1 a 1945 130.081 561.2 4840.9 265
      1 a 1946 189.161 688.1 4900.9 402.2
      1 a 1947 268.89 568.9 3526.5 761.5
      1 a 1948 303.893 529.2 3254.7 922.4
      1 a 1949 329.043 555.1 3700.2 1020.1
      1 a 1950 365.774 642.9 3755.6 1099
      1 a 1951 404.775 755.9 4833 1207.7
      1 a 1952 486.616 891.2 4924.9 1430.5
      1 a 1953 642.788 1304.4 6241.7 1777.3
      1 a 1954 797.979 1486.7 5593.6 2226.3
      2 a 1935 41.806 209.9 1362.4 53.8
      2 a 1936 65.614 355.3 1807.1 50.5
      2 a 1937 96.742 469.9 2676.3 118.1
      2 a 1938 99.491 262.3 1801.9 260.2
      2 a 1939 104.682 230.4 1957.3 312.7
      2 a 1940 113.841 361.6 2202.9 254.2
      2 a 1941 136.105 472.8 2380.5 261.4
      2 a 1942 142.109 445.6 2168.6 298.7
      2 a 1943 127.919 361.6 1985.1 301.8
      2 a 1944 109.276 288.2 1813.9 279.1
      2 a 1945 86.688 258.7 1850.2 213.8
      2 a 1946 96.533 420.3 2067.7 132.6
      2 a 1947 132.333 420.5 1796.7 264.8
      2 a 1948 159.367 494.5 1625.8 306.9
      2 a 1949 152.125 405.1 1667 351.1
      2 a 1950 156.436 418.8 1677.4 357.8
      2 a 1951 180.27 588.2 2289.5 342.1
      2 a 1952 218.556 645.5 2159.4 444.2
      2 a 1953 263.787 641 2031.3 623.6
      2 a 1954 238.13 459.3 2115.5 669.7
      3 b 1935 19.364 33.1 1170.6 97.8
      3 b 1936 14.942 45 2015.8 104.4
      3 b 1937 16.907 77.2 2803.3 118
      3 b 1938 27.573 44.6 2039.7 156.2
      3 b 1939 30.208 48.1 2256.2 172.6
      3 b 1940 40.208 74.4 2132.2 186.6
      3 b 1941 59.484 113 1834.1 220.9
      3 b 1942 74.45 91.9 1588 287.8
      3 b 1943 74.741 61.3 1749.4 319.9
      3 b 1944 74.813 56.8 1687.2 321.3
      3 b 1945 78.543 93.6 2007.7 319.6
      3 b 1946 96.397 159.9 2208.3 346
      3 b 1947 126.973 147.2 1656.7 456.4
      3 b 1948 149.066 146.3 1604.4 543.4
      3 b 1949 159.917 98.3 1431.8 618.3
      3 b 1950 164.445 93.5 1610.5 647.4
      3 b 1951 176.671 135.2 1819.4 671.3
      3 b 1952 192.188 157.3 2079.7 726.1
      3 b 1953 212.259 179.5 2371.6 800.3
      3 b 1954 232.546 189.6 2759.9 888.9
      4 b 1935 6.508 40.29 417.5 10.5
      4 b 1936 8.724 72.76 837.8 10.2
      4 b 1937 13.088 66.26 883.9 34.7
      4 b 1938 18.891 51.6 437.9 51.8
      4 b 1939 19.76 52.41 679.7 64.3
      4 b 1940 23.379 69.41 727.8 67.1
      4 b 1941 26.034 68.35 643.6 75.2
      4 b 1942 23.101 46.8 410.9 71.4
      4 b 1943 20.371 47.4 588.4 67.1
      4 b 1944 20.055 59.57 698.4 60.5
      4 b 1945 22.942 88.78 846.4 54.6
      4 b 1946 27.086 74.12 893.8 84.8
      4 b 1947 30.946 62.68 579 96.8
      4 b 1948 38.476 89.36 694.6 110.2
      4 b 1949 46.743 78.98 590.3 147.4
      4 b 1950 53.997 100.66 693.5 163.2
      4 b 1951 74.909 160.62 809 203.5
      4 b 1952 94.38 145 727 290.6
      4 b 1953 111.496 174.93 1001.5 346.1
      4 b 1954 131.191 172.49 703.2 414.9
      5 b 1935 52.159 39.68 157.7 183.2
      5 b 1936 59.467 50.73 167.9 204
      5 b 1937 71.919 74.24 192.9 236
      5 b 1938 82.06 53.51 156.7 291.7
      5 b 1939 87.391 42.65 191.4 323.1
      5 b 1940 93.441 46.48 185.5 344
      5 b 1941 102.209 61.4 199.6 367.7
      5 b 1942 107.839 39.67 189.5 407.2
      5 b 1943 117.586 62.24 151.2 426.6
      5 b 1944 126.087 52.32 187.7 470
      5 b 1945 135.295 63.21 214.7 499.2
      5 b 1946 143.195 59.37 232.9 534.6
      5 b 1947 150.764 58.02 249 566.6
      5 b 1948 160.648 70.34 224.5 595.3
      5 b 1949 168.961 67.42 237.3 631.4
      5 b 1950 174.322 55.74 240.1 662.3
      5 b 1951 183.762 80.3 327.3 683.9
      5 b 1952 195.811 85.4 359.4 729.3
      5 b 1953 207.971 91.9 398.4 774.3
      5 b 1954 213.854 81.43 365.7 804.9
      6 c 1935 3.727 20.36 197 6.5
      6 c 1936 7.043 25.98 210.3 15.8
      6 c 1937 9.882 25.94 223.1 27.7
      6 c 1938 13.139 27.53 216.7 39.2
      6 c 1939 14.206 24.6 286.4 48.6
      6 c 1940 15.853 28.54 298 52.5
      6 c 1941 21.288 43.41 276.9 61.5
      6 c 1942 25.961 42.81 272.6 80.5
      6 c 1943 26.294 27.84 287.4 94.4
      6 c 1944 26.367 32.6 330.3 92.6
      6 c 1945 27.637 39.03 324.4 92.3
      6 c 1946 29.565 50.17 401.9 94.2
      6 c 1947 34.146 51.85 407.4 111.4
      6 c 1948 40.564 64.03 409.2 127.4
      6 c 1949 46.135 68.16 482.2 149.3
      6 c 1950 49.83 77.34 673.8 164.4
      6 c 1951 56.591 95.3 676.9 177.2
      6 c 1952 62.878 99.49 702 200
      6 c 1953 70.444 127.52 793.5 211.5
      6 c 1954 77.546 135.72 927.3 238.7
      7 c 1935 28.556 24.43 138 100.2
      7 c 1936 33.891 23.21 200.1 125
      7 c 1937 40.055 32.78 210.1 142.4
      7 c 1938 46.171 32.54 161.2 165.1
      7 c 1939 52.413 26.65 161.7 194.8
      7 c 1940 61.016 33.71 145.1 222.9
      7 c 1941 70.619 43.5 110.6 252.1
      7 c 1942 74.986 34.46 98.1 276.3
      7 c 1943 82.843 44.28 108.8 300.3
      7 c 1944 92.528 70.8 118.2 318.2
      7 c 1945 91.609 44.12 126.5 336.2
      7 c 1946 96.029 48.98 156.7 351.2
      7 c 1947 101.908 48.51 119.4 373.6
      7 c 1948 106.059 50 129.1 389.4
      7 c 1949 110.445 50.59 134.8 406.7
      7 c 1950 114.473 42.53 140.8 429.5
      7 c 1951 123.814 64.77 179 450.6
      7 c 1952 129.48 72.68 178.1 466.9
      7 c 1953 134.454 73.86 186.8 486.2
      7 c 1954 143.8 89.51 192.7 511.3
      8 c 1935 1.121 12.93 191.5 1.8
      8 c 1936 0.22 25.9 516 0.8
      8 c 1937 1.57 35.05 729 7.4
      8 c 1938 3.499 22.89 560.4 18.1
      8 c 1939 4.444 18.84 519.9 23.5
      8 c 1940 6.054 28.57 628.5 26.5
      8 c 1941 13.381 48.51 537.1 36.2
      8 c 1942 18.256 43.34 561.2 60.8
      8 c 1943 22.332 37.02 617.2 84.4
      8 c 1944 24.095 37.81 626.7 91.2
      8 c 1945 23.582 39.27 737.2 92.4
      8 c 1946 24.587 53.46 760.5 86
      8 c 1947 33.073 55.56 581.4 111.1
      8 c 1948 35.939 49.56 662.3 130.6
      8 c 1949 36.02 32.04 583.8 141.8
      8 c 1950 34.271 32.24 635.2 136.7
      8 c 1951 36.063 54.38 723.8 129.7
      8 c 1952 42.09 71.78 864.1 145.5
      8 c 1953 49.781 90.08 1193.5 174.8
      8 c 1954 55.206 68.6 1188.9 213.5
      9 c 1935 42.92 26.63 290.6 162
      9 c 1936 45.267 23.39 291.1 174
      9 c 1937 48.53 30.65 335 183
      9 c 1938 51.218 20.89 246 198
      9 c 1939 54.194 28.78 356.2 208
      9 c 1940 58.238 26.93 289.8 223
      9 c 1941 62.234 32.08 268.2 234
      9 c 1942 66.309 32.21 213.3 248
      9 c 1943 72.156 35.69 348.2 274
      9 c 1944 79.252 62.47 374.2 282
      9 c 1945 85.592 52.32 387.2 316
      9 c 1946 83.416 56.95 347.4 302
      9 c 1947 91.195 54.32 291.9 333
      9 c 1948 94.884 40.53 297.2 359
      9 c 1949 96.239 32.54 276.9 370
      9 c 1950 99.95 43.48 274.6 376
      9 c 1951 105.649 56.49 339.9 391
      9 c 1952 111.948 65.98 474.8 414
      9 c 1953 119.012 66.11 496 443
      9 c 1954 122.123 49.34 474.5 468
      10 c 1935 0.9239 2.54 70.91 4.5
      10 c 1936 0.6981 2 87.94 4.71
      10 c 1937 0.7585 2.19 82.2 4.57
      10 c 1938 0.9508 1.99 58.72 4.56
      10 c 1939 0.6956 2.03 80.54 4.38
      10 c 1940 0.5498 1.81 86.47 4.21
      10 c 1941 0.6812 2.14 77.68 4.12
      10 c 1942 0.7079 1.86 62.16 3.83
      10 c 1943 0.4586 0.93 62.24 3.58
      10 c 1944 0.4703 1.18 61.82 3.41
      10 c 1945 0.441 1.36 65.85 3.31
      10 c 1946 0.5601 2.24 69.54 3.23
      10 c 1947 1.0873 3.81 64.97 3.9
      10 c 1948 1.797 5.66 68 5.38
      10 c 1949 1.9771 4.21 71.24 7.39
      10 c 1950 2.1785 3.42 69.05 8.74
      10 c 1951 2.3711 4.67 83.04 9.07
      10 c 1952 2.9383 6 74.42 9.93
      10 c 1953 3.5909 6.53 63.51 11.68
      10 c 1954 4.0253 5.12 58.12 14.33
      Last edited by saikat sarkar; 06 Jan 2024, 12:57.

      Comment


      • #18
        You have missed the whole point of -runby-, which is that you don't loop over the values of industry (nor year). The code that you tried to model your solution on, from earlier in this thread, does not loop over industry, it loops over ISIN3. And the -drop- command leaves the data set empty, so our regression command gives you an error. And the problem that was being solved there was materially different from yours.

        I think what you want is this:
        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input byte firm str2 industry int year float(y x1 x2 x3)
        1 "a " 1935  33.435  317.6 3078.5    2.8
        1 "a " 1936  44.893  391.8 4661.7   52.6
        1 "a " 1937  67.474  410.6 5387.1  156.9
        1 "a " 1938  75.918  257.7 2792.2  209.2
        1 "a " 1939  73.878  330.8 4313.2  203.4
        1 "a " 1940  97.601  461.2 4643.9  207.2
        1 "a " 1941 120.688    512 4551.2  255.2
        1 "a " 1942 133.084    448 3244.1  303.7
        1 "a " 1943 125.408  499.6 4053.7  264.1
        1 "a " 1944 116.107  547.5 4379.3  201.6
        1 "a " 1945 130.081  561.2 4840.9    265
        1 "a " 1946 189.161  688.1 4900.9  402.2
        1 "a " 1947  268.89  568.9 3526.5  761.5
        1 "a " 1948 303.893  529.2 3254.7  922.4
        1 "a " 1949 329.043  555.1 3700.2 1020.1
        1 "a " 1950 365.774  642.9 3755.6   1099
        1 "a " 1951 404.775  755.9   4833 1207.7
        1 "a " 1952 486.616  891.2 4924.9 1430.5
        1 "a " 1953 642.788 1304.4 6241.7 1777.3
        1 "a " 1954 797.979 1486.7 5593.6 2226.3
        2 "a " 1935  41.806  209.9 1362.4   53.8
        2 "a " 1936  65.614  355.3 1807.1   50.5
        2 "a " 1937  96.742  469.9 2676.3  118.1
        2 "a " 1938  99.491  262.3 1801.9  260.2
        2 "a " 1939 104.682  230.4 1957.3  312.7
        2 "a " 1940 113.841  361.6 2202.9  254.2
        2 "a " 1941 136.105  472.8 2380.5  261.4
        2 "a " 1942 142.109  445.6 2168.6  298.7
        2 "a " 1943 127.919  361.6 1985.1  301.8
        2 "a " 1944 109.276  288.2 1813.9  279.1
        2 "a " 1945  86.688  258.7 1850.2  213.8
        2 "a " 1946  96.533  420.3 2067.7  132.6
        2 "a " 1947 132.333  420.5 1796.7  264.8
        2 "a " 1948 159.367  494.5 1625.8  306.9
        2 "a " 1949 152.125  405.1   1667  351.1
        2 "a " 1950 156.436  418.8 1677.4  357.8
        2 "a " 1951  180.27  588.2 2289.5  342.1
        2 "a " 1952 218.556  645.5 2159.4  444.2
        2 "a " 1953 263.787    641 2031.3  623.6
        2 "a " 1954  238.13  459.3 2115.5  669.7
        3 "b " 1935  19.364   33.1 1170.6   97.8
        3 "b " 1936  14.942     45 2015.8  104.4
        3 "b " 1937  16.907   77.2 2803.3    118
        3 "b " 1938  27.573   44.6 2039.7  156.2
        3 "b " 1939  30.208   48.1 2256.2  172.6
        3 "b " 1940  40.208   74.4 2132.2  186.6
        3 "b " 1941  59.484    113 1834.1  220.9
        3 "b " 1942   74.45   91.9   1588  287.8
        3 "b " 1943  74.741   61.3 1749.4  319.9
        3 "b " 1944  74.813   56.8 1687.2  321.3
        3 "b " 1945  78.543   93.6 2007.7  319.6
        3 "b " 1946  96.397  159.9 2208.3    346
        3 "b " 1947 126.973  147.2 1656.7  456.4
        3 "b " 1948 149.066  146.3 1604.4  543.4
        3 "b " 1949 159.917   98.3 1431.8  618.3
        3 "b " 1950 164.445   93.5 1610.5  647.4
        3 "b " 1951 176.671  135.2 1819.4  671.3
        3 "b " 1952 192.188  157.3 2079.7  726.1
        3 "b " 1953 212.259  179.5 2371.6  800.3
        3 "b " 1954 232.546  189.6 2759.9  888.9
        4 "b " 1935   6.508  40.29  417.5   10.5
        4 "b " 1936   8.724  72.76  837.8   10.2
        4 "b " 1937  13.088  66.26  883.9   34.7
        4 "b " 1938  18.891   51.6  437.9   51.8
        4 "b " 1939   19.76  52.41  679.7   64.3
        4 "b " 1940  23.379  69.41  727.8   67.1
        4 "b " 1941  26.034  68.35  643.6   75.2
        4 "b " 1942  23.101   46.8  410.9   71.4
        4 "b " 1943  20.371   47.4  588.4   67.1
        4 "b " 1944  20.055  59.57  698.4   60.5
        4 "b " 1945  22.942  88.78  846.4   54.6
        4 "b " 1946  27.086  74.12  893.8   84.8
        4 "b " 1947  30.946  62.68    579   96.8
        4 "b " 1948  38.476  89.36  694.6  110.2
        4 "b " 1949  46.743  78.98  590.3  147.4
        4 "b " 1950  53.997 100.66  693.5  163.2
        4 "b " 1951  74.909 160.62    809  203.5
        4 "b " 1952   94.38    145    727  290.6
        4 "b " 1953 111.496 174.93 1001.5  346.1
        4 "b " 1954 131.191 172.49  703.2  414.9
        5 "b " 1935  52.159  39.68  157.7  183.2
        5 "b " 1936  59.467  50.73  167.9    204
        5 "b " 1937  71.919  74.24  192.9    236
        5 "b " 1938   82.06  53.51  156.7  291.7
        5 "b " 1939  87.391  42.65  191.4  323.1
        5 "b " 1940  93.441  46.48  185.5    344
        5 "b " 1941 102.209   61.4  199.6  367.7
        5 "b " 1942 107.839  39.67  189.5  407.2
        5 "b " 1943 117.586  62.24  151.2  426.6
        5 "b " 1944 126.087  52.32  187.7    470
        5 "b " 1945 135.295  63.21  214.7  499.2
        5 "b " 1946 143.195  59.37  232.9  534.6
        5 "b " 1947 150.764  58.02    249  566.6
        5 "b " 1948 160.648  70.34  224.5  595.3
        5 "b " 1949 168.961  67.42  237.3  631.4
        5 "b " 1950 174.322  55.74  240.1  662.3
        5 "b " 1951 183.762   80.3  327.3  683.9
        5 "b " 1952 195.811   85.4  359.4  729.3
        5 "b " 1953 207.971   91.9  398.4  774.3
        5 "b " 1954 213.854  81.43  365.7  804.9
        end
        
        capture program drop myregress
        program define myregress
            regress y x1 x2 x3
            gen b_x1 = _b[x1]
            gen b_x2 = _b[x2]
            gen b_x3 = _b[x3]
            gen b_cons = _b[_cons]
            gen se_x1 = _se[x1]
            gen se_x2 = _se[x2]
            gen se_x3 = _se[x3]
            gen r2 = e(r2)
            gen n_obs = e(N)
            predict r, resid
            exit
        end
        
        runby myregress, by(industry year) status
        In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        Comment


        • #19
          Originally posted by Clyde Schechter View Post
          You have missed the whole point of -runby-, which is that you don't loop over the values of industry (nor year). The code that you tried to model your solution on, from earlier in this thread, does not loop over industry, it loops over ISIN3. And the -drop- command leaves the data set empty, so our regression command gives you an error. And the problem that was being solved there was materially different from yours.

          I think what you want is this:
          Code:
          * Example generated by -dataex-. For more info, type help dataex
          clear
          input byte firm str2 industry int year float(y x1 x2 x3)
          1 "a " 1935 33.435 317.6 3078.5 2.8
          1 "a " 1936 44.893 391.8 4661.7 52.6
          1 "a " 1937 67.474 410.6 5387.1 156.9
          1 "a " 1938 75.918 257.7 2792.2 209.2
          1 "a " 1939 73.878 330.8 4313.2 203.4
          1 "a " 1940 97.601 461.2 4643.9 207.2
          1 "a " 1941 120.688 512 4551.2 255.2
          1 "a " 1942 133.084 448 3244.1 303.7
          1 "a " 1943 125.408 499.6 4053.7 264.1
          1 "a " 1944 116.107 547.5 4379.3 201.6
          1 "a " 1945 130.081 561.2 4840.9 265
          1 "a " 1946 189.161 688.1 4900.9 402.2
          1 "a " 1947 268.89 568.9 3526.5 761.5
          1 "a " 1948 303.893 529.2 3254.7 922.4
          1 "a " 1949 329.043 555.1 3700.2 1020.1
          1 "a " 1950 365.774 642.9 3755.6 1099
          1 "a " 1951 404.775 755.9 4833 1207.7
          1 "a " 1952 486.616 891.2 4924.9 1430.5
          1 "a " 1953 642.788 1304.4 6241.7 1777.3
          1 "a " 1954 797.979 1486.7 5593.6 2226.3
          2 "a " 1935 41.806 209.9 1362.4 53.8
          2 "a " 1936 65.614 355.3 1807.1 50.5
          2 "a " 1937 96.742 469.9 2676.3 118.1
          2 "a " 1938 99.491 262.3 1801.9 260.2
          2 "a " 1939 104.682 230.4 1957.3 312.7
          2 "a " 1940 113.841 361.6 2202.9 254.2
          2 "a " 1941 136.105 472.8 2380.5 261.4
          2 "a " 1942 142.109 445.6 2168.6 298.7
          2 "a " 1943 127.919 361.6 1985.1 301.8
          2 "a " 1944 109.276 288.2 1813.9 279.1
          2 "a " 1945 86.688 258.7 1850.2 213.8
          2 "a " 1946 96.533 420.3 2067.7 132.6
          2 "a " 1947 132.333 420.5 1796.7 264.8
          2 "a " 1948 159.367 494.5 1625.8 306.9
          2 "a " 1949 152.125 405.1 1667 351.1
          2 "a " 1950 156.436 418.8 1677.4 357.8
          2 "a " 1951 180.27 588.2 2289.5 342.1
          2 "a " 1952 218.556 645.5 2159.4 444.2
          2 "a " 1953 263.787 641 2031.3 623.6
          2 "a " 1954 238.13 459.3 2115.5 669.7
          3 "b " 1935 19.364 33.1 1170.6 97.8
          3 "b " 1936 14.942 45 2015.8 104.4
          3 "b " 1937 16.907 77.2 2803.3 118
          3 "b " 1938 27.573 44.6 2039.7 156.2
          3 "b " 1939 30.208 48.1 2256.2 172.6
          3 "b " 1940 40.208 74.4 2132.2 186.6
          3 "b " 1941 59.484 113 1834.1 220.9
          3 "b " 1942 74.45 91.9 1588 287.8
          3 "b " 1943 74.741 61.3 1749.4 319.9
          3 "b " 1944 74.813 56.8 1687.2 321.3
          3 "b " 1945 78.543 93.6 2007.7 319.6
          3 "b " 1946 96.397 159.9 2208.3 346
          3 "b " 1947 126.973 147.2 1656.7 456.4
          3 "b " 1948 149.066 146.3 1604.4 543.4
          3 "b " 1949 159.917 98.3 1431.8 618.3
          3 "b " 1950 164.445 93.5 1610.5 647.4
          3 "b " 1951 176.671 135.2 1819.4 671.3
          3 "b " 1952 192.188 157.3 2079.7 726.1
          3 "b " 1953 212.259 179.5 2371.6 800.3
          3 "b " 1954 232.546 189.6 2759.9 888.9
          4 "b " 1935 6.508 40.29 417.5 10.5
          4 "b " 1936 8.724 72.76 837.8 10.2
          4 "b " 1937 13.088 66.26 883.9 34.7
          4 "b " 1938 18.891 51.6 437.9 51.8
          4 "b " 1939 19.76 52.41 679.7 64.3
          4 "b " 1940 23.379 69.41 727.8 67.1
          4 "b " 1941 26.034 68.35 643.6 75.2
          4 "b " 1942 23.101 46.8 410.9 71.4
          4 "b " 1943 20.371 47.4 588.4 67.1
          4 "b " 1944 20.055 59.57 698.4 60.5
          4 "b " 1945 22.942 88.78 846.4 54.6
          4 "b " 1946 27.086 74.12 893.8 84.8
          4 "b " 1947 30.946 62.68 579 96.8
          4 "b " 1948 38.476 89.36 694.6 110.2
          4 "b " 1949 46.743 78.98 590.3 147.4
          4 "b " 1950 53.997 100.66 693.5 163.2
          4 "b " 1951 74.909 160.62 809 203.5
          4 "b " 1952 94.38 145 727 290.6
          4 "b " 1953 111.496 174.93 1001.5 346.1
          4 "b " 1954 131.191 172.49 703.2 414.9
          5 "b " 1935 52.159 39.68 157.7 183.2
          5 "b " 1936 59.467 50.73 167.9 204
          5 "b " 1937 71.919 74.24 192.9 236
          5 "b " 1938 82.06 53.51 156.7 291.7
          5 "b " 1939 87.391 42.65 191.4 323.1
          5 "b " 1940 93.441 46.48 185.5 344
          5 "b " 1941 102.209 61.4 199.6 367.7
          5 "b " 1942 107.839 39.67 189.5 407.2
          5 "b " 1943 117.586 62.24 151.2 426.6
          5 "b " 1944 126.087 52.32 187.7 470
          5 "b " 1945 135.295 63.21 214.7 499.2
          5 "b " 1946 143.195 59.37 232.9 534.6
          5 "b " 1947 150.764 58.02 249 566.6
          5 "b " 1948 160.648 70.34 224.5 595.3
          5 "b " 1949 168.961 67.42 237.3 631.4
          5 "b " 1950 174.322 55.74 240.1 662.3
          5 "b " 1951 183.762 80.3 327.3 683.9
          5 "b " 1952 195.811 85.4 359.4 729.3
          5 "b " 1953 207.971 91.9 398.4 774.3
          5 "b " 1954 213.854 81.43 365.7 804.9
          end
          
          capture program drop myregress
          program define myregress
          regress y x1 x2 x3
          gen b_x1 = _b[x1]
          gen b_x2 = _b[x2]
          gen b_x3 = _b[x3]
          gen b_cons = _b[_cons]
          gen se_x1 = _se[x1]
          gen se_x2 = _se[x2]
          gen se_x3 = _se[x3]
          gen r2 = e(r2)
          gen n_obs = e(N)
          predict r, resid
          exit
          end
          
          runby myregress, by(industry year) status
          In the future, when showing data examples, please use the -dataex- command to do so, as I have here. If you are running version 18, 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.
          Thank you for the quick and eleborate response, i do not how to use dataex- command. I will try to learn the procedure. Thank you very much for providing the code and the code is working fine. Again thanks a lot.
          Last edited by saikat sarkar; 06 Jan 2024, 16:40.

          Comment


          • #20
            Originally posted by saikat sarkar View Post

            Thank you for the quick and eleborate response, i do not how to use dataex- command. I will try to learn the procedure. Thank you very much for providing the code and the code is working fine. Again thanks a lot.
            Dear Clyde Schechter,

            I apologize for reaching out once again.

            The code you provided is functioning correctly with the dummy data. There are some missing values in my original dataset and it is unbalanced data. However, when I applied the code to estimate my model, I encountered some issues. Unfortunately, there are errors present. Could you kindly provide guidance on resolving them?

            Thank you very much in advance.

            Regards,

            gen inc_B_Item = ib
            gen ocf = oibdp-txt-xint
            gen t_accruals = inc_B_Item-ocf
            gen ppegt1 = ppegt
            gen size1 = at
            gen lsize1 = l1.at
            gen sales1 = sale
            gen lsales1 = l1.sale

            gen y1 = t_accruals/lsize1
            gen x1 = 1/lsize1
            gen x2 = ppegt1/lsize1
            gen x3 = (sales1-lsales1)/lsize1

            summarize opt0 bubble1 crash1 bubble_duration1 crash_duration1 con_bubble con_crash size lv1 volum1 roa ret_1 std mb2 sensi kldge age1 maturity1 ///
            inc_B_Item ocf t_accruals ppegt size1 lsize1 sales1 lsales1 y1 x1 x2 x3 sic


            capture program drop myregress
            program define myregress
            regress y x1 x2 x3, noconstant
            gen b_x1 = _b[x1]
            gen b_x2 = _b[x2]
            gen b_x3 = _b[x3]
            //gen b_cons = _b[_cons]
            gen se_x1 = _se[x1]
            gen se_x2 = _se[x2]
            gen se_x3 = _se[x3]
            gen r2 = e(r2)
            gen n_obs = e(N)
            predict r, resid
            exit
            end

            runby myregress, by(sic fyear) status

            runby myregress, by(sic fyear) status
            elapsed ----------- by-groups ---------- ------- observations ------ time
            time count errors no-data processed saved remaining
            00:00:01 117 58 0 721 653 00:00:28
            00:00:02 273 106 0 1,405 1,264 00:00:28
            00:00:03 555 250 0 1,981 1,642 00:00:29
            00:00:04 875 439 0 2,648 2,079 00:00:28
            (now reporting every 5 seconds)
            00:00:09 2,289 1,168 0 6,581 5,104 00:00:20
            00:00:14 3,565 1,766 0 10,836 8,623 00:00:13
            00:00:19 4,783 2,301 0 14,471 11,529 00:00:09
            00:00:24 6,203 3,032 0 18,875 14,948 00:00:03
            00:00:27 6,872 3,354 0 21,253 16,923 00:00:00
            Number of by-groups = 6,872
            by-groups with errors = 3,354
            by-groups with no data = 0
            Observations processed = 21,253
            Observations saved = 16,923
            .
            end of do-file
            Variable Obs Mean Std. Dev. Min Max
            inc_B_Item 16,918 626.6237 2816.918 -56121.9 94680
            ocf 15,947 1130.26 3785.853 -16941 103061
            t_accruals 15,947 -474.7509 1722.73 -51559 22368
            ppegt 16,870 5818.044 23851.88 0 511400
            size1 16,919 10642.52 37219.8 10.842 797769
            lsize1 14,869 10049.07 36412.29 10.842 797769
            sales1 16,908 8253.417 27036.87 0 556933
            lsales1 14,859 7899.217 25961.86 0 521426
            y 14,070 -.0540334 .103519 -8.855349 3.586056
            x1 14,869 .0017065 .0041731 1.25e-06 .0922339
            x2 14,826 .5634085 .4523919 .0020807 6.629733
            x3 14,849 .0957242 .2843401 -12.86199 9.802354
            sic 16,923 4329.884 1956.546 1040 9997
            Last edited by saikat sarkar; 06 Jan 2024, 22:13.

            Comment


            • #21
              Well, the first step in troubleshooting this is to identify just what the errors are. My best guess, but it is only a guess, is that due to the missing values, there are some combinations of sic and fyear for which there are insufficient observations to carry out the regression. But really we need to know if that is the problem or not.

              It is easy enough to identify the combinations of sic and fyear where the errors are arising: they will have missing values for all of the coefficients and standard errors. So, with the results that you have for your code, you can do this:
              Code:
              egen int check = rownonmiss(b_*)
              keep if check == 0
              drop b_* se_* r2 n_obs r
              You have now reduced the data set to exactly the data going into program myregress in those combinations of sic and fyear for which there were errors. Next, modify the code in program myregress by putting everything in a -quietly- block. Now rerun the -runby- command, but this time add the -verbose- option. This will show the error messages as -runby- runs.

              If my guess is correct about the source of the problem, all of the messages will be "r(2000) no observations" or "r(2001) insufficient observations." There isn't actually anything you can do about these other than getting a better data set that doesn't have those missing values and gaps in the data. Also, in this situation, the good news is that there is no reason to worry about the results that yougot originally for those sic-fyear combinations that didn't provoke these errors. So you can just move on. If you are concerned, however, to have completely clean output, you can modify program myregress as follows:
              Code:
              capture program drop myregress
              program define myregress
                  capture regress y x1 x2 x3, noconstant
                  if c(rc) == 0 { // SUCCESSFUL REGRESSION
                      gen b_x1 = _b[x1]
                      gen b_x2 = _b[x2]
                      gen b_x3 = _b[x3]
                      //gen b_cons = _b[_cons]
                      gen se_x1 = _se[x1]
                      gen se_x2 = _se[x2]
                      gen se_x3 = _se[x3]
                      gen r2 = e(r2)
                      gen n_obs = e(N)
                      predict r, resid
                  }
                  else if !inlist(c(rc), 2000, 2001) {    // SOME UNANTICIPATED ERROR
                      gen message == "Unexpected error `c(rc)' in regression:"
                  }
                  exit
              end
              With this code, sic-fyear combinations which lack sufficient (or any) observations for the regression will simply be skipped over. They will generate no output in the results at all, and -runby- will not count them as errors. In the event that some other problem arises during the regression, you will not get any results for that sic-fyear combination, but instead there will be a string variable called message that tells you what went wrong. So following -runby-ing this code you should run:
              Code:
              capture confirm var message, exact
              if c(rc) == 0 { // THERE WAS AN UNEXPECTED PROBLEM
                  list sic fyear message if !missing(message)
              }
              to see which sic-fyear combinations are having which problems.

              Comment


              • #22
                Originally posted by Clyde Schechter View Post
                Well, the first step in troubleshooting this is to identify just what the errors are. My best guess, but it is only a guess, is that due to the missing values, there are some combinations of sic and fyear for which there are insufficient observations to carry out the regression. But really we need to know if that is the problem or not.

                It is easy enough to identify the combinations of sic and fyear where the errors are arising: they will have missing values for all of the coefficients and standard errors. So, with the results that you have for your code, you can do this:
                Code:
                egen int check = rownonmiss(b_*)
                keep if check == 0
                drop b_* se_* r2 n_obs r
                You have now reduced the data set to exactly the data going into program myregress in those combinations of sic and fyear for which there were errors. Next, modify the code in program myregress by putting everything in a -quietly- block. Now rerun the -runby- command, but this time add the -verbose- option. This will show the error messages as -runby- runs.

                If my guess is correct about the source of the problem, all of the messages will be "r(2000) no observations" or "r(2001) insufficient observations." There isn't actually anything you can do about these other than getting a better data set that doesn't have those missing values and gaps in the data. Also, in this situation, the good news is that there is no reason to worry about the results that yougot originally for those sic-fyear combinations that didn't provoke these errors. So you can just move on. If you are concerned, however, to have completely clean output, you can modify program myregress as follows:
                Code:
                capture program drop myregress
                program define myregress
                capture regress y x1 x2 x3, noconstant
                if c(rc) == 0 { // SUCCESSFUL REGRESSION
                gen b_x1 = _b[x1]
                gen b_x2 = _b[x2]
                gen b_x3 = _b[x3]
                //gen b_cons = _b[_cons]
                gen se_x1 = _se[x1]
                gen se_x2 = _se[x2]
                gen se_x3 = _se[x3]
                gen r2 = e(r2)
                gen n_obs = e(N)
                predict r, resid
                }
                else if !inlist(c(rc), 2000, 2001) { // SOME UNANTICIPATED ERROR
                gen message == "Unexpected error `c(rc)' in regression:"
                }
                exit
                end
                With this code, sic-fyear combinations which lack sufficient (or any) observations for the regression will simply be skipped over. They will generate no output in the results at all, and -runby- will not count them as errors. In the event that some other problem arises during the regression, you will not get any results for that sic-fyear combination, but instead there will be a string variable called message that tells you what went wrong. So following -runby-ing this code you should run:
                Code:
                capture confirm var message, exact
                if c(rc) == 0 { // THERE WAS AN UNEXPECTED PROBLEM
                list sic fyear message if !missing(message)
                }
                to see which sic-fyear combinations are having which problems.
                Hello Clyde Schechter,

                Thank you very much for your guidance and help. I will follow your advice. Thanks again.

                Again thank you for your time and help.

                The first code
                ------------------
                egen int check = rownonmiss(b_*)
                keep if check == 0
                drop b_* se_* r2 n_obs r

                This generates 3 in all rows incling for missing values. If I run this code it completely delete all the observations.

                Second code
                --------------------
                capture program drop myregress
                program define myregress
                capture regress y x1 x2 x3, noconstant
                if c(rc) == 0 { // SUCCESSFUL REGRESSION
                gen b_x1 = _b[x1]
                gen b_x2 = _b[x2]
                gen b_x3 = _b[x3]
                gen se_x1 = _se[x1]
                gen se_x2 = _se[x2]
                gen se_x3 = _se[x3]
                gen r2 = e(r2)
                gen n_obs = e(N)
                predict r, resid
                }
                else if !inlist(c(rc), 2000, 2001) { // SOME UNANTICIPATED ERROR
                gen message = "Unexpected error `c(rc)' in regression:"
                }
                exit
                end

                unfortunately, it is not running. Probably my STATA is old I am suding STATA15

                Third code
                --------------------
                I guess that it is related to second step.

                capture confirm var message, exact
                if c(rc) == 0 { // THERE WAS AN UNEXPECTED PROBLEM
                list sic fyear message if !missing(message)
                }

                Any advice please.

                Thanks in advance.
                Last edited by saikat sarkar; 07 Jan 2024, 10:53.

                Comment


                • #23
                  Hello Clyde Schechter,

                  Thank you very much for your guidance and help. I will follow your advice. Thanks again.


                  The first code
                  -------------------

                  Code:
                  egen int check = rownonmiss(b_*)
                  keep if check == 0
                  drop b_* se_* r2 n_obs r
                  This generates 3 in all rows incling for missing values. If I run this code it completely delete all the observations.

                  Second code
                  --------------------
                  Code:
                  capture program drop myregress
                  program define myregress
                  capture regress y x1 x2 x3, noconstant
                  if c(rc) == 0 { // SUCCESSFUL REGRESSION
                  gen b_x1 = _b[x1]
                  gen b_x2 = _b[x2]
                  gen b_x3 = _b[x3]
                  gen se_x1 = _se[x1]
                  gen se_x2 = _se[x2]
                  gen se_x3 = _se[x3]
                  gen r2 = e(r2)
                  gen n_obs = e(N)
                  predict r, resid
                  }
                  else if !inlist(c(rc), 2000, 2001) { // SOME UNANTICIPATED ERROR
                  gen message = "Unexpected error `c(rc)' in regression:"
                  }
                  exit
                  end
                  unfortunately, it is not running. Probably my STATA is old I am using STATA15

                  Third code
                  --------------------

                  Code:
                  capture confirm var message, exact
                  if c(rc) == 0 { // THERE WAS AN UNEXPECTED PROBLEM
                  list sic fyear message if !missing(message)
                  }
                  I guess that it is related to second step. Any advice please. Thanks in advance.

                  Comment


                  • #24
                    The first code
                    -------------------

                    Code:

                    egen int check = rownonmiss(b_*) keep if check == 0 drop b_* se_* r2 n_obs r
                    This generates 3 in all rows incling for missing values. If I run this code it completely delete all the observations.
                    Sorry, my mistake. Since the original program, after -runby- would drop all the sic-fyear combinations that led to an error, only the combinations that ran without a problem would be there. So this approach was misguided.

                    Second code... is not running
                    What exactly is happening? By itself, this code will not produce any output--it just defines a program for Stata and, until you call on the program with -runby- it does nothing. There is, by the way, nothing in this code that won't work with Stata 15.

                    Go back to your data set, create the variables y, x1, x2, and x3, and run
                    Code:
                    capture program drop myregress
                    program define myregress
                    capture regress y x1 x2 x3, noconstant
                        if c(rc) == 0 { // SUCCESSFUL REGRESSION
                            gen b_x1 = _b[x1]
                            gen b_x2 = _b[x2]
                            gen b_x3 = _b[x3]
                            //gen b_cons = _b[_cons]
                            gen se_x1 = _se[x1]
                            gen se_x2 = _se[x2]
                            gen se_x3 = _se[x3]
                            gen r2 = e(r2)
                            gen n_obs = e(N)
                            predict r, resid
                        }
                        else if !inlist(c(rc), 2000, 2001) { // SOME UNANTICIPATED ERROR
                            gen message == "Unexpected error `c(rc)' in regression:"
                        }
                        exit
                    end
                    
                    runby myregress, by(sic fyear) status
                    
                    capture confirm var message, exact
                    if c(rc) == 0 { // THERE WAS AN UNEXPECTED PROBLEM
                        list sic fyear message if !missing(message)
                    }
                    This will do the regressions where they are possible, it will skip over the ones the fail due to absent or insufficient observations, and if there are regressions that fail for other reasons, it will create and show you a variable called message that tells you what happened in those cases.

                    If this does not work as I describe, when posting back show what output and messages you get from the Results window and show examples of what is in the data in active memory after the code runs. Just saying it doesn't work won't help find the problem.




                    Comment


                    • #25
                      Originally posted by Clyde Schechter View Post
                      Sorry, my mistake. Since the original program, after -runby- would drop all the sic-fyear combinations that led to an error, only the combinations that ran without a problem would be there. So this approach was misguided.


                      What exactly is happening? By itself, this code will not produce any output--it just defines a program for Stata and, until you call on the program with -runby- it does nothing. There is, by the way, nothing in this code that won't work with Stata 15.

                      Go back to your data set, create the variables y, x1, x2, and x3, and run
                      Code:
                      capture program drop myregress
                      program define myregress
                      capture regress y x1 x2 x3, noconstant
                      if c(rc) == 0 { // SUCCESSFUL REGRESSION
                      gen b_x1 = _b[x1]
                      gen b_x2 = _b[x2]
                      gen b_x3 = _b[x3]
                      //gen b_cons = _b[_cons]
                      gen se_x1 = _se[x1]
                      gen se_x2 = _se[x2]
                      gen se_x3 = _se[x3]
                      gen r2 = e(r2)
                      gen n_obs = e(N)
                      predict r, resid
                      }
                      else if !inlist(c(rc), 2000, 2001) { // SOME UNANTICIPATED ERROR
                      gen message == "Unexpected error `c(rc)' in regression:"
                      }
                      exit
                      end
                      
                      runby myregress, by(sic fyear) status
                      
                      capture confirm var message, exact
                      if c(rc) == 0 { // THERE WAS AN UNEXPECTED PROBLEM
                      list sic fyear message if !missing(message)
                      }
                      This will do the regressions where they are possible, it will skip over the ones the fail due to absent or insufficient observations, and if there are regressions that fail for other reasons, it will create and show you a variable called message that tells you what happened in those cases.

                      If this does not work as I describe, when posting back show what output and messages you get from the Results window and show examples of what is in the data in active memory after the code runs. Just saying it doesn't work won't help find the problem.



                      Thank a lot Clyde Schechter. Your program is working nicely. Again thank you very much.

                      Comment

                      Working...
                      X