Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #31
    The code runs but the result is not as expected.
    In the example the regression with cik "1338749" should have 6 peers (1 deleted because of too many missing data).
    But when running the code all peers are deleted and only the intercept is left.

    Comment


    • #32
      Ah, I see the problem. I processed the predictor variables starting with those having the fewest missing values first and then working up. That's backwards. The following code should do the trick:

      Code:
      capture program drop one_regression
      program define one_regression
          keep cik peerCIK mdate spr_*
          reshape wide spr_peer, i(cik mdate) j(peerCIK)
          //    ELIMINATE VARIABLES WITH MISSING DATA
          ds spr_peer*
          local dvs `r(varlist)'
          local mcounts
          foreach v of local dvs {
              count if missing(`v')
              local mcounts `mcounts' `:display %010.0f r(N)'#`v'
          }
          local mcounts: list sort mcounts
          //    REVERSE THE ORDER OF MCOUNTS
          local backwards `mcounts'
          local mcounts
          foreach b of local backwards {
              local mcounts `b' `mcounts'
          }
          
          //    NOW WEED OUT VARIABLES WITH LARGEST AMOUNTS
          //    OF MISSING VALUES UNTIL WE ARE LEFT WITH
          //    A SUFFICIENT SAMPLE SIZE
          local mcounts: subinstr local mcounts "#" " ", all
          tokenize `mcounts'
          local dvs_c: subinstr local dvs " " ", ", all
          count if !missing(spr_focal, `dvs_c')
          while `r(N)' < 30 & "`2'" != "" {
              local dvs: subinstr local dvs "`2'" ""
              local dvs: list retokenize dvs
              macro shift 2
              local dvs_c: subinstr local dvs " " ", ", all
              if `"`dvs_c'"' == "" {
                  continue, break
              }
              count if !missing(spr_focal, `dvs_c')
          }
          regress spr_focal `dvs'
          matrix M = r(table)
          local peer_ciks: colvarlist M
          local peer_ciks: subinstr local peer_ciks "spr_peer" "", all
          local i = 1
          foreach p of local peer_ciks {
              if substr("`p'", 1, 2) != "o." {
                  gen b_peer`i' = M[1, `i']
                  gen se_peer`i' = M[2, `i']
                  gen t_peer`i' = M[3, `i']
                  gen p_peer`i' = M[4, `i']
                  gen lb_peer`i' = M[5, `i']
                  gen ub_peer`i' = M[6, `i']
              }
              local ++i
          }
          gen rsq = e(r2)
          gen n_obs = e(N)
          drop spr_peer*
          exit
      end
      
      //    CONVERT CIK AND PEER CIK TO NUMERIC VARIABLES
      destring cik peerCIK, replace
      //    CREATE AN EMPTY MDATE RANGE FOR ALL BUT ONE OBSERVATION PER CIK-Startfiscalyear
      //    TO PREVENT UNNECESSARY REPETITIONS
      replace earliest = mdate+1 if !flag
      replace latest = mdate-1 if !flag
      
      
      rangerun one_regression, by(cik Startfiscalyear) interval(mdate earliest latest)
      Sorry for that error.

      Changes/additions shown in italics. Note that the line -local dvs: list retokenize dvs- is just a better way of doing what was there before. The original was not an error, just suboptimal. And the -drop spr_peer*- line at the end of the program fixes a bug neither of us had noticed yet. The problem was that the predictor spr_peer* variables were accumulating in wide form in the data set. So now they have been removed. The actual information they contain is still present in the variable spr_peer , in long layout, from the input data, so if you need to work with it you can. This is in the spirit of your earlier request to have the b_, se_, etc. variables just numbered sequentially and not have a huge number of variables accumulating in the result data set.

      Comment


      • #33
        Thanks.

        But when checking for the peer with the most missing values, does it consider the correct interval (mdate earliest latest)? It should only consider the missing values within this interval.

        Comment


        • #34
          Yes it does consider that, because the program is never used in isolation, it is used only when called by -rangerun-. When -rangerun- calls a program, it first clears the data set from memory and then repopulates memory with only those observations that are in range for the current round. Those are therefore the only observations that the program sees. There is no need to write those range restrictions into the program itself; -rangerun- makes sure that the program only has in-range data in the first place.

          Comment


          • #35
            Ok, perfect. Thank you

            Comment


            • #36
              Dear Clyde

              I am trying to do the same regressions but this time with weekly share price returns (instead of monthly returns).
              How do I have to format the date variable? The variables in the dataset have currently the format: sprDDMMYYYY (ending date of the week)

              [CODE]
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input long cik str30 Name str13 spr06012008 str12(spr13012008 spr20012008 spr27012008 spr03022008 spr10022008)
              1137411 "Rockwell Collins Inc" "-.03249549" "-.04664849" "-.0788919" ".01144165" ".03975436" "-.06141064"
              66382 "Herman Miller Inc" "-.1006098" "-.01355932" "-.04948453" ".006869125" ".1931777" "-.1149564"
              16732 "Campbell Soup Co" "-.03481894" "-.02597403" "-.05748148" "-.01886199" ".03876962" "-.04164096"
              1551182 "Eaton Corp PLC" "-.05730659" "-.06600087" "-.0576476" ".02022694" ".04618914" "-.06440363"
              1567892 "Mallinckrodt PLC" "-" "-" "-" "-" "-" "-"
              1326428 "Linn Energy LLC" "-.03265631" "-.06381227" "-.03166227" "-.003178928" "-.04373576" "-.02042558"
              896878 "Intuit Inc" "-.05311125" ".005642217" ".009570957" "-.03203661" ".05234718" "-.04525032"
              1166036 "MarkWest Energy Partners LP" ".006406523" ".04224537" "-.05607996" ".008823529" "-.01516035" "-.04994138"
              1552000 "MPLX LP" "-" "-" "-" "-" "-" "-"
              1591763 "Enable Midstream Partners LP" "-" "-" "-" "-" "-" "-"
              5272 "American International Group I" "-.04556437" ".04303797" "-.09760749" ".02247839" ".04716272" "-.09061546"
              944695 "Hanover Insurance Group Inc/Th" "-.02980495" "-.01784504" "-.06623735" ".01108374" ".1490865" "-.05236379"
              874766 "Hartford Financial Services Gr" "-.04819277" ".008679928" "-.0815107" "-.02238126" ".08478637" "-.08723927"
              932628 "AmeriGas Partners LP" ".002767017" ".005518764" "-.1454446" ".05716121" ".01852977" ".003424986"
              875159 "XL Group Ltd" "-.00259896" "-.004008819" "-.2046689" ".1308198" ".03915865" "-.08096469"
              22444 "Commercial Metals Co" "-.06026936" "-.02255028" "-.1115611" ".05967675" ".1392257" "-.04291109"
              890319 "Taubman Centers Inc" "-.08160754" ".04442956" "-.02885849" ".05348888" ".08608441" "-.07214313"
              1274057 "Hospira Inc" "-.05913606" ".05254112" "-.03149055" "-.03901734" ".04636591" "-.01652695"
              48465 "Hormel Foods Corp" "-.03853611" ".02041845" "-.03176274" "-.04332222" ".04421222" "-.002822684"
              end

              I am very thankful for any help.

              Comment


              • #37
                Well, you don't really have weekly dates there. You have daily dates that happen to be spaced a week apart. The simplest approach is to do it in three steps. First -reshape- the data to long, then calculate daily dates, and then convert those to weekly dates.

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input long cik str30 Name str13 spr06012008 str12(spr13012008 spr20012008 spr27012008 spr03022008 spr10022008)
                1137411 "Rockwell Collins Inc" "-.03249549" "-.04664849" "-.0788919" ".01144165" ".03975436" "-.06141064" 
                66382 "Herman Miller Inc" "-.1006098" "-.01355932" "-.04948453" ".006869125" ".1931777" "-.1149564" 
                16732 "Campbell Soup Co" "-.03481894" "-.02597403" "-.05748148" "-.01886199" ".03876962" "-.04164096" 
                1551182 "Eaton Corp PLC" "-.05730659" "-.06600087" "-.0576476" ".02022694" ".04618914" "-.06440363" 
                1567892 "Mallinckrodt PLC" "-" "-" "-" "-" "-" "-" 
                1326428 "Linn Energy LLC" "-.03265631" "-.06381227" "-.03166227" "-.003178928" "-.04373576" "-.02042558" 
                896878 "Intuit Inc" "-.05311125" ".005642217" ".009570957" "-.03203661" ".05234718" "-.04525032" 
                1166036 "MarkWest Energy Partners LP" ".006406523" ".04224537" "-.05607996" ".008823529" "-.01516035" "-.04994138" 
                1552000 "MPLX LP" "-" "-" "-" "-" "-" "-" 
                1591763 "Enable Midstream Partners LP" "-" "-" "-" "-" "-" "-" 
                5272 "American International Group I" "-.04556437" ".04303797" "-.09760749" ".02247839" ".04716272" "-.09061546" 
                944695 "Hanover Insurance Group Inc/Th" "-.02980495" "-.01784504" "-.06623735" ".01108374" ".1490865" "-.05236379" 
                874766 "Hartford Financial Services Gr" "-.04819277" ".008679928" "-.0815107" "-.02238126" ".08478637" "-.08723927" 
                932628 "AmeriGas Partners LP" ".002767017" ".005518764" "-.1454446" ".05716121" ".01852977" ".003424986" 
                875159 "XL Group Ltd" "-.00259896" "-.004008819" "-.2046689" ".1308198" ".03915865" "-.08096469" 
                22444 "Commercial Metals Co" "-.06026936" "-.02255028" "-.1115611" ".05967675" ".1392257" "-.04291109" 
                890319 "Taubman Centers Inc" "-.08160754" ".04442956" "-.02885849" ".05348888" ".08608441" "-.07214313" 
                1274057 "Hospira Inc" "-.05913606" ".05254112" "-.03149055" "-.03901734" ".04636591" "-.01652695" 
                48465 "Hormel Foods Corp" "-.03853611" ".02041845" "-.03176274" "-.04332222" ".04421222" "-.002822684"
                end
                reshape long spr, i(cik) j(d1) string
                numdate daily ddate = d1, pattern("DMY")
                format ddate %td
                tab ddate
                gen int wdate = wofd(ddate)
                format wdate %tw
                tab wdate

                Comment

                Working...
                X