Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Code to Use Estimated Coefficient

    Hi All,
    I have the following model
    TA/LTA = α1 (1/LTA) + α2 (ΔREV/LTA) + α3 (PPE/LTA) + ε
    which I need to run it for each firm in the same industry_id in each year. After running the this model I need to use the estimated coefficient of this model a1,a2, and ,a3 to calculate other variable as follow
    NDA = α1 (1/LTA) + α2 (ΔREV/LTA) + α3 (PPE/LTA)
    , so the new variable is NDA.
    sorry if I wasn't clear but I'm tried my best.
    Thank you in advance for your comments and suggestion

  • #2
    If all you need to do is calculate NDA, there is no need to save the estimated coefficients separately, because -predict- will calculate NDA for you. If you do need the coefficients themselves for other reasons, see the help and manual sections for the -statsby- command. But to just get NDA you can do this (I'm calling your variables y x1 x2 and x3 for short, and I assume industry_id is a numeric variable, not a string):

    Code:
    levelsof industry_id, local(ids)
    levelsof year, local(years)
    gen nda = .
    foreach i of local ids {
         foreach y of local years {
              regress y x1 x2 x3 if year == `y' & industry_id == `i'
              predict xb, xb
              replace nda = xb if year == `y' & industry_id == `i'
              drop xb
         }
    }

    Comment


    • #3
      Dear Clyde
      I run your code but there is as error message that no observation
      no observations
      r(2000);
      .

      Comment


      • #4
        This could be a problem with missing data or with variables that should be numeric but are string. You may need to tell us more about your dataset to get more detailed advice.

        Also, you did not reply to the earlier comment that you need enough observations for the regression to run, i.e. replicates for each firm-year combination.

        http://www.statalist.org/forums/foru...-s-coefficient

        Preceding

        Code:
        regress y x1 x2 x3 if year == `y' & industry_id == `i'
        with

        Code:
        noi count if !missing(y, x1, x2, x3) & if year == `y' & industry_id == `i'
        will show you how many observations there are for each regression you are trying.
        Last edited by Nick Cox; 28 Dec 2015, 04:41.

        Comment


        • #5
          Dear Nick
          I have 6376 firm year observations from 59 industry. In addition my firm_id and industry_id variables are numeric.
          Hope this will help.
          Last edited by Issa Almaharmeh; 28 Dec 2015, 04:57.

          Comment


          • #6
            That rules out a problem with string variables. Otherwise, that says nothing whatsoever about individual combinations of firm and year.

            Comment


            • #7
              This is a sample of my data. it is unbalanced panel of 6376 observations for about 650 firms frrom 59 industry for the period from 1990-2013.
              y x1 x2 x3 year industry_id firm_id
              0.016866 5.44E-06 -0.04788 0.570729 2007 33 2002
              0.167458 4.24E-05 -0.18187 0.060609 2010 44 2003
              -0.00046 4.57E-05 0.155479 0.071096 2011 44 2003
              -0.1809 3.63E-05 -0.09731 0.064317 2012 44 2003
              -0.23769 5.24E-05 -0.13225 0.095747 2013 44 2003
              1997 57 2005
              -0.1368 4.09E-06 0.205387 0.003296 2006 13 2011
              0.255565 0.000824 0.79967 0.299258 1999 73 2012
              -0.30504 0.000191 0.479206 0.137925 2000 73 2012
              -0.41297 2.93E-05 0.112503 0.024783 2001 73 2012
              -0.02425 6.73E-05 0 0.029903 2009 10 2014
              -0.1044 0.000053 0 0.026444 2011 10 2014
              -0.61345 1.32E-05 0.00286 0.917697 2006 49 2016
              0.796652 0.000138 0.553465 0.113432 2002 87 2018
              0.74341 7.96E-05 0.538345 0.128375 2003 87 2018
              0.366284 1.26E-05 0.251456 0.039 2006 87 2018
              0.503076 5.99E-06 0.384956 0.062444 2007 87 2018
              0.544914 3.79E-06 0.345077 0.016366 2008 87 2018
              0.05812 5.13E-05 0.174823 0.107623 2007 73 2024
              0.074219 4.13E-05 0.219148 0.10247 2008 73 2024
              0.144096 3.41E-05 0.479478 0.101892 2009 73 2024
              0.011866 1.58E-05 0.073173 0.055827 2010 73 2024
              -0.10689 0.000014 -0.01466 0.061827 2011 73 2024
              -0.27348 4.19E-06 -0.00067 1.980653 2010 10 2027

              Comment


              • #8
                I see here (e.g.) one observation in which year is 2008 and industry id is 73, one with 2009 and 73, and so on. If that's correct. You can't run your regression for such combinations and you need more defensive code.

                Comment


                • #9
                  could you help me on writing the codeplease?

                  Comment


                  • #10
                    I'd personally not to want fit a multiple regression with 3 predictors to fewer than say 30 observations. That being so, a code sketch based on Clyde's code is

                    Code:
                    levelsof industry_id, local(ids)
                    levelsof year, local(years)
                    gen nda = .
                    gen nobs = .
                    
                    foreach i of local ids {
                         foreach y of local years {
                              count if !missing(y, x1, x2, x3) &  year == `y' & industry_id == `i'
                              replace nobs = r(N) if year == `y' & industry_id == `i'
                        
                              if r(N) >= 30 {
                                  regress y x1 x2 x3 if year == `y' & industry_id == `i'
                                  predict xb, xb
                                  replace nda = xb if year == `y' & industry_id == `i'
                                  drop xb
                             }
                         }
                    }

                    Comment


                    • #11
                      Can I add more clarification here. I want to run the regression for each industry in every year (so the regression will estimate industry level coefficients) then we need to use these coefficients to calculate NDA for each firm in each year(so the firms with the same industry_id will have the same coefficients in the same year) .*** also there is a condition here to run the yearly regression for each industry that must be at least 6 firms in each industry in each year.
                      sorry again if the thing not clear but I did my best.

                      Comment


                      • #12
                        Can I add more clarification here. I want to run the regression for each industry in every year (so the regression will estimate industry level coefficients) then we need to use these coefficients to calculate NDA for each firm in each year(so the firms with the same industry_id will have the same coefficients in the same year)
                        The code that Nick gives in #10 will do this.

                        .*** also there is a condition here to run the yearly regression for each industry that must be at least 6 firms in each industry in each year.
                        So you need to count the number of firms in each industry in each year and then then restrict the code to those industries where the number is at least 6. I assume that each combination of industry_id, firm_id, and year occurs only once in your data, which is true in your sample data. If it is not true in your data as a whole, a slightly different approach would be needed.

                        Code:
                        // VERIFY EACH COMBINATION OF industry_id, firm_id, & year
                        // OCCURS ONLY ONCE TO AVOID DOUBLE COUNTING
                        isid industry_id year firm_id, sort 
                        
                        // COUNT NUMBER OF FIRMS IN EACH INDUSTRY IN EACH YEAR
                        by industry_id year( firm_id): gen firm_count _this_year= _N
                        
                        // IDENTIFY SMALLEST NUMBER OF FIRMS IN ANY YEAR FOR EACH INDUSTRY
                        by industry_id (firm_count_this_year), sort: gen least_number_firms_any_year = firm_count_this_year[1]
                        Now, modify the code from #10 to add this restriction to the guard condition for the regression:
                        Code:
                        levelsof industry_id, local(ids)
                        levelsof year, local(years)
                        gen nda = .
                        gen nobs = .
                        
                        foreach i of local ids {
                             foreach y of local years {
                                  count if !missing(y, x1, x2, x3) &  year == `y' & industry_id == `i'
                                  replace nobs = r(N) if year == `y' & industry_id == `i'
                            
                                  if r(N) >= 30 & least_number_firms_any_year >= 6 {
                                      regress y x1 x2 x3 if year == `y' & industry_id == `i'
                                      predict xb, xb
                                      replace nda = xb if year == `y' & industry_id == `i'
                                      drop xb
                                 }
                             }
                        }

                        Comment


                        • #13
                          I received this error message
                          factor variables and time-series operators not allowed
                          r(101);
                          when I tried to calculate firm count in each industry for each year by running the following code,
                          by industry_id year( firm_id): gen firm_count _this_year= _N

                          Comment


                          • #14
                            Oh, sorry, there was a typo. There needs to be a space between year and (firm_id). (And no space needed between the open parenthesis and firm_id).

                            Comment


                            • #15
                              After the correction I received the following error
                              too many variables specified
                              r(103);
                              could you help me on this please.

                              Comment

                              Working...
                              X