Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • generate beta variable for panel dataset

    Dear all,

    I hope I could get some help with the following problem. My dataset is a panel and contains asset returns from 45 companies, over a period from 1998-2016 and looks like the following:

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input int Date str23 CompanyName str11 Country float Recessionlag1 double(Xsector X)
    13880 "AIR LIQUIDE" "France" .   74.99411992971802   -.5188337554968327
    13881 "AIR LIQUIDE" "France" 0  .40916783846729227 .0031839828055408082
    13884 "AIR LIQUIDE" "France" 0   .1968211549754861   .00846651915332929
    13885 "AIR LIQUIDE" "France" 0  -.1809183794515586 -.003148172018527824
    13886 "AIR LIQUIDE" "France" 0 -.29311647324801804  -.03263164551346719
    end
    format %tdnn/dd/CCYY Date
    Now I am trying to generate a new variable containing the beta coefficient of the regression in the code below (cbeta) but I get a variable that only contains the beta from the first company in the sector for all companies and then Stata gives an error code:

    Code:
    forvalues i = 1(1)45 {
       qreg Xsector X, quantile(0.05)
       gen cbeta `i' = _b[X]
       }
    Code:
    variable cbeta already defined
    r(110);
    How do I adjust my command to generate the beta coefficient variable for each company separately in the panel dataset?

    I am using the Stata version 14.2
    I hope I have specified my problem clearly enough,
    Kind regards,
    Charlotte
    Last edited by Charlotte Koelemij; 07 Dec 2017, 02:25.

  • #2
    I see two problems.

    First, the loop will run the same qreg each time through the loop - there is nothing in the qreg command to restrict it to the data for a single company.

    Second, if you are trying to create variables cbeta1, cbeta2, ..., you need to remove the space between cbeta and `i' in your generate command. As it stands, it tries to generate the variable cbeta (with no digit after it) each time through the loop. That is the meaning of the error message you received.

    Comment


    • #3
      Dear William,

      Thank you very much for your quick response,

      I have adjusted my command to solve for the same qreg;
      Code:
      forvalues i = 1(1)45 {
         qreg Xsector X, quantile(0.05), if companyid==`i'
         gen cbeta`i' = _b[X]
         }
      Then I get a beta for each different company. However, I want to generate one variable "cbeta" which differs per company like this:

      company A 1
      company A 1
      company A 1
      company B 2
      company B 2
      company B 2
      company C 3
      company C 3
      company C 3

      Could you tell me how to adjust the command to do so?

      Kind regards,
      Charlotte

      Comment


      • #4
        You can use runby (from SSC) to perform a regression per company. This would look something like:

        Code:
        * create a demonstration dataset
        clear all
        set seed 312
        set obs 45
        gen company = _n
        expand 1000
        bysort company: gen Date = mdy(12,31,1997) + _n
        format %tdnn/dd/CCYY Date
        gen double Xsector = runiform()
        gen double X = runiform()
        
        * perform a regression per company
        program my_qreg
          qreg Xsector X, quantile(0.05)
          gen cbeta = _b[X]
        end
        runby my_qreg, by(company)
        
        * spot check for company 2
        qreg Xsector X if company == 2, quantile(0.05)
        sum cbeta if company == 2
        and the spot check results:
        Code:
        . * spot check for company 2
        . qreg Xsector X if company == 2, quantile(0.05)
        Iteration  1:  WLS sum of weighted deviations =  102.88653
        
        Iteration  1: sum of abs. weighted deviations =  103.15077
        Iteration  2: sum of abs. weighted deviations =  48.853891
        Iteration  3: sum of abs. weighted deviations =  45.019146
        Iteration  4: sum of abs. weighted deviations =  25.264765
        Iteration  5: sum of abs. weighted deviations =  24.678962
        Iteration  6: sum of abs. weighted deviations =  23.451682
        Iteration  7: sum of abs. weighted deviations =  23.352759
        Iteration  8: sum of abs. weighted deviations =  23.345709
        
        .05 Quantile regression                             Number of obs =      1,000
          Raw sum of deviations  23.3467 (about .05011088)
          Min sum of deviations 23.34571                    Pseudo R2     =     0.0000
        
        ------------------------------------------------------------------------------
             Xsector |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
        -------------+----------------------------------------------------------------
                   X |    .002263    .022248     0.10   0.919    -.0413952    .0459213
               _cons |   .0489107   .0128404     3.81   0.000     .0237134     .074108
        ------------------------------------------------------------------------------
        
        . sum cbeta if company == 2
        
            Variable |        Obs        Mean    Std. Dev.       Min        Max
        -------------+---------------------------------------------------------
               cbeta |      1,000     .002263           0    .002263    .002263

        Comment


        • #5
          Dear Robert,

          You are a hero! Thank you very much it works perfectly!

          Kind regards, and many thanks,
          Charlotte

          Comment


          • #6
            Dear Robert,

            I hope I can ask you an additional question to my previous one. I want to run the regression with the runby command per company yearly. How would I have to adjust the command?

            I tried a lot including the following,
            Code:
            program my_qreg  
            qreg Xsector X, quantile(0.05)  
            gen cbeta = _b[X]
            end
            runby my_qreg, by(company, year)
            but I get an error code which says: .

            Code:
             runby my_qreg, by(companyid, year)
            by() does not contain a valid varlist
            r(198);
            Thank you very much in advance,
            Kind regards,
            Charlotte

            Comment


            • #7
              You should not have a comma (,) within the -by()- option.

              Comment


              • #8
                thank you!

                Comment


                • #9
                  Hi, I have looked and the tread in this post and tried to implement the same..., however, when I run the runby command I get an error message....Can you please make a suggestion as my predict values are being repeated. Many Thanks

                  Attached Files

                  Comment


                  • #10
                    We can guess that your program my_qreg1 is buggy or that your dataset is not fit for purpose, but you need to show us the code and/or tell us about your data to get more detailed advice.

                    Comment


                    • #11
                      Hi Nick...Thanks for the response...my code is;

                      program my_qreg1
                      1. qreg MV IPGUK1 PDgrowth UKFTALLSHAREINDEXEPG FTSEGROWTH UKGDPQTRGROWTH UKGDPPERCAPITAQTR UKDSRealEstPrINDEXG LIQUIDITYSPREAD ChangeonslopeofyieldcurveCa ChangeinyieldcurveUK Defaultspread, quantile(0.01)
                      2. gen cbeta = _b{MV]
                      3. end

                      . runby my_qreg1, by(MV)

                      I have 10 banks with quarterly data from 2000 - 2018 that I want generate the predicted MV values for, however, when I use the predict command, it returns the predicted values for only one bank I guess and repeats the same values through the panel. Also the panel is unbalanced...
                      Many Thanks

                      Comment


                      • #12
                        Hi, I have also tried (below) and I get similar error....Many Thanks...

                        . egen countrynum = group (CompanyName)

                        .
                        . xtset countrynum
                        panel variable: countrynum (unbalanced)

                        .
                        . xtset countrynum Q, quarterly
                        panel variable: countrynum (unbalanced)
                        time variable: Q, 6960q2 to 7008q1, but with gaps
                        delta: 1 quarter

                        . program QREG
                        1. qreg MV IPGUK1 PDgrowth UKFTALLSHAREINDEXEPG FTSEGROWTH UKGDPQTR GROWTH UKGDPPERCAPITAQTR UKDSRealEstPrINDEXG LIQUIDITYSPREAD ChangeonslopeofyieldcurveCa ChangeinyieldcurveUK Defaultspread, quantile(0.01)
                        2. gen cbeta = _b[MV]
                        3. end

                        . runby QREG, by(ID)

                        --------------------------------------
                        Number of by-groups = 10
                        by-groups with errors = 10
                        by-groups with no data = 0
                        Observations processed = 547
                        Observations saved = 0
                        --------------------------------------

                        Comment


                        • #13
                          #12

                          I would first note that quantile regression with quantile 0.01 is hard work at best and futile at worst. If you have about 50 observations in each group, how well do you think that is going to work? (You're also using a large number of predictors for your implied sample size.)

                          More fatally to your problem you are asking for the coefficient on MV to be saved. But MV is not a predictor; it's the response or outcome (dependent variable, if you use that term).

                          If you try this directly, Stata complains

                          Code:
                          . sysuse auto, clear
                          (1978 Automobile Data)
                          
                          . qreg mpg weight
                          Iteration  1:  WLS sum of weighted deviations =  84.775676
                          
                          Iteration  1: sum of abs. weighted deviations =  84.941176
                          Iteration  2: sum of abs. weighted deviations =      83.08
                          
                          Median regression                                   Number of obs =         74
                            Raw sum of deviations      164 (about 20)
                            Min sum of deviations    83.08                    Pseudo R2     =     0.4934
                          
                          ------------------------------------------------------------------------------
                                   mpg |      Coef.   Std. Err.      t    P>|t|     [95% Conf. Interval]
                          -------------+----------------------------------------------------------------
                                weight |  -.0053333   .0004355   -12.25   0.000    -.0062016   -.0044651
                                 _cons |   36.94667   1.357403    27.22   0.000     34.24073     39.6526
                          ------------------------------------------------------------------------------
                          
                          . di _b[mpg]
                          [mpg] not found
                          r(111);
                          So I guess that runby is just eating the error, which is usually a feature. Read its help to find an option that will inform:

                          verbose specifies that the output from program_name is not suppressed.
                          Incidentally, your quarterly dates are nonsense and you should not have ignored what xtset told you.

                          I really doubt that you have data for the 2nd quarter of 6960, and so forth.

                          Those look to me like daily dates and assigning them a quarterly date display format doesn't convert them to quarterly dates. This is a common fallacy, and discussed at length in https://journals.sagepub.com/doi/pdf...867X1201200415

                          Code:
                          .  mata : yq(6960,2), yq(7008, 1)
                                     1       2
                              +-----------------+
                            1 |  20001   20192  |
                              +-----------------+
                          
                          .  mata : strofreal((yq(6960,2), yq(7008, 1)), "%td")
                                         1           2
                              +-------------------------+
                            1 |  05oct2014   14apr2015  |
                              +-------------------------+
                          I can't advise on what you should be doing, as for example there aren't 50 or so quarters between October 2014 and April 2015. So far as I can see your call on qreg pays no attention to any time variable, let alone to any xtset setting.

                          Comment

                          Working...
                          X