Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • syntax help: to generate variables based on an equation

    Dear Statalisters,

    Hope my post finds you well.

    I need to generate two variables based on the equation: Ln(REVENUEt)= b1 + b2t + ð where Ln(REVENUEt) is natural logarithm of a firm's revenue in year t, t is the independent variable (year), and ð is residual.

    First, run a linear regression model of the relationship between the year (t) and the natural logarithm of revenue (REVENUE) within the annual window period [t, t+4],

    then I want to generate two variables.
    1) X1: the antilog of the regression coefficient (b2);
    2) X2: the antilog of the standard deviation of b2


    Thank you in advance!

    Best,
    Josh


    Here is the example of the sample:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input id year REVENUE
    1 2010 1000
    1 2012 1200
    2 2010 0
    2 2011 900
    2 2012 100
    2 2013 500
    2 2014 990
    2 2015 1000
    2 2016 1100
    2 2017 1300
    2 2018 1450
    3 2010 1200
    3 2011 1345
    3 2012 0
    3 2013 1340
    3 2014 1250
    3 2015 1400
    3 2016 1568
    3 2017 1200
    3 2018 1900
    4 2010 0
    4 2011 0
    4 2012 0
    4 2013 400
    4 2014 670
    4 2015 760
    4 2016 560
    4 2017 800
    4 2018 890
    5 2010 110
    end
    [/CODE]
    ------------------


  • #2
    I don't understand what you are asking for. Your equation is written for a single time series. But your data is panel data. I'm going to guess that you want to do these calculations separately for each id--which is really the only way it would make sense to have a variable to contain the results. Also, regression coefficients do not have standard deviations; they have standard errors. Of course, perhaps you mean to take the standard deviation of the different values of the coefficient across the different id's. I can't really tell what you want.

    Anyway, my best guess as to what you are looking for is this:
    Code:
    gen ln_revenue = log(REVENUE)
    rangestat (reg) ln_revenue year, by(id) interval(year . .)
    gen X1 = exp(b_year)
    gen X2 = exp(se_year)
    -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer. It is available from SSC.

    If this is not what you meant, please post back with a clearer explanation.

    Comment


    • #3
      #1 looks like an assignment to me.

      If so, then please note our policy at #4 of https://www.statalist.org/forums/help#adviceextras

      If not, then Clyde Schechter's reply in #2 above gives good advice as always.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        #1 looks like an assignment to me.

        If so, then please note our policy at #4 of https://www.statalist.org/forums/help#adviceextras

        If not, then Clyde Schechter's reply in #2 above gives good advice as always.
        Hello Nick Cox ,
        Thanks for your reply. It is not homework, I am doing a paper and now generating a variable used in some references.

        Best,
        Josh

        Comment


        • #5
          Originally posted by Clyde Schechter View Post
          I don't understand what you are asking for. Your equation is written for a single time series. But your data is panel data. I'm going to guess that you want to do these calculations separately for each id--which is really the only way it would make sense to have a variable to contain the results. Also, regression coefficients do not have standard deviations; they have standard errors. Of course, perhaps you mean to take the standard deviation of the different values of the coefficient across the different id's. I can't really tell what you want.

          Anyway, my best guess as to what you are looking for is this:
          Code:
          gen ln_revenue = log(REVENUE)
          rangestat (reg) ln_revenue year, by(id) interval(year . .)
          gen X1 = exp(b_year)
          gen X2 = exp(se_year)
          -rangestat- is written by Robert Picard, Nick Cox, and Roberto Ferrer. It is available from SSC.

          If this is not what you meant, please post back with a clearer explanation.
          Thank you Clyde for your codes. Sorry for making the confusion. It is a variable generated in a paper I cited. You can find here
          https://www.emerald.com/insight/cont...0239/full/html

          On page ten, the authors described the operationalization of the variables.

          I have tried your syntax, it is working. But I'm just curious about the "within the annual window period [t, t+4]", this is indicated in the paper, the authors used a dataset from 2008 to 2017, but used this 5-year window period, how can we include this in the syntax?

          Best,
          Josh

          Comment


          • #6
            Oh, sorry, I forgot about the [t, t+4] window. Change the -rangestat- command to
            Code:
            rangestat (reg) ln_revenue year, by(id) interval(year 0 4)

            Comment


            • #7
              Originally posted by Clyde Schechter View Post
              Oh, sorry, I forgot about the [t, t+4] window. Change the -rangestat- command to
              Code:
              rangestat (reg) ln_revenue year, by(id) interval(year 0 4)
              Thank you for your help!

              Best,
              Josh

              Comment

              Working...
              X