Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Growth rates from Panel dataset.

    Hi Statalist,

    I have a panel dataset (monthly survey responses) which is defined by a firm id variable and a monthly time variable (format example: 2021m1) and has around 30 variables total. One of these variables are current firm sales. I'm looking to generate a new variable that calculates firm yearly sales growth rates from that current sales variables with the firms that have responses that are 12 months apart. For those firms that don't have responses 12 months apart, this variable would just be a dot so that these observations can be dropped later. I've used tsset FIRM_ID mdate to be able to use the time series operators but am not sure where to go from here.

    I've used dataex to show a sample of my data below, however the mdate column has been converted to numbers (e.g. 748) but in my data editor it's under the format 2021m1.

    input long FIRM_ID float(C_SALE mdate)
    38 1.40e+07 748
    100 3.50e+08 747
    210 2850000 758
    337 60000 746
    563 3000000 744
    799 3.50e+07 754
    1258 629927 755
    1612 150000 756
    1638 75000 751
    1936 300 752

    Best,
    Matt

  • #2
    Code:
    gen annual_growth_rate = 100*D12.C_SALE/L12.C_SALE
    will give you the annual growth_rate (in percent) during the year preceding the date of the current observation.

    I've used dataex to show a sample of my data below, however the mdate column has been converted to numbers (e.g. 748) but in my data editor it's under the format 2021m1.
    Yes. The numbers that -datatex- is showing are what is actually in your Stata data set. The 2021m1 and similar things you see in the editor are simply display formats that Stata uses when showing you these numbers. It is a convenience to you, but it is not what is actually in your data set. This is precisely how Stata is supposed to work with datetime variables: internally it has numbers that are suitable for calculations and sorting but are difficult for humans to understand as dates, but in data displays, Stata "translates" them into human-readable dates. Do read -help datetime- if you are going to be working with panel data on an ongoing basis.
    Last edited by Clyde Schechter; 12 Apr 2023, 13:53.

    Comment


    • #3
      Hi Clyde,

      Thank you for the info, I'll make sure to read more into datetime. As for the code you gave, I tried running it and it gave me all missing values. I tried reversing the values (L12.C_SALE/D12.C_SALE) as I want to get the realized growth rate the succeeding year of the current observation but again that just gave missing observations. Any idea why this is happening? I've copy and pasted another example below of a single firms data. The observation with the mdate of 2022m2 should have a realized_growth_rate value as there is also a 2023m2 observation.

      FIRM_ID mdate C_SALE
      1001063 2022m2 6000000
      1001063 2022m5 7500000
      1001063 2022m8 8000000
      1001063 2022m11 7250000
      1001063 2023m2 7000000

      Thanks,
      Matt

      Comment


      • #4
        The kind of interchanging you did will not, in any case, get what you are looking for. The formula for annual growth is (sales at end of year - sales at beginning of year)/(sales at beginning of year). The formula you came up with has the numerator and denominator switched. Here is revised code that will give the forward-looking growth instead of the one-year look back:

        Code:
        * Example generated by -dataex-. For more info, type help dataex
        clear
        input long(firm_id c_sale) float mdate
        1001063 6000000 745
        1001063 7500000 748
        1001063 8000000 751
        1001063 7250000 754
        1001063 7000000 757
        end
        format %tm mdate
        
        xtset firm_id mdate
        
        gen ann_pct_growth_rate = 100*(F12.c_sale - c_sale)/c_sale
        This code produces missing values for most of the observations, because they have no observation that is one year later. But it does produce the correct value for the first observation.

        In the future, when you post example data, always use the -dataex- command. You did that (if imperfectly) in #1. But in #3 you did not. It took me more time to translate those 2021m2 dates back to Stata numeric dates than it took me to solve the actual problem. Stick with -dataex-, always.

        Comment


        • #5
          This is great! So clear and now it works. Thank you Clyde, I'll make sure to always use dataex going forward. One question to wrap this up, how do you use dataex to get the data in the format that you have above? What I did in #1 was simply copying the data/text from stata into the chat box here but I know that isn't right. I've looked in the FAQ and it doesn't indicate exactly how to do it.

          Thanks
          Matt

          Comment


          • #6
            If you are using version 17 or 16, or fully updated versions 15 or 14, then -dataex- is already part of your Stata installation. If not, run -ssc install dataex- to get it.

            Then create your example data in Stata by selecting appropriate observations to show using -keep if/in- and -drop if/in- type commands to get what you want. Then just type -dataex-. Stata will then respond in the Results window with output that looks like what I have shown, preceded by a line that says "copy starting from the next line" and followed by a line that says "copy up to and including the previous line." So highlight everything between (but not including) those two lines. Copy to your clipboard. Then paste into the Forum edit box. That's all you have to do. The pasted material will not look exactly like that formatting, but when you click on Post Reply, it will get formatted nicely.

            Comment

            Working...
            X