Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • missing years with dependent variable using lagging operator

    Dear Statalist users,

    my dataset consists of firm years from 2009 to 2015. My dependent variable is using the lagging operator. I assumed I would get output starting from 2010 then, but it really starts from 2011. What can I do so the observations from 2010 are included?

    Best

    Lilly

  • #2
    Lilly,
    Make sure that the 2009 values are really given in the initial dataset,
    The main reason that could explain missing values in the 2010 lagged variable (assuming one year lag operator L1.var) is missing values in 2009.

    Otherwise you perhaps have specified a two year lag (L2.var) or your panel is mis-specified, such that L1.var doesn't return one year lagged variable.
    Anyway, it is hard to say much without seeing your data and the exact code you wrote.

    Best,
    Charlie

    Comment


    • #3
      Hi Charlie,

      thanks for your quick response and input on that matter. I will specify my data and my dependent variable in the following:

      dependent variable=(var1+l.var1)/(var2+l.var2)*0,5 if (var1>0 & l.var2>0)

      my dataset consists of firm years from 2009 and 2015 with variables var1 and var2 (among others), whereby there are some observations missings but I assume this should not be the reason why output for regressions with the dependent variable starts from 2011.

      *** Edit:
      I might add the following info:
      some of my control variables are scaled with l.assets, so could it be that when I run my regression that somehow there will be a lag of 2 years?:

      dependent variable (components lagged) by 1 year= a + b*(var/l.assets)
      ****

      ****Edit 2:
      I ran tab YEAR if e(sample) --> see attachment, and 2010 was included , but I only goit results from 2011 onwards


      ****



      I hope this clarifies my problem.

      Best

      Lilly
      Last edited by Lilly NG; 14 Oct 2016, 10:45. Reason: additional info

      Comment


      • #4
        Additional info, because my previous attachment did not show: firm years from 2009 to 2015, Due to variables having lagging components (l.var) I assumed I get results from 2010 onwards, but I really get results from 2011..even though my data for 2010 is included.
        Click image for larger version

Name:	Unbenannt.PNG
Views:	1
Size:	24.7 KB
ID:	1360329

        Comment


        • #5
          Thanks Lilly.

          Concerning the outputs you posted in #4 :
          what is the command for the first table? Moreover, your model seems to include the year 2010, as shows your last picture, so I don't really understand your issue.

          Here it seems to me (if the first table you posted in #4 is the result of any regression) that simply the year 2010 is the base case and the other years' coefficients are to be interpreted compared to 2010.

          However, I don't know much more about your initial data
          Please post your data using dataex, or simply report the result of
          Code:
          tab year if var1!=., mi
          tab year if var2!=., mi
          tab year if depvar!=. , mi
          This will tell you whether var1 var2 and your dependent variable are initially non missing (before you run your lagged variable model).

          Also you didn't told how you generated your lagged variable, nor your panel.

          Best,
          Charlie



          Comment


          • #6
            Hi Charlie,

            thanks for your input, I try to specify my data a little bit more:

            Paneldata for firmyears from 2009 to 2015
            Code:
            xtset ID YEAR
            
            
            *control variables scaled by lagged total assets
            gen LAGGED_TOTAL_ASSETS=l1.TOTAL_ASSETS
            gen ROA=OPERATING_INCOME/LAGGED_TOTAL_ASSETS
            gen LEVERAGE = LONG_TERM_DEBT/LAGGED_TOTAL_ASSETS
            gen FI=INTERNATIONAL_OPERATING_INCOME/LAGGED_TOTAL_ASSETS
            gen PPE = PROPERTY_PLANT__EQUIP_NET/LAGGED_TOTAL_ASSETS
            gen INTANGIBLE_ASSETS = TOTAL_INTANGIBLE_OT_ASSETS_NET/LAGGED_TOTAL_ASSETS
            gen EQINC=l1.PRETAX_EQUITY_IN_EARNINGS
            gen SIZE = ln(l1.MARKET_CAPITALIZATION)
            gen MB = l1.MARKET_CAPITALIZATION/l1.COMMON_SHAREHOLDERS_EQUITY
            gen MI=MINORITY_INTEREST_PL/LAGGED_TOTAL_ASSETS
            
            
            *depvar
            gen TWO_YEAR_IFRS_ETR= ((INCOME_TAXES+l1.INCOME_TAXES)/ (PRETAX_INCOME+l1.PRETAX_INCOME))*0.5 if (PRETAX_INCOME>0 & l1.PRETAX_INCOME>0)
            replace TWO_YEAR_IFRS_ETR= . if TWO_YEAR_IFRS_ETR<0 
            
            reg TWO_YEAR_IFRS_ETR DAXPLUS_FAMILY ROA LEV NOL deltaNOL NOL_DUMMY FI PPE INTANGIBLE_ASSETS EQINC SIZE MB i.YEAR i.INDUSTRY
            
            tab YEAR if e(sample)
            
            tab YEAR if LAGGED_TOTAL_ASSETS!=., mi
            
            tab YEAR if TWO_YEAR_IFRS_ETR!=., mi
            gives me the following:
            Click image for larger version

Name:	2016-10-15 20_47_27-Stata_SE 14.0.png
Views:	1
Size:	60.2 KB
ID:	1360351

            Click image for larger version

Name:	2016-10-15 20_53_30-Stata_SE 14.0.png
Views:	1
Size:	8.6 KB
ID:	1360354


            Click image for larger version

Name:	2016-10-15 20_49_01-Stata_SE 14.0.png
Views:	1
Size:	9.0 KB
ID:	1360352

            Click image for larger version

Name:	2016-10-15 20_50_46-Stata_SE 14.0.png
Views:	1
Size:	8.7 KB
ID:	1360353

            So, why is Stata starting off with running regression from 2011 onwards when there is data from 2010 onwards. Wish you a nice weekend!

            Best

            Lilly

            Comment


            • #7
              There is no problem here. Your -tab YEAR if e(sample)- shows that your analysis includes observations for 2010. Are you asking why you don't see a row in the output table for year 2010? That's just because indicator variables for the levels of a categorical variable are always colinear with the constant term, so one of them is treated as a reference category and is omitted. If you don't specify otherwise, the omitted one will be the lowest, namely 2010. So the 2010 data are in there, but there is no indicator variable for 2010--which is exactly as it should be. If you look carefully at your output, you will also notice that some category of your Industry variable is also not represented in the regression output, for exactly the same reason. One indicator is always omitted to avoid colinearity with the constant.

              Comment


              • #8
                Hi Clyde,

                thank you very much for your input! How can I specify otherwise so 2010 and the other industry categories are shown?

                Best

                Lilly

                Comment


                • #9
                  us the "allbase" option; see the help file for your estimation command (e.g., "h regress")

                  Comment


                  • #10
                    Thank you so much for your input!

                    Comment

                    Working...
                    X