Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculation of Standard Deviation

    Good evening,

    I am working on my thesis regarding the capital structure of firms and I am having troubles to create the standard deviation of a variable that i need. I have to calculate the standard deviation for each year and for each firm. Afterwards I have to take an average of 3 years for each firm (e.g Firm 1, I need to obtain the Average Std for year 2005-2008,2006-2009 and so on).

    so far i have tried the following command:
    sort firm year
    egen SD=sd (business risk), by (firm year)

    but i received missing variables. Then I tried the following:

    sort firm year
    egen SD=sd(business risk) if firm[ _n]=firm[ _n-1], by(year)

    In this case, I end up having missing variables for the first year in each firm and I also doubt about the results that i receive.

    Thank you in advance for your help and for your time.

    With kind regards,

    Fotis Psipsilis

  • #2
    Can you add a data sample using -dataex-? Without seeing your actual data it's hard to see what's going wrong.

    Comment


    • #3
      Welcome to Statalist!

      We can better help you if we know exactly what commands you have tried and exactly what Stata told you to indicate that there was a problem. Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. See especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using CODE delimiters, as described in section 12 of the FAQ.

      In your first example there are several problems
      Code:
      sort firm year
      egen SD=sd (business risk), by (firm year)
      Your variable could not be named "business risk" because variable names cannot contain a space. We don't know what you mean by "received missing variables" - did you mean missing values were created, or did Stata tell you "businessrisk not found"?

      The more you can tell us about your data, your commands, and your results, the more likely we are to be able to help. Otherwise, we can only guess what is happening.
      Last edited by William Lisowski; 27 Jul 2016, 08:49.

      Comment


      • #4
        The variable is name "Businessrisk". Indeed, it gives missing values. Stata did not point out any error.
        My data look like the following:

        Firm Year Businessrisk
        Shell 2005 0,1435
        Shell 2006 0,0237
        ... ... ...
        Shell 2015 0.0654
        Unilever 2005 0.2375
        Unilever 2006 0.37654

        If I use the command I typed then it gives missing values. I want to calculate the standard deviation of 3 years (from 2005-2008,then 2006-2009 and so on) for each firm. I intend to use this in order to run a regression with leverage in order to check the correlation between business risk and firm leverage.

        I hope this information is helpful. In case, it is not clear yet please let me know what else should I provide you

        PS: my knowledge on stata is low, so I do not know how I could provide you with the -dataex- that you asked previously

        Comment


        • #5
          Read the FAQ on asking questions - it will help you with -data-ex-.

          You can manually create the two lags and then use egen with rowsd to calculate the standard deviations.
          That is,
          g lag1br=L.Businessrisk
          g lag2br=L2.Businessrisk
          egen sd13=rowsd(Businessrisk lag1br lag2br)

          Comment


          • #6
            In your original question, you wrote
            I have to calculate the standard deviation for each year and for each firm.
            In your example data, you show each firm as having one observation per year. The command you typed gave missing values because you asked it to compute the standard deviation separately for each firm for each year. With only one observation for each firm for each year, there is no standard deviation.

            Did you mean you need to calculate the standard deviation separately for each firm, across all years, and for each year, across all firms? Then the code you needed was
            Code:
            egen SDf = sd(Businessrisk), by(Firm)
            egen SDy = sd(Businessrisk), by(Year)
            Neither of these answers the task addressed by Phil in post #5. His answer will require that you xtset your data before running the code he suggests, because the lag (L) operator needs to know about your data structure so it can avoid combining data from separate firms.

            Regarding not knowing how to use dataex, you apparently did not follow the advice to read the Statalist FAQ that was previously given in post #3.

            Given your admittedly low knowledge of Stata, please consider the following advice. When I began using Stata in a serious way, I started by reading my way through the Getting Started with Stata manual relevant to my setup. Chapter 18 then gives suggested further reading, much of which is in the Stata User's Guide, and I worked my way through much of that reading as well. All of these manuals are included as PDFs in the Stata installation (since version 11) and are accessible from within Stata - for example, through Stata's Help menu. The objective in doing this was not so much to master Stata as to be sure I'd become familiar with a wide variety of important basic techniques, so that when the time came that I needed them, I might recall their existence, if not the full syntax.

            Stata supplies exceptionally good documentation that amply repays the time spent studying it.

            Comment


            • #7
              Thank you very much for your answers. I would try to follow your advice as well.

              Comment

              Working...
              X