Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Finding mean value of a column based on year

    Hi,

    I am a newbie with STATA and working with it for my thesis. So, I have a dataset of companies from 2000-2019 with their information such as (industry codes, debt, equity, assets, etc.).

    Now, I have grouped the companies based on the SIC codes to find out the number of companies in each industry.

    I am having a problem in 2 areas:

    1. I want to find the mean value of debt in every industry.
    2. I want to find the mean value of debt for every year for the whole sample

    How can I go about doing these?

    egen tocount = tag(c_id SIC_ID) (This is what I have for now to count for companies within a specific SIC code).

    Thanks in advance!

  • #2
    The command
    Code:
    egen debtbyindustry = mean(debt), by(sic)
    will give you the mean debt by sick industry
    Code:
    egen debtbyyear = mean(debt), by(year)
    will give you mean debt by year, and

    Code:
    egen debtbyindsutryyear = mean(debt), by(sic year)
    will give you the mean by the cells formed by the combinations of industry and year.

    Comment


    • #3
      Thanks Joro! I did use your formulas and it worked. I now have another query - hoping you can help in this one too.

      So I am first creating a variable meanPEby firm = mean(PE), by(c_id)... c_id has the name of all the unique companies in the dataset.

      The next step is to divide this into 3 sub-sets... low, medium, and high.. I am calculating the centiles for (33), (67), and (100).. Then, I am recoding PE by using these centile values..

      Theoretically, all the 3 sub-sets should be equal in firm size but it is not the case.. Can you understand the problem? What is it that I am doing wrong?

      My current code for this problem is:

      egen PEbyfirm = mean(PE), by(c_id)
      centile (PEbyfirm), c(33)
      centile (PEbyfirm), c(67)
      centile (PEbyfirm), c(100)
      recode PE (0/19.84421=1) (19.84422/29.15539=2) (29.15540/432.3908=3), gen (PE_ID)
      tab PE_ID
      label var PE_ID "Firm Grouping Division"
      lab define PE_IDlabel 1 "Low-Growth Firms" 2 "Mid-Growth Firms" 3 "High-Growth Firms"
      tabulate PE_ID, generate(dPE_ID)

      Comment


      • #4
        There is nothing in your calculation to ensure that the firms are "all the 3 sub-sets should be equal in firm size ," therefore I do not think that there is anything wrong with your code.

        If you have to do this often, there is a shorter way:

        Code:
        egen PEbyfirm = mean(PE), by(c_id)
        
         xtile tercilePE = PEbyfirm, nq(3) altdef
        my tercilePE should be identical to your PE_ID.

        Comment


        • #5
          Thanks Joro again for the help - your xtile code worked! But there's an issue, It is dividing the total observations (45,000) in 3 equal sub-sets.

          However, I want it to be divided into 3 sub-sets of equal firms. The total firms are 6,821 (so ideally 2274 in each one). Any advice on how to solve it?

          Comment


          • #6
            You can tag one observation per firm, Babar, do the classification, and then extend the classification to the rest of the observations. So try something like this:
            Code:
            egen firmtag = tag(c_id)
            
            xtile temptercilePE = PEbyfirm if firmtag, nq(3) altdef
            
            egen tercilePE = mean(temptercilePE), by(c_id)
            Originally posted by Babar Javed View Post
            Thanks Joro again for the help - your xtile code worked! But there's an issue, It is dividing the total observations (45,000) in 3 equal sub-sets.

            However, I want it to be divided into 3 sub-sets of equal firms. The total firms are 6,821 (so ideally 2274 in each one). Any advice on how to solve it?

            Comment


            • #7
              Hi again Joro! This has really solved it!! Thanks a lot for all your help

              One final small question - after running this line:
              xtile temptercilePE = PEbyfirm if firmtag, nq(3) altdef the number of firms are being reduced by 1 (now I have 6820 firms in equal groups). I can put the last one manually in a group but the code will be checked by the professor (so I have to make it work by the code itself).

              Comment


              • #8
                This got resolved - thanks again Joro!

                Comment

                Working...
                X