Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • frequency table

    Hello,

    I have a sample of 158 firms, all belonging to a certain industry. I recoded the industry the firm is active in based on the first two digits of the NACE code and created a dummy for each industry.
    I ended up with 20 dummies. I now want to create a frequency table, to see how the firms are distributed over these different industries. Is there any way to do this using the dummies, or should I go back to my original data, in which I had the first two digits of the industry for each company?

    for clarity, this is how I constructed these dummies:

    Code:
    generate twodigitsNACE = int(NACEBEL2008codeprincipal/1000)
    generate industry08 = twodigitsNACE == 08 if !missing(twodigitsNACE)
    generate industry10 = twodigitsNACE == 10 if !missing(twodigitsNACE)
    generate industry11 = twodigitsNACE == 11 if !missing(twodigitsNACE)
    generate industry13 = twodigitsNACE == 13 if !missing(twodigitsNACE)
    generate industry16 = twodigitsNACE == 16 if !missing(twodigitsNACE) 
    generate industry17 = twodigitsNACE == 17 if !missing(twodigitsNACE) 
    generate industry19 = twodigitsNACE == 19 if !missing(twodigitsNACE)
    generate industry20 = twodigitsNACE == 20 if !missing(twodigitsNACE) 
    generate industry21 = twodigitsNACE == 21 if !missing(twodigitsNACE) 
    generate industry22 = twodigitsNACE == 22 if !missing(twodigitsNACE)
    generate industry23 = twodigitsNACE == 23 if !missing(twodigitsNACE) 
    generate industry24 = twodigitsNACE == 24 if !missing(twodigitsNACE)
    generate industry25 = twodigitsNACE == 25 if !missing(twodigitsNACE)
    generate industry27 = twodigitsNACE == 27 if !missing(twodigitsNACE)
    generate industry28 = twodigitsNACE == 28 if !missing(twodigitsNACE)
    generate industry29 = twodigitsNACE == 29 if !missing(twodigitsNACE)
    generate industry30 = twodigitsNACE == 30 if !missing(twodigitsNACE)
    generate industry35 = twodigitsNACE == 35 if !missing(twodigitsNACE)
    generate industry39 = twodigitsNACE == 39 if !missing(twodigitsNACE) 
    generate industry41 = twodigitsNACE == 41 if !missing(twodigitsNACE) 
    generate industry42 = twodigitsNACE == 42 if !missing(twodigitsNACE)
    generate industry46 = twodigitsNACE == 46 if !missing(twodigitsNACE)
    generate industry47 = twodigitsNACE == 47 if !missing(twodigitsNACE) 
    generate industry49 = twodigitsNACE == 49 if !missing(twodigitsNACE) 
    generate industry52 = twodigitsNACE == 52 if !missing(twodigitsNACE)
    generate industry61 = twodigitsNACE == 61 if !missing(twodigitsNACE)
    generate industry63 = twodigitsNACE == 63 if !missing(twodigitsNACE)
    generate industry70 = twodigitsNACE == 70 if !missing(twodigitsNACE)
    generate industry72 = twodigitsNACE == 72 if !missing(twodigitsNACE)
    generate industry81 = twodigitsNACE == 81 if !missing(twodigitsNACE) 
    generate industry82 = twodigitsNACE == 82 if !missing(twodigitsNACE)
    generate industry85 = twodigitsNACE == 85 if !missing(twodigitsNACE)
    generate industry94 = twodigitsNACE == 94 if !missing(twodigitsNACE)
    thank you!
    Timea

  • #2
    The generation of all of those dummies can be greatly simplified. Then, if I understand your query, a loop over the dummies could be used.

    Code:
    tab twodigitsNACE, gen(industry)   // replace the manual indicator creation
    
    foreach v of varlist industry* {
      tab firm if industry==1
    }

    Comment


    • #3
      Why do you want a table here? How about

      Code:
      sort twodigitNACE firm
      
      list twodigitNACE firm, sepby(twodigitNACE) noobs
      In #2 I think Leonardo Guizzetti means

      Code:
      tab firm if `v'

      Comment


      • #4
        Thank you for the tip on how to create the dummy's more efficiently, I will definitely keep this in mind. However, no matter how the dummies were created, want to have a distribution of how many firms are in each industry. The code you proposed, Nick Cox lists the firms within each industry, but I only need the number and their %share in the total firms.

        Comment


        • #5
          I am sorry I did not realize this code exactly does what I want to! Thank you so much!

          Comment


          • #6
            Code:
             
             tab twodigitNACE

            Comment


            • #7

              However, if I perform code
              Code:
                  
               tab twodigitsNACE, gen(industry)
              new variables are returned, but they do not have the number I need them to have (I have know the number of the industry code when regressing). If I run the second code, however, stata states that industry* is not found..

              Comment


              • #8
                Originally posted by Nick Cox View Post
                Why do you want a table here? How about

                [CODE]
                In #2 I think Leonardo Guizzetti means

                Code:
                tab firm if `v'
                Thanks for spotting this error. Nick is correct.

                Comment

                Working...
                X