Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to count the number of business segments in Stata? How to write the code? The relative data is downloaded from WRDS. Data picture is inserted.

    Hi:
    My data comes from WRDS and my research question is about gender diversity and firm performance. My data is 7 years panel data(from 2007 to 2013) and I use fixed effect to run the regression. I use Stata 13.1

    However, because the data in WRDS is quite tricky, I am confused about how to write down relative Stata code to count the number of business segments.

    The data picture downloaded from the WRDS are as follows.  DATADATE =Ending date of the Fiscal Period Being Reported
    Source Date=“Date” of the report – Up to 3 SRCDATE’s for each DATADATE
    Click image for larger version

Name:	sample 1.JPG
Views:	1
Size:	147.9 KB
ID:	1303995



    I am curious about how to use these data to count the number of business segments.

    Especially,in the same day in the same company, the number of Segment Identifiers is different. How can I count the number ?
    Click image for larger version

Name:	sample2.JPG
Views:	1
Size:	60.7 KB
ID:	1303996



    Actually, I do not want to use the data sorted by data data, I want to use the data sorteed by data year. I have already used data year and ticker as key variables to merge about 10 data sets. Therefore, if the new data set do not have data year then I can not merge the new dta with the old one.

    Or could you recommend me other database that can be used to count the number of business segments?

    WRDS provide me with the SAS code to count the number of business segments sorted by GVKEY,SRCDATE and DATADATE which cannot solve my problem.

    Thank you very much!

  • #2
    Carlo asked you here to read the FAQ to learn about attachments. Screenshots are not a good way to share your data because anyone who would like to experiment in Stata would have to type everything in. Few list members are willing to do that and as a result you are less likely to get a response than if you had shared your data in a more usable format.

    Comment


    • #3
      OK...I would be more care about this. Thank you for reminding me .

      Comment


      • #4
        This is also very unlikely to get an answer, because: Excel files are not universally readable, people are afraid of malware in these files and nobody wants to do the extra work of putting this into Stata.

        The best way to get an answer is to provide an useful data example that can be put into Stata using input. See dataex from ssc.

        Code:
        ssc install dataex
        h dataex
        Last edited by Jorge Eduardo Perez Perez; 28 Jul 2015, 12:07.
        Jorge Eduardo Pérez Pérez
        www.jorgeperezperez.com

        Comment


        • #5
          You are right, it doesn't seem to be possible to attach CSV files. When I tried to upload a CSV file with the "Upload Attachments" button I got the error message "Invalid File". However, you can change the suffix from CSV to TXT and it is then possible to attach the file.

          Section 12 of the FAQ contains advice against Excel files.

          You can attach datasets or other documents, but that is usually much less convenient than the methods [described earlier in the FAQ]. Note, in particular, that MS Word and MS Excel file formats are not universally readable by forum members.

          Comment


          • #6
            Thank you very much!! I use the dataex and copy the text on the Stata as follows, which is part of whole data set about the number of business segments. Hope I can get answer this time. For convenience, I attached the file I ant to merge with the new data set again.

            Code:
            clear
            input int globalcompanykeysegmentannualfun str6 segmenttype byte segmentidentifier long(datadatesegmentannualfundamental sourcedatesegmentannualfundament) int(siccodeprimary siccodesecondary) str37 segmentname str3 tickersymbol
            1045 "BUSSEG" 1 20071231 20091231 4512 4581 "Air Transportation" "AAL" 
            1045 "BUSSEG" 1 20081231 20101231 4512 4581 "Air Transportation" "AAL" 
            1045 "BUSSEG" 1 20091231 20111231 4512 4581 "Air Transportation" "AAL" 
            1045 "BUSSEG" 1 20101231 20121231 4512 4581 "Air Transportation" "AAL" 
            1045 "BUSSEG" 1 20111231 20131231 4512 4581 "Air Transportation" "AAL" 
            1045 "BUSSEG" 1 20121231 20141231 4512 4581 "Air Transportation" "AAL" 
            1045 "BUSSEG" 1 20131231 20141231 4512 4581 "Air Transportation" "AAL" 
            1075 "BUSSEG" 8 20071231 20091231 6552 . "Real Estate" "PNW" 
            1075 "BUSSEG" 11 20071231 20091231 4911 . "All Other" "PNW" 
            1075 "BUSSEG" 9 20071231 20091231 4911 . "Regulated Electricity" "PNW" 
            1075 "BUSSEG" 9 20081231 20101231 4911 . "Regulated Electricity" "PNW" 
            1075 "BUSSEG" 11 20081231 20101231 4911 . "All Other" "PNW" 
            1075 "BUSSEG" 9 20091231 20111231 4911 . "Regulated Electricity" "PNW" 
            1075 "BUSSEG" 11 20091231 20111231 4911 . "All Other" "PNW" 
            1075 "BUSSEG" 9 20101231 20121231 4911 . "Regulated Electricity" "PNW" 
            1075 "BUSSEG" 11 20101231 20121231 4911 . "All Other" "PNW" 
            1075 "BUSSEG" 9 20111231 20131231 4911 . "Regulated Electricity" "PNW" 
            1075 "BUSSEG" 9 20121231 20141231 4911 . "Regulated Electricity" "PNW" 
            1075 "BUSSEG" 9 20131231 20141231 4911 . "Regulated Electricity" "PNW" 
            1078 "BUSSEG" 18 20071231 20091231 2834 . "Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 20 20071231 20091231 3842 . "Vascular" "ABT" 
            1078 "BUSSEG" 6 20071231 20091231 3841 . "Diagnostics" "ABT" 
            1078 "BUSSEG" 21 20071231 20091231 . . "Other" "ABT" 
            1078 "BUSSEG" 19 20071231 20091231 2023 2834 "Nutritionals" "ABT" 
            1078 "BUSSEG" 21 20081231 20101231 . . "Other" "ABT" 
            1078 "BUSSEG" 20 20081231 20101231 3842 . "Vascular" "ABT" 
            1078 "BUSSEG" 6 20081231 20101231 2835 . "Diagnostics" "ABT" 
            1078 "BUSSEG" 18 20081231 20101231 2834 . "Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 19 20081231 20101231 2023 2834 "Nutritionals" "ABT" 
            1078 "BUSSEG" 22 20091231 20111231 2834 . "Proprietary Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 19 20091231 20111231 2023 2834 "Nutritionals" "ABT" 
            1078 "BUSSEG" 21 20091231 20111231 2835 7372 "Other" "ABT" 
            1078 "BUSSEG" 6 20091231 20111231 2835 . "Diagnostics" "ABT" 
            1078 "BUSSEG" 20 20091231 20111231 3842 . "Vascular" "ABT" 
            1078 "BUSSEG" 23 20091231 20111231 2834 . "Established Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 6 20101231 20121231 2835 . "Diagnostics" "ABT" 
            1078 "BUSSEG" 21 20101231 20121231 2835 7372 "Other" "ABT" 
            1078 "BUSSEG" 20 20101231 20121231 3842 . "Vascular" "ABT" 
            1078 "BUSSEG" 22 20101231 20121231 2834 . "Proprietary Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 23 20101231 20121231 2834 . "Established Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 19 20101231 20121231 2023 2834 "Nutritionals" "ABT" 
            1078 "BUSSEG" 19 20111231 20131231 2834 2023 "Nutritionals" "ABT" 
            1078 "BUSSEG" 21 20111231 20131231 2835 7372 "Other" "ABT" 
            1078 "BUSSEG" 20 20111231 20131231 3842 . "Vascular" "ABT" 
            1078 "BUSSEG" 23 20111231 20131231 2834 . "Established Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 24 20111231 20131231 . . "Corporate" "ABT" 
            1078 "BUSSEG" 6 20111231 20131231 2835 . "Diagnostics" "ABT" 
            1078 "BUSSEG" 19 20121231 20141231 2834 2023 "Nutritionals" "ABT" 
            1078 "BUSSEG" 6 20121231 20141231 2835 . "Diagnostics" "ABT" 
            1078 "BUSSEG" 21 20121231 20141231 2835 7372 "Other" "ABT" 
            1078 "BUSSEG" 20 20121231 20141231 3842 . "Vascular" "ABT" 
            1078 "BUSSEG" 23 20121231 20141231 2834 . "Established Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 24 20121231 20141231 . . "Corporate" "ABT" 
            1078 "BUSSEG" 19 20131231 20141231 2834 2023 "Nutritionals" "ABT" 
            1078 "BUSSEG" 6 20131231 20141231 2835 . "Diagnostics" "ABT" 
            1078 "BUSSEG" 24 20131231 20141231 . . "Corporate" "ABT" 
            1078 "BUSSEG" 23 20131231 20141231 2834 . "Established Pharmaceuticals" "ABT" 
            1078 "BUSSEG" 21 20131231 20141231 2835 7372 "Other" "ABT" 
            1078 "BUSSEG" 20 20131231 20141231 3842 . "Vascular" "ABT" 
            1177 "BUSSEG" 22 20071231 20091231 . . "Corporate & Interest" "AET" 
            1177 "BUSSEG" 13 20071231 20091231 6311 . "Group Insurance" "AET" 
            1177 "BUSSEG" 18 20071231 20091231 6371 6311 "Large Case Pensions" "AET" 
            1177 "BUSSEG" 17 20071231 20091231 6324 6300 "Healthcare" "AET" 
            1177 "BUSSEG" 13 20081231 20101231 6311 6300 "Group Insurance" "AET" 
            1177 "BUSSEG" 18 20081231 20101231 6371 6311 "Large Case Pensions" "AET" 
            1177 "BUSSEG" 22 20081231 20101231 . . "Corporate & Interest" "AET" 
            1177 "BUSSEG" 17 20081231 20101231 6324 6300 "Healthcare" "AET" 
            1177 "BUSSEG" 18 20091231 20111231 6371 6311 "Large Case Pensions" "AET" 
            1177 "BUSSEG" 17 20091231 20111231 6324 6300 "Healthcare" "AET" 
            1177 "BUSSEG" 22 20091231 20111231 . . "Corporate Financing" "AET" 
            1177 "BUSSEG" 13 20091231 20111231 6311 6300 "Group Insurance" "AET" 
            1177 "BUSSEG" 17 20101231 20121231 6324 6300 "Healthcare" "AET" 
            1177 "BUSSEG" 22 20101231 20121231 . . "Corporate Financing" "AET" 
            1177 "BUSSEG" 18 20101231 20121231 6371 6311 "Large Case Pensions" "AET" 
            1177 "BUSSEG" 13 20101231 20121231 6311 6321 "Group Insurance" "AET" 
            1177 "BUSSEG" 22 20111231 20131231 . . "Corporate Financing" "AET" 
            1177 "BUSSEG" 18 20111231 20131231 6371 6311 "Large Case Pensions" "AET" 
            1177 "BUSSEG" 17 20111231 20131231 6324 6300 "Healthcare" "AET" 
            1177 "BUSSEG" 13 20111231 20131231 6311 6321 "Group Insurance" "AET" 
            1177 "BUSSEG" 18 20121231 20141231 6371 6311 "Large Case Pensions" "AET" 
            1177 "BUSSEG" 22 20121231 20141231 . . "Corporate Financing" "AET" 
            1177 "BUSSEG" 13 20121231 20141231 6311 6321 "Group Insurance" "AET" 
            1177 "BUSSEG" 17 20121231 20141231 6324 6300 "Healthcare" "AET" 
            1177 "BUSSEG" 17 20131231 20141231 6324 6300 "Healthcare" "AET" 
            1177 "BUSSEG" 13 20131231 20141231 6311 6321 "Group Insurance" "AET" 
            1177 "BUSSEG" 18 20131231 20141231 6371 6311 "Large Case Pensions" "AET" 
            1177 "BUSSEG" 22 20131231 20141231 . . "Corporate Financing" "AET" 
            1209 "BUSSEG" 12 20070930 20090930 3559 3569 "Equipment and Energy" "APD" 
            1209 "BUSSEG" 16 20070930 20090930 2813 5169 "Electronics and Performance Materials" "APD" 
            1209 "BUSSEG" 15 20070930 20090930 2813 5169 "Tonnage Gases" "APD" 
            1209 "BUSSEG" 14 20070930 20090930 2813 5169 "Merchant Gases" "APD" 
            1209 "BUSSEG" 13 20070930 20090930 . . "Other" "APD" 
            1209 "BUSSEG" 13 20080930 20100930 . . "Other" "APD" 
            1209 "BUSSEG" 16 20080930 20100930 2813 5169 "Electronics and Performance Materials" "APD" 
            1209 "BUSSEG" 14 20080930 20100930 2813 5169 "Merchant Gases" "APD" 
            1209 "BUSSEG" 12 20080930 20100930 3559 3569 "Equipment and Energy" "APD" 
            1209 "BUSSEG" 15 20080930 20100930 2813 5169 "Tonnage Gases" "APD" 
            1209 "BUSSEG" 12 20090930 20110930 3559 3569 "Equipment and Energy" "APD" 
            1209 "BUSSEG" 14 20090930 20110930 2813 5169 "Merchant Gases" "APD" 
            1209 "BUSSEG" 13 20090930 20110930 . . "Other" "APD" 
            1209 "BUSSEG" 16 20090930 20110930 2813 5169 "Electronics and Performance Materials" "APD" 
            1209 "BUSSEG" 15 20090930 20110930 2813 5169 "Tonnage Gases" "APD" 
            1209 "BUSSEG" 12 20100930 20120930 3559 3569 "Equipment and Energy" "APD" 
            1209 "BUSSEG" 15 20100930 20120930 2813 5169 "Tonnage Gases" "APD" 
            1209 "BUSSEG" 16 20100930 20120930 2813 5169 "Electronics Materials" "APD" 
            1209 "BUSSEG" 14 20100930 20120930 2813 5169 "Merchant Gases" "APD" 
            1209 "BUSSEG" 13 20100930 20120930 . . "Other" "APD" 
            1209 "BUSSEG" 16 20110930 20130930 2813 5169 "Electronics Materials" "APD" 
            1209 "BUSSEG" 13 20110930 20130930 . . "Other" "APD" 
            1209 "BUSSEG" 12 20110930 20130930 3559 3569 "Equipment and Energy" "APD" 
            1209 "BUSSEG" 14 20110930 20130930 2813 5169 "Merchant Gases" "APD" 
            1209 "BUSSEG" 15 20110930 20130930 2813 5169 "Tonnage Gases" "APD" 
            1209 "BUSSEG" 13 20120930 20140930 . . "Other" "APD" 
            1209 "BUSSEG" 15 20120930 20140930 2813 5169 "Tonnage Gases" "APD" 
            1209 "BUSSEG" 14 20120930 20140930 2813 5169 "Merchant Gases" "APD" 
            1209 "BUSSEG" 16 20120930 20140930 2813 5169 "Electronics Materials" "APD" 
            1209 "BUSSEG" 12 20120930 20140930 3559 3569 "Equipment and Energy" "APD" 
            1209 "BUSSEG" 15 20130930 20140930 2813 5169 "Tonnage Gases" "APD" 
            1209 "BUSSEG" 13 20130930 20140930 . . "Other" "APD" 
            1209 "BUSSEG" 14 20130930 20140930 2813 5169 "Merchant Gases" "APD" 
            1209 "BUSSEG" 12 20130930 20140930 3559 3569 "Equipment and Energy" "APD" 
            1209 "BUSSEG" 16 20130930 20140930 2813 5169 "Electronics Materials" "APD" 
            end


            .


            Attached Files

            Comment


            • #7
              Originally posted by Friedrich Huebler View Post
              You are right, it doesn't seem to be possible to attach CSV files. When I tried to upload a CSV file with the "Upload Attachments" button I got the error message "Invalid File". However, you can change the suffix from CSV to TXT and it is then possible to attach the file.

              Section 12 of the FAQ contains advice against Excel files.
              Thank you very much! But my full sample is so large and TXT cannnot be uploaded. So I just upload the part of my data by dataex.

              Comment


              • #8
                The full sample isn't needed, the excerpt you shared is sufficient.

                Next step: Please explain what exactly you mean when you say "count the number of business segments". How is a business segment defined?

                For the record, the forum software was changed today and it is now possible to attach CSV files.

                Comment


                • #9
                  It's not clear but, to count the number of "business segments", isn't it enough to count the number of distinct values of variable -segmentidentifier- ?

                  If so, see

                  http://www.stata.com/support/faqs/da...-observations/.





                  You should:

                  1. Read the FAQ carefully.

                  2. "Say exactly what you typed and exactly what Stata typed (or did) in response. N.B. exactly!"

                  3. Describe your dataset. Use list to list data when you are doing so. Use input to type in your own dataset fragment that others can experiment with.

                  4. Use the advanced editing options to appropriately format quotes, data, code and Stata output. The advanced options can be toggled on/off using the A button in the top right corner of the text editor.

                  Comment


                  • #10
                    Here's my best interpretation of what is asked

                    Code:
                    clear
                    input int globalcompanykeysegmentannualfun str6 segmenttype byte segmentidentifier long(datadatesegmentannualfundamental sourcedatesegmentannualfundament) int(siccodeprimary siccodesecondary) str37 segmentname str3 tickersymbol
                    1045 "BUSSEG" 1 20071231 20091231 4512 4581 "Air Transportation" "AAL" 
                    1045 "BUSSEG" 1 20081231 20101231 4512 4581 "Air Transportation" "AAL" 
                    1045 "BUSSEG" 1 20091231 20111231 4512 4581 "Air Transportation" "AAL" 
                    1045 "BUSSEG" 1 20101231 20121231 4512 4581 "Air Transportation" "AAL" 
                    1045 "BUSSEG" 1 20111231 20131231 4512 4581 "Air Transportation" "AAL" 
                    1045 "BUSSEG" 1 20121231 20141231 4512 4581 "Air Transportation" "AAL" 
                    1045 "BUSSEG" 1 20131231 20141231 4512 4581 "Air Transportation" "AAL" 
                    1075 "BUSSEG" 8 20071231 20091231 6552 . "Real Estate" "PNW" 
                    1075 "BUSSEG" 11 20071231 20091231 4911 . "All Other" "PNW" 
                    1075 "BUSSEG" 9 20071231 20091231 4911 . "Regulated Electricity" "PNW" 
                    1075 "BUSSEG" 9 20081231 20101231 4911 . "Regulated Electricity" "PNW" 
                    1075 "BUSSEG" 11 20081231 20101231 4911 . "All Other" "PNW" 
                    1075 "BUSSEG" 9 20091231 20111231 4911 . "Regulated Electricity" "PNW" 
                    1075 "BUSSEG" 11 20091231 20111231 4911 . "All Other" "PNW" 
                    1075 "BUSSEG" 9 20101231 20121231 4911 . "Regulated Electricity" "PNW" 
                    1075 "BUSSEG" 11 20101231 20121231 4911 . "All Other" "PNW" 
                    1075 "BUSSEG" 9 20111231 20131231 4911 . "Regulated Electricity" "PNW" 
                    1075 "BUSSEG" 9 20121231 20141231 4911 . "Regulated Electricity" "PNW" 
                    1075 "BUSSEG" 9 20131231 20141231 4911 . "Regulated Electricity" "PNW" 
                    1078 "BUSSEG" 18 20071231 20091231 2834 . "Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 20 20071231 20091231 3842 . "Vascular" "ABT" 
                    1078 "BUSSEG" 6 20071231 20091231 3841 . "Diagnostics" "ABT" 
                    1078 "BUSSEG" 21 20071231 20091231 . . "Other" "ABT" 
                    1078 "BUSSEG" 19 20071231 20091231 2023 2834 "Nutritionals" "ABT" 
                    1078 "BUSSEG" 21 20081231 20101231 . . "Other" "ABT" 
                    1078 "BUSSEG" 20 20081231 20101231 3842 . "Vascular" "ABT" 
                    1078 "BUSSEG" 6 20081231 20101231 2835 . "Diagnostics" "ABT" 
                    1078 "BUSSEG" 18 20081231 20101231 2834 . "Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 19 20081231 20101231 2023 2834 "Nutritionals" "ABT" 
                    1078 "BUSSEG" 22 20091231 20111231 2834 . "Proprietary Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 19 20091231 20111231 2023 2834 "Nutritionals" "ABT" 
                    1078 "BUSSEG" 21 20091231 20111231 2835 7372 "Other" "ABT" 
                    1078 "BUSSEG" 6 20091231 20111231 2835 . "Diagnostics" "ABT" 
                    1078 "BUSSEG" 20 20091231 20111231 3842 . "Vascular" "ABT" 
                    1078 "BUSSEG" 23 20091231 20111231 2834 . "Established Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 6 20101231 20121231 2835 . "Diagnostics" "ABT" 
                    1078 "BUSSEG" 21 20101231 20121231 2835 7372 "Other" "ABT" 
                    1078 "BUSSEG" 20 20101231 20121231 3842 . "Vascular" "ABT" 
                    1078 "BUSSEG" 22 20101231 20121231 2834 . "Proprietary Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 23 20101231 20121231 2834 . "Established Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 19 20101231 20121231 2023 2834 "Nutritionals" "ABT" 
                    1078 "BUSSEG" 19 20111231 20131231 2834 2023 "Nutritionals" "ABT" 
                    1078 "BUSSEG" 21 20111231 20131231 2835 7372 "Other" "ABT" 
                    1078 "BUSSEG" 20 20111231 20131231 3842 . "Vascular" "ABT" 
                    1078 "BUSSEG" 23 20111231 20131231 2834 . "Established Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 24 20111231 20131231 . . "Corporate" "ABT" 
                    1078 "BUSSEG" 6 20111231 20131231 2835 . "Diagnostics" "ABT" 
                    1078 "BUSSEG" 19 20121231 20141231 2834 2023 "Nutritionals" "ABT" 
                    1078 "BUSSEG" 6 20121231 20141231 2835 . "Diagnostics" "ABT" 
                    1078 "BUSSEG" 21 20121231 20141231 2835 7372 "Other" "ABT" 
                    1078 "BUSSEG" 20 20121231 20141231 3842 . "Vascular" "ABT" 
                    1078 "BUSSEG" 23 20121231 20141231 2834 . "Established Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 24 20121231 20141231 . . "Corporate" "ABT" 
                    1078 "BUSSEG" 19 20131231 20141231 2834 2023 "Nutritionals" "ABT" 
                    1078 "BUSSEG" 6 20131231 20141231 2835 . "Diagnostics" "ABT" 
                    1078 "BUSSEG" 24 20131231 20141231 . . "Corporate" "ABT" 
                    1078 "BUSSEG" 23 20131231 20141231 2834 . "Established Pharmaceuticals" "ABT" 
                    1078 "BUSSEG" 21 20131231 20141231 2835 7372 "Other" "ABT" 
                    1078 "BUSSEG" 20 20131231 20141231 3842 . "Vascular" "ABT" 
                    1177 "BUSSEG" 22 20071231 20091231 . . "Corporate & Interest" "AET" 
                    1177 "BUSSEG" 13 20071231 20091231 6311 . "Group Insurance" "AET" 
                    1177 "BUSSEG" 18 20071231 20091231 6371 6311 "Large Case Pensions" "AET" 
                    1177 "BUSSEG" 17 20071231 20091231 6324 6300 "Healthcare" "AET" 
                    1177 "BUSSEG" 13 20081231 20101231 6311 6300 "Group Insurance" "AET" 
                    1177 "BUSSEG" 18 20081231 20101231 6371 6311 "Large Case Pensions" "AET" 
                    1177 "BUSSEG" 22 20081231 20101231 . . "Corporate & Interest" "AET" 
                    1177 "BUSSEG" 17 20081231 20101231 6324 6300 "Healthcare" "AET" 
                    1177 "BUSSEG" 18 20091231 20111231 6371 6311 "Large Case Pensions" "AET" 
                    1177 "BUSSEG" 17 20091231 20111231 6324 6300 "Healthcare" "AET" 
                    1177 "BUSSEG" 22 20091231 20111231 . . "Corporate Financing" "AET" 
                    1177 "BUSSEG" 13 20091231 20111231 6311 6300 "Group Insurance" "AET" 
                    1177 "BUSSEG" 17 20101231 20121231 6324 6300 "Healthcare" "AET" 
                    1177 "BUSSEG" 22 20101231 20121231 . . "Corporate Financing" "AET" 
                    1177 "BUSSEG" 18 20101231 20121231 6371 6311 "Large Case Pensions" "AET" 
                    1177 "BUSSEG" 13 20101231 20121231 6311 6321 "Group Insurance" "AET" 
                    1177 "BUSSEG" 22 20111231 20131231 . . "Corporate Financing" "AET" 
                    1177 "BUSSEG" 18 20111231 20131231 6371 6311 "Large Case Pensions" "AET" 
                    1177 "BUSSEG" 17 20111231 20131231 6324 6300 "Healthcare" "AET" 
                    1177 "BUSSEG" 13 20111231 20131231 6311 6321 "Group Insurance" "AET" 
                    1177 "BUSSEG" 18 20121231 20141231 6371 6311 "Large Case Pensions" "AET" 
                    1177 "BUSSEG" 22 20121231 20141231 . . "Corporate Financing" "AET" 
                    1177 "BUSSEG" 13 20121231 20141231 6311 6321 "Group Insurance" "AET" 
                    1177 "BUSSEG" 17 20121231 20141231 6324 6300 "Healthcare" "AET" 
                    1177 "BUSSEG" 17 20131231 20141231 6324 6300 "Healthcare" "AET" 
                    1177 "BUSSEG" 13 20131231 20141231 6311 6321 "Group Insurance" "AET" 
                    1177 "BUSSEG" 18 20131231 20141231 6371 6311 "Large Case Pensions" "AET" 
                    1177 "BUSSEG" 22 20131231 20141231 . . "Corporate Financing" "AET" 
                    1209 "BUSSEG" 12 20070930 20090930 3559 3569 "Equipment and Energy" "APD" 
                    1209 "BUSSEG" 16 20070930 20090930 2813 5169 "Electronics and Performance Materials" "APD" 
                    1209 "BUSSEG" 15 20070930 20090930 2813 5169 "Tonnage Gases" "APD" 
                    1209 "BUSSEG" 14 20070930 20090930 2813 5169 "Merchant Gases" "APD" 
                    1209 "BUSSEG" 13 20070930 20090930 . . "Other" "APD" 
                    1209 "BUSSEG" 13 20080930 20100930 . . "Other" "APD" 
                    1209 "BUSSEG" 16 20080930 20100930 2813 5169 "Electronics and Performance Materials" "APD" 
                    1209 "BUSSEG" 14 20080930 20100930 2813 5169 "Merchant Gases" "APD" 
                    1209 "BUSSEG" 12 20080930 20100930 3559 3569 "Equipment and Energy" "APD" 
                    1209 "BUSSEG" 15 20080930 20100930 2813 5169 "Tonnage Gases" "APD" 
                    1209 "BUSSEG" 12 20090930 20110930 3559 3569 "Equipment and Energy" "APD" 
                    1209 "BUSSEG" 14 20090930 20110930 2813 5169 "Merchant Gases" "APD" 
                    1209 "BUSSEG" 13 20090930 20110930 . . "Other" "APD" 
                    1209 "BUSSEG" 16 20090930 20110930 2813 5169 "Electronics and Performance Materials" "APD" 
                    1209 "BUSSEG" 15 20090930 20110930 2813 5169 "Tonnage Gases" "APD" 
                    1209 "BUSSEG" 12 20100930 20120930 3559 3569 "Equipment and Energy" "APD" 
                    1209 "BUSSEG" 15 20100930 20120930 2813 5169 "Tonnage Gases" "APD" 
                    1209 "BUSSEG" 16 20100930 20120930 2813 5169 "Electronics Materials" "APD" 
                    1209 "BUSSEG" 14 20100930 20120930 2813 5169 "Merchant Gases" "APD" 
                    1209 "BUSSEG" 13 20100930 20120930 . . "Other" "APD" 
                    1209 "BUSSEG" 16 20110930 20130930 2813 5169 "Electronics Materials" "APD" 
                    1209 "BUSSEG" 13 20110930 20130930 . . "Other" "APD" 
                    1209 "BUSSEG" 12 20110930 20130930 3559 3569 "Equipment and Energy" "APD" 
                    1209 "BUSSEG" 14 20110930 20130930 2813 5169 "Merchant Gases" "APD" 
                    1209 "BUSSEG" 15 20110930 20130930 2813 5169 "Tonnage Gases" "APD" 
                    1209 "BUSSEG" 13 20120930 20140930 . . "Other" "APD" 
                    1209 "BUSSEG" 15 20120930 20140930 2813 5169 "Tonnage Gases" "APD" 
                    1209 "BUSSEG" 14 20120930 20140930 2813 5169 "Merchant Gases" "APD" 
                    1209 "BUSSEG" 16 20120930 20140930 2813 5169 "Electronics Materials" "APD" 
                    1209 "BUSSEG" 12 20120930 20140930 3559 3569 "Equipment and Energy" "APD" 
                    1209 "BUSSEG" 15 20130930 20140930 2813 5169 "Tonnage Gases" "APD" 
                    1209 "BUSSEG" 13 20130930 20140930 . . "Other" "APD" 
                    1209 "BUSSEG" 14 20130930 20140930 2813 5169 "Merchant Gases" "APD" 
                    1209 "BUSSEG" 12 20130930 20140930 3559 3569 "Equipment and Energy" "APD" 
                    1209 "BUSSEG" 16 20130930 20140930 2813 5169 "Electronics Materials" "APD" 
                    end
                    
                    * please consider using shorter simpler name and use variable labels if needed
                    rename datadatesegmentannualfundamental date
                    label var date "data date segment annual fundamental"
                    
                    * the date variable is stored as a long integer, use arithmetics to extract year
                    gen int datayear = round(date/10000)
                    
                    * adjust name for upcoming merge
                    rename tickersymbol ticker
                    
                    * order the data, verify that each segment appears only once per ticker year
                    isid ticker datayear segmentidentifier, sort
                    
                    * the number of segments is the number of observations in the group
                    by ticker datayear: gen nsegments = _N
                    
                    * reduce to variables needed
                    keep ticker datayear nsegments
                    
                    * reduce to one observation per ticker datayear group
                    by ticker datayear: keep if _n == 1
                    
                    list, sepby(ticker) noobs
                    
                    * merge with the other data
                    merge 1:1 ticker datayear using "merge2.dta"
                    
                    * use the following information to decide if observations that
                    * did not match should be kept or dropped.
                    tab _merge

                    Comment


                    • #11
                      Originally posted by Friedrich Huebler View Post
                      The full sample isn't needed, the excerpt you shared is sufficient.

                      Next step: Please explain what exactly you mean when you say "count the number of business segments". How is a business segment defined?

                      For the record, the forum software was changed today and it is now possible to attach CSV files.
                      Thank you for advising.

                      The standard definition of business segments is as follows:
                      A subsection of a company's overall operations that is more commonly seen within larger, diversified companies. The costs and income of each business segment are accounted for independently, and to be considered a business segment, a subsection must create income for its company.

                      In the code I share, it means 'BUSSEG'. The number of business segments in the research I conducted represents firm complexity. More segments, more complex.

                      Reference is ADAMS, R. B. & FERREIRA, D. 2009. Women in the boardroom and their impact on governance and performance. Journal of financial economics, 94, 291-309.




                      Comment


                      • #12
                        Originally posted by Robert Picard View Post
                        Here's my best interpretation of what is asked

                        Code:
                        clear
                        input int globalcompanykeysegmentannualfun str6 segmenttype byte segmentidentifier long(datadatesegmentannualfundamental sourcedatesegmentannualfundament) int(siccodeprimary siccodesecondary) str37 segmentname str3 tickersymbol
                        1045 "BUSSEG" 1 20071231 20091231 4512 4581 "Air Transportation" "AAL"
                        1045 "BUSSEG" 1 20081231 20101231 4512 4581 "Air Transportation" "AAL"
                        1045 "BUSSEG" 1 20091231 20111231 4512 4581 "Air Transportation" "AAL"
                        1045 "BUSSEG" 1 20101231 20121231 4512 4581 "Air Transportation" "AAL"
                        1045 "BUSSEG" 1 20111231 20131231 4512 4581 "Air Transportation" "AAL"
                        1045 "BUSSEG" 1 20121231 20141231 4512 4581 "Air Transportation" "AAL"
                        1045 "BUSSEG" 1 20131231 20141231 4512 4581 "Air Transportation" "AAL"
                        1075 "BUSSEG" 8 20071231 20091231 6552 . "Real Estate" "PNW"
                        1075 "BUSSEG" 11 20071231 20091231 4911 . "All Other" "PNW"
                        1075 "BUSSEG" 9 20071231 20091231 4911 . "Regulated Electricity" "PNW"
                        1075 "BUSSEG" 9 20081231 20101231 4911 . "Regulated Electricity" "PNW"
                        1075 "BUSSEG" 11 20081231 20101231 4911 . "All Other" "PNW"
                        1075 "BUSSEG" 9 20091231 20111231 4911 . "Regulated Electricity" "PNW"
                        1075 "BUSSEG" 11 20091231 20111231 4911 . "All Other" "PNW"
                        1075 "BUSSEG" 9 20101231 20121231 4911 . "Regulated Electricity" "PNW"
                        1075 "BUSSEG" 11 20101231 20121231 4911 . "All Other" "PNW"
                        1075 "BUSSEG" 9 20111231 20131231 4911 . "Regulated Electricity" "PNW"
                        1075 "BUSSEG" 9 20121231 20141231 4911 . "Regulated Electricity" "PNW"
                        1075 "BUSSEG" 9 20131231 20141231 4911 . "Regulated Electricity" "PNW"
                        1078 "BUSSEG" 18 20071231 20091231 2834 . "Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 20 20071231 20091231 3842 . "Vascular" "ABT"
                        1078 "BUSSEG" 6 20071231 20091231 3841 . "Diagnostics" "ABT"
                        1078 "BUSSEG" 21 20071231 20091231 . . "Other" "ABT"
                        1078 "BUSSEG" 19 20071231 20091231 2023 2834 "Nutritionals" "ABT"
                        1078 "BUSSEG" 21 20081231 20101231 . . "Other" "ABT"
                        1078 "BUSSEG" 20 20081231 20101231 3842 . "Vascular" "ABT"
                        1078 "BUSSEG" 6 20081231 20101231 2835 . "Diagnostics" "ABT"
                        1078 "BUSSEG" 18 20081231 20101231 2834 . "Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 19 20081231 20101231 2023 2834 "Nutritionals" "ABT"
                        1078 "BUSSEG" 22 20091231 20111231 2834 . "Proprietary Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 19 20091231 20111231 2023 2834 "Nutritionals" "ABT"
                        1078 "BUSSEG" 21 20091231 20111231 2835 7372 "Other" "ABT"
                        1078 "BUSSEG" 6 20091231 20111231 2835 . "Diagnostics" "ABT"
                        1078 "BUSSEG" 20 20091231 20111231 3842 . "Vascular" "ABT"
                        1078 "BUSSEG" 23 20091231 20111231 2834 . "Established Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 6 20101231 20121231 2835 . "Diagnostics" "ABT"
                        1078 "BUSSEG" 21 20101231 20121231 2835 7372 "Other" "ABT"
                        1078 "BUSSEG" 20 20101231 20121231 3842 . "Vascular" "ABT"
                        1078 "BUSSEG" 22 20101231 20121231 2834 . "Proprietary Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 23 20101231 20121231 2834 . "Established Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 19 20101231 20121231 2023 2834 "Nutritionals" "ABT"
                        1078 "BUSSEG" 19 20111231 20131231 2834 2023 "Nutritionals" "ABT"
                        1078 "BUSSEG" 21 20111231 20131231 2835 7372 "Other" "ABT"
                        1078 "BUSSEG" 20 20111231 20131231 3842 . "Vascular" "ABT"
                        1078 "BUSSEG" 23 20111231 20131231 2834 . "Established Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 24 20111231 20131231 . . "Corporate" "ABT"
                        1078 "BUSSEG" 6 20111231 20131231 2835 . "Diagnostics" "ABT"
                        1078 "BUSSEG" 19 20121231 20141231 2834 2023 "Nutritionals" "ABT"
                        1078 "BUSSEG" 6 20121231 20141231 2835 . "Diagnostics" "ABT"
                        1078 "BUSSEG" 21 20121231 20141231 2835 7372 "Other" "ABT"
                        1078 "BUSSEG" 20 20121231 20141231 3842 . "Vascular" "ABT"
                        1078 "BUSSEG" 23 20121231 20141231 2834 . "Established Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 24 20121231 20141231 . . "Corporate" "ABT"
                        1078 "BUSSEG" 19 20131231 20141231 2834 2023 "Nutritionals" "ABT"
                        1078 "BUSSEG" 6 20131231 20141231 2835 . "Diagnostics" "ABT"
                        1078 "BUSSEG" 24 20131231 20141231 . . "Corporate" "ABT"
                        1078 "BUSSEG" 23 20131231 20141231 2834 . "Established Pharmaceuticals" "ABT"
                        1078 "BUSSEG" 21 20131231 20141231 2835 7372 "Other" "ABT"
                        1078 "BUSSEG" 20 20131231 20141231 3842 . "Vascular" "ABT"
                        1177 "BUSSEG" 22 20071231 20091231 . . "Corporate & Interest" "AET"
                        1177 "BUSSEG" 13 20071231 20091231 6311 . "Group Insurance" "AET"
                        1177 "BUSSEG" 18 20071231 20091231 6371 6311 "Large Case Pensions" "AET"
                        1177 "BUSSEG" 17 20071231 20091231 6324 6300 "Healthcare" "AET"
                        1177 "BUSSEG" 13 20081231 20101231 6311 6300 "Group Insurance" "AET"
                        1177 "BUSSEG" 18 20081231 20101231 6371 6311 "Large Case Pensions" "AET"
                        1177 "BUSSEG" 22 20081231 20101231 . . "Corporate & Interest" "AET"
                        1177 "BUSSEG" 17 20081231 20101231 6324 6300 "Healthcare" "AET"
                        1177 "BUSSEG" 18 20091231 20111231 6371 6311 "Large Case Pensions" "AET"
                        1177 "BUSSEG" 17 20091231 20111231 6324 6300 "Healthcare" "AET"
                        1177 "BUSSEG" 22 20091231 20111231 . . "Corporate Financing" "AET"
                        1177 "BUSSEG" 13 20091231 20111231 6311 6300 "Group Insurance" "AET"
                        1177 "BUSSEG" 17 20101231 20121231 6324 6300 "Healthcare" "AET"
                        1177 "BUSSEG" 22 20101231 20121231 . . "Corporate Financing" "AET"
                        1177 "BUSSEG" 18 20101231 20121231 6371 6311 "Large Case Pensions" "AET"
                        1177 "BUSSEG" 13 20101231 20121231 6311 6321 "Group Insurance" "AET"
                        1177 "BUSSEG" 22 20111231 20131231 . . "Corporate Financing" "AET"
                        1177 "BUSSEG" 18 20111231 20131231 6371 6311 "Large Case Pensions" "AET"
                        1177 "BUSSEG" 17 20111231 20131231 6324 6300 "Healthcare" "AET"
                        1177 "BUSSEG" 13 20111231 20131231 6311 6321 "Group Insurance" "AET"
                        1177 "BUSSEG" 18 20121231 20141231 6371 6311 "Large Case Pensions" "AET"
                        1177 "BUSSEG" 22 20121231 20141231 . . "Corporate Financing" "AET"
                        1177 "BUSSEG" 13 20121231 20141231 6311 6321 "Group Insurance" "AET"
                        1177 "BUSSEG" 17 20121231 20141231 6324 6300 "Healthcare" "AET"
                        1177 "BUSSEG" 17 20131231 20141231 6324 6300 "Healthcare" "AET"
                        1177 "BUSSEG" 13 20131231 20141231 6311 6321 "Group Insurance" "AET"
                        1177 "BUSSEG" 18 20131231 20141231 6371 6311 "Large Case Pensions" "AET"
                        1177 "BUSSEG" 22 20131231 20141231 . . "Corporate Financing" "AET"
                        1209 "BUSSEG" 12 20070930 20090930 3559 3569 "Equipment and Energy" "APD"
                        1209 "BUSSEG" 16 20070930 20090930 2813 5169 "Electronics and Performance Materials" "APD"
                        1209 "BUSSEG" 15 20070930 20090930 2813 5169 "Tonnage Gases" "APD"
                        1209 "BUSSEG" 14 20070930 20090930 2813 5169 "Merchant Gases" "APD"
                        1209 "BUSSEG" 13 20070930 20090930 . . "Other" "APD"
                        1209 "BUSSEG" 13 20080930 20100930 . . "Other" "APD"
                        1209 "BUSSEG" 16 20080930 20100930 2813 5169 "Electronics and Performance Materials" "APD"
                        1209 "BUSSEG" 14 20080930 20100930 2813 5169 "Merchant Gases" "APD"
                        1209 "BUSSEG" 12 20080930 20100930 3559 3569 "Equipment and Energy" "APD"
                        1209 "BUSSEG" 15 20080930 20100930 2813 5169 "Tonnage Gases" "APD"
                        1209 "BUSSEG" 12 20090930 20110930 3559 3569 "Equipment and Energy" "APD"
                        1209 "BUSSEG" 14 20090930 20110930 2813 5169 "Merchant Gases" "APD"
                        1209 "BUSSEG" 13 20090930 20110930 . . "Other" "APD"
                        1209 "BUSSEG" 16 20090930 20110930 2813 5169 "Electronics and Performance Materials" "APD"
                        1209 "BUSSEG" 15 20090930 20110930 2813 5169 "Tonnage Gases" "APD"
                        1209 "BUSSEG" 12 20100930 20120930 3559 3569 "Equipment and Energy" "APD"
                        1209 "BUSSEG" 15 20100930 20120930 2813 5169 "Tonnage Gases" "APD"
                        1209 "BUSSEG" 16 20100930 20120930 2813 5169 "Electronics Materials" "APD"
                        1209 "BUSSEG" 14 20100930 20120930 2813 5169 "Merchant Gases" "APD"
                        1209 "BUSSEG" 13 20100930 20120930 . . "Other" "APD"
                        1209 "BUSSEG" 16 20110930 20130930 2813 5169 "Electronics Materials" "APD"
                        1209 "BUSSEG" 13 20110930 20130930 . . "Other" "APD"
                        1209 "BUSSEG" 12 20110930 20130930 3559 3569 "Equipment and Energy" "APD"
                        1209 "BUSSEG" 14 20110930 20130930 2813 5169 "Merchant Gases" "APD"
                        1209 "BUSSEG" 15 20110930 20130930 2813 5169 "Tonnage Gases" "APD"
                        1209 "BUSSEG" 13 20120930 20140930 . . "Other" "APD"
                        1209 "BUSSEG" 15 20120930 20140930 2813 5169 "Tonnage Gases" "APD"
                        1209 "BUSSEG" 14 20120930 20140930 2813 5169 "Merchant Gases" "APD"
                        1209 "BUSSEG" 16 20120930 20140930 2813 5169 "Electronics Materials" "APD"
                        1209 "BUSSEG" 12 20120930 20140930 3559 3569 "Equipment and Energy" "APD"
                        1209 "BUSSEG" 15 20130930 20140930 2813 5169 "Tonnage Gases" "APD"
                        1209 "BUSSEG" 13 20130930 20140930 . . "Other" "APD"
                        1209 "BUSSEG" 14 20130930 20140930 2813 5169 "Merchant Gases" "APD"
                        1209 "BUSSEG" 12 20130930 20140930 3559 3569 "Equipment and Energy" "APD"
                        1209 "BUSSEG" 16 20130930 20140930 2813 5169 "Electronics Materials" "APD"
                        end
                        
                        * please consider using shorter simpler name and use variable labels if needed
                        rename datadatesegmentannualfundamental date
                        label var date "data date segment annual fundamental"
                        
                        * the date variable is stored as a long integer, use arithmetics to extract year
                        gen int datayear = round(date/10000)
                        
                        * adjust name for upcoming merge
                        rename tickersymbol ticker
                        
                        * order the data, verify that each segment appears only once per ticker year
                        isid ticker datayear segmentidentifier, sort
                        
                        * the number of segments is the number of observations in the group
                        by ticker datayear: gen nsegments = _N
                        
                        * reduce to variables needed
                        keep ticker datayear nsegments
                        
                        * reduce to one observation per ticker datayear group
                        by ticker datayear: keep if _n == 1
                        
                        list, sepby(ticker) noobs
                        
                        * merge with the other data
                        merge 1:1 ticker datayear using "merge2.dta"
                        
                        * use the following information to decide if observations that
                        * did not match should be kept or dropped.
                        tab _merge
                        Wooooo, this is so cool!!!! Thank you very much! But I still confused how can I tell others the reason why the number of observations in the group can represent the number of segments? Why I deal the data like this? Is there any support?
                        You really help me a lot!!
                        Last edited by SHIYI LI; 28 Jul 2015, 17:42.

                        Comment


                        • #13

                          Robert Picard

                          Thank you very much!
                          However, when I type "
                          isid ticker datayear segmentidentifier, sort" in Stata,
                          it tells me:
                          isid ticker datayear segmentidentifier, sort
                          variables ticker datayear segmentidentifier do not uniquely identify the observations
                          r(459)
                          How could I deal with this problem?

                          Originally posted by Robert Picard View Post
                          Here's my best interpretation of what is asked

                          Code:
                          clear
                          input int globalcompanykeysegmentannualfun str6 segmenttype byte segmentidentifier long(datadatesegmentannualfundamental sourcedatesegmentannualfundament) int(siccodeprimary siccodesecondary) str37 segmentname str3 tickersymbol
                          1045 "BUSSEG" 1 20071231 20091231 4512 4581 "Air Transportation" "AAL"
                          1045 "BUSSEG" 1 20081231 20101231 4512 4581 "Air Transportation" "AAL"
                          1045 "BUSSEG" 1 20091231 20111231 4512 4581 "Air Transportation" "AAL"
                          1045 "BUSSEG" 1 20101231 20121231 4512 4581 "Air Transportation" "AAL"
                          1045 "BUSSEG" 1 20111231 20131231 4512 4581 "Air Transportation" "AAL"
                          1045 "BUSSEG" 1 20121231 20141231 4512 4581 "Air Transportation" "AAL"
                          1045 "BUSSEG" 1 20131231 20141231 4512 4581 "Air Transportation" "AAL"
                          1075 "BUSSEG" 8 20071231 20091231 6552 . "Real Estate" "PNW"
                          1075 "BUSSEG" 11 20071231 20091231 4911 . "All Other" "PNW"
                          1075 "BUSSEG" 9 20071231 20091231 4911 . "Regulated Electricity" "PNW"
                          1075 "BUSSEG" 9 20081231 20101231 4911 . "Regulated Electricity" "PNW"
                          1075 "BUSSEG" 11 20081231 20101231 4911 . "All Other" "PNW"
                          1075 "BUSSEG" 9 20091231 20111231 4911 . "Regulated Electricity" "PNW"
                          1075 "BUSSEG" 11 20091231 20111231 4911 . "All Other" "PNW"
                          1075 "BUSSEG" 9 20101231 20121231 4911 . "Regulated Electricity" "PNW"
                          1075 "BUSSEG" 11 20101231 20121231 4911 . "All Other" "PNW"
                          1075 "BUSSEG" 9 20111231 20131231 4911 . "Regulated Electricity" "PNW"
                          1075 "BUSSEG" 9 20121231 20141231 4911 . "Regulated Electricity" "PNW"
                          1075 "BUSSEG" 9 20131231 20141231 4911 . "Regulated Electricity" "PNW"
                          1078 "BUSSEG" 18 20071231 20091231 2834 . "Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 20 20071231 20091231 3842 . "Vascular" "ABT"
                          1078 "BUSSEG" 6 20071231 20091231 3841 . "Diagnostics" "ABT"
                          1078 "BUSSEG" 21 20071231 20091231 . . "Other" "ABT"
                          1078 "BUSSEG" 19 20071231 20091231 2023 2834 "Nutritionals" "ABT"
                          1078 "BUSSEG" 21 20081231 20101231 . . "Other" "ABT"
                          1078 "BUSSEG" 20 20081231 20101231 3842 . "Vascular" "ABT"
                          1078 "BUSSEG" 6 20081231 20101231 2835 . "Diagnostics" "ABT"
                          1078 "BUSSEG" 18 20081231 20101231 2834 . "Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 19 20081231 20101231 2023 2834 "Nutritionals" "ABT"
                          1078 "BUSSEG" 22 20091231 20111231 2834 . "Proprietary Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 19 20091231 20111231 2023 2834 "Nutritionals" "ABT"
                          1078 "BUSSEG" 21 20091231 20111231 2835 7372 "Other" "ABT"
                          1078 "BUSSEG" 6 20091231 20111231 2835 . "Diagnostics" "ABT"
                          1078 "BUSSEG" 20 20091231 20111231 3842 . "Vascular" "ABT"
                          1078 "BUSSEG" 23 20091231 20111231 2834 . "Established Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 6 20101231 20121231 2835 . "Diagnostics" "ABT"
                          1078 "BUSSEG" 21 20101231 20121231 2835 7372 "Other" "ABT"
                          1078 "BUSSEG" 20 20101231 20121231 3842 . "Vascular" "ABT"
                          1078 "BUSSEG" 22 20101231 20121231 2834 . "Proprietary Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 23 20101231 20121231 2834 . "Established Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 19 20101231 20121231 2023 2834 "Nutritionals" "ABT"
                          1078 "BUSSEG" 19 20111231 20131231 2834 2023 "Nutritionals" "ABT"
                          1078 "BUSSEG" 21 20111231 20131231 2835 7372 "Other" "ABT"
                          1078 "BUSSEG" 20 20111231 20131231 3842 . "Vascular" "ABT"
                          1078 "BUSSEG" 23 20111231 20131231 2834 . "Established Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 24 20111231 20131231 . . "Corporate" "ABT"
                          1078 "BUSSEG" 6 20111231 20131231 2835 . "Diagnostics" "ABT"
                          1078 "BUSSEG" 19 20121231 20141231 2834 2023 "Nutritionals" "ABT"
                          1078 "BUSSEG" 6 20121231 20141231 2835 . "Diagnostics" "ABT"
                          1078 "BUSSEG" 21 20121231 20141231 2835 7372 "Other" "ABT"
                          1078 "BUSSEG" 20 20121231 20141231 3842 . "Vascular" "ABT"
                          1078 "BUSSEG" 23 20121231 20141231 2834 . "Established Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 24 20121231 20141231 . . "Corporate" "ABT"
                          1078 "BUSSEG" 19 20131231 20141231 2834 2023 "Nutritionals" "ABT"
                          1078 "BUSSEG" 6 20131231 20141231 2835 . "Diagnostics" "ABT"
                          1078 "BUSSEG" 24 20131231 20141231 . . "Corporate" "ABT"
                          1078 "BUSSEG" 23 20131231 20141231 2834 . "Established Pharmaceuticals" "ABT"
                          1078 "BUSSEG" 21 20131231 20141231 2835 7372 "Other" "ABT"
                          1078 "BUSSEG" 20 20131231 20141231 3842 . "Vascular" "ABT"
                          1177 "BUSSEG" 22 20071231 20091231 . . "Corporate & Interest" "AET"
                          1177 "BUSSEG" 13 20071231 20091231 6311 . "Group Insurance" "AET"
                          1177 "BUSSEG" 18 20071231 20091231 6371 6311 "Large Case Pensions" "AET"
                          1177 "BUSSEG" 17 20071231 20091231 6324 6300 "Healthcare" "AET"
                          1177 "BUSSEG" 13 20081231 20101231 6311 6300 "Group Insurance" "AET"
                          1177 "BUSSEG" 18 20081231 20101231 6371 6311 "Large Case Pensions" "AET"
                          1177 "BUSSEG" 22 20081231 20101231 . . "Corporate & Interest" "AET"
                          1177 "BUSSEG" 17 20081231 20101231 6324 6300 "Healthcare" "AET"
                          1177 "BUSSEG" 18 20091231 20111231 6371 6311 "Large Case Pensions" "AET"
                          1177 "BUSSEG" 17 20091231 20111231 6324 6300 "Healthcare" "AET"
                          1177 "BUSSEG" 22 20091231 20111231 . . "Corporate Financing" "AET"
                          1177 "BUSSEG" 13 20091231 20111231 6311 6300 "Group Insurance" "AET"
                          1177 "BUSSEG" 17 20101231 20121231 6324 6300 "Healthcare" "AET"
                          1177 "BUSSEG" 22 20101231 20121231 . . "Corporate Financing" "AET"
                          1177 "BUSSEG" 18 20101231 20121231 6371 6311 "Large Case Pensions" "AET"
                          1177 "BUSSEG" 13 20101231 20121231 6311 6321 "Group Insurance" "AET"
                          1177 "BUSSEG" 22 20111231 20131231 . . "Corporate Financing" "AET"
                          1177 "BUSSEG" 18 20111231 20131231 6371 6311 "Large Case Pensions" "AET"
                          1177 "BUSSEG" 17 20111231 20131231 6324 6300 "Healthcare" "AET"
                          1177 "BUSSEG" 13 20111231 20131231 6311 6321 "Group Insurance" "AET"
                          1177 "BUSSEG" 18 20121231 20141231 6371 6311 "Large Case Pensions" "AET"
                          1177 "BUSSEG" 22 20121231 20141231 . . "Corporate Financing" "AET"
                          1177 "BUSSEG" 13 20121231 20141231 6311 6321 "Group Insurance" "AET"
                          1177 "BUSSEG" 17 20121231 20141231 6324 6300 "Healthcare" "AET"
                          1177 "BUSSEG" 17 20131231 20141231 6324 6300 "Healthcare" "AET"
                          1177 "BUSSEG" 13 20131231 20141231 6311 6321 "Group Insurance" "AET"
                          1177 "BUSSEG" 18 20131231 20141231 6371 6311 "Large Case Pensions" "AET"
                          1177 "BUSSEG" 22 20131231 20141231 . . "Corporate Financing" "AET"
                          1209 "BUSSEG" 12 20070930 20090930 3559 3569 "Equipment and Energy" "APD"
                          1209 "BUSSEG" 16 20070930 20090930 2813 5169 "Electronics and Performance Materials" "APD"
                          1209 "BUSSEG" 15 20070930 20090930 2813 5169 "Tonnage Gases" "APD"
                          1209 "BUSSEG" 14 20070930 20090930 2813 5169 "Merchant Gases" "APD"
                          1209 "BUSSEG" 13 20070930 20090930 . . "Other" "APD"
                          1209 "BUSSEG" 13 20080930 20100930 . . "Other" "APD"
                          1209 "BUSSEG" 16 20080930 20100930 2813 5169 "Electronics and Performance Materials" "APD"
                          1209 "BUSSEG" 14 20080930 20100930 2813 5169 "Merchant Gases" "APD"
                          1209 "BUSSEG" 12 20080930 20100930 3559 3569 "Equipment and Energy" "APD"
                          1209 "BUSSEG" 15 20080930 20100930 2813 5169 "Tonnage Gases" "APD"
                          1209 "BUSSEG" 12 20090930 20110930 3559 3569 "Equipment and Energy" "APD"
                          1209 "BUSSEG" 14 20090930 20110930 2813 5169 "Merchant Gases" "APD"
                          1209 "BUSSEG" 13 20090930 20110930 . . "Other" "APD"
                          1209 "BUSSEG" 16 20090930 20110930 2813 5169 "Electronics and Performance Materials" "APD"
                          1209 "BUSSEG" 15 20090930 20110930 2813 5169 "Tonnage Gases" "APD"
                          1209 "BUSSEG" 12 20100930 20120930 3559 3569 "Equipment and Energy" "APD"
                          1209 "BUSSEG" 15 20100930 20120930 2813 5169 "Tonnage Gases" "APD"
                          1209 "BUSSEG" 16 20100930 20120930 2813 5169 "Electronics Materials" "APD"
                          1209 "BUSSEG" 14 20100930 20120930 2813 5169 "Merchant Gases" "APD"
                          1209 "BUSSEG" 13 20100930 20120930 . . "Other" "APD"
                          1209 "BUSSEG" 16 20110930 20130930 2813 5169 "Electronics Materials" "APD"
                          1209 "BUSSEG" 13 20110930 20130930 . . "Other" "APD"
                          1209 "BUSSEG" 12 20110930 20130930 3559 3569 "Equipment and Energy" "APD"
                          1209 "BUSSEG" 14 20110930 20130930 2813 5169 "Merchant Gases" "APD"
                          1209 "BUSSEG" 15 20110930 20130930 2813 5169 "Tonnage Gases" "APD"
                          1209 "BUSSEG" 13 20120930 20140930 . . "Other" "APD"
                          1209 "BUSSEG" 15 20120930 20140930 2813 5169 "Tonnage Gases" "APD"
                          1209 "BUSSEG" 14 20120930 20140930 2813 5169 "Merchant Gases" "APD"
                          1209 "BUSSEG" 16 20120930 20140930 2813 5169 "Electronics Materials" "APD"
                          1209 "BUSSEG" 12 20120930 20140930 3559 3569 "Equipment and Energy" "APD"
                          1209 "BUSSEG" 15 20130930 20140930 2813 5169 "Tonnage Gases" "APD"
                          1209 "BUSSEG" 13 20130930 20140930 . . "Other" "APD"
                          1209 "BUSSEG" 14 20130930 20140930 2813 5169 "Merchant Gases" "APD"
                          1209 "BUSSEG" 12 20130930 20140930 3559 3569 "Equipment and Energy" "APD"
                          1209 "BUSSEG" 16 20130930 20140930 2813 5169 "Electronics Materials" "APD"
                          end
                          
                          * please consider using shorter simpler name and use variable labels if needed
                          rename datadatesegmentannualfundamental date
                          label var date "data date segment annual fundamental"
                          
                          * the date variable is stored as a long integer, use arithmetics to extract year
                          gen int datayear = round(date/10000)
                          
                          * adjust name for upcoming merge
                          rename tickersymbol ticker
                          
                          * order the data, verify that each segment appears only once per ticker year
                          isid ticker datayear segmentidentifier, sort
                          
                          * the number of segments is the number of observations in the group
                          by ticker datayear: gen nsegments = _N
                          
                          * reduce to variables needed
                          keep ticker datayear nsegments
                          
                          * reduce to one observation per ticker datayear group
                          by ticker datayear: keep if _n == 1
                          
                          list, sepby(ticker) noobs
                          
                          * merge with the other data
                          merge 1:1 ticker datayear using "merge2.dta"
                          
                          * use the following information to decide if observations that
                          * did not match should be kept or dropped.
                          tab _merge
                          Last edited by SHIYI LI; 29 Jul 2015, 10:26.

                          Comment


                          • #14
                            I assume that your segments are identified using the segmentidentifier variable and that you want to count the number of distinct segments. If that's the case, then all you need to do is to remove the duplicates in term of ticker datayear segmentidentifier groups. Here's a shorter example with two duplicates.

                            Code:
                            clear
                            input str3 ticker int datayear byte segmentidentifier
                            "AAL" 2007 1 
                            "AAL" 2008 1 
                            "AAL" 2009 1 
                            "AAL" 2010 1 
                            "AAL" 2010 1 
                            "AAL" 2011 1 
                            "AAL" 2012 1 
                            "AAL" 2013 1 
                            "PNW" 2007 8 
                            "PNW" 2007 9 
                            "PNW" 2007 11 
                            "PNW" 2008 9 
                            "PNW" 2008 11 
                            "PNW" 2009 9 
                            "PNW" 2009 11 
                            "PNW" 2010 9 
                            "PNW" 2010 11 
                            "PNW" 2010 11 
                            "PNW" 2011 9 
                            "PNW" 2012 9 
                            "PNW" 2013 9 
                            end
                            
                            * group by ticker datayear segmentidentifier and note obs in group
                            * and pick one representative observation for the group
                            sort ticker datayear segmentidentifier
                            by ticker datayear segmentidentifier: gen nobs = _N
                            by ticker datayear segmentidentifier: gen pick = _n == 1
                            
                            * you have duplicate for groups with more than one observation
                            tab nobs pick
                            
                            * if all you want is to count the number of segments, only count
                            * each segment once per group
                            keep if pick
                            
                            * double-check that the 3 variables uniquely identify observations
                            isid ticker datayear segmentidentifier, sort
                            
                            * the number of distinct segments is the number of observations in the group
                            by ticker datayear: gen nsegments = _N
                            
                            * reduce to variables needed
                            keep ticker datayear nsegments
                            
                            * reduce to one observation per ticker datayear group
                            by ticker datayear: keep if _n == 1
                            
                            list, sepby(ticker) noobs

                            Comment


                            • #15
                              Thank you verrrry much!! code works!!

                              Originally posted by Robert Picard View Post
                              I assume that your segments are identified using the segmentidentifier variable and that you want to count the number of distinct segments. If that's the case, then all you need to do is to remove the duplicates in term of ticker datayear segmentidentifier groups. Here's a shorter example with two duplicates.

                              Code:
                              clear
                              input str3 ticker int datayear byte segmentidentifier
                              "AAL" 2007 1
                              "AAL" 2008 1
                              "AAL" 2009 1
                              "AAL" 2010 1
                              "AAL" 2010 1
                              "AAL" 2011 1
                              "AAL" 2012 1
                              "AAL" 2013 1
                              "PNW" 2007 8
                              "PNW" 2007 9
                              "PNW" 2007 11
                              "PNW" 2008 9
                              "PNW" 2008 11
                              "PNW" 2009 9
                              "PNW" 2009 11
                              "PNW" 2010 9
                              "PNW" 2010 11
                              "PNW" 2010 11
                              "PNW" 2011 9
                              "PNW" 2012 9
                              "PNW" 2013 9
                              end
                              
                              * group by ticker datayear segmentidentifier and note obs in group
                              * and pick one representative observation for the group
                              sort ticker datayear segmentidentifier
                              by ticker datayear segmentidentifier: gen nobs = _N
                              by ticker datayear segmentidentifier: gen pick = _n == 1
                              
                              * you have duplicate for groups with more than one observation
                              tab nobs pick
                              
                              * if all you want is to count the number of segments, only count
                              * each segment once per group
                              keep if pick
                              
                              * double-check that the 3 variables uniquely identify observations
                              isid ticker datayear segmentidentifier, sort
                              
                              * the number of distinct segments is the number of observations in the group
                              by ticker datayear: gen nsegments = _N
                              
                              * reduce to variables needed
                              keep ticker datayear nsegments
                              
                              * reduce to one observation per ticker datayear group
                              by ticker datayear: keep if _n == 1
                              
                              list, sepby(ticker) noobs

                              Comment

                              Working...
                              X