Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Distinct countries and total frequencies by year

    I am working with two variables: year and country. How can I obtain total number of distinct countries and total country frequency by year?

    I tried
    Code:
      by year, sort: tab country
    The code gives total country frequency, but does not provide total number of distinct countries. Is there any way I can obtain the desired output i.e. total number of distinct countries and total frequencies in one table? Thanks.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input float year str38 country
    2007 "Australia"               
    2007 "Australia"               
    2007 "Australia"               
    2007 "Australia"               
    2007 "Australia"               
    2007 "Australia"               
    2007 "Brazil"                  
    2007 "Brazil"                  
    2007 "Brazil"                  
    2007 "Canada"                  
    2007 "Canada"                  
    2007 "Canada"                  
    2007 "Canada"                  
    2007 "Canada"                  
    2007 "China"                   
    2007 "China"                   
    2007 "China"                   
    2007 "Estonia"                 
    2007 "France"                  
    2007 "France"                  
    2007 "Germany"                 
    2007 "Germany"                 
    2007 "Guernsey"                
    2007 "Hong Kong"               
    2007 "Hong Kong"               
    2007 "India"                   
    2007 "India"                   
    2007 "Isle of Man"             
    2007 "Israel"                  
    2007 "Italy"                   
    2007 "Japan"                   
    2007 "Japan"                   
    2007 "Japan"                   
    2007 "Japan"                   
    2007 "Malaysia"                
    2007 "Mexico"                  
    2007 "Poland"                  
    2007 "Poland"                  
    2007 "Russian Federation"      
    2007 "Russian Federation"      
    2007 "Russian Federation"      
    2007 "Saudi Arabia"            
    2007 "South Korea"             
    2007 "United States of America"
    2007 "United States of America"
    2007 "United States of America"
    2007 "United States of America"
    2007 "Vietnam"                 
    2017 "Australia"               
    2017 "Chile"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "China"                   
    2017 "Egypt"                   
    2017 "Hong Kong"               
    2017 "Hong Kong"               
    2017 "Hong Kong"               
    2017 "Hong Kong"               
    2017 "India"                   
    2017 "India"                   
    2017 "India"                   
    2017 "India"                   
    2017 "India"                   
    2017 "India"                   
    2017 "India"                   
    2017 "India"                   
    2017 "Indonesia"               
    2017 "Indonesia"               
    2017 "Israel"                  
    2017 "Israel"                  
    2017 "Italy"                   
    2017 "Japan"                   
    2017 "Japan"                   
    2017 "Japan"                   
    2017 "Japan"                   
    2017 "Malaysia"                
    2017 "Malaysia"                
    2017 "Philippines"             
    2017 "Singapore"               
    2017 "Singapore"               
    2017 "South Korea"             
    2017 "South Korea"             
    2017 "United Kingdom"          
    2017 "United Kingdom"          
    2017 "United Kingdom"          
    2017 "United Kingdom"          
    2017 "United States of America"
    2017 "United States of America"
    2017 "United States of America"
    2017 "Vietnam"                 
    end

  • #2
    Reviewed at length in https://www.stata-journal.com/sjpdf....iclenum=dm0042

    That gives an otherwise unpredictable search term, dm0042, which can be used to find many, many discussions of related problems in this forum.

    Comment


    • #3
      Thanks Nick Cox . Here is my attempt:

      Code:
      egen tag=tag(country year)
      egen distinct_countries=total(tag), by(year)
      by year,sort: gen total_freq=_N
      tabdisp year,cell(distinct_countries total_freq)
      --------------------------------------------------
           year | distinct_countries          total_freq
      ----------+---------------------------------------
           2007 |                 22                  48
           2017 |                 17                  52
      --------------------------------------------------

      Comment


      • #4
        Compare also:

        Code:
        . bysort year: distinct country
        
        -----------------------------------------------------------------------------------------
        -> year = 2007
        
        --------------------------------
                 |     total   distinct
        ---------+----------------------
         country |        48         22
        --------------------------------
        
        -----------------------------------------------------------------------------------------
        -> year = 2017
        
        --------------------------------
                 |     total   distinct
        ---------+----------------------
         country |        52         17
        --------------------------------
        Code:
        SJ-15-3 dm0042_2  . . . . . . . . . . . . . . . . Software update for distinct
                (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                Q3/15   SJ 15(3):899
                improved table format and display of large numbers of
                observations
        
        SJ-12-2 dm0042_1  . . . . . . . . . . . . . . . . Software update for distinct
                (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                Q2/12   SJ 12(2):352
                options added to restrict output to variables with a minimum
                or maximum of distinct values
        
        SJ-8-4  dm0042  . . . . . . . . . . . .  Speaking Stata: Distinct observations
                (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
                Q4/08   SJ 8(4):557--568
                shows how to answer questions about distinct observations
                from first principles; provides a convenience command

        Comment

        Working...
        X