Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • List Unique Data of One or More Variables

    I would like to see a list of unique data for one or more variables. Can anyone please suggest how best to go about it.

  • #2
    -help levelsof-

    Comment


    • #3
      Originally posted by Clyde Schechter View Post
      -help levelsof-
      Thanks Clyde. I tried that already but it really did not give me what I wanted. I wanted it in a table listed side by side with other variables of choice. Can you advise on how to achieve this? Sorry I am relatively new to Stata.

      Comment


      • #4
        Since you provide no example data, here is general code crafted for some variables in the built-in auto.dta set. It shouldn't be too hard for you to modify it for your specific problem.

        Code:
        clear*
        sysuse auto
        
        //  VARIABLES TO LIST DISTINCT VALUES OF: foreign, rep78, mpg
        
        frame create wanted byte foreign byte rep78 float mpg
        
        count
        frame wanted: set obs `r(N)'
        
        local focal_variables foreign rep78 mpg
        
        foreach v of varlist `focal_variables' {
            levelsof `v', local(values)
            local n_values: word count `values'
            forvalues i = 1/`n_values' {
                frame wanted: replace `v' = `:word `i' of `values'' in `i'
            }
        }
        
        frame change wanted
        local nvars: word count `focal_variables'
        egen mcount = rowmiss(`focal_variables')
        drop if mcount == `nvars'
        drop mcount
        list, noobs clean
        Note: Requires version 16 or later.

        Also, at the end of this code, the results are in a data set in frame wanted. So you can save it or export it to some other file format if you wish.

        Comment


        • #5
          I am not on very clear as to what you want and now -levelsof- fails to deliver on your demands.

          But you can also try my command -levelstovar- (the -ado file is attached to this message, you need to put the -ado file somewhere where Stata can find it, e.g., in "C:\ado\plus\l").

          Code:
          . sysuse auto
          (1978 Automobile Data)
          
          . levelstovar replevels = rep78, missing
          
          . levelstovar turnlevels = turn
          
          . list rep78 replevels turn turnlevels in 1/20, sep(0)
          
               +------------------------------------+
               | rep78   replev~s   turn   turnle~s |
               |------------------------------------|
            1. |     3          1     40         31 |
            2. |     3          2     40         32 |
            3. |     .          3     35         33 |
            4. |     3          4     40         34 |
            5. |     4          5     43         35 |
            6. |     3          .     43         36 |
            7. |     .         .z     34         37 |
            8. |     3         .z     42         38 |
            9. |     3         .z     43         39 |
           10. |     3         .z     42         40 |
           11. |     3         .z     44         41 |
           12. |     2         .z     43         42 |
           13. |     3         .z     45         43 |
           14. |     3         .z     34         44 |
           15. |     4         .z     43         45 |
           16. |     3         .z     31         46 |
           17. |     2         .z     41         48 |
           18. |     2         .z     40         51 |
           19. |     3         .z     43         .z |
           20. |     5         .z     35         .z |
               +------------------------------------+
          
          .
          Attached Files

          Comment


          • #6
            Originally posted by Clyde Schechter View Post
            Since you provide no example data, here is general code crafted for some variables in the built-in auto.dta set. It shouldn't be too hard for you to modify it for your specific problem.

            Code:
            clear*
            sysuse auto
            
            // VARIABLES TO LIST DISTINCT VALUES OF: foreign, rep78, mpg
            
            frame create wanted byte foreign byte rep78 float mpg
            
            count
            frame wanted: set obs `r(N)'
            
            local focal_variables foreign rep78 mpg
            
            foreach v of varlist `focal_variables' {
            levelsof `v', local(values)
            local n_values: word count `values'
            forvalues i = 1/`n_values' {
            frame wanted: replace `v' = `:word `i' of `values'' in `i'
            }
            }
            
            frame change wanted
            local nvars: word count `focal_variables'
            egen mcount = rowmiss(`focal_variables')
            drop if mcount == `nvars'
            drop mcount
            list, noobs clean
            Note: Requires version 16 or later.

            Also, at the end of this code, the results are in a data set in frame wanted. So you can save it or export it to some other file format if you wish.
            Please how can I share the data. I tried the code but didn't quite get what I wanted.

            Comment


            • #7
              I too am at a loss to know what you want but https://www.statalist.org/forums/for...updated-on-ssc may point to a useful command.

              Comment


              • #8
                Abiye Kalaiwo The helpful way to show example data is with the -dataex- command. It is also the easiest. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                That said, I think the main reason that after 7 posts to the thread you still have no answer to your question is that nobody can figure out what you want! While example data would probably be helpful, I think you also need to work out by hand what the results you seek would look like from the example data, and then show that as well. Several people have tried to guess your meaning. I don't think anybody else is likely to try at this point. So please show both example data and what you're trying to get.

                Comment


                • #9
                  Originally posted by Clyde Schechter View Post
                  Abiye Kalaiwo The helpful way to show example data is with the -dataex- command. It is also the easiest. If you are running version 17, 16 or a fully updated version 15.1 or 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

                  That said, I think the main reason that after 7 posts to the thread you still have no answer to your question is that nobody can figure out what you want! While example data would probably be helpful, I think you also need to work out by hand what the results you seek would look like from the example data, and then show that as well. Several people have tried to guess your meaning. I don't think anybody else is likely to try at this point. So please show both example data and what you're trying to get.
                  Thanks Clyde. I appreciate your support. Still learning so many things including how to ask questions here. To your point, see the dataset below:

                  sysuse auto.dta
                  (1978 Automobile Data)

                  How do I see only a unique list or table of the variable 'make' with label 'Make and Model'. While I know there are multiple cars which are of the same make, I just want to get a list of the different types of makes. Also assume that each make has a unique code e.g. 1, 2 etc. How do I display this unique list of make alongside with their respective codes.

                  The problem with levelsof is that it is unable to provide this information as a table (see below):

                  levelsof make, clean
                  AMC Concord AMC Pacer AMC Spirit Audi 5000 Audi Fox BMW 320i Buick Century Buick Electra Buick LeSabre Buick Opel Buick Regal Bu
                  > ick Riviera Buick Skylark Cad. Deville Cad. Eldorado Cad. Seville Chev. Chevette Chev. Impala Chev. Malibu Chev. Monte Carlo C
                  > hev. Monza Chev. Nova Datsun 200 Datsun 210 Datsun 510 Datsun 810 Dodge Colt Dodge Diplomat Dodge Magnum Dodge St. Regis Fiat
                  > Strada Ford Fiesta Ford Mustang Honda Accord Honda Civic Linc. Continental Linc. Mark V Linc. Versailles Mazda GLC Merc. Bobca
                  > t Merc. Cougar Merc. Marquis Merc. Monarch Merc. XR-7 Merc. Zephyr Olds 98 Olds Cutl Supr Olds Cutlass Olds Delta 88 Olds Omeg
                  > a Olds Starfire Olds Toronado Peugeot 604 Plym. Arrow Plym. Champ Plym. Horizon Plym. Sapporo Plym. Volare Pont. Catalina Pont
                  > . Firebird Pont. Grand Prix Pont. Le Mans Pont. Phoenix Pont. Sunbird Renault Le Car Subaru Toyota Celica Toyota Corolla Toyot
                  > a Corona VW Dasher VW Diesel VW Rabbit VW Scirocco Volvo 260


                  I apologize once again for not making this clearer.

                  Comment


                  • #10
                    Despite repeated requests for an example of your data we have yet to see one. And the auto data doesn't really provide an example of what you seem to want. But let's suppose that we do this


                    Code:
                    . sysuse auto, clear
                    (1978 automobile data)
                    
                    . sort make
                    
                    . gen code = _n
                    Then make and code are in one-to-one correspondence. Using groups as mentioned in #7 you can see the correspondence. For brevity, let's look at just the foreign cars.
                    Code:
                    . groups make code if foreign, sep(0)
                    
                      +-----------------------------------------+
                      | make             code   Freq.   Percent |
                      |-----------------------------------------|
                      | Audi 5000           4       1      4.55 |
                      | Audi Fox            5       1      4.55 |
                      | BMW 320i            6       1      4.55 |
                      | Datsun 200         23       1      4.55 |
                      | Datsun 210         24       1      4.55 |
                      | Datsun 510         25       1      4.55 |
                      | Datsun 810         26       1      4.55 |
                      | Fiat Strada        31       1      4.55 |
                      | Honda Accord       34       1      4.55 |
                      | Honda Civic        35       1      4.55 |
                      | Mazda GLC          39       1      4.55 |
                      | Peugeot 604        53       1      4.55 |
                      | Renault Le Car     65       1      4.55 |
                      | Subaru             66       1      4.55 |
                      | Toyota Celica      67       1      4.55 |
                      | Toyota Corolla     68       1      4.55 |
                      | Toyota Corona      69       1      4.55 |
                      | VW Dasher          70       1      4.55 |
                      | VW Diesel          71       1      4.55 |
                      | VW Rabbit          72       1      4.55 |
                      | VW Scirocco        73       1      4.55 |
                      | Volvo 260          74       1      4.55 |
                      +-----------------------------------------+
                    
                    . groups make code if foreign, sep(0) show(none)
                    
                      +-----------------------+
                      | make             code |
                      |-----------------------|
                      | Audi 5000           4 |
                      | Audi Fox            5 |
                      | BMW 320i            6 |
                      | Datsun 200         23 |
                      | Datsun 210         24 |
                      | Datsun 510         25 |
                      | Datsun 810         26 |
                      | Fiat Strada        31 |
                      | Honda Accord       34 |
                      | Honda Civic        35 |
                      | Mazda GLC          39 |
                      | Peugeot 604        53 |
                      | Renault Le Car     65 |
                      | Subaru             66 |
                      | Toyota Celica      67 |
                      | Toyota Corolla     68 |
                      | Toyota Corona      69 |
                      | VW Dasher          70 |
                      | VW Diesel          71 |
                      | VW Rabbit          72 |
                      | VW Scirocco        73 |
                      | Volvo 260          74 |
                      +-----------------------+
                    As the last command example shows, you can do other things, such as suppress the display of frequencies. The groups command can be used to list combinations (groups!) of values that occur in the data, regardless of how often they occur.

                    Comment


                    • #11
                      Originally posted by Nick Cox View Post
                      Despite repeated requests for an example of your data we have yet to see one. And the auto data doesn't really provide an example of what you seem to want. But let's suppose that we do this


                      Code:
                      . sysuse auto, clear
                      (1978 automobile data)
                      
                      . sort make
                      
                      . gen code = _n
                      Then make and code are in one-to-one correspondence. Using groups as mentioned in #7 you can see the correspondence. For brevity, let's look at just the foreign cars.
                      Code:
                      . groups make code if foreign, sep(0)
                      
                      +-----------------------------------------+
                      | make code Freq. Percent |
                      |-----------------------------------------|
                      | Audi 5000 4 1 4.55 |
                      | Audi Fox 5 1 4.55 |
                      | BMW 320i 6 1 4.55 |
                      | Datsun 200 23 1 4.55 |
                      | Datsun 210 24 1 4.55 |
                      | Datsun 510 25 1 4.55 |
                      | Datsun 810 26 1 4.55 |
                      | Fiat Strada 31 1 4.55 |
                      | Honda Accord 34 1 4.55 |
                      | Honda Civic 35 1 4.55 |
                      | Mazda GLC 39 1 4.55 |
                      | Peugeot 604 53 1 4.55 |
                      | Renault Le Car 65 1 4.55 |
                      | Subaru 66 1 4.55 |
                      | Toyota Celica 67 1 4.55 |
                      | Toyota Corolla 68 1 4.55 |
                      | Toyota Corona 69 1 4.55 |
                      | VW Dasher 70 1 4.55 |
                      | VW Diesel 71 1 4.55 |
                      | VW Rabbit 72 1 4.55 |
                      | VW Scirocco 73 1 4.55 |
                      | Volvo 260 74 1 4.55 |
                      +-----------------------------------------+
                      
                      . groups make code if foreign, sep(0) show(none)
                      
                      +-----------------------+
                      | make code |
                      |-----------------------|
                      | Audi 5000 4 |
                      | Audi Fox 5 |
                      | BMW 320i 6 |
                      | Datsun 200 23 |
                      | Datsun 210 24 |
                      | Datsun 510 25 |
                      | Datsun 810 26 |
                      | Fiat Strada 31 |
                      | Honda Accord 34 |
                      | Honda Civic 35 |
                      | Mazda GLC 39 |
                      | Peugeot 604 53 |
                      | Renault Le Car 65 |
                      | Subaru 66 |
                      | Toyota Celica 67 |
                      | Toyota Corolla 68 |
                      | Toyota Corona 69 |
                      | VW Dasher 70 |
                      | VW Diesel 71 |
                      | VW Rabbit 72 |
                      | VW Scirocco 73 |
                      | Volvo 260 74 |
                      +-----------------------+
                      As the last command example shows, you can do other things, such as suppress the display of frequencies. The groups command can be used to list combinations (groups!) of values that occur in the data, regardless of how often they occur.
                      Thanks Nick. I apologize for not sharing the data as previously requested. Unfortunately, there are data sharing confidentiality agreements in place around the dataset. It is absolutely beyond my jurisdiction to share. I will review your guidance and revert if I have any other issues. Thank you and apologies once again.

                      Comment


                      • #12
                        #6 asked about how to share data and #8 explained how, but the confidentiality problem now reported in #11 is addressed already in https://www.statalist.org/forums/help#stata

                        We naturally don't expect you to post confidential data, but you should be able to fake data similar in structure and/or use a standard dataset to illustrate, as we did and you did earlier.

                        We'll see what further question you may have.

                        Comment


                        • #13
                          Thanks everyone for your inputs. I have been able to resolve the issue by using levelsof. tabulating and using tabdisp was also helpful. I think what this post has also helped me on is how best to frame questions on the forum. Thanks once again to everyone.

                          Comment

                          Working...
                          X