Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • a list of distinct values where a second variable satisfies a certain condition

    I’m attempting to list distinct values where a second variable satisfies a certain condition.
    I’m trying to implement the advice “After using egen, tag() to create a tagged variable, you can list tagged observations to again show the distinct combinations.” from Speaking Stata: Distinct observations, The Stata Journal (2008), 8, Number 4, pp. 557–568 (on page 11 of my pdf version of the article)

    Here’s an attempt using the auto file – ie listing distinct car makes where they are foreign.
    Code:
     
     * list unique make with foreign attribute
    clear
    sysuse auto
    egen tag = tag(foreign make)
    egen nvals = total(tag), by(foreign)
    tabdisp foreign, cell(nvals)
    but I don’t know how to set the list command to list the makes which have now been tagged?

    I can see the end goal with the bysort command, eg
    bysort foreign: list make
    however, the eventual data file I’ll be analysing has a tens of thousands of unique variables and only a small number of distinct variables to be listed.
    I hope this makes sense, thank you for reading through the problem, Dan

  • #2
    I'm not terribly sure what you are trying to do with the first few lineof codes (the tag==1 for all observations here), but your question
    list the makes which have now been tagged
    Would be:
    Code:
    tabdisp make if tag==1, cell(nvals)
    Or if only the foreign vehicles:
    Code:
    tabdisp make if tag==1 & foreign==1, cell(nvals)
    Is that what you were after?

    Comment


    • #3
      I am not clear what the problem is here beyond specifying if.

      make
      is an identifier in the auto data in any case so

      Code:
      list make if foreign, noobs
      lists the distinct (different) values, all of which happen to be unique (occur once only). On the terminology here, see e.g. Section 2 in https://www.stata-journal.com/sjpdf....iclenum=dm0042

      The paper discusses technique. If interested in the distinct command, get the program files from .

      Code:
      SJ-15-3 dm0042_2  . . . . . . . . . . . . . . . . Software update for distinct
              (help distinct if installed)  . . . . . .  N. J. Cox and G. M. Longton
              Q3/15   SJ 15(3):899
              improved table format and display of large numbers of
              observations

      Code:
      search dm0042_2, entry sj
      gives you a clickable link.

      Comment


      • #4
        Thank you your help Jorrit and Nick,
        Yes Jorrit, that tabdisp command followed on naturally from my egen, tag() steps and translated easily over to the context of my data.
        Thank you Nick for the references and clarification around the "unique" term - I can now see my choice of using auto wasn't a good illustration of my problem/task. I'll build a better example using a sample of my data when I encounter this issue again.
        Thanks again, Dan

        Comment

        Working...
        X