Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Frequency in a variable

    Hi all,
    This seems to me a very easy thing to calculate, but I haven't been able to figure it out for hours now. I want to have in a separate variable: the count of the number of observations with the same value in a row of another variable. Preferably this would be combined with a "by" function. The command "tablulate" give a table, the command "tablepc" gives it in relative terms, but me I just want to have it count terms.

    Isn't there a simple egen function or formula for this thing?

    Many thanks!

    Best wishes,
    Erik van der Marel

  • #2
    Erik:
    did you take a look at -help egen-?
    Kind regards,
    Carlo
    (Stata 19.0)

    Comment


    • #3
      Yep, I did, and couldn't figure it out. I am a experienced user of STATA but this is a tedious simple issue that I just can't resolve. I know it sounds too ludicrous for words. Please if you know the simple command, then you would make my day!

      Comment


      • #4
        I don't understand your explanation. Perhaps you could show us some sample data (in a code block, of course!) along with hand-calculated results you would like to get from it?

        Comment


        • #5
          Does something like this get you on the right track? Seems like there ought to be a more graceful way, but think it does what you want. Obviously, if you only want one value, you can take it out of the loop.
          Code:
          *=========make fake data
          clear
          set obs 1000
          set seed 1971
          
          gen var1=int(runiform()*8)+1
          
          *=======get counts of each value
          forvalues i=1/8 {
              gen var1_`i'=1 if var1==`i'
              egen var1Count_`i'=count(var1_`i')
             drop var1_`i'   
          }
          sum

          Comment


          • #6
            We are all still guessing, but I will start at a simpler end:

            Code:
             
            bysort x : gen count = _N

            Comment


            • #7
              For both Clyde and Nick, here is what I am looking for. Below there is spreadsheet table taken from my STATA file. What I need is to compute the variable tab in an easy way. This variable tab counts the number of times a similar observation within com_4 is mentioned sorted by the variable ind. For instance, for the ind observation 111110 there are various observations within com_4. And although the observation 3334 is mentioned only once by ind 111110, the observation 3339 is mentioned 4 times. You will see that the variable tab in this example counts for that. Now I am looking for an easy way to compute this variable. Also because I need then to distinguish this variable for both when _fillin is 0 and one variable for when _fillin is 1. Please let me know if you have any clue!
              com_4 ind use _fillin tab
              3334 111110 0.1 0 1
              3335 111110 . 1 1
              3336 111110 . 1 1
              3339 111110 4.6 0 4
              3339 111110 5.7 0 4
              3339 111110 4.3 0 4
              3339 111110 0.7 0 4
              3341 111110 0.1 0 1
              3342 111110 0 0 2
              3342 111110 0 0 2
              3343 111110 . 1 1
              3344 111110 . 1 1
              3345 111110 0.2 0 1
              3346 111110 . 1 1
              Ben, thanks a lot! I will go ahead and have a careful look at it. But true, an easy egen code or something would be great indeed!

              Thanks a bunch guys!

              Comment


              • #8
                This seems to me to be essentially what I suggested:

                Code:
                 
                bysort com_4: gen tab = _N
                The implication is that you want to do this separately by your ind variable, here a constant, but it's still counting subsets:

                Code:
                 
                bysort ind com_4: gen tab = _N
                and doing it separately by _fillin is then just the same idea. For a basic tutorial on by: see e.g. http://www.stata-journal.com/article.html?article=pr0004 and for the spelling "Stata" please see FAQ Advice here, Section 18.



                Comment


                • #9
                  Thanks, Nick. I can't believe that this is so simple in the end. You should know that I am quite an experienced user of Stata (yes, I got it, thanks ;-)), but you might not believe after this query of mine. In anyway, a great thank you!

                  Comment


                  • #10
                    I think it can take quite a lot of practice to see that using by: is the answer and to think through to the details. Same goes for many areas of Stata which I should be much more fluent in.

                    Comment

                    Working...
                    X