Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Counting non-missing values - long panel data

    Hello,

    I am currently struggling with the following issue:

    I have a big dataset with around 12,000 individuals. For each individual, I have information on 18 consecutive years. Each individual is matched to his/her father.
    I would like to create a variable that allows me to assess the number of individuals (small n) who have non-missing values for income (of the individual) and non-missing values of father's income at the same time.

    In summary, I would like to calculate the small n for certain variables, applying certain restrictions. The product should be something like the small n from xttab, but I need to do it manually.
    I thought of creating dummy variable that will be equal to 1 if the individual has at least one observation for income (over the 18 years) and if there is at least one observation for the income of the father (over 18 years); or 0 if income and father's income are missing for the 18 years.

    I am already struggling to create the dummy for each case separately, never mind with two restrictions.
    Could anyone help?

    Thank you.




  • #2
    Assume variables

    Code:
    id year income fincome
    then you want -- it seems --

    Code:
    egen totalbyyear = total(!missing(income) & !missing(fincome)), by(year)
    egen totalbyid = total(!missing(income) & !missing(fincome)), by(id)
    If father's income is not aligned, see e.g. http://www.statalist.org/forums/foru...in-a-household

    If this doesn't help (enough), please go back to FAQ Advice especially #12 and provide a data example.

    I don't see that you need dummies at all. You can count satisfactory observations directly. You just need to build on true = 1, false = 0.

    Comment


    • #3
      Hello Nick,

      Thanks for the quick reply. It works perfectly!

      Comment

      Working...
      X