Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How do I compare a sub-sample of the data set with the full dataset?

    Hello

    I have a dataset of questionnaire responses from participants. My aim is to discover whether a subsample of this dataset is representative of the larger sample (which includes the sub-sample). I am unsure how I go about doing this. A simple t-test or ANOVA would not be appropriate as the sample sizes differ (120 vs 480) and the groups are not independent. I have searched forums and replies have suggested that you effectively split the larger sample into two sub-samples but that answers a different research question.
    On top of the statistical question, I would appreciate advice on how to conduct this in stata. I have stata 14.

    Thanks,
    Emma

  • #2
    you might find this useful:
    http://www.statalist.org/forums/foru...des-the-subset

    Comment


    • #3
      This is how you can compare the subsamples:

      (1) subtract from your subsample (n = 120) from your full sample (n = 480), if the latter is indeed your full sample. Then you will have two sub-samples a (120) and b (360).
      (2) take a random sample of the larger group b (set the random seed); use the count option for the sample command

      Code:
      set seed 02138
      use b, clear
      sample b 240 , count
      (3) Append (help append) the data into one with a variable to indicate whether it belongs to a or b

      (4) test any differences using the two subsamples (a and b) you've now generated.

      Caveat: I'm not entirely confident of this approach; it's what I'd do, however.
      Nathan E. Fosse, PhD
      [email protected]

      Comment

      Working...
      X