Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • creating two different count variables

    I am working on rare disease dataset and prepearing the data for poisson regression. There are 4 predictor variables (age_category, sex, inc_quint and rural) and outcome is breast cancer (breast_ca). This is just a mock dataset.


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(age_cat sex inc_quint rural breast_ca)
    1 2 1 1 1
    2 1 2 2 1
    3 1 3 1 1
    1 2 4 2 1
    1 1 5 2 0
    2 2 4 1 0
    2 2 3 1 0
    4 2 3 1 0
    5 1 2 1 0
    2 1 1 2 0
    1 2 1 2 0
    1 2 2 1 0
    1 2 2 1 0
    1 2 3 1 1
    2 1 4 1 1
    2 1 5 2 1
    2 1 5 2 1
    2 2 5 1 1
    2 2 4 1 0
    4 2 3 1 0
    3 1 4 1 0
    5 1 5 2 0
    2 1 5 2 1
    7 1 4 2 1
    2 2 3 2 1
    3 2 2 2 1
    2 1 1 2 1
    1 1 3 1 1
    7 1 4 1 1
    8 1 4 1 1
    6 2 5 1 0
    5 2 5 1 0
    4 1 4 1 0
    6 1 3 1 1
    3 1 3 2 1
    2 1 4 2 1
    4 2 5 2 1
    5 2 5 2 0
    6 2 4 2 0
    7 1 3 1 0
    2 1 3 1 1
    3 1 2 1 1
    4 2 3 1 0
    5 2 4 1 0
    6 2 4 1 1
    7 1 5 1 1
    8 1 5 1 1
    9 2 5 1 1
    2 2 5 1 1
    3 1 4 1 1
    6 2 4 1 0
    8 2 3 2 0
    1 1 3 2 1
    2 1 2 2 1
    3 2 2 2 1
    4 1 2 2 1
    5 2 2 1 1
    3 1 4 1 1
    2 2 5 1 1
    1 2 5 1 1
    7 2 4 1 0
    7 1 4 1 0
    8 2 5 1 1
    1 2 1 1 0
    2 1 2 2 0
    3 1 3 1 0
    1 2 4 2 1
    1 1 5 2 0
    2 2 4 1 0
    2 2 3 1 0
    4 2 3 1 0
    5 1 2 1 0
    2 1 1 2 0
    1 2 1 2 0
    1 2 2 1 0
    1 2 2 1 0
    1 2 3 1 1
    2 1 4 1 1
    2 1 5 2 1
    2 1 5 2 1
    2 2 5 1 1
    2 2 4 1 0
    4 2 3 1 0
    3 1 4 1 0
    5 1 5 2 0
    2 1 5 2 0
    7 1 4 2 0
    2 2 3 2 0
    3 2 2 2 0
    2 1 1 2 0
    1 1 3 1 0
    7 1 4 1 0
    8 1 4 1 0
    6 2 5 1 0
    5 2 5 1 0
    4 1 4 1 0
    6 1 3 1 0
    3 1 3 2 0
    2 1 4 2 0
    4 2 5 2 0
    end
    I contracted the dataset using the command

    contract age_cat-breast_ca

    However, it gives a single frequency variable for different combinations of variables.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(age_cat sex inc_quint rural breast_ca _freq)
    1 1 3 1 0  4
    1 1 3 1 1  1
    1 1 3 2 0  8
    1 1 3 2 1  1
    1 1 5 2 0  2
    1 2 1 1 0  1
    1 2 1 1 1  1
    1 2 1 2 0  2
    1 2 2 1 0  4
    1 2 3 1 1  2
    1 2 4 2 1  2
    1 2 5 1 0  8
    1 2 5 1 1  1
    2 1 1 2 0  6
    2 1 1 2 1  1
    2 1 2 2 0  9
    2 1 2 2 1  2
    2 1 3 1 0  2
    2 1 3 1 1  3
    2 1 4 1 1  2
    2 1 4 2 0  4
    2 1 4 2 1  1
    2 1 5 2 0  4
    2 1 5 2 1  5
    2 2 3 1 0  2
    2 2 3 2 0  4
    2 2 3 2 1  1
    2 2 4 1 0  4
    2 2 5 1 0 16
    2 2 5 1 1  4
    3 1 2 1 0  4
    3 1 2 1 1  1
    3 1 3 1 0  1
    3 1 3 1 1  1
    3 1 3 2 0  4
    3 1 3 2 1  1
    3 1 4 1 0 19
    3 1 4 1 1  2
    3 2 2 2 0 12
    3 2 2 2 1  2
    4 1 2 2 0  8
    4 1 2 2 1  1
    4 1 4 1 0  5
    4 2 3 1 0 10
    4 2 5 2 0  4
    4 2 5 2 1  1
    5 1 2 1 0  2
    5 1 5 2 0  3
    5 2 2 1 0  8
    5 2 2 1 1  1
    5 2 3 1 0  2
    5 2 4 1 0  5
    5 2 5 1 0  5
    5 2 5 2 0  5
    6 1 3 1 0  4
    6 1 3 1 1  1
    6 2 4 1 0 13
    6 2 4 1 1  1
    6 2 4 2 0  5
    6 2 5 1 0  5
    7 1 3 1 0  5
    7 1 4 1 0 13
    7 1 4 1 1  1
    7 1 4 2 0  4
    7 1 4 2 1  1
    7 1 5 1 0  8
    7 1 5 1 1  1
    7 2 4 1 0  9
    8 1 4 1 0  4
    8 1 4 1 1  1
    8 1 5 1 0  8
    8 1 5 1 1  1
    8 2 3 2 0  9
    8 2 5 1 0  8
    8 2 5 1 1  1
    9 2 5 1 0  8
    9 2 5 1 1  1
    end
    I was wondering if it is possible to get two different variables for frequency of breast cancer cases (those coded 1 for a specific combinations of predictors and another variable for total number of those coded 0 for breast cancer.

    for example I wanted a dataset like,below
    age_cat sex inc_quint rural breast_ca(1) breast_ca(0)
    1 1 3 1 1 4
    which means there was 1 case of breast cancer out of 4 people with that combination of predictor variables. and I want zero frequencies to be shown for breast_ca(1), but not for breast_ca(0)

    Thank you.
    Yuba

  • #2
    Start with the output of -contract- (i.e., what you show in the second code block) and then just run -reshape wide _freq, i(age_cat-rural) j(breast_ca)-.

    Comment


    • #3
      Many thanks Clyde. It perfectly works.

      Comment

      Working...
      X