Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Calculating a % variable for groups within a variable

    Hello,

    I have data that is split by wave (4 years available), age group, sex and Life satisfaction. I would like to calculate the % of females within each age group for each year so that I can use this as a variable for regression.

    I have so far written code which calculates % for one particular age group and year (code below). I am struggling to apply this to all age groups and waves so that I have a single variable that calculates separate % for each age group and wave.
    Are there any pointers on the next steps I could take?

    count if (Sex==1)&(Agegroup==1)&(Wave==1) ** counts number of females for first age group and wave
    local numerator = r(N)
    count if (Sex>=0)&(Agegroup==1)&(Wave==1) ** counts number of respondents for first age group and wave
    local denominator = r(N)
    gen percent =100*`numerator'/`denominator' ** generates a variable which is applied to ALL observations including the ones that should have a different %

    Thanks







  • #2
    Perhaps this example code will start you in a useful direction.
    Code:
    sort Agegroup Wave
    by Agegroup Wave: egen n = count(Sex) if Sex==1
    by Agegroup Wave: egen d = count(Sex) if Sex>=0
    generate percent = 100 * n / d

    Comment


    • #3
      Amazing thank you - last night I used this and figured out a bit of a roundabout way of applying the calculated % for the whole column:
      Code:
      sort Agegroup Wave
      by Agegroup Wave: egen n = count(Sex) if Sex==1
      by Agegroup Wave: egen d = count(Sex)
      generate percent = 100 * n / d
      sort Agegroup Wave
      by Country Agegroup Wave: egen percent2 = max(percent)
      Last edited by Yumi Nito; 03 Dec 2020, 02:41.

      Comment


      • #4
        Code:
         
          bysort Agegroup Wave: egen wanted = mean(100 * (Sex == 1))
        And if you're worried about values of Sex other than 0 or 1.

        Code:
         
         bysort Agegroup Wave: egen wanted = mean(100 * cond(inlist(Sex, 0, 1), Sex == 1, .))

        Comment

        Working...
        X