Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to aggregate variables into a single variable

    Hi all,

    I am looking at sickness absence in various occupations in LFS data. The dataset contains the vairables ILLDAYS1-7 detailing whether an individual was ill on that day of the week with ILLDAYS1 being the first day of the week the individual was ill and ILLDAYS7 being the 7th day of the week they were ill on. Ideally, I would like to aggregate ILLDAYS into a single variable that tells me how many days of the reference week the individual was ill for. Is there a way I can do this?

    Thanks

  • #2
    Welcome to Statalist.

    Is there a way I can do this?
    Yes.

    But exactly how it is done depends on how your data are coded, and you describe it incompletely.

    If we assume that the ILLDAYS1-ILLDAYS7 variables take the value 1 on those days where the individual was ill and 0 on the days when the individual was not ill, then the following example might start you in a useful direction.
    Code:
    . // input made-up data
    . * Example generated by -dataex-. To install: ssc install dataex
    . clear
    
    . input float(ILLDAYS1 ILLDAYS2 ILLDAYS3 ILLDAYS4 ILLDAYS5 ILLDAYS6 ILLDAYS7)
    
          ILLDAYS1   ILLDAYS2   ILLDAYS3   ILLDAYS4   ILLDAYS5   ILLDAYS6   ILLDAYS7
      1. 0 0 1 1 1 0 0
      2. 1 0 0 0 0 0 0
      3. 0 1 1 0 1 0 0
      4. 0 0 0 0 0 0 1
      5. 0 0 0 0 0 0 0
      6. end
    
    . 
    . egen DAYSILL = rowtotal(ILLDAYS*)
    
    . list, clean
    
           ILLDAYS1   ILLDAYS2   ILLDAYS3   ILLDAYS4   ILLDAYS5   ILLDAYS6   ILLDAYS7   DAYSILL  
      1.          0          0          1          1          1          0          0         3  
      2.          1          0          0          0          0          0          0         1  
      3.          0          1          1          0          1          0          0         3  
      4.          0          0          0          0          0          0          1         1  
      5.          0          0          0          0          0          0          0         0  
    
    .

    Comment


    • #3
      Originally posted by William Lisowski View Post
      Welcome to Statalist.



      Yes.

      But exactly how it is done depends on how your data are coded, and you describe it incompletely.

      If we assume that the ILLDAYS1-ILLDAYS7 variables take the value 1 on those days where the individual was ill and 0 on the days when the individual was not ill, then the following example might start you in a useful direction.
      Code:
      . // input made-up data
      . * Example generated by -dataex-. To install: ssc install dataex
      . clear
      
      . input float(ILLDAYS1 ILLDAYS2 ILLDAYS3 ILLDAYS4 ILLDAYS5 ILLDAYS6 ILLDAYS7)
      
      ILLDAYS1 ILLDAYS2 ILLDAYS3 ILLDAYS4 ILLDAYS5 ILLDAYS6 ILLDAYS7
      1. 0 0 1 1 1 0 0
      2. 1 0 0 0 0 0 0
      3. 0 1 1 0 1 0 0
      4. 0 0 0 0 0 0 1
      5. 0 0 0 0 0 0 0
      6. end
      
      .
      . egen DAYSILL = rowtotal(ILLDAYS*)
      
      . list, clean
      
      ILLDAYS1 ILLDAYS2 ILLDAYS3 ILLDAYS4 ILLDAYS5 ILLDAYS6 ILLDAYS7 DAYSILL
      1. 0 0 1 1 1 0 0 3
      2. 1 0 0 0 0 0 0 1
      3. 0 1 1 0 1 0 0 3
      4. 0 0 0 0 0 0 1 1
      5. 0 0 0 0 0 0 0 0
      
      .
      Hi William,

      Sorry for incompletely describing my data. The observations in ILLDAYS are stored as the actual days of the week of the week the individual was ill for. I wrote the code you suggested and it has worked to some extent by assigning each day a value and DAYSILL is the sum of those numbers. Thanks a lot for you help.

      As a side note, I'm not sure if I need to start a new thread for this but I would like for the values to be 0 if the person was not ill that day or 1 if they were. In this case would I use the egen command?

      Comment


      • #4
        I won't attempt an answer at this point. It is clear that I did not understand your original question, and I do not understand it after your clarification, and you have shown neither the code you wound up using nor a sample of the data you have.

        Please take a few moments to review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question. It's particularly helpful to copy commands and output from your Stata Results window and paste them into your Statalist post using code delimiters [CODE] and [/CODE], and to use the dataex command to provide sample data, as described in section 12 of the FAQ.

        The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

        Even the best descriptions of data are no substitute for an actual example of the data.

        Be sure to use the dataex command to do this. If you are running version 15.1 or later, or a fully updated version 14.2, dataex is already part of your official Stata installation. If not, run ssc install dataex to get it. Either way, run help dataex and read the simple instructions for using it. dataex will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

        When asking for help with code, always show example data. When showing example data, always use dataex.

        Comment


        • #5
          Sorry if anything isn't clear but I entered the code shown below and below is the output I got. You understoof the first question perfectly well. As you can see ILLDAYS contains the actual days of the week and DAYSILL has aggregated the data by assigning valules to the days. What I would like to have is simply a 0 if ILLDAYS = Does not apply and 1 if ILLDAYS = a day of the week.
          Attached Files

          Comment

          Working...
          X