Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Multiplying two different sets of variables

    I have multiple sets of variables i.e. "alone" "sleeping" and its split into 15-minute time blocks (alone is b1dd0101-b1dd01096 and sleep is b1da0201-b1da0196), I'm trying to work out how to work out the total time spent sleeping whilst alone, in the attached photo I have a way of doing it but it's incredibly time consuming and inefficient, any pointers?


  • #2
    I am going to assume that the 15-minute blocks are numbered 1 through 96, and so the variables are in a series of the form b1dd01XX and b1da02XX. I am also going to assume that the b1dd01XX series is a set of binary variables indicating whether alone or not. Then you can do something like this:
    Code:
    gen wanted = 0
    forval i = 1/96 {
        local j `:dis %02.0f `i''
        replace wanted = wanted + bbd01`j'*b1da02`j'
    }

    Comment


    • #3
      If there are missing values in the bbdd01 and b1da02 variables, then the code in #2 will fail, leaving only a missing value for wanted.

      Here's a different approach that will just ignore missing values. It relies on -reshape-ing the data into long layout. It is likely that other things you will do with this data will prove easier (or only possible) in long layout, because that is the usual case in Stata.

      In addition to the assumptions about your data in #2, I also assume that you have a variable that is a person identifier, and I'll call it id.

      Code:
      reshape long b1dd01 b1da02, i(id) j(interval)
      by id, sort: egen wanted = total(cond(b1dd01 == 1, b1da02, .))
      As an aside, I advise you to rename your variables to something with mnemonic value. Even if you are not having any difficulty today remembering that b1dd01 is alone and b1da02 is sleep, if you have to return to this data after being away from it for several weeks, you will probably find your code unintelligible without a serious refresher course in the data names. In large scale surveys, there are good reasons why variables are given gobbledygook names, but for the end user, who generally uses only a subset of the variables, your work will be smoother and quicker if you change the names to something that describes what they are.

      Comment


      • #4
        A small edit to the reshape command in #3: you'd want to add the string option, like so
        Code:
        reshape long b1dd01 b1da02, i(id) j(interval) string
        Also note that the one edge case where #3 fails is when all the b1dd01 and/or b1da02 variables for an id are missing. The wanted variable then would produce a value of 0 (thereby being indistinguishable from situations where zero time was actually spent sleeping alone), rather than missing. Fortunately, the fix is easy, involving another option:
        Code:
        by id, sort: egen wanted = total(cond(b1dd01 == 1, b1da02, .)), missing

        Comment


        • #5
          the loop works but when I'm using another variable i.e. b1dd03`j' I run into some issues. it works when I'm doing the time frames before 9am (1/36) and 9am to 5pm (37/68), but once I get to after 5 (69/96) it says "b1dd03 is ambiguous". What's the cause of this[IMG]file:///C:/Users/24471702/OneDrive%20-%20UWA/Pictures/Screenshots/Screenshot%202025-09-15%20162755.png[/IMG]

          Comment


          • #6
            Kevin Bainbridge It's hard to troubleshoot this without more information. Your image did not come through. It would also help to see exactly the code you are working with.

            The error seems to suggest that for some reason the local `j' was empty and Stata is reading b1dd03`j' as just b1dd03. But I'm not sure how that can happen without seeing exactly the code you are trying to execute.

            Comment

            Working...
            X