Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difficulty understanding the difference using bysort with parenthesis

    Hi all,

    I have been trying to look at different forums, and found similar answers, but I have a certain application I cannot find an answer to.


    I have sorted my data by a group variable and want to keep a particular group if an indicator variable denoted indicatorvar ==1 at least once in the group variable. However, I get two different sized datasets when doing the following two different methods, holding other code equal:

    bysort group: keep if indicatorvar[_N]

    bysort (indicatorvar): keep if indicatorvar[_N]

    Upon browsing both results, it seems as if it keeps the entire group as long as the indicator variable is activated in at least one observation within a group. However, the first result is much smaller observation-wise than the second result.

    I cannot understand why it is doing it. Any help would be appreciated.


    Similar forum is here:-by- syntax of adding another variable in brackets - Statalist

  • #2
    Code:
    bysort group: keep if indicatorvar[_N]
    keeps groups whose last value of indicatorvar is 1 given the current sort order
    Code:
    bysort group (indicatorvar): keep if indicatorvar[_N]
    keeps groups whose largest value of indicatorvar is 1

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input byte(group indicatorvar)
    1 0
    1 0
    1 1
    2 0
    2 1
    2 0
    end
    
    bysort group: g a = indicatorvar[_N]
    bysort group (indicatorvar): g b = indicatorvar[_N]
    
    list ,sep(6)
    
         +--------------------------+
         | group   indica~r   a   b |
         |--------------------------|
      1. |     1          0   1   1 |
      2. |     1          0   1   1 |
      3. |     1          1   1   1 |
      4. |     2          0   0   1 |
      5. |     2          0   0   1 |
      6. |     2          1   0   1 |
         +--------------------------+
    Last edited by Øyvind Snilsberg; 22 Feb 2022, 00:02.

    Comment


    • #3
      Originally posted by Øyvind Snilsberg View Post
      Code:
      bysort group: keep if indicatorvar[_N]
      keeps groups whose last value of indicatorvar is 1 given the current sort order
      Code:
      bysort group (indicatorvar): keep if indicatorvar[_N]
      keeps groups whose largest value of indicatorvar is 1

      Code:
      * Example generated by -dataex-. For more info, type help dataex
      clear
      input byte(group indicatorvar)
      1 0
      1 0
      1 1
      2 0
      2 1
      2 0
      end
      
      bysort group: g a = indicatorvar[_N]
      bysort group (indicatorvar): g b = indicatorvar[_N]
      
      list ,sep(6)
      
      +--------------------------+
      | group indica~r a b |
      |--------------------------|
      1. | 1 0 1 1 |
      2. | 1 0 1 1 |
      3. | 1 1 1 1 |
      4. | 2 0 0 1 |
      5. | 2 0 0 1 |
      6. | 2 1 0 1 |
      +--------------------------+
      Thank you so much, extremely effective answer!

      Comment

      Working...
      X