Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Using a loop to compare observations of a variable

    Hi All,
    I have a dataset which looks something like below:
    ID Amount month max_amount max_month N
    1 10 1 20 . 2
    1 20 2 20 2 2
    2 10 1 10 1 3
    2 10 2 10 2 3
    2 5 3 10 . 3
    3 20 2 20 2 1
    4 30 4 30 4 1
    Note that I have calculated the max_amount, max_month columns and N using code as below:

    egen max_amount =max(Amount), by(ID)
    bysort ID: gen max_month = month if Amount>= max_amount
    bysort ID: gen N=_N

    My problem is with the variable max_month. For my analysis I need one unique value for each ID. Notice that for the ID 2, there are 3 observations. Two of these observations have same value for "amount", i.e. 10. So my maximum amount is 10 but the months corresponding to these 2 "maximums" are different. Now for practical purposes, I am willing to consider only the earliest month among the duplicate "maximums", i.e. I want to retain only the earliest month with a maximum amount for each ID . I tried several ways to do this. Following is a failed attempt at a loop :

    foreach num i=1/N {
    replace max_month[`i'] = max_month[`i'-1] if max_month[`i']>max_month[`i'-1]
    }

    I am also wondering if I am not searching for the correct "keywords" on Google to look for an answer to this. Any help would be appreciated.

    Thank you!

  • #2
    Here's sample code that shows a slightly different way of approaching the problem. The key is that if you sort your data by id, amount, and the negative of the value of month, for each id the earliest month having the largest value for the id will be the last observation for that id.
    Code:
    cls
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input byte(id amount month)
    1 10 1
    1 20 2
    2 10 1
    2 10 2
    2  5 3
    3 20 2
    4 30 4
    end
    generate negmon = -month
    bysort id (amount negmon) : generate max_amount = amount[_N]
    bysort id (amount negmon) : generate max_month = month[_N]
    drop negmon
    sort id month
    bysort id : egen n = total(amount==max_amount)
    list, sepby(id) noobs abbreviate(12)
    Code:
    . list, sepby(id) noobs abbreviate(12)
    
      +--------------------------------------------------+
      | id   amount   month   max_amount   max_month   n |
      |--------------------------------------------------|
      |  1       10       1           20           2   1 |
      |  1       20       2           20           2   1 |
      |--------------------------------------------------|
      |  2       10       1           10           1   2 |
      |  2       10       2           10           1   2 |
      |  2        5       3           10           1   2 |
      |--------------------------------------------------|
      |  3       20       2           20           2   1 |
      |--------------------------------------------------|
      |  4       30       4           30           4   1 |
      +--------------------------------------------------+

    Comment


    • #3
      Thank you William! I was just not thinking in the right direction.

      Comment

      Working...
      X