Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Problem with converting excel formula to STATA code

    Dear STATALIST Forum,

    I am having a trouble in converting an excel formula to STATA code. I know the logic behind the formula, but can not unfortunately think of how to write the code in STATA. I am attaching the excel file with the formula in the last column (the year should always be in descending order). Could you, please kindly help me in writing a STATA code based on excel formula? The reason is simple, I have millions of observations, that excel simply can't work with.

    Kind regards,
    Firangiz Aghayeva
    Attached Files

  • #2
    If anyone is pathologically curious like me, I've opened the file and got a screenshot:

    Click image for larger version

Name:	2023-05-10_13-15-37.png
Views:	1
Size:	45.2 KB
ID:	1713176


    If I understand correctly, OP wants to count how many repeated IDs (Col A) exist in the following years that also has a "Yes" in Col B.

    I do not understand the actual logic, but something like this may be a good start:

    Code:
    clear
    input id str5 invest year
    1  No 1990
    1 Yes 1991
    1 Yes 1992
    2  No 1990
    2  No 1991
    2 Yes 1992
    3 Yes 1990
    3  No 1991
    3  No 1992
    4  No 1990
    4  No 1991
    4  No 1992
    5 Yes 1990
    5 Yes 1991
    5 Yes 1992
    end
    
    gen isYes = invest=="Yes"
    gsort id invest -year
    
    bysort id: gen countYes = sum(isYes)
    
    gsort id year
    
    list, sepby(id)
    Results:

    Code:
         +---------------------------------------+
         | id   invest   year   isYes   countYes |
         |---------------------------------------|
      1. |  1       No   1990       0          0 |
      2. |  1      Yes   1991       1          2 |
      3. |  1      Yes   1992       1          1 |
         |---------------------------------------|
      4. |  2       No   1990       0          0 |
      5. |  2       No   1991       0          0 |
      6. |  2      Yes   1992       1          1 |
         |---------------------------------------|
      7. |  3      Yes   1990       1          1 |
      8. |  3       No   1991       0          0 |
      9. |  3       No   1992       0          0 |
         |---------------------------------------|
     10. |  4       No   1990       0          0 |
     11. |  4       No   1991       0          0 |
     12. |  4       No   1992       0          0 |
         |---------------------------------------|
     13. |  5      Yes   1990       1          3 |
     14. |  5      Yes   1991       1          2 |
     15. |  5      Yes   1992       1          1 |
         +---------------------------------------+
    In future, please refrain from using file attachment. Downloading files can have security risk and many users here don't do that. Try to explain your problem with Stata sample data.

    Comment


    • #3
      Ken's points are standard advice. Please see https://www.statalist.org/forums/help#stata

      Comment


      • #4
        Many thanks for your recommendations!

        Comment

        Working...
        X