Problem with foreach command using a local

Marc Peters

Join Date: Jan 2023

Posts: 8
#1

Problem with foreach command using a local

02 Jan 2023, 07:19

Hello!

I have panel data with id as panel variable and date as time variable. I want to create identifiers of groups of respondents based on the first value that each respondent has on the variable confidence. I use Stata 17.

Below is an excerpt with the relevant variables.

clear
input long id float(date confidence)
1 723 7
1 724 .
1 725 .
1 726 .
1 727 .
1 728 .
1 729 .
1 730 .
1 731 .
1 732 .
1 733 .
1 734 .
1 735 .
1 736 .
1 737 .
1 738 .
1 739 .
1 740 .
1 741 .
1 742 .
1 743 .
1 744 .
1 745 .
1 746 .
1 747 .
1 748 .
1 749 .
1 750 .
1 751 .
1 752 .
5 723 7
5 724 7
5 725 6
5 726 6
5 727 7
5 728 7
5 729 8
5 730 8
5 731 8
5 732 8
5 733 8
5 734 8
5 735 8
5 736 8
5 737 8
5 738 7
5 739 8
5 740 8
5 741 7
5 742 7
5 743 7
5 744 7
5 745 7
5 746 8
5 747 8
5 748 8
5 749 8
5 750 8
5 751 8
5 752 8
11 741 5
11 742 .
11 743 .
11 744 .
11 745 .
11 746 .
11 747 .
11 748 .
11 749 .
11 750 .
11 751 .
11 752 .
end
format %tmMon-YY date
[/CODE]

So far, I have the following code:

by id (date): gen byte start1 = confidence[_n==1]==1
levelsof id if start1 = 1, local(unqID1)
gen byte groupidentifier1 = 0
foreach 1 of local unqID1 {
replace groupidentifier1 = 1
}

Unfortunately, the foreach command results in all values of the variable groupidentifier1 becoming 1, although I want this to be only the case for the subset of respondents in the local unqID1.

For further background, what I ultimately want to do is create time-series on confidence for each group of individuals that start out with a specific value on confidence. I suppose that would be bysort date: egen journey1 = mean(confidence) if groupidentifier1 == 1 and so forth.

Can somebody help me? Maybe also with a more efficient way?

Last edited by Marc Peters; 02 Jan 2023, 07:25.
Tags: None
Nick Cox

Join Date: Mar 2014

Posts: 35664
#2

02 Jan 2023, 07:32

You don’t need a loop for this problem. Search for an FAQ on first and last occurrences in panel data, I am answering by phone and can’t comfortably copy-and-paste links.

There are several reasons why your code didn’t work. The line inside the loop is exactly the same each time around the loop, or other way round there isn’t a way that the loop machinery automatically selects a different element of the local macro to use.
Comment
Marc Peters

Join Date: Jan 2023

Posts: 8
#3

02 Jan 2023, 08:39

Thanks a lot, Nick! I checked the FAQ and think I know how to identify first occurrences in panel data, but I don't get how I can accordingly classify entire respondents. Basically, I'm thinking about an indicator variable that is 1 for all rows of the respondent if the first row of the variable confidence for that respondent has a specific value. Do you know what I mean? If this is covered by the FAQ on first/last occurrences in panel data and I'm not seeing it, apologies in advance!
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35664

02 Jan 2023, 08:50

Code:

bysort id : egen when = min(cond(confidence < ., date, .))

by id : egen wanted = total(confidence * (date == when))

is one way to do it.

Comment

Marc Peters

Join Date: Jan 2023

Posts: 8
#5

02 Jan 2023, 09:36

I see, thank you so much!
Comment

Nick Cox

Join Date: Mar 2014
Posts: 35664

03 Jan 2023, 03:37

Here's another way to do it. By the way, egen groupies will note a 2000 function first() in egenmore on SSC.

Code:

clear
input long id float(date confidence)
1 723 7
1 724 .
1 725 .
1 726 .
1 727 .
1 728 .
1 729 .
1 730 .
1 731 .
1 732 .
1 733 .
1 734 .
1 735 .
1 736 .
1 737 .
1 738 .
1 739 .
1 740 .
1 741 .
1 742 .
1 743 .
1 744 .
1 745 .
1 746 .
1 747 .
1 748 .
1 749 .
1 750 .
1 751 .
1 752 .
5 723 7
5 724 7
5 725 6
5 726 6
5 727 7
5 728 7
5 729 8
5 730 8
5 731 8
5 732 8
5 733 8
5 734 8
5 735 8
5 736 8
5 737 8
5 738 7
5 739 8
5 740 8
5 741 7
5 742 7
5 743 7
5 744 7
5 745 7
5 746 8
5 747 8
5 748 8
5 749 8
5 750 8
5 751 8
5 752 8
11 741 5
11 742 .
11 743 .
11 744 .
11 745 .
11 746 .
11 747 .
11 748 .
11 749 .
11 750 .
11 751 .
11 752 .
end
format %tmMon-YY date

bysort id (date) : gen wanted = confidence[1]
by id : replace wanted = confidence  if _n > 1 & missing(wanted)
by id : replace wanted = wanted[_N]

list , sepby(id)

     +---------------------------------+
     | id     date   confid~e   wanted |
     |---------------------------------|
  1. |  1   Apr-20          7        7 |
  2. |  1   May-20          .        7 |
  3. |  1   Jun-20          .        7 |
  4. |  1   Jul-20          .        7 |
  5. |  1   Aug-20          .        7 |
  6. |  1   Sep-20          .        7 |
  7. |  1   Oct-20          .        7 |
  8. |  1   Nov-20          .        7 |
  9. |  1   Dec-20          .        7 |
 10. |  1   Jan-21          .        7 |
 11. |  1   Feb-21          .        7 |
 12. |  1   Mar-21          .        7 |
 13. |  1   Apr-21          .        7 |
 14. |  1   May-21          .        7 |
 15. |  1   Jun-21          .        7 |
 16. |  1   Jul-21          .        7 |
 17. |  1   Aug-21          .        7 |
 18. |  1   Sep-21          .        7 |
 19. |  1   Oct-21          .        7 |
 20. |  1   Nov-21          .        7 |
 21. |  1   Dec-21          .        7 |
 22. |  1   Jan-22          .        7 |
 23. |  1   Feb-22          .        7 |
 24. |  1   Mar-22          .        7 |
 25. |  1   Apr-22          .        7 |
 26. |  1   May-22          .        7 |
 27. |  1   Jun-22          .        7 |
 28. |  1   Jul-22          .        7 |
 29. |  1   Aug-22          .        7 |
 30. |  1   Sep-22          .        7 |
     |---------------------------------|
 31. |  5   Apr-20          7        7 |
 32. |  5   May-20          7        7 |
 33. |  5   Jun-20          6        7 |
 34. |  5   Jul-20          6        7 |
 35. |  5   Aug-20          7        7 |
 36. |  5   Sep-20          7        7 |
 37. |  5   Oct-20          8        7 |
 38. |  5   Nov-20          8        7 |
 39. |  5   Dec-20          8        7 |
 40. |  5   Jan-21          8        7 |
 41. |  5   Feb-21          8        7 |
 42. |  5   Mar-21          8        7 |
 43. |  5   Apr-21          8        7 |
 44. |  5   May-21          8        7 |
 45. |  5   Jun-21          8        7 |
 46. |  5   Jul-21          7        7 |
 47. |  5   Aug-21          8        7 |
 48. |  5   Sep-21          8        7 |
 49. |  5   Oct-21          7        7 |
 50. |  5   Nov-21          7        7 |
 51. |  5   Dec-21          7        7 |
 52. |  5   Jan-22          7        7 |
 53. |  5   Feb-22          7        7 |
 54. |  5   Mar-22          8        7 |
 55. |  5   Apr-22          8        7 |
 56. |  5   May-22          8        7 |
 57. |  5   Jun-22          8        7 |
 58. |  5   Jul-22          8        7 |
 59. |  5   Aug-22          8        7 |
 60. |  5   Sep-22          8        7 |
     |---------------------------------|
 61. | 11   Oct-21          5        5 |
 62. | 11   Nov-21          .        5 |
 63. | 11   Dec-21          .        5 |
 64. | 11   Jan-22          .        5 |
 65. | 11   Feb-22          .        5 |
 66. | 11   Mar-22          .        5 |
 67. | 11   Apr-22          .        5 |
 68. | 11   May-22          .        5 |
 69. | 11   Jun-22          .        5 |
 70. | 11   Jul-22          .        5 |
 71. | 11   Aug-22          .        5 |
 72. | 11   Sep-22          .        5 |
     +---------------------------------+

If the first value for each panel is never missing, then go straight to

Code:

bysort id (date) : gen isfirst = confidence[1]

Last edited by Nick Cox; 03 Jan 2023, 04:03.

Announcement

Problem with foreach command using a local

Comment

Comment

Comment

Comment

Comment