Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create a Binary Variable from Longitudinal Survey of Respondents

    Hi everyone,
    I am working with panel data that consists of three waves, 3, 4 & 5. Some individuals (pid) who were interviewed in wave 3 could not be reached or reinterviewed in subsequent waves 4 or 5. I want to generate a binary variable that takes the value of 1 if an individual who was interviewed in wave 3 was successfully interviewed in waves 4 and 5, and a value of 0 if the individual interviewed in wave 3 could not be interviewed in either of the subsequent waves. Is there a code to help me to accomplish this? Thanks very much in anticipation.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long pid float wave
    401011 3
    401011 4
    401013 3
    401014 3
    401014 4
    401014 5
    401016 3
    401016 4
    401016 5
    401017 3
    401017 4
    401018 3
    401018 4
    401018 5
    401019 3
    401020 3
    401020 4
    401020 5
    401021 3
    401021 4
    401022 3
    401022 4
    401023 3
    401023 4
    401023 5
    401024 3
    401024 4
    401024 5
    401026 3
    401026 4
    401027 3
    401028 3
    401028 4
    401028 5
    401030 3
    401030 4
    401030 5
    401031 3
    401032 3
    401032 4
    401032 5
    401033 3
    401034 3
    401035 3
    401035 4
    401035 5
    401036 3
    401036 4
    401036 5
    401037 3
    401037 4
    401037 5
    401038 3
    401038 4
    401038 5
    401039 3
    401039 4
    401040 3
    401040 4
    401040 5
    401041 3
    401042 3
    401042 4
    401043 3
    401043 4
    401043 5
    401044 3
    401044 4
    401044 5
    401045 3
    401045 4
    401045 5
    401046 3
    401046 4
    401047 3
    401047 4
    401048 3
    401048 4
    401048 5
    401050 3
    401050 4
    401050 5
    401052 3
    401052 4
    401052 5
    401056 3
    401056 4
    401056 5
    401060 3
    401064 3
    401064 4
    401064 5
    401065 3
    401065 4
    401065 5
    401066 3
    401066 4
    401066 5
    401068 3
    401068 4
    401068 5
    401069 3
    401070 3
    401070 4
    401070 5
    401071 3
    401073 3
    401073 4
    401073 5
    401075 3
    401075 4
    401075 5
    401076 3
    401076 4
    401077 3
    401077 4
    401077 5
    401079 3
    401079 4
    401080 3
    401080 4
    401080 5
    401083 3
    401084 3
    401084 4
    401084 5
    401085 3
    401085 4
    401085 5
    401087 3
    401087 4
    401087 5
    401088 3
    401088 4
    401088 5
    401089 3
    401089 4
    401089 5
    401091 3
    401091 4
    401091 5
    401092 3
    401095 3
    401095 4
    401095 5
    401097 3
    401097 4
    401097 5
    401101 3
    401101 4
    401101 5
    401102 3
    401102 4
    401103 3
    401104 3
    401104 4
    401105 3
    401106 3
    401106 4
    401106 5
    401108 3
    401108 4
    401108 5
    401111 3
    401111 4
    401111 5
    401112 3
    401112 4
    401112 5
    401114 3
    401114 4
    401114 5
    401115 3
    401115 4
    401115 5
    401118 3
    401118 4
    401119 3
    401120 3
    401120 4
    401120 5
    401123 3
    401123 4
    401123 5
    401124 3
    401124 4
    401124 5
    401125 3
    401125 4
    401129 3
    401129 4
    401129 5
    401130 3
    401133 3
    401133 4
    401133 5
    401135 3
    401135 4
    401135 5
    401136 3
    end

  • #2
    Code:
    // first check: first observation withing each person is wave 3, second wave 4, etc.
    bysort pid (wave) : gen wanted = ( wave == 2+_n )
    bysort pid (wanted)  : replace wanted = wanted[1]
    
    // second check: wave 3,4, and 5 are present
    bysort pid (wave) : replace wanted = ( _N == 3 ) if wanted == 1
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Code:
      bysort pid: gen wanted = _N == 3
      is an indicator for complete panels, which is not literally what you asked, but may be what you want.
      Last edited by Nick Cox; 22 Dec 2022, 01:27.

      Comment


      • #4
        @Nick,
        My goal is to estimate the probability that an individual who was interviewed in wave 3 was subsequently interviewed in waves 4 & 5. Ultimately, I will be creating inverse probability weights for the balanced panel as you indicated. For the first step, I want to generate a binary variable with a value of 1 for those who remained in the balanced panel and a value of 0 for those who dropped out.

        Comment


        • #5
          Maarten Buis,
          Yes, some individuals (pid) such as 401013 were interviewed in wave 3 but subsequently could not be interviewed for waves 4 and 5. Such individuals will have a value of 0. On the other hand, individuals(pid) such as 401014 were interviewed in wave 3, and also successfully interviewed in waves 4 and 5 and should therefore have a value of 1. I will try your codes and revert.

          Comment


          • #6
            Nick Cox,

            The code you provided produced the following outcome which is close to what I want except that the pids are duplicated. How to remove the duplicates, i.e to have each pid with a value of 1 or 0 appear only once. Not sure if I am clear. Thanks very much.

            Code:
            * Example generated by -dataex-. To install: ssc install dataex
            clear
            input long pid float(wanted wave)
            401011 0 3
            401011 0 4
            401013 0 3
            401014 1 3
            401014 1 5
            401014 1 4
            401016 1 4
            401016 1 5
            401016 1 3
            401017 0 4
            401017 0 3
            401018 1 3
            401018 1 5
            401018 1 4
            401019 0 3
            401020 1 5
            401020 1 4
            401020 1 3
            401021 0 3
            401021 0 4
            401022 0 3
            401022 0 4
            401023 1 3
            401023 1 4
            401023 1 5
            401024 1 5
            401024 1 3
            401024 1 4
            401026 0 3
            401026 0 4
            401027 0 3
            401028 1 4
            401028 1 5
            401028 1 3
            401030 1 3
            401030 1 5
            401030 1 4
            401031 0 3
            401032 1 4
            401032 1 5
            401032 1 3
            401033 0 3
            401034 0 3
            401035 1 5
            401035 1 4
            401035 1 3
            401036 1 3
            401036 1 4
            401036 1 5
            401037 1 3
            401037 1 5
            401037 1 4
            401038 1 4
            401038 1 3
            401038 1 5
            401039 0 4
            401039 0 3
            401040 1 3
            401040 1 4
            401040 1 5
            401041 0 3
            401042 0 4
            401042 0 3
            401043 1 5
            401043 1 3
            401043 1 4
            401044 1 3
            401044 1 5
            401044 1 4
            401045 1 4
            401045 1 3
            401045 1 5
            401046 0 3
            401046 0 4
            401047 0 3
            401047 0 4
            401048 1 3
            401048 1 4
            401048 1 5
            401050 1 3
            401050 1 5
            401050 1 4
            401052 1 4
            401052 1 5
            401052 1 3
            401056 1 3
            401056 1 5
            401056 1 4
            401060 0 3
            401064 1 4
            401064 1 5
            401064 1 3
            401065 1 3
            401065 1 4
            401065 1 5
            401066 1 5
            401066 1 3
            401066 1 4
            401068 1 5
            401068 1 3
            401068 1 4
            401069 0 3
            401070 1 4
            401070 1 5
            401070 1 3
            401071 0 3
            401073 1 5
            401073 1 4
            401073 1 3
            401075 1 4
            401075 1 3
            401075 1 5
            401076 0 4
            401076 0 3
            401077 1 5
            401077 1 3
            401077 1 4
            401079 0 4
            401079 0 3
            401080 1 4
            401080 1 5
            401080 1 3
            401083 0 3
            401084 1 3
            401084 1 4
            401084 1 5
            401085 1 5
            401085 1 3
            401085 1 4
            401087 1 4
            401087 1 3
            401087 1 5
            401088 1 5
            401088 1 3
            401088 1 4
            401089 1 3
            401089 1 4
            401089 1 5
            401091 1 4
            401091 1 3
            401091 1 5
            401092 0 3
            401095 1 3
            401095 1 4
            401095 1 5
            401097 1 5
            401097 1 3
            401097 1 4
            401101 1 4
            401101 1 3
            401101 1 5
            401102 0 4
            401102 0 3
            401103 0 3
            401104 0 3
            401104 0 4
            401105 0 3
            401106 1 5
            401106 1 4
            401106 1 3
            401108 1 3
            401108 1 4
            401108 1 5
            401111 1 3
            401111 1 4
            401111 1 5
            401112 1 5
            401112 1 3
            401112 1 4
            401114 1 3
            401114 1 4
            401114 1 5
            401115 1 3
            401115 1 4
            401115 1 5
            401118 0 4
            401118 0 3
            401119 0 3
            401120 1 5
            401120 1 3
            401120 1 4
            401123 1 5
            401123 1 4
            401123 1 3
            401124 1 4
            401124 1 3
            401124 1 5
            401125 0 3
            401125 0 4
            401129 1 4
            401129 1 3
            401129 1 5
            401130 0 3
            401133 1 5
            401133 1 3
            401133 1 4
            401135 1 5
            401135 1 3
            401135 1 4
            401136 1 5
            end
            ------------------ copy up to and including the previous line ------------------

            Comment


            • #7
              to estimate the probability that an individual who was interviewed in wave 3 was subsequently interviewed in waves 4 & 5.
              That's a single conditional probability, so a scalar to be calculated across identifiers. I am not clear that a binary indicator variable across observations is even a good step in the right direction.

              Here is some technique to reduce the dataset to profiles across identifiers. See also https://journals.sagepub.com/doi/epu...36867X20909698


              Code:
              * Example generated by -dataex-. To install: ssc install dataex
              clear
              input long pid float wave
              401011 3
              401011 4
              401013 3
              401014 3
              401014 4
              401014 5
              401016 3
              401016 4
              401016 5
              401017 3
              401017 4
              401018 3
              401018 4
              401018 5
              401019 3
              401020 3
              401020 4
              401020 5
              401021 3
              401021 4
              401022 3
              401022 4
              401023 3
              401023 4
              401023 5
              401024 3
              401024 4
              401024 5
              401026 3
              401026 4
              401027 3
              401028 3
              401028 4
              401028 5
              401030 3
              401030 4
              401030 5
              401031 3
              401032 3
              401032 4
              401032 5
              401033 3
              401034 3
              401035 3
              401035 4
              401035 5
              401036 3
              401036 4
              401036 5
              401037 3
              401037 4
              401037 5
              401038 3
              401038 4
              401038 5
              401039 3
              401039 4
              401040 3
              401040 4
              401040 5
              401041 3
              401042 3
              401042 4
              401043 3
              401043 4
              401043 5
              401044 3
              401044 4
              401044 5
              401045 3
              401045 4
              401045 5
              401046 3
              401046 4
              401047 3
              401047 4
              401048 3
              401048 4
              401048 5
              401050 3
              401050 4
              401050 5
              401052 3
              401052 4
              401052 5
              401056 3
              401056 4
              401056 5
              401060 3
              401064 3
              401064 4
              401064 5
              401065 3
              401065 4
              401065 5
              401066 3
              401066 4
              401066 5
              401068 3
              401068 4
              401068 5
              401069 3
              401070 3
              401070 4
              401070 5
              401071 3
              401073 3
              401073 4
              401073 5
              401075 3
              401075 4
              401075 5
              401076 3
              401076 4
              401077 3
              401077 4
              401077 5
              401079 3
              401079 4
              401080 3
              401080 4
              401080 5
              401083 3
              401084 3
              401084 4
              401084 5
              401085 3
              401085 4
              401085 5
              401087 3
              401087 4
              401087 5
              401088 3
              401088 4
              401088 5
              401089 3
              401089 4
              401089 5
              401091 3
              401091 4
              401091 5
              401092 3
              401095 3
              401095 4
              401095 5
              401097 3
              401097 4
              401097 5
              401101 3
              401101 4
              401101 5
              401102 3
              401102 4
              401103 3
              401104 3
              401104 4
              401105 3
              401106 3
              401106 4
              401106 5
              401108 3
              401108 4
              401108 5
              401111 3
              401111 4
              401111 5
              401112 3
              401112 4
              401112 5
              401114 3
              401114 4
              401114 5
              401115 3
              401115 4
              401115 5
              401118 3
              401118 4
              401119 3
              401120 3
              401120 4
              401120 5
              401123 3
              401123 4
              401123 5
              401124 3
              401124 4
              401124 5
              401125 3
              401125 4
              401129 3
              401129 4
              401129 5
              401130 3
              401133 3
              401133 4
              401133 5
              401135 3
              401135 4
              401135 5
              401136 3
              end
              
              bysort pid (wave) : gen history = strofreal(wave) if _n == 1
              by pid : replace history = history[_n-1] + strofreal(wave) if _n > 1
              by pid : replace history = history[_N]
              by pid : keep if _n == _N
              
              tab history
              
              count if substr(history, 1, 1) == "3"
              local denom = r(N)
              count if history == "345"
              
              di r(N) / `denom'
              
              .61445783
              EDIT: I think #6 is in effect answered by this.
              Last edited by Nick Cox; 22 Dec 2022, 03:48.

              Comment


              • #8
                Nick Cox,

                As indicated above, the code you suggested gave me the output below which is close to what I wanted except for the duplicates:

                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input long pid float(wave wanted)
                401011 3 0
                401011 4 0
                401013 3 0
                401014 3 1
                401014 4 1
                401014 5 1
                401016 3 1
                401016 4 1
                401016 5 1
                401017 3 0
                401017 4 0
                401018 3 1
                401018 4 1
                401018 5 1
                401019 3 0
                401020 3 1
                401020 4 1
                401020 5 1
                401021 3 0
                401021 4 0
                401022 3 0
                401022 4 0
                401023 3 1
                401023 4 1
                401023 5 1
                401024 3 1
                401024 4 1
                401024 5 1
                401026 3 0
                401026 4 0
                401027 3 0
                401028 3 1
                401028 4 1
                401028 5 1
                401030 3 1
                401030 4 1
                401030 5 1
                401031 3 0
                401032 3 1
                401032 4 1
                401032 5 1
                401033 3 0
                401034 3 0
                401035 3 1
                401035 4 1
                401035 5 1
                401036 3 1
                401036 4 1
                401036 5 1
                401037 3 1
                401037 4 1
                401037 5 1
                401038 3 1
                401038 4 1
                401038 5 1
                401039 3 0
                401039 4 0
                401040 3 1
                401040 4 1
                401040 5 1
                401041 3 0
                401042 3 0
                401042 4 0
                401043 3 1
                401043 4 1
                401043 5 1
                401044 3 1
                401044 4 1
                401044 5 1
                401045 3 1
                401045 4 1
                401045 5 1
                401046 3 0
                401046 4 0
                401047 3 0
                401047 4 0
                401048 3 1
                401048 4 1
                401048 5 1
                401050 3 1
                401050 4 1
                401050 5 1
                401052 3 1
                401052 4 1
                401052 5 1
                401056 3 1
                401056 4 1
                401056 5 1
                401060 3 0
                401064 3 1
                401064 4 1
                401064 5 1
                401065 3 1
                401065 4 1
                401065 5 1
                401066 3 1
                401066 4 1
                401066 5 1
                401068 3 1
                401068 4 1
                401068 5 1
                401069 3 0
                401070 3 1
                401070 4 1
                401070 5 1
                401071 3 0
                401073 3 1
                401073 4 1
                401073 5 1
                401075 3 1
                401075 4 1
                401075 5 1
                401076 3 0
                401076 4 0
                401077 3 1
                401077 4 1
                401077 5 1
                401079 3 0
                401079 4 0
                401080 3 1
                401080 4 1
                401080 5 1
                401083 3 0
                401084 3 1
                401084 4 1
                401084 5 1
                401085 3 1
                401085 4 1
                401085 5 1
                401087 3 1
                401087 4 1
                401087 5 1
                401088 3 1
                401088 4 1
                401088 5 1
                401089 3 1
                401089 4 1
                401089 5 1
                401091 3 1
                401091 4 1
                401091 5 1
                401092 3 0
                401095 3 1
                401095 4 1
                401095 5 1
                401097 3 1
                401097 4 1
                401097 5 1
                401101 3 1
                401101 4 1
                401101 5 1
                401102 3 0
                401102 4 0
                401103 3 0
                401104 3 0
                401104 4 0
                401105 3 0
                401106 3 1
                401106 4 1
                401106 5 1
                401108 3 1
                401108 4 1
                401108 5 1
                401111 3 1
                401111 4 1
                401111 5 1
                401112 3 1
                401112 4 1
                401112 5 1
                401114 3 1
                401114 4 1
                401114 5 1
                401115 3 1
                401115 4 1
                401115 5 1
                401118 3 0
                401118 4 0
                401119 3 0
                401120 3 1
                401120 4 1
                401120 5 1
                401123 3 1
                401123 4 1
                401123 5 1
                401124 3 1
                401124 4 1
                401124 5 1
                401125 3 0
                401125 4 0
                401129 3 1
                401129 4 1
                401129 5 1
                401130 3 0
                401133 3 1
                401133 4 1
                401133 5 1
                401135 3 1
                401135 4 1
                401135 5 1
                401136 3 1
                end
                What I wanted is as below which I obtained by running the codes below to enable me to get rid of the duplicate pids:

                duplicates drop pid if wanted==1, force
                duplicates drop pid if wanted==0, force


                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input long pid float(wave wanted)
                401011 3 0
                401013 3 0
                401014 3 1
                401016 3 1
                401017 3 0
                401018 3 1
                401019 3 0
                401020 3 1
                401021 3 0
                401022 3 0
                401023 3 1
                401024 3 1
                401026 3 0
                401027 3 0
                401028 3 1
                401030 3 1
                401031 3 0
                401032 3 1
                401033 3 0
                401034 3 0
                401035 3 1
                401036 3 1
                401037 3 1
                401038 3 1
                401039 3 0
                401040 3 1
                401041 3 0
                401042 3 0
                401043 3 1
                401044 3 1
                401045 3 1
                401046 3 0
                401047 3 0
                401048 3 1
                401050 3 1
                401052 3 1
                401056 3 1
                401060 3 0
                401064 3 1
                401065 3 1
                401066 3 1
                401068 3 1
                401069 3 0
                401070 3 1
                401071 3 0
                401073 3 1
                401075 3 1
                401076 3 0
                401077 3 1
                401079 3 0
                401080 3 1
                401083 3 0
                401084 3 1
                401085 3 1
                401087 3 1
                401088 3 1
                401089 3 1
                401091 3 1
                401092 3 0
                401095 3 1
                401097 3 1
                401101 3 1
                401102 3 0
                401103 3 0
                401104 3 0
                401105 3 0
                401106 3 1
                401108 3 1
                401111 3 1
                401112 3 1
                401114 3 1
                401115 3 1
                401118 3 0
                401119 3 0
                401120 3 1
                401123 3 1
                401124 3 1
                401125 3 0
                401129 3 1
                401130 3 0
                401133 3 1
                401135 3 1
                401136 3 1
                401137 3 1
                401138 3 0
                401139 3 1
                401140 3 1
                401142 3 0
                401145 3 1
                401147 3 0
                401148 3 1
                401149 3 1
                401150 3 1
                401151 3 1
                401152 3 1
                401153 3 0
                401155 3 0
                401156 3 1
                401159 3 1
                401160 3 1
                401161 3 1
                401163 3 1
                401165 3 0
                401166 3 1
                401167 3 1
                401170 3 0
                401171 3 1
                401172 3 0
                401174 3 0
                401175 3 1
                401178 3 1
                401179 3 0
                401180 3 1
                401185 3 0
                401186 3 0
                401188 3 0
                401189 3 0
                401190 3 1
                401193 3 0
                401194 3 1
                401196 3 0
                401197 3 0
                401198 3 0
                401199 3 1
                401203 3 1
                401205 3 1
                401206 3 1
                401208 3 1
                401210 3 1
                401211 3 1
                401212 3 0
                401214 3 1
                401215 3 0
                401217 3 1
                401218 3 1
                401219 3 0
                401222 3 1
                401224 3 0
                401226 3 1
                401228 3 0
                401229 3 1
                401230 3 0
                401235 3 0
                401236 3 0
                401237 3 0
                401238 3 1
                401241 3 0
                401242 3 0
                401243 3 0
                401244 3 1
                401247 3 0
                401248 3 1
                401249 3 1
                401250 3 0
                401253 3 1
                401254 3 0
                401255 3 0
                401256 3 0
                401257 3 0
                401258 3 1
                401264 3 1
                401266 3 1
                401267 3 0
                401268 3 0
                401269 3 1
                401270 3 1
                401271 3 1
                401272 3 1
                401274 3 0
                401275 3 1
                401276 3 1
                401277 3 0
                401279 3 0
                401280 3 1
                401282 3 1
                401283 3 0
                401285 3 1
                401286 3 0
                401288 3 1
                401290 3 0
                401292 3 0
                401293 3 1
                401294 3 0
                401295 3 1
                401296 3 1
                401297 3 1
                401299 3 0
                401302 3 1
                401304 3 1
                401306 3 1
                401307 3 0
                401308 3 0
                401310 3 1
                401312 3 1
                401313 3 1
                401316 3 0
                401319 3 0
                401320 3 1
                401321 3 1
                401323 3 0
                end
                ------------------ copy up to and including the previous line ------------------

                My question: Is there a way to obtain this output without necessarily dropping the duplicate pids from the dataset. In other words, I want the binary variable 'wanted' to contain only one observation (either 0 or 1) per pid.
                Thanks very much.

                Comment


                • #9
                  Sorry, but I can't follow #8 easily. Why don't you show a smaller example dataset as you have it and as you want and explain the logic in terms of those examples?

                  Comment

                  Working...
                  X