Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Recoding Variable Assistance

    Hello Statalisters,

    I need to recode a variable to indicate the number of days that elapsed to the adoption of a particular policy. Currently, the variable first_public_healthpost is coded 0 or 1, with 1 indicating when the policy was adopted. I have a variable daily_cases_by_date that's tracking the passage of time. I believe to recode this variable I need to do something like: gen daystopassage= daily_cases_by_date-first_public_healthpost==1 but when I do this, all I get is a bunch of zeros in the new variable when I expect that starts at 1 to the number of elapsed days to passage. I know this should be fairly simple but I'm stuck. Any assistance would be greatly appreciated. Data sample included below:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float daily_cases_by_date int day byte(fips2 first_public_healthpost mandate_order_dates) str14 name
    22085 150 1 0 0 "Alabama"
    22086 151 1 0 0 "Alabama"
    22087 152 1 0 0 "Alabama"
    22088 153 1 0 0 "Alabama"
    22089 154 1 0 0 "Alabama"
    22090 155 1 0 0 "Alabama"
    22091 156 1 0 0 "Alabama"
    22092 157 1 0 0 "Alabama"
    22093 158 1 0 0 "Alabama"
    22094 159 1 0 0 "Alabama"
    22095 160 1 0 0 "Alabama"
    22096 161 1 0 0 "Alabama"
    22097 162 1 0 0 "Alabama"
    22098 163 1 0 0 "Alabama"
    22099 164 1 0 0 "Alabama"
    22100 165 1 0 0 "Alabama"
    22101 166 1 0 0 "Alabama"
    22102 167 1 0 0 "Alabama"
    22103 168 1 0 0 "Alabama"
    22104 169 1 0 0 "Alabama"
    22105 170 1 0 0 "Alabama"
    22106 171 1 0 0 "Alabama"
    22107 172 1 0 0 "Alabama"
    22108 173 1 0 0 "Alabama"
    22109 174 1 0 0 "Alabama"
    22110 175 1 0 0 "Alabama"
    22111 176 1 0 0 "Alabama"
    22112 177 1 1 1 "Alabama"
    22113 178 1 1 1 "Alabama"
    22114 179 1 1 1 "Alabama"
    22115 180 1 1 1 "Alabama"
    22116 181 1 1 1 "Alabama"
    22117 182 1 1 1 "Alabama"
    22118 183 1 1 1 "Alabama"
    22119 184 1 1 1 "Alabama"
    22120 185 1 1 1 "Alabama"
    22121 186 1 1 1 "Alabama"
    22122 187 1 1 1 "Alabama"
    22123 188 1 1 1 "Alabama"
    22124 189 1 1 1 "Alabama"
    22125 190 1 1 1 "Alabama"
    22126 191 1 1 1 "Alabama"
    22127 192 1 1 1 "Alabama"
    22128 193 1 1 1 "Alabama"
    22129 194 1 1 1 "Alabama"
    22130 195 1 1 1 "Alabama"
    22131 196 1 1 1 "Alabama"
    22132 197 1 1 1 "Alabama"
    22133 198 1 1 1 "Alabama"
    22134 199 1 1 1 "Alabama"
    22135 200 1 1 1 "Alabama"
    end
    format %td daily_cases_by_date

    So basically all I need to do is create a new variable that counts from the daily_cases_by_date up to the date of the first_public_healthpost to run a Cox-Hazard model of this data. I'm interested in the differences in time to adoption amongst the states.

  • #2

    That code is legal but some distance from what you need, which sounds like

    Code:
    egen datefirst = min(cond(first_public_healthpost == 1, daily_cases_by_date, .)), by(name)
    
    gen wanted = daily_cases_by_date - datefirst
    or its negation.

    See also https://www.stata-journal.com/articl...article=dm0055 Section 9.
    Last edited by Nick Cox; 31 Oct 2022, 10:03.

    Comment


    • #3
      It is specific by state. Thanks for the assistance!

      Comment


      • #4
        Indeed. I spotted that on the second reading.

        Comment


        • #5
          Nick Cox I have one follow up question. The date range in my data goes from January 22, 2020 to June 2, 2021 so using your code above I know have some negative integers that need to be adjusted since there are 498 days in the data. Do I need to incorporate this somehow? I assume the answer is yes, but I'm not sure where in your code I would input this would it be something like
          Code:
          gen daystopassage= (daily_cases_by_date-datefirst)-498

          Comment


          • #6
            As a follow up. The new variable time_to_passage is correct but the sign is negative (although I guess in terms of interpretation this is the correct way to illustrate). See below for data snippet. I guess I'm wondering if this variable is correctly censored for analysis.

            Code:
            * Example generated by -dataex-. For more info, type help dataex
            clear
            input float(daily_cases_by_date datefirst) int day str14 statename float time_to_passage
            22035 22112 100 "Alabama" -77
            22036 22112 101 "Alabama" -76
            22037 22112 102 "Alabama" -75
            22038 22112 103 "Alabama" -74
            22039 22112 104 "Alabama" -73
            22040 22112 105 "Alabama" -72
            22041 22112 106 "Alabama" -71
            22042 22112 107 "Alabama" -70
            22043 22112 108 "Alabama" -69
            22044 22112 109 "Alabama" -68
            22045 22112 110 "Alabama" -67
            22046 22112 111 "Alabama" -66
            22047 22112 112 "Alabama" -65
            22048 22112 113 "Alabama" -64
            22049 22112 114 "Alabama" -63
            22050 22112 115 "Alabama" -62
            22051 22112 116 "Alabama" -61
            22052 22112 117 "Alabama" -60
            22053 22112 118 "Alabama" -59
            22054 22112 119 "Alabama" -58
            22055 22112 120 "Alabama" -57
            22056 22112 121 "Alabama" -56
            22057 22112 122 "Alabama" -55
            22058 22112 123 "Alabama" -54
            22059 22112 124 "Alabama" -53
            22060 22112 125 "Alabama" -52
            22061 22112 126 "Alabama" -51
            22062 22112 127 "Alabama" -50
            22063 22112 128 "Alabama" -49
            22064 22112 129 "Alabama" -48
            22065 22112 130 "Alabama" -47
            22066 22112 131 "Alabama" -46
            22067 22112 132 "Alabama" -45
            22068 22112 133 "Alabama" -44
            22069 22112 134 "Alabama" -43
            22070 22112 135 "Alabama" -42
            22071 22112 136 "Alabama" -41
            22072 22112 137 "Alabama" -40
            22073 22112 138 "Alabama" -39
            22074 22112 139 "Alabama" -38
            22075 22112 140 "Alabama" -37
            22076 22112 141 "Alabama" -36
            22077 22112 142 "Alabama" -35
            22078 22112 143 "Alabama" -34
            22079 22112 144 "Alabama" -33
            22080 22112 145 "Alabama" -32
            22081 22112 146 "Alabama" -31
            22082 22112 147 "Alabama" -30
            22083 22112 148 "Alabama" -29
            22084 22112 149 "Alabama" -28
            22085 22112 150 "Alabama" -27
            22086 22112 151 "Alabama" -26
            22087 22112 152 "Alabama" -25
            22088 22112 153 "Alabama" -24
            22089 22112 154 "Alabama" -23
            22090 22112 155 "Alabama" -22
            22091 22112 156 "Alabama" -21
            22092 22112 157 "Alabama" -20
            22093 22112 158 "Alabama" -19
            22094 22112 159 "Alabama" -18
            22095 22112 160 "Alabama" -17
            22096 22112 161 "Alabama" -16
            22097 22112 162 "Alabama" -15
            22098 22112 163 "Alabama" -14
            22099 22112 164 "Alabama" -13
            22100 22112 165 "Alabama" -12
            22101 22112 166 "Alabama" -11
            22102 22112 167 "Alabama" -10
            22103 22112 168 "Alabama"  -9
            22104 22112 169 "Alabama"  -8
            22105 22112 170 "Alabama"  -7
            22106 22112 171 "Alabama"  -6
            22107 22112 172 "Alabama"  -5
            22108 22112 173 "Alabama"  -4
            22109 22112 174 "Alabama"  -3
            22110 22112 175 "Alabama"  -2
            end
            format %td daily_cases_by_date

            Comment


            • #7
              Posts are crossing here. An edit to #2 said "or its negation", which was hedging bets. Perhaps what you want

              Code:
              bysort name (dailycases) : gen wanted = daily_cases - daily_cases[1] + 1 if daily_cases < datefirst

              Comment


              • #8
                Yep. That's exactly what I wanted. Now the variable goes to missing "." once the policy was passed. Thank you again.

                Comment


                • #9
                  So, this should have been simpler:

                  Code:
                   
                   bysort name (daily_cases) : gen wanted = daily_cases - daily_cases[1] + 1 if first_public_healthpost == 0

                  Comment

                  Working...
                  X