Hey,
I'm currently working on a weekly time series project and cleaning my dataset. As it spans from 2020 to 2023 the 53rd in 2020 is constantly a pain
I've already done this to create a numerical date variable in the format YYYYYwWW and a week variable with ISO8601 dates:
When I use
Stata puts my id 44 with the week 2020w53 at the end. I also tried this solution from Nick in an older post:
It didn't help as when I use
afterwards it just deletes the 2020w53 and sorts it at the end. If I use Nicks code after the tsset and try to generate 1. differences in a foreach loop I get the error "not sorted":
What am I doing wrong here and is there a solution (and what would it be)?
Im sorry if there is a solution already posted in the forum as I didn't found it yet please reference me
Best
Philipp
PS: maybe unrelated but when I first diff. the variables and ignoring the problem with the missing date in 2020w53 I lose the first observation and the last one. Is this related to the 53-week problem as I would expect to lose only one observation by using the first diff.
I'm currently working on a weekly time series project and cleaning my dataset. As it spans from 2020 to 2023 the 53rd in 2020 is constantly a pain

I've already done this to create a numerical date variable in the format YYYYYwWW and a week variable with ISO8601 dates:
Code:
* Convert the string date to a Stata date variable gen date = date(date_str, "YMD") format date %td gen year=year(date) * Get ISO8601 dates gen ISOweek =int((doy(7*int((date-mdy(1,1,1900))/7)+ mdy(1,1,1900) + 3)+6)/7) gen week = ISOweek drop ISOweek date_str
Code:
tesset date
Code:
egen numweek = group(date), label replace numweek = 3297 in 169 format numweek %tw replace date = numweek if id ==44 sort id
Code:
tesset date
Code:
local variables "wai tavg_Berlin tavg_Bremen tavg_Hamburg airbnb_Berlin airbnb_Bremen airbnb_Hamburg booking_Berlin booking_Bremen booking_Hamburg urlaub_topic_Berlin urlaub_topic_Bremen urlaub_topic_Hamburg anzfallvortag_Berlin anzfallvortag_Bremen anzfallvortag_Hamburg kumfall_Berlin kumfall_Bremen kumfall_Hamburg" sort id * Generate FD of LOGs foreach v of local variables { 2. gen dln_`v'=d.ln_`v' 3. label variable dln_`v' "f. diff. log `v'" 4. } not sorted
Im sorry if there is a solution already posted in the forum as I didn't found it yet please reference me

Best
Philipp
PS: maybe unrelated but when I first diff. the variables and ignoring the problem with the missing date in 2020w53 I lose the first observation and the last one. Is this related to the 53-week problem as I would expect to lose only one observation by using the first diff.
Comment