Hi everyone,
I would like to create a longitudinal dataset from an uncleaned dataset, and I don't know where to start. I need help, please. Basically, my dataset looks like:
I have some variables representing each municipality involved in a renewable construction project in Spain, from -municipality_1- to -municipality_26-. Some projects could contain 1 or more municipalities involved. I have the full dates of publication of the status of the project (-publication_date_boe_ddmmyyyy-), and the decision (-boe_resolution_date_ddmmyyyy-).
Here's what I'd ideally like to have. This is an example and does not necessarily reflect the reality given above:
I'd like to have the same number of quarters. In other words, I'd like the time variable for each municipality to start at 2017q1 up to 2023q4, even if there are a lot of 0s and missing values. Is this possible?
Otherwise, is it possible to have as many rows as quarters?
Thank you very much in advance for your help and time!
Best regards,
Michael
I would like to create a longitudinal dataset from an uncleaned dataset, and I don't know where to start. I need help, please. Basically, my dataset looks like:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input str28 municipality_1 str32 municipality_2 str28 municipality_3 str32 municipality_4 str17 power_mw str7 power_kw str29 renewables_type int(publication_date_boe_ddmmyyyy boe_resolution_date_ddmmyyyy) "Yebes" " Horche" "" "" "55" "" "Solar" 23146 23120 "Yebes" " Horche" "" "" "67.5" "" "Solar" 23145 23120 "Yepes" " Numancia de La Sagra" " Pantoja" " Añover de Tajo" "242.35" "" "Solar" 23162 23121 "Yunquera de Henares" " Fontanar" " Tórtola de Henares" " Guadalajara" "42,5" "" "Solar" 23176 23166 "Zaragoza" " Teruel" " Tarragona" "Barcelona." "38.2" "" "Solar" 23224 23204 end format %td publication_date_boe_ddmmyyyy format %td boe_resolution_date_ddmmyyyy
I have some variables representing each municipality involved in a renewable construction project in Spain, from -municipality_1- to -municipality_26-. Some projects could contain 1 or more municipalities involved. I have the full dates of publication of the status of the project (-publication_date_boe_ddmmyyyy-), and the decision (-boe_resolution_date_ddmmyyyy-).
Here's what I'd ideally like to have. This is an example and does not necessarily reflect the reality given above:
municipality_1 | quarter | new_installations_dummy | cum_new_installations_dummy | new_capacity_installed_mw | cumul_new_capacity_installed_mw | new_capacity_installed_kw | cumul_new_capacity_installed_kw | renewables_type (encoded) | count_other_municipalities_wo_first | municipality_2 | municipality_3 |
"Yebes" | 2018q1 | 0 | 0 | . | . | . | . | . | 0 | ||
"Yebes" | 2018q2 | 0 | 0 | . | . | . | . | . | 0 | ||
"Yebes" | 2018q3 | 1 | 1 | 62.55 | 62.55 | . | . | Solar | 1 | "Barcelona" | |
"Yebes" | 2018q4 | 0 | 1 | . | . | . | . | . | 0 | ||
"Yebes" | 2019q1 | 0 | 1 | . | . | . | . | . | 0 | ||
"Yebes" | 2019q2 | 1 | 2 | 350.10 | 412.65 | . | . | Solar | 2 | "Zaragoza" | "Teruel" |
"Yebes" | 2019q3 | 0 | 2 | . | . | . | . | . | 0 | ||
"Yebes" | 2019q4 | 0 | 2 | . | . | . | . | . | 0 | ||
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
Otherwise, is it possible to have as many rows as quarters?
Thank you very much in advance for your help and time!
Best regards,
Michael
Comment