Hi everyone,
I have big datasets (on average, 3,500,000 observations for each monthly file from January 2021 to July 2023) and I need to make them smaller to be able to work with.
I want to keep only the IDs, drop duplicates, and save them as monthly IDs, then it would become a quite small dataset that you can put together.
The variables of my monthly files are always the same, as well as their name ("export_telemedida_yyyymm", from export_telemedida_202101 to export_telemedida_202307).
I want to produce this jointly with this nice post written by Daniel Schaefer:
Here is the post: https://www.statalist.org/forums/for...70#post1732070
Here is a dataex:
Could anyone help me please?
Thank you in advance.
Michael
I have big datasets (on average, 3,500,000 observations for each monthly file from January 2021 to July 2023) and I need to make them smaller to be able to work with.
I want to keep only the IDs, drop duplicates, and save them as monthly IDs, then it would become a quite small dataset that you can put together.
The variables of my monthly files are always the same, as well as their name ("export_telemedida_yyyymm", from export_telemedida_202101 to export_telemedida_202307).
I want to produce this jointly with this nice post written by Daniel Schaefer:
Here is the post: https://www.statalist.org/forums/for...70#post1732070
Here is a dataex:
Code:
* Example generated by -dataex-. For more info, type help dataex clear input long v1 str96 id long fecha_consumo float(v4 v5) 0 "C22D34923F76E418EBAF2DF336E314232FD494E42A38CB85B182821164E849971C7EEEDB9C49717C1E20F82A7127E51E" 20210122 .103 .103 end
Thank you in advance.
Michael
Comment