Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is there a way in stata to do a unique graph with multiple lines from multiple monthly files?

    Hi everyone,

    I have monthly DTA files about electricity consumption in Spain. These files run from January 2021 to November 2023. So 35 separate monthly files.
    Here is my aim:
    • I want to draw multiple lines from average consumption in January 2021, February 2021, March 2021, and so on in one and unique graph, if possible. Or one graph showing monthly average elec. consumption for 2021, one for 2022 and finally one for 2023.
    • In particular, in June 2021 a new policy was implemented about electricity tariffs in Spain, so it would be nice to see at least some months before this change, and some months after this change.
    • Also, if this is not recommended, I appreciate if you could give me some insights, please. Thank you.
    To give you an idea, here is a small -dataex- about the first monthly file: January 2021:


    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input long(id fecha_consumo) float(a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 a10 a11 a12 a13 a14 a15 a16 a17 a18 a19 a20 a21 a22 a23)
    1001 20210101 .179 .113 .149 .115 .179 .114 .148 .115 .179 .113 .146 .113 .171 .117 .137 .125 .159 .134 .125 .136 .137 .157 .113 .148
    1001 20210102 .115 .181 .114 .149 .115 .168 .131 .127 .134 .125 .166 .111 .142  .11 .175 .114 .148 .113 .143 .138 .146 .112 .146 .113
    1001 20210103 .177 .115 .147 .113 .162 .144 .111 .141 .115 .151 .139 .112 .142 .112 .157 .149 .115 .147 .113 .183 .112 .147 .113 .161
    1001 20210104 .136 .123 .144 .121 .175 .113 .147 .113 .172 .119 .135 .119 .135 .152 .112 .147 .112 .178 .111 .145 .113 .151  .14 .115
    1001 20210105 .146 .119 .177 .115 .149 .115 .173 .121 .134 .122 .137 .153 .111 .145 .111  .18 .113 .144  .11 .154 .134  .12  .14 .122
    end

    Basically:
    1. -id- is the Household ID,
    2. -fecha_consumo- is the date in "long" format,
    3. -a0- up to -a23- are hourly electricity consumption from midnight (-a0-) up to 11 p.m (-a23-).
    All the monthly files have the same structure and same variables.

    I would like to thank you in advance for your help.
    Best,

    Michael

    Edit: -merge- is not possible, since these monthly files are huge and exceed the memory when merged together.
    Last edited by Michael Duarte Goncalves; 20 Dec 2023, 00:58.

  • #2
    You can process one month to return its average consumption like so:
    Code:
    egen day = rowtotal(a*) // daily consumption per hh
    drop a*
    collapse (sum) day, by(id) // monthly consumption per hh
    rename day monthly
    sum monthly // average consumption in that month
    Than you can repeat that over all monthly files you have and store those means in a separate file, and graph that. Lets say we stored the code above in a .do file called compute_average.do, and that the only .dta files in a directory called data

    Code:
    local files : dir "/data" files "*.dta"
    frame create means average str50 fn
    foreach file of local files {
        use `file', clear
        do compute_average.do
        frame post means (r(mean)) ("`file'")
    }
    frame change means
    list
    Now you need to extract the month and year from the variable fn, and than you can graph
    ---------------------------------
    Maarten L. Buis
    University of Konstanz
    Department of history and sociology
    box 40
    78457 Konstanz
    Germany
    http://www.maartenbuis.nl
    ---------------------------------

    Comment


    • #3
      Maarten Buis: Thank you so much for your answer.

      Your explanations are perfect. Thank you so much!
      Lovely day.

      Michael

      Comment

      Working...
      X