Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #16
    Originally posted by Katherine Adams View Post
    Weiwen,


    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input long location str10 date str5 elec_consumption str2 treatgr
    600001 "2017-01-01" "66.1" "0"
    600003 "2017-01-01" "46.7" "3"
    600004 "2017-01-01" "10.8" "3"
    600005 "2017-01-01" "24" "3"
    600006 "2017-01-01" "42.5" "2"
    600007 "2017-01-01" "7.1" "3"
    600008 "2017-01-01" "41.1" "1"
    600009 "2017-01-01" "41.4" "2"
    600010 "2017-01-01" "96.1" "3"
    600011 "2017-01-01" "22.1" "0"
    600012 "2017-01-01" "31" "1"
    600013 "2017-01-01" "33.1" "2"
    600014 "2017-01-01" "139" "2"
    600015 "2017-01-01" "44.9" "0"
    600016 "2017-01-01" "76.9" "2"
    600017 "2017-01-01" "34" "0"
    600018 "2017-01-01" "4.9" "3"
    600019 "2017-01-01" "27.1" "3"
    600020 "2017-01-01" "50.5" "1"
    600022 "2017-01-01" "47.1" "3"
    end

    I have data for 2017-2018 (sorted by date). As it has been said, the following R code is supposed to build an event-study figure showing average daily electricity consumption (= average consumption for the relevant treatment group minus the average consumption for the control group) by households in the treated groups (3 treatment groups, variable ‘treatgr’, treatgr=1/2/3) compared to the control group (treatgr=0) over time. The date of the treatment is May 2, 2017 (it should be the vertical line on the figure).

    day_avg <- dat %>%
    group_by(date, treatgrp) %>%
    summarise(mean_consumption = mean(elec_consumption)) %>%
    spread(key=treatgrp, value=mean_consumption) %>%
    mutate(diff1 = `1` - `0`,
    diff2 = `2` - `0`,
    diff3 = `3` - `0`) %>%
    select(date, contains("diff")) %>%
    gather(key=treatment, value=mean_consumption,contains("diff"))

    ggplot(day_avg,
    aes(x=date, y=mean_consumption, colour=factor(treatment))) +
    theme_bw() +
    xlab("") +
    geom_line() +
    ylab("Electricity consumption per day (kWh/day)") +
    geom_hline(yintercept=0) +
    geom_vline(xintercept = as.numeric(as.Date("02-05-2017","%d-%m-%Y")), linetype="dashed")


    I have a follow-up question on this. I guess that this figure will not actually be an event-study one... If I want to draw an event-study figure for my data (i.e., the one which typically displays point estimates from an event study regression of electricity consumption before and after the treatment), how can I do this? I have been desperately trying to find any information about an event study, but all I have found so far is related to financial data.

    I would appreciate any help!
    I'd post a new thread at this point. In general, you can do this by fitting the regression model and then using margins and marginsplot. However, I don't really know what an event study is. If it's a bit like a difference in differences setup, then what I said should hold.
    Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

    When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

    Comment


    • #17
      Weiwen,

      Thank you so much for your help. I read the description of 'margins' and 'marginsplot' - it looks like these commands would be of great help to me. However, I did not get how to use them at all... :-) But I will keep studying.

      Comment


      • #18
        Note I changed the input example dates to have more than one date, but otherwise this translates the R code, which I'm familiar with.

        Dave

        Code:
        * Example generated by -dataex-. To install: ssc install dataex
        clear
        input long location str10 date str5 elec_consumption str2 treatgr
        600001 "2017-01-01" "66.1" "0"
        600003 "2017-01-01" "46.7" "3"
        600004 "2017-01-01" "10.8" "3"
        600005 "2017-01-01" "24" "3"
        600006 "2017-01-01" "42.5" "2"
        600007 "2017-01-01" "7.1" "3"
        600008 "2017-01-01" "41.1" "1"
        600009 "2017-01-01" "41.4" "2"
        600010 "2017-01-01" "96.1" "3"
        600011 "2017-01-01" "22.1" "0"
        600012 "2017-03-01" "31" "1"
        600013 "2017-03-01" "33.1" "2"
        600014 "2017-03-01" "139" "2"
        600015 "2017-03-01" "44.9" "0"
        600016 "2017-03-01" "76.9" "2"
        600017 "2017-03-01" "34" "0"
        600018 "2017-03-01" "4.9" "3"
        600019 "2017-03-01" "27.1" "3"
        600020 "2017-03-01" "50.5" "1"
        600022 "2017-03-01" "47.1" "3"
        end
        
        rename elec_consumption ec
        destring ec, replace force
        destring treatgr, replace force
        collapse (mean) ec, by(treatgr date)
        reshape wide ec, i(date) j(treatgr)
        generate diff1 = ec1 - ec0
        generate diff2 = ec2 - ec0
        generate diff3 = ec3 - ec0
        keep date diff*
        reshape long diff, i(date) j(comparison)
        
        generate date2 = date(date, "YMD", 2019)
        format date2 %td
        
        set scheme s2color
        grstyle init
        grstyle set imesh
        grstyle set color, opacity(50): p#markfill
        grstyle set color, opacity(50): p#markline
        
        graph twoway (line diff date2 if comparison == 1) ///
            (line diff date2 if comparison == 2) ///
            (line diff date2 if comparison == 3), ///
            yline(0, lcolor(black%50)) ///
            xline(`=date("2017-02-05", "YMD", 2019)', lcolor(black%50)) ///
            ytitle("Electricity consumption per day (kWh/day)") ///
            legend(label(1 "1-0") label(2 "2-0") label(3 "3-0") ///
            ring(0) position(11) col(1))

        Comment


        • #19
          Dave,

          Thank you for your answer! I ran the code on my data and got a very interesting figure:

          Click image for larger version

Name:	Screen Shot 2019-02-16 at 12.11.35 PM.png
Views:	2
Size:	180.2 KB
ID:	1484016

          But I will try to figure this out.


          Attached Files

          Comment


          • #20
            Well that is fugly! It looks like in your graph you have just three lines, but each line is going all over the place. How many green lines are you expecting for example? Is the input variable unique per row in your dataset?

            Comment


            • #21
              Dave,

              Well, the figure they got in R has only one line of each color (so, there are 3 lines in the figure). I am sorry I do not quite understand your last question...

              Comment


              • #22
                Maybe the data need to be sorted by date and group before plotting?

                Comment


                • #23
                  Dave,

                  Yes, I sorted the data by date2, and the result is much better now! Thanks a lot!
                  Click image for larger version

Name:	Screen Shot 2019-02-16 at 4.55.34 PM.png
Views:	1
Size:	224.9 KB
ID:	1484061


                  I have found almost the same R code but for monthly averages (the code is in red), which might look even better on the graph. I am trying to convert the R commands into Stata using your code, but I am not sure about most of the code lines (my commands are in black). Could you please correct me one more time?



                  rename lconsum ec
                  destring ec, replace force
                  destring treatgr, replace force

                  month_avg <- dat %>%
                  mutate(month_of_year = format(date, "%m"),
                  year = format(date, "%Y")) %>%
                  group_by(month_of_year,year,treatrp) %>%

                  summarise(mean_consumption = mean(elec_consumption),

                  collapse (mean) ec, by(treatgr date)

                  date = first(date)) %>%
                  spread(key=treatrp, value=mean_consumption) %>%


                  mutate(diff1 = `1` - `0`,
                  diff2 = `2` - `0`,
                  diff3 = `3` - `0`) %>%

                  generate diff1 = ec1 - ec0
                  generate diff2 = ec2 - ec0
                  generate diff3 = ec3 - ec0

                  select(date, year, month_of_year, contains("diff")) %>%
                  gather(key=treatment, value=mean_consumption, contains("diff"))

                  keep date diff*



                  Thank you anyway!

                  Comment


                  • #24
                    I have to say that posting part of the R code mixed into part of the Stata code is not helpful towards understanding. Hopefully you can figure things out.

                    Comment


                    • #25
                      Originally posted by Katherine Adams View Post
                      Dave,

                      Yes, I sorted the data by date2, and the result is much better now! Thanks a lot!
                      [ATTACH=CONFIG]n1484061[/ATTACH]


                      I have found almost the same R code but for monthly averages (the code is in red), which might look even better on the graph. I am trying to convert the R commands into Stata using your code, but I am not sure about most of the code lines (my commands are in black). Could you please correct me one more time?



                      rename lconsum ec
                      destring ec, replace force
                      destring treatgr, replace force

                      month_avg <- dat %>%
                      mutate(month_of_year = format(date, "%m"),
                      year = format(date, "%Y")) %>%
                      group_by(month_of_year,year,treatrp) %>%

                      summarise(mean_consumption = mean(elec_consumption),

                      collapse (mean) ec, by(treatgr date)

                      date = first(date)) %>%
                      spread(key=treatrp, value=mean_consumption) %>%


                      mutate(diff1 = `1` - `0`,
                      diff2 = `2` - `0`,
                      diff3 = `3` - `0`) %>%

                      generate diff1 = ec1 - ec0
                      generate diff2 = ec2 - ec0
                      generate diff3 = ec3 - ec0

                      select(date, year, month_of_year, contains("diff")) %>%
                      gather(key=treatment, value=mean_consumption, contains("diff"))

                      keep date diff*



                      Thank you anyway!
                      Going back to the original csv file, you need to create a monthly date indicator. Then, collapse over month - we both advised you to collapse over date, but I didn't recognize that your R code is in fact summarizing by month. You may have to debug this, but it should work or get you close:

                      Code:
                      rename elec_consumption ec
                      destring ec, replace force
                      destring treatgr, replace force
                      gen date2 = date(date, "YMD", 2019)
                      gen month = mofd(date2)
                      
                      collapse (mean) ec, by(treatgr month)
                      reshape wide ec, i(month) j(treatgr)
                      generate diff1 = ec1 - ec0
                      generate diff2 = ec2 - ec0
                      generate diff3 = ec3 - ec0
                      keep month diff*
                      reshape month diff, i(month) j(comparison)
                      Be aware that it can be very hard to answer a question without sample data. You can use the dataex command for this. Type help dataex at the command line.

                      When presenting code or results, please use the code delimiters format them. Use the # button on the formatting toolbar, between the " (double quote) and <> buttons.

                      Comment


                      • #26
                        Weiwen and Dave,

                        Thank you for your replies!

                        Comment

                        Working...
                        X