Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Difference in Differences with multiple events

    Hey all,

    I'm trying to apply difference in differences in stata for traders in stock markets before they use robot investing and after they use robot investing. My main dependent variable is portfolio performance, So I want to see how does their portfolio perform before and after using robot investing. The problem that concern me is that most of the things i saw on the internet had time variable (For example before 1994 after 1994). Where in my case i want to see before switching to robot, after switching to robot.

    I have a variable for switching = 1 if investor switched to robot, 0 otherwise
    and a variable that tells me whether this trade was made by a robot or human. 1 = robot, 0 = human.

    Can someone please tell me how can I do the difference in differences in this case. Thanks

  • #2
    This is a straightforward enough problem--diff-in-diff can be easily done with the treatment (robo-investing in this case) turning on at different times. But to make sure my suggestions are relevant, could you please post an example of your data, using the -dataex- command? (See -help dataex- for information).

    Comment


    • #3
      Originally posted by Kye Lippold View Post
      This is a straightforward enough problem--diff-in-diff can be easily done with the treatment (robo-investing in this case) turning on at different times. But to make sure my suggestions are relevant, could you please post an example of your data, using the -dataex- command? (See -help dataex- for information).
      Thank You Kye for responding,

      My data looks like this:

      Code:
      * Example generated by -dataex-. To install: ssc install dataex
      clear
      InvestorId StockId Switching AutomaticTrading Performance
      20             32390   0                0              .
      20             32411   0                0               .09867152
      20             34733   0                0               .05027843
      20             37041   0                0               .5676744
      20             38587   0                0               .14006527
      20             40238   0                0              -.13145956
      20             42419   0                0               .03951561
      20             43284   0                0               .0186967
      20             44037   1                1              -.0415748
      20             44060   0                0              1.1997385
      20             44278   0                0              -.7503465
      20             44435   0                0               .8813246
      20             44953   0                0                .3260774
      20             47458   0                0                .27319214
      20             48892   0                0                .25726753
      20             49943   0                0                .2555361
      20             52181   0                0                 .252478
      20             54990   1                1                 .23942897
      20             55036   0                1                 .23070805
      20             55060   0                1                 .2226026
      20             55071   0                1                 .26550338
      20             55075   0                1                 .20598054
      end
      label values AutomaticTrading autotrade
      label def autotrade 0 "Human Trading", modify
      label def autotrade 1 "Robot Trading", modify
      Please Note that Switching is a dummy 1 = Switch to robot, 0 = didn't switch. And AutoTrading = 1 means trade attempted by robot and 0 is human trade
      Last edited by David HajHooj; 16 May 2020, 17:20.

      Comment


      • #4
        Great, the example is very helpful. I see you have investors trading multiple stocks, but no time variable for when the trade took place. You will need some sort of time ordering to be able to talk about results "after" switching to robo-investing.

        Do you have any way to get a time or date for each trade? Or can we assume that stock IDs are assigned in order, meaning a higher value always means the sale took place after an earlier value? This information will determine how to set up the method.

        Comment


        • #5
          Originally posted by Kye Lippold View Post
          Great, the example is very helpful. I see you have investors trading multiple stocks, but no time variable for when the trade took place. You will need some sort of time ordering to be able to talk about results "after" switching to robo-investing.

          Do you have any way to get a time or date for each trade? Or can we assume that stock IDs are assigned in order, meaning a higher value always means the sale took place after an earlier value? This information will determine how to set up the method.
          Thank you Kye, yes i have the time of each trade and they are in order. I have a variable that have the exact time of each trade and based on that time variable i generated tradeId where tradeId is 1,2,3,4...... 1 means first trade and so one.

          Comment


          • #6
            Ok, great. Then to set up a diff-in-diff with multiple events, you would first merge the time variable on to the dataset (I will assume you name it time). You will want to have observations across multiple investors for each time period, so defining time at an aggregated level (such as day) would be better than using a very short period (such as second).

            You can then run the following code. (I assume from your other posts that you are familiar with reghdfe already).

            Code:
            ssc install reghdfe // if needed
            xtset InvestorId time
            rename AutomaticTrading treated
            reghdfe Performance treated , absorb(InvestorID time) cluster(InvestorID)
            In this case, the use of robo-investing is your indicator for being "treated". This indicator can turn on or off for each trade, but the coefficient on the treated variable will tell you the average increase in performance when trades are done while robo-investing, relative to what is expected for a non-robo trade by that investor at that time period (i.e. including investor and time fixed effects).

            The "Switching" variable could also be useful if you want to see how the effects of robo-trading change over time... but that is going beyond the basic diff-in-diff.

            Hope that helps!

            Comment


            • #7
              Originally posted by Kye Lippold View Post
              Ok, great. Then to set up a diff-in-diff with multiple events, you would first merge the time variable on to the dataset (I will assume you name it time). You will want to have observations across multiple investors for each time period, so defining time at an aggregated level (such as day) would be better than using a very short period (such as second).

              You can then run the following code. (I assume from your other posts that you are familiar with reghdfe already).

              Code:
              ssc install reghdfe // if needed
              xtset InvestorId time
              rename AutomaticTrading treated
              reghdfe Performance treated , absorb(InvestorID time) cluster(InvestorID)
              In this case, the use of robo-investing is your indicator for being "treated". This indicator can turn on or off for each trade, but the coefficient on the treated variable will tell you the average increase in performance when trades are done while robo-investing, relative to what is expected for a non-robo trade by that investor at that time period (i.e. including investor and time fixed effects).

              The "Switching" variable could also be useful if you want to see how the effects of robo-trading change over time... but that is going beyond the basic diff-in-diff.

              Hope that helps!
              Thank you so much Kye for your help! I appreciate it

              Comment


              • #8
                Originally posted by Kye Lippold View Post
                Ok, great. Then to set up a diff-in-diff with multiple events, you would first merge the time variable on to the dataset (I will assume you name it time). You will want to have observations across multiple investors for each time period, so defining time at an aggregated level (such as day) would be better than using a very short period (such as second).

                You can then run the following code. (I assume from your other posts that you are familiar with reghdfe already).

                Code:
                ssc install reghdfe // if needed
                xtset InvestorId time
                rename AutomaticTrading treated
                reghdfe Performance treated , absorb(InvestorID time) cluster(InvestorID)
                In this case, the use of robo-investing is your indicator for being "treated". This indicator can turn on or off for each trade, but the coefficient on the treated variable will tell you the average increase in performance when trades are done while robo-investing, relative to what is expected for a non-robo trade by that investor at that time period (i.e. including investor and time fixed effects).

                The "Switching" variable could also be useful if you want to see how the effects of robo-trading change over time... but that is going beyond the basic diff-in-diff.

                Hope that helps!
                Hello Kye,

                If someone attempted that regression, how can we know the effect of before using robot investing and after using robot investing?

                Thanks

                Comment


                • #9
                  Hi Said-- If I understand your question correctly, you would look at the coefficient on the treated variable. Here is an example using the Grunfeld data (of several US companies). We will pretend that there was a (fake) treatment that applied from 1945 forward to company 1.

                  Code:
                  webuse grunfeld, clear
                  xtset company year
                  gen treated = year>=1945 & company==1
                  reghdfe kstock treated, absorb(company year) cluster(company)
                  The coefficient on treated tells us that the capital stock (kstock) of company 1 was 748 units higher than expected (for that company and year) during the period of treatment. So that is the DiD estimate of the effect of the treatment.

                  Comment


                  • #10
                    Originally posted by Kye Lippold View Post
                    Hi Said-- If I understand your question correctly, you would look at the coefficient on the treated variable. Here is an example using the Grunfeld data (of several US companies). We will pretend that there was a (fake) treatment that applied from 1945 forward to company 1.

                    Code:
                    webuse grunfeld, clear
                    xtset company year
                    gen treated = year>=1945 & company==1
                    reghdfe kstock treated, absorb(company year) cluster(company)
                    The coefficient on treated tells us that the capital stock (kstock) of company 1 was 748 units higher than expected (for that company and year) during the period of treatment. So that is the DiD estimate of the effect of the treatment.
                    Thank you Kye!

                    Comment

                    Working...
                    X