Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to use ttest for DiD method

    Hi all, I have dataframe with 2 variables, in which did_control calculated by daily_income_2021 minus daily_income_2023 in control group; and did_treatment calculated by daily_income_2021 minus daily_income_2023 in treatment group, as follow:
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(did_control did_treatment)
             .          .
             .          .
             .          .
             .          .
     -6027.396          .
             .          .
             .          .
             .          .
       55068.5          .
             .          .
             0          .
             .          .
             0          .
             .          .
    -10958.904          .
             .          .
     16438.355          .
             .          .
             .          .
     107123.28          .
             .          .
             .          .
             .          0
             .          .
      41534.25          .
             .          .
             .          0
             .          .
             .          .
             .          .
     -46575.34          .
             .          .
      39726.02          .
             .          .
     23835.615          .
             .          .
             .          .
             .          0
             .          .
             0          .
             .          .
     -8767.123          .
             .          .
      391835.6          .
             .          .
      8383.561          .
             .          .
    -4383.5625          .
             .          .
             .          .
             .          .
     -32876.71          .
             .          .
             0          .
             .          .
             .          .
             .  -52054.79
             .          .
      47671.24          .
             .          .
             .          0
             .          .
             .          .
             .          .
             .          .
             . -38356.156
             .          .
             .          .
             .          .
             .   66904.11
             .          .
             .          0
             .          .
     142465.75          .
             .          .
     -32876.71          .
             .          .
             .  109589.04
             .          .
     -32876.71          .
             .          .
             .          .
    -141835.61          .
             .          .
             0          .
             .          .
             .  -60273.97
             .          .
     -6849.315          .
             .          .
     -7671.234          .
             .          .
    -10958.904          .
             .          .
     -80273.98          .
             .          .
     66301.375          .
             .          .
             .          .
      14794.52          .
    end
    Now i would like to test DID method by using ttest in stata, but I have some errors in these codes as follow:
    Code:
    // Initialize row counter
        local row = 2
    
        // Define the list of variables to analyze
        // Run T-Test for the DID effect
        di `row'
        di "DID Effect"
        ttest did_control = did_treatment if did_treatment < . & did_control < . // Perform DID t-test
        ret list
    
        local dif_m = `r(mu_2)' - `r(mu_1)' // difference of mean
        local nhan = "DID Effect"
    
        // Determine the significance star
        local star
        if r(p) < 0.01 {
            local star = "***"
        }
        else if r(p) < 0.05 {
            local star = "**"
        }
        else if r(p) < 0.1 {
            local star = "*"
        }
        else {
            local star = " "
        }
    
        // Format and display the p-value with the significance star 
        local p : display %5.2f `r(p)' "`star'" // add star symbol for significance 
        di "`p'"
    
        // Export results to Excel for the DID effect
        putexcel A`row' = ("DID Effect")
        putexcel B`row' = ("`nhan'")
        putexcel C`row' = `r(mu_2)', nformat(number_d2) 
        putexcel D`row' = `r(mu_1)', nformat(number_d2)
        putexcel E`row' = `dif_m', nformat(number_d2)
    
        // Add bold formatting for significant p-values for DID effect
        if strpos("`p'", "***") > 0 {
            putexcel F`row' = "`p'", bold
        }
        else {
            putexcel F`row' = "`p'"
        }
    
        putexcel G`row' = `r(N_2)'
        putexcel H`row' = `r(N_1)'
    I would appreciate any suggestion help me to find out solutions. Thanks!!

  • #2
    I don't know what you mean by "test DID method by using ttest in stata." Also, it isn't very helpful to say "I have some errors in these codes" while providing no clue as to what those errors might be, or how they manifest themselves as unexpected results or error messages.

    Nonetheless, from perusing the code, it appears that you want to do a t-test comparing the values of did_control and did_treatment. The problem is that you are using the paired t-test syntax on unpaired data. So you need to reorganize your data to long layout and then use the unpaired t-test.
    Code:
    gen `c(obs_t)' obs_no = _n
    reshape long did, i(obs_no) j(arm) string
    ttest did, by(arm)
    return list
    The rest of your code appears to be exporting the results to a spreadsheet and I think it will not require any changes.

    Added: On reflection, I note that your original paired -ttest- command includes an -if- qualifier requiring that both did_control and did_treatment by non-missing. In a sense, this is completely unnecessary because the paired -ttest- command will automatically exclude any observations where either of the variables being compared is missing. But it dawns on me that perhaps you really do intend for this to be paired data. In that case, the problem is not with the code you show but with whatever code created the data in the first place. In your example data, there aren't any observations that have non-missing values for both of the did_* variables. Perhaps in your full data set there are some, but the example, if representative, suggests they will be few and far between. In any case, if it is your expectation that you will be comparing paired values of did_control and did_treatment, then there is nothing wrong with your code--it is just that your data doesn't contain any pairs. So you will have to review how the data set got created and figure out why you didn't get the paired results you were expecting.
    Last edited by Clyde Schechter; 21 Sep 2023, 10:00.

    Comment

    Working...
    X