Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Synthetic Control Method: looking for a better approach

    Hi experts,

    I am trying to investigate the impact of the coup that happened in 1999 on the economy of Ivory Coast through utilizing the synthetic control method.

    I have obtained the result using the following command, but the synthetic which is created by other countries except Ivory Coast does not fit Ivory Coast(=treatment).

    Code:
    tsset country_id year
    Code:
    synth rgdpe pop cn rconna csh_i csh_x, trunit(19) trperiod(1999) keep("ivory_coast.dta") replace fig

    In this case, does anyone know the better command because I am assuming there would be a room in my command to be improved???

    Thank you.
    Attached Files

  • #2
    The problem here is that you need more variables. I don't think I've ever seen a fit so horrific (nothing against you at all of course, but this is pretty shocking). You don't really give details about what your outcome is, so I presume we're working with GDP per capita, and fortunately for me, that's all I need. I use Africa as the donor pool. Note that I blindly shoved this into my algorithm, so you'll need to modify this for your own needs, as I presume you used donor restrictions.




    Effect of the coup as estimated by my synthetic learner, scul (ssc inst scul, replace)
    Code:
    cd E:\Tests\EXs
    
    
    import delim "https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv", clear
    
    drop countrycode
    
    rename alpha3 countrycode
    
    tempfile codes
    
    sa `codes', replace
    
    
    cls
    
    
    u "https://www.rug.nl/ggdc/historicaldevelopment/maddison/data/mpd2020.dta", clear
    
    replace country = "Tanzania" if strpos(country,"U.R.")
    
    keep if inrange(year,1970,2018)
    
    
    
    cls
    
    merge m:1 countrycode using `codes', keepusing(region) keep(3) nogen
    
    
    keep if region=="Africa"
    
    egen id = group(country), label(country)
    
    
    qui levelsof id if gdp ==., loc(drops) sep(",")
    
    cap drop if inlist(id,`drops')
    
    
    
    loc int_time = 1999
    
    xtset id year, y
    
    local lbl: value label `r(panelvar)'
    
    
    loc unit ="Côte d'Ivoire":`lbl'
    
    g treat = cond(`r(panelvar)'==`unit' & `r(timevar)' >=`int_time',1,0)
    
    
    labvars gdp treat "GDP per Capita" "Coup"
    
    
    cls
    
    set tr off
    
    scul gdp, ///
        treated(treat) obscol(black) cfcol(red) /// 
        legpos(10) ///
        scheme(white_tableau) ///
        ahead(4) cv(adaptive) //
    Effect of the coup as estimated by Robust Synthetic Controls (the source code can be found here)
    Code:
    cd "E:\Robust SCM"
    clear *
    
    loc timevar year
    
    
    import delim "https://raw.githubusercontent.com/lukes/ISO-3166-Countries-with-Regional-Codes/master/all/all.csv", clear
    
    drop countrycode
    
    rename alpha3 countrycode
    
    tempfile codes
    
    sa `codes', replace
    
    
    cls
    
    
    u "https://www.rug.nl/ggdc/historicaldevelopment/maddison/data/mpd2020.dta", clear
    
    replace country = "Tanzania" if strpos(country,"U.R.")
    
    keep if inrange(year,1970,2018)
    
    
    
    cls
    
    merge m:1 countrycode using `codes', keepusing(region) keep(3) nogen
    
    
    keep if region=="Africa"
    
    
    qui su `timevar'
    
    loc mintime = r(min)
    
    loc maxtime = r(max)
    cls
    
    
    loc int_time = 1999
    
    g rel = `timevar' - `int_time'
    recast int rel
    
    qui su rel
    
    loc min = `r(min)'
    
    loc max = `r(max)'
    
    sa rsc, replace
    
    cls
    
    loc treated "Côte d'Ivoire"
    
    loc outcome gdppc
    
    loc panel country
    cls
    
    
    python:
    import sys, os
    from sfi import Macro
    import numpy as np
    import pandas as pd
    import copy
    
    from tslib.src import tsUtils
    from tslib.src.synthcontrol.syntheticControl import RobustSyntheticControl
    from tslib.tests import testdata
    
    filename = 'rsc.dta'
    
    df = pd.read_stata(filename)
    
    yvar = Macro.getLocal('outcome')
    
    panelvar = Macro.getLocal('panel')
    
    minrel = df['rel'].min()
    maxrel = df['rel'].max()+1
    
    pivot = df.pivot_table(values=yvar, index=panelvar, columns=['rel'])
    
    dfProp99 = pd.DataFrame(pivot.to_records())
    
    allColumns = dfProp99.columns.values
    
    pd.set_option('display.max_columns', 10)
    allColumns
    
    states = list(np.unique(dfProp99[panelvar]))
    years = np.delete(allColumns, [0])
    treat = Macro.getLocal('treated')
    states.remove(treat)
    donors = states
    
    yearStart = minrel
    yearTrainEnd = 0
    yearTestEnd = maxrel
    
    p = 1
    
    
    trainingYears = []
    for i in range(yearStart, yearTrainEnd, 1):
     trainingYears.append(str(i))
    
    testYears = []
    for i in range(yearTrainEnd, yearTestEnd, 1):
     testYears.append(str(i))
    
    trainDataMasterDict = {}
    trainDataDict = {}
    testDataDict = {}
    for key in donors:
     series = dfProp99.loc[dfProp99[panelvar] == key]
    
     trainDataMasterDict.update({key: series[trainingYears].values[0]})
    
     # randomly hide training data
     (trainData, pObservation) = tsUtils.randomlyHideValues(copy.deepcopy(trainDataMasterDict[key]), p)
     trainDataDict.update({key: trainData})
     testDataDict.update({key: series[testYears].values[0]})
    series = dfProp99[dfProp99[panelvar] == treat]
    trainDataMasterDict.update({treat: series[trainingYears].values[0]})
    trainDataDict.update({treat: series[trainingYears].values[0]})
    testDataDict.update({treat: series[testYears].values[0]})
    
    trainMasterDF = pd.DataFrame(data=trainDataMasterDict)
    trainDF = pd.DataFrame(data=trainDataDict)
    testDF = pd.DataFrame(data=testDataDict)
    
    
    singvals = 5
    rscModel = RobustSyntheticControl(treat, singvals, len(trainDF), probObservation=1.0, modelType='svd', svdMethod='numpy', otherSeriesKeysArray=donors)
    rscModel.fit(trainDF)
    denoisedDF = rscModel.model.denoisedDF()
    
    predictions = []
    predictions = np.dot(testDF[donors], rscModel.model.weights)
    actual = dfProp99.loc[dfProp99[panelvar] == treat]
    actual = actual.drop(panelvar, axis=1)
    actual = actual.iloc[0]
    model_fit = np.dot(trainDF[donors][:], rscModel.model.weights)
    
    #rscModel.model.weights
    
    combined = np.concatenate((model_fit, predictions))
    
    data = np.vstack([actual,combined])
    
    data = np.swapaxes(data,0,1)
    
    np.savetxt(treat+".csv", data, delimiter=",")
    
    
    end
    
    import delimited "E:\Robust SCM\\`treated'.csv", clear
    
    egen `timevar' = seq(), f(`mintime') t(`maxtime')
    
    g relative = `timevar' -`int_time'
    
    rename (v1 v2) (real rsc_cf)
    
    tsset `timevar', y
    
    g diff = real-rsc
    
    su diff if rel >=0, mean
    
    loc ATT: di %6.4g r(mean)
    
    cls
    
    twoway (tsline real, lcolor(black)) (tsline rsc_cf, lcolor(red)), ///
    legend(order(1 "`treated'" 2 "RSC `treated'") ///
    ring(0) pos(7)) xli(`int_time', lcol(black) lpat(solid)) caption("ATT = `ATT'")
    These analysis give us the following results
    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input float(real rsc_cf) int year double cf
         2882 2920.9604 1970  2846.353258120823
         2874 2887.5334 1971 2923.5133928417526
         2887  2928.188 1972 2878.9391813514367
         2974  2909.834 1973 2896.3779467557015
         2931  2979.416 1974  3004.590031139447
         2812  2961.905 1975 2949.3854282743614
         3017 3007.2786 1976 2961.4707128916198
         3033  3021.058 1977 3009.6522901657336
         3202  3073.406 1978 3057.5960382509857
         3139 3133.7185 1979 3217.6590070162347
         3253  3149.622 1980 3268.5796382981243
         3242  3035.863 1981  3174.916195159876
         3131  3057.817 1982 3065.1707104283987
         2901 2959.0076 1983  2852.041196794824
         2703 2879.2014 1984 2742.7000272299065
         2735  2843.359 1985  2668.577686104908
         2718  2691.469 1986 2582.1101575346574
         2579  2550.323 1987  2529.830060447107
         2444 2481.6804 1988  2507.841487754652
         2343  2363.298 1989 2436.9024522570035
         2083 2088.6516 1990 2165.4991594684416
     2036.299 2096.5603 1991 2130.8059100235287
    2011.5695 2010.6012 1992  2067.135569782661
    1983.2336 1924.7153 1993  1969.408309307442
    1951.5408 1932.8026 1994 1969.2926478043707
    2035.1682  1993.308 1995 2109.9271527082583
    2179.0356 2130.1458 1996  2185.274497307327
    2288.8086 2334.8584 1997  2270.673367984331
     2393.499 2432.8628 1998  2319.930586796091
    2417.4575 2572.0984 1999 2389.4413419577095
    2352.8562  2728.975 2000   2503.53207721297
    2342.8274  3179.551 2001 2592.1418188162916
    2298.4565  3459.982 2002 2745.3197269731763
    2263.0925  3704.734 2003 2792.2450248864916
     2280.788  4123.507 2004  3012.188670405938
    2312.8425 4444.8086 2005 3205.8730267348274
     2343.927  4684.634 2006 3473.2608896611346
    2380.1155  5009.807 2007  3682.060837971817
    2435.2705  5574.875 2008 3860.6373500004765
     2508.822  5753.836 2009 3854.1356708218136
     2555.016  5793.815 2010  4057.835989237853
         2444  5066.266 2011 4256.0229237804915
         2636  6061.083 2012  4368.288410803053
         2823  5085.256 2013  4471.934358302735
         3011  4898.148 2014  4600.531908240408
         3218 4703.7715 2015  4688.217694802218
         3395  4463.973 2016  4666.467705151712
     3558.838  4588.723 2017  4685.459459893192
    3713.6436  4531.171 2018 4671.3934556806535
    end
    format %ty year
    
    
    tsset year, y
    
    cls
    
    
    twoway (tsline real, lcolor(black) lwidth(thick)) ///
        (tsline rsc, lcolor("255 130 0") lwidth(medium) lpat(dash)) ///
        (tsline cf , lcolor("0 154 68") lwidth(medium) lpat(dash)), ///
        legend(ring(0) pos(9) ///
        order(1 "Real Ivory Coast" 2 "RSC Ivory Coast" 3 "SCUL Ivory Coast") ///
        region(fcol(none))) ///
        tli(1999, lcol(blue) lpat(solid) lwidth(thick)) ///
        yti("GDP per Capita") tti(Year) xsize(4) ysize(4)
    Note that SCUL and RSC are big-boy commands which are much more complicated conceptually and methodologically than normal SCM, so use with caution..... but these commands (in Stata, and in Python), will do as you seek. Much easier too.

    Comment


    • #3
      Thank you so much Jared Greathouse for your big help!!

      Comment

      Working...
      X