Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Issue with coding

    Dear All,

    I am trying to create a command reiv_manual_multi using the code below:

    Code:
    capture program drop reiv_manual_multi
    program define reiv_manual_multi, rclass
        version 17.0
        syntax varlist(min=4) [if] [in], Endog(varlist) Instr1(varlist) Instr2(varlist) Panel(varname)
    
        // Parse variables
        tokenize "`varlist'"
        local depvar `1'
        local exogvars : list varlist - `depvar'
        gettoken y1 y2 : endog
    
        // First-stage regressions
        reg `y1' `instr1' `exogvars' i.time `if' `in'
        predict double f`y1', xb
        scalar r2_y1 = e(r2)
    
        reg `y2' `instr2' `exogvars' i.time `if' `in'
        predict double f`y2', xb
        scalar r2_y2 = e(r2)
    
        // Estimate RE model to get variance components
        xtset `panel'
        xtreg `depvar' `exogvars' i.time, re
        matrix V = e(V)
        scalar sig_u2 = V[1,1]
        scalar sig_e2 = V[2,1]
    
        // Compute individual-specific means and theta
        gen byte one = 1
        egen Ti = total(one), by(`panel')
        gen double theta_i = 1 - (sig_e2 / sqrt(Ti * sig_u2 + sig_e2))
    
        // Quasi-demean variables
        foreach var in `depvar' `exogvars' {
            egen mean_`var' = mean(`var'), by(`panel')
            gen double `var'_theta = `var' - theta_i * mean_`var'
        }
    
        foreach var in `y1' `y2' {
            egen mean_f`var' = mean(f`var'), by(`panel')
            gen double f`var'_theta = f`var' - theta_i * mean_f`var'
        }
    
        // Time fixed effects
        quietly levelsof time, local(tlevels)
        foreach t of local tlevels {
            gen byte time_`t' = (time == `t')
            egen mean_time_`t' = mean(time_`t'), by(`panel')
            gen double time_`t'_theta = time_`t' - theta_i * mean_time_`t'
        }
    
        // Build transformed variable lists
        local ftheta f`y1'_theta f`y2'_theta
        local exog_theta
        foreach var in `exogvars' {
            local exog_theta `exog_theta' `var'_theta
        }
        local time_theta
        foreach t of local tlevels {
            local time_theta `time_theta' time_`t'_theta
        }
    
        // Second-stage regression
        reg `depvar'_theta `ftheta' `exog_theta' `time_theta'
        predict double uhat, residuals
    
        // Clustered robust SEs
        preserve
        keep `panel' uhat
        tempfile clust
        save `clust', replace
        restore
        mkmat `ftheta' `exog_theta' `time_theta', matrix(X)
        matrix XtX = X'*X
        matrix Vcluster = J(colsof(X), colsof(X), 0)
        levelsof `panel', local(clusters)
        foreach c of local clusters {
            use `clust', clear
            keep if `panel' == `c'
            mkmat uhat, matrix(u_c)
            mkmat `ftheta' `exog_theta' `time_theta' if `panel' == `c', matrix(X_c)
            matrix Vcluster = Vcluster + X_c'*u_c*u_c'*X_c
        }
        matrix Vrobust = syminv(XtX) * Vcluster * syminv(XtX)
    
        // Export second-stage results
        matrix b = e(b)
        local k = colsof(b)
        tempfile results
        postfile handle str20 varname double coef double se double tstat double pval using `results'
        forvalues i = 1/`k' {
            local coef = b[1,`i']
            local se = sqrt(Vrobust[`i',`i'])
            local t = `coef'/`se'
            local p = 2 * ttail(e(df_r), abs(`t'))
            local name : colname b[`i']
            post handle ("`name'") (`coef') (`se') (`t') (`p')
        }
        postclose handle
        export excel using "reiv_manual_results.xlsx", sheet("SecondStage") firstrow(variables) replace
    
        // Hansen J test
        reg uhat `instr1' `instr2' `exogvars' i.time
        scalar N = e(N)
        scalar R2 = e(r2)
        scalar J = N * R2
        scalar df_J = wordcount("`instr1' `instr2'") - 2
        scalar pval_J = chi2tail(df_J, J)
    
        // Cragg-Donald statistic (approximate)
        scalar CD_stat = min(r2_y1, r2_y2)
        scalar df_CD = 2
        scalar pval_CD = chi2tail(df_CD, CD_stat)
    
        // Export diagnostics
        preserve
        clear
        set obs 2
        gen Test = ""
        gen Statistic = .
        gen df = .
        gen p_value = .
        replace Test = "Hansen J" in 1
        replace Statistic = J in 1
        replace df = df_J in 1
        replace p_value = pval_J in 1
        replace Test = "Cragg-Donald" in 2
        replace Statistic = CD_stat in 2
        replace df = df_CD in 2
        replace p_value = pval_CD in 2
        export excel using "reiv_manual_results.xlsx", sheet("Diagnostics") firstrow(variables) sheetmodify
        restore
    end
    I exceute this command running the following:

    Code:
    reiv_manual_multi lnfdipccurr lic Lgwtpol_corr legor_uk legor_fr legor_sc legor_ge, endog(Lpc Lpc2) instr1(desert tropical) instr2(desert tropical soil si501550) panel(id)
    Then I have something that is making me crazy. According to the line of code in red, a new variable called fLpc is created. Then the line in green should generate the variable fLpc2. However, I get an error message saying that varlist is not allowed. When I try to search for the problem in the trace I noticed the following:

    Code:
    version 6.0, missing
          - sret clear
          - gettoken ouser 0 : 0
          - local orig `"`0'"'
          = local orig `" double f Lpc2, xb"'
          - gettoken varn 0 : 0, parse(" ,[")
          - gettoken nxt : 0, parse(" ,[(")
          - if !(`"`nxt'"'=="" | `"`nxt'"'=="if" | `"`nxt'"'=="in" | `"`nxt'"'==",") {
          = if !(`"f"'=="" | `"f"'=="if" | `"f"'=="in" | `"f"'==",") {
          - local typ `varn'
          = local typ double
          - gettoken varn 0 : 0, parse(" ,[")
          - }
          - syntax [if] [in] [, `ouser' CONStant(varname numeric) noOFFset *]
          = syntax [if] [in] [, COVratio DFBeta(string) DFIts E(string) Hat Leverage Pr(string) Welsch YStar(string) CONStant(varname numeric
    > ) noOFFset *]
    varlist not allowed
    According to the blue line it seems that Stata thinks that "f Lpc2" is written instead of "fLpc2". I tried to change the name names but still the problem is present. If the red line works fine to create fLpc, while it fails when fLpc2 should be generated? How this issue can be fixed?

    I thank you for any help you may be willing to provide.

    Dario

  • #2
    Your local macro variable `endog' contains the string "Lpc Lpc2". The gettoken command then tokenizes this string into the local `y1', which contains the string "Lpc", and `y2', which contains the string " Lpc2". Notice that the second string contains a leading blank character. Consequently, the second predict command attemps to create a new variable named "f Lpc2" with a blank character in between. Stata interprets this as a variable list of 2 variables, while only 1 variable name is allowed. One workaround is to insert the following line after your gettoken command:
    Code:
    local y2 : list retokenize y2
    https://www.kripfganz.de/stata/

    Comment


    • #3
      Sebastian Kripfganz Thank you very much!

      Comment


      • #4
        Dear Prof. Sebastian Kripfganz

        - I am calibrating our driver-based panel model on 8 countries (2015–2022) and will use it to predict the target variable for 5 other countries (i.e., different than the one we used in the estimation step).
        - The countries I have in the estimation are diverse (a mix of developed, developing, and emerging countries)
        - A modelling issue we have: including country fixed effects in the calibration estimates country-specific intercepts that we do not have for the 5 target countries, so the FE model cannot be used directly to produce level predictions for them.

        In this case, could you please advise us on what we should do?

        1) Is estimating the coefficients on this sample, and then using these estimated coefficients to predict the target variable for the other 5 countries a good approach?
        2) Do we have to care about the fixed effects when estimating/calibrating the coefficients in our model? If yes, could you please guide us on which approach we should follow to address this issue?
        3) Do we also need to care about the dynamic effect when estimating the model?

        I look forward to your guidance.

        Comment

        Working...
        X