Create new variable for observations of another variable

Scott Rick

Join Date: May 2021
Posts: 242

Create new variable for observations of another variable

27 Sep 2023, 21:42

In the data below, for each Treated_state I want to generate a new variable = s_"observation". So for Treated_state==Arun, I want a variable named s_arun; for Treated_state==Biha, I want a variable named s_biha, and so on. I also do not want any variables created for the observations where Treated_state==""

The only method I know of doing this is to reshape, but that causes other problems since I need this data structure to be preserved as is, and only these additional variables to be added. I'd greatly appreciate some help on this

Code:


input str50 state str4 Treated_State byte treated int cem_strata double(cem_matched cem_weights)
"Andaman and Nicobar Islands"              ""     0 1 1 1
"Andhra Pradesh"                           ""     0 1 1 1
"Arunachal Pradesh"                        "Arun" 1 1 1 1
"Assam"                                    ""     0 1 1 1
"Bihar"                                    "Biha" 1 1 1 1
"Chandigarh"                               ""     0 3 0 0
"Chhattisgarh"                             "Chha" 1 1 1 1
"Dadra and Nagar Haveli and Daman and Diu" ""     0 1 1 1
"Delhi"                                    "Delh" 1 4 0 0
"Goa"                                      ""     0 1 1 1
"Gujarat"                                  ""     0 1 1 1
"Haryana"                                  ""     0 1 1 1
"Himachal Pradesh"                         ""     0 1 1 1
"Jammu and Kashmir"                        ""     0 1 1 1
"Jharkhand"                                "Jhar" 1 1 1 1
"Karnataka"                                "Karn" 1 1 1 1
"Kerala"                                   ""     0 1 1 1
"Lakshadweep"                              ""     0 1 1 1
"Madhya Pradesh"                           "Madh" 1 1 1 1
"Maharashtra"                              "Maha" 1 1 1 1
"Manipur"                                  "Mani" 1 1 1 1
"Meghalaya"                                "Megh" 1 1 1 1
"Mizoram"                                  ""     0 1 1 1
"Nagaland"                                 "Naga" 1 1 1 1
"Odisha"                                   "Odis" 1 1 1 1
"Puducherry"                               "Pudu" 1 2 0 0
"Punjab"                                   "Punj" 1 1 1 1
"Rajasthan"                                ""     0 1 1 1
"Sikkim"                                   ""     0 1 1 1
"Tamil Nadu"                               ""     0 1 1 1
"Telangana"                                ""     0 1 1 1
"Tripura"                                  ""     0 1 1 1
"Uttar Pradesh"                            ""     0 1 1 1
"Uttarakhand"                              "Utta" 1 1 1 1
"West Bengal"                              ""     0 1 1 1

Tags: None

Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#2

27 Sep 2023, 21:49

But you do not say what values you want these new variables to take on. The code below simply creates variables with the names you want, as numeric variables taking on all missing values. Modified the -gen- command to put whatever it is you need in those variables.

Code:

assert !missing(Treated_State) == treated levelsof Treated_State, local(treated_states) foreach t of local treated_states { local name = lower(`"`t'"') gen s_`name' = . }

You don't indicate what you are going to do with these variables or why you want them. I feel I should point out that this kind of hybrid long-wide data layout is usually problematic, and it will not surprise me if it causes you trouble down the line. If you explain what you are seeking to accomplish, it may be possible to point out a better approach.
1 like
Comment

Scott Rick

Join Date: May 2021
Posts: 242

28 Sep 2023, 19:46

Clyde Schechter Thanks a lot for this, Clyde.

I wanted the new variables to take on name of the corresponding Treated_state when Treated_state was the same as the name and for all treated==0 values. As you suggested, I modified the -gen- command to do so.

I didn't mention what I am going to do because I was playing around with several possibilities to address a data issue. However, you are very right about the problems with the hybrid long-wide data layout, and I've had to rule it out as a result.

Instead, what I am seeking to do is create a matched control set using the -cem- command for each treated unit in my data. I then individually estimate the effect size for each treated unit running an event studies design, followed by averaging across all coefficients to get the average effect size. To match, I use four different sets of variables (matchlists) below, matching on each of them.

I'd greatly appreciate your help with - The code below runs the cem on one single dataset of all treated and control units. How can I first create individual datasets for all treated units, where each dataset has only one unique treated unit and all the untreated units. This will be followed by running the cem code then on each of these individual datasets for the four matchlists, finally spitting out four datatsets per treated unit (one for each matchlist).

Thank you again for the help!

Code:



* four lists of matched vars
local matchList1 "pop_perkm2_2019"
local matchList2 "pop_perkm2_2019 per60_2011"
local matchList3 "pop_perkm2_2019 per60_2011 exphealth_percap1516"
local matchList4 "pop_perkm2_2019 per60_2011 hospital_beds_total"


local method "sturges"

forvalues j = 1/4 {


    * open the data 
        use "${outdir}\data_for_cohort_match.dta", clear

    * Run CEM command
            cem `matchList`j'', treatment(stepone) 
            
            * keep only matched controls (and treated)
            keep if cem_matched == 1
            count
            if r(N) == 0 continue // no matches found

            *keep only variables needed
            keep state stepone cem_matched cem_strata cem_weights
            
            
            * drop duplicate observations (shouldn't be any) 
            duplicates drop
            
            * keep track of cem variables
            gen cem_varlist = `j'
            
            * identify treated state
            gen Treated_State = state if treated== 1
            gsort -treated
            replace Treated_State = Treated_State[_n-1] if Treated_State == ""
            
            
       
            * save
            save "${outdir}\matched_cohorts\temp`statename'`y'_`j'.dta", replace

            }



Data:

clear
input str50 state byte treated float(pop_perkm2_2019 per60_2011 hospital_beds_total) int exphealth_percap1516
"Andaman and Nicobar Islands"              0  48.12704  6.7   .3259446 6201
"Andhra Pradesh"                           0  320.4234 10.1  .15938033 1013
"Arunachal Pradesh"                        1  17.95971  4.6   .1744681 5177
"Assam"                                    0  437.1988  6.7  .07050418 1546
"Bihar"                                    1 1269.2883  7.4 .025817437  491
"Chandigarh"                               0 10342.105  6.4   .4776081 2224
"Chhattisgarh"                             1 212.46977  7.8  .06068096 1354
"Dadra and Nagar Haveli and Daman and Diu" 0 1590.3815  4.3  .22846715 2286
"Delhi"                                    1 13351.752  6.8   .1991269 1992
"Goa"                                      0  415.9914 11.2  .29766235 3643
"Gujarat"                                  0  346.5698  7.9  .09547515 1189
"Haryana"                                  0  648.5117  8.7   .1260498 1119
"Himachal Pradesh"                         0  131.1228 10.2  .21972603 2667
"Jammu and Kashmir"                        0  133.1137  7.4  .05923977 2359
"Jharkhand"                                1  469.2149  7.1  .07083924  866
"Karnataka"                                1 343.07135  7.7  .39835405 1124
"Kerala"                                   0   903.816 12.6   .2824968 1463
"Lakshadweep"                              0 2084.6106  8.2   .6264706 6018
"Madhya Pradesh"                           1  266.7748  7.9  .07897048  716
"Maharashtra"                              1 396.97055  9.9  .18971208 1011
"Manipur"                                  1  138.9797    7  .05768611 2061
"Meghalaya"                                1 143.74248  4.7   .1626551 2223
"Mizoram"                                  0  56.54381  6.3  .20939597 5862
"Nagaland"                                 1 129.68213  5.2  .11911628 2450
"Odisha"                                   1  280.4691  9.5  .05873463  927
"Puducherry"                               1  3139.875  9.7    .343883 3340
"Punjab"                                   1  592.8875 10.3  .20428346 1173
"Rajasthan"                                0 225.76036  7.5  .12059432 1360
"Sikkim"                                   0  93.57384  6.7   .2939759 5126
"Tamil Nadu"                               0  582.0096 10.4  .20526455 1235
"Telangana"                                0  324.1031  9.2  .26845512 1322
"Tripura"                                  0  380.6981  7.9  .11690882 2183
"Uttar Pradesh"                            0  933.8018  7.7  .12507923  733
"Uttarakhand"                              1  208.3092  8.9   .2140113 1765
"West Bengal"                              0  1091.874  8.5  .11715993  778

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#4

28 Sep 2023, 22:34

I'm sorry, but I'm not familiar with the -cem- command, nor with the coarsened exact matching approach. I don't understand in what sense you want to create four data sets out of the one you have. Perhaps it would be best if somebody else following the thread knows about this and can respond. If nobody does respond within, say, 24 hours, I would suggest starting a new thread specifically about this. And when you do that, I recommend explaining as clearly as possible what you are looking for here (although perhaps it would be obvious to somebody familiar with -cem-.)
Comment

Scott Rick

Join Date: May 2021
Posts: 242

29 Sep 2023, 06:10

Clyde Schechter Thanks, Clyde. Would you be able to tell me just this part - How can I create individual datasets for each treated units, where each dataset has only one unique treated unit and all the untreated units.

I can figure out the cem part and what follows. Many thanks for your help.

Code:

  
  use "${outdir}\data_for_cohort_match.dta", clear    


input str50 state byte stepone float(pop_perkm2_2019 per60_2011 hospital_beds_total) int exphealth_percap1516
"Andaman and Nicobar Islands"              0  48.12704  6.7   .3259446 6201
"Andhra Pradesh"                           0  320.4234 10.1  .15938033 1013
"Arunachal Pradesh"                        1  17.95971  4.6   .1744681 5177
"Assam"                                    0  437.1988  6.7  .07050418 1546
"Bihar"                                    1 1269.2883  7.4 .025817437  491
"Chandigarh"                               0 10342.105  6.4   .4776081 2224
"Chhattisgarh"                             1 212.46977  7.8  .06068096 1354
"Dadra and Nagar Haveli and Daman and Diu" 0 1590.3815  4.3  .22846715 2286
"Delhi"                                    1 13351.752  6.8   .1991269 1992
"Goa"                                      0  415.9914 11.2  .29766235 3643
"Gujarat"                                  0  346.5698  7.9  .09547515 1189
"Haryana"                                  0  648.5117  8.7   .1260498 1119
"Himachal Pradesh"                         0  131.1228 10.2  .21972603 2667
"Jammu and Kashmir"                        0  133.1137  7.4  .05923977 2359
"Jharkhand"                                1  469.2149  7.1  .07083924  866
"Karnataka"                                1 343.07135  7.7  .39835405 1124
"Kerala"                                   0   903.816 12.6   .2824968 1463
"Lakshadweep"                              0 2084.6106  8.2   .6264706 6018
"Madhya Pradesh"                           1  266.7748  7.9  .07897048  716
"Maharashtra"                              1 396.97055  9.9  .18971208 1011
"Manipur"                                  1  138.9797    7  .05768611 2061
"Meghalaya"                                1 143.74248  4.7   .1626551 2223
"Mizoram"                                  0  56.54381  6.3  .20939597 5862
"Nagaland"                                 1 129.68213  5.2  .11911628 2450
"Odisha"                                   1  280.4691  9.5  .05873463  927
"Puducherry"                               1  3139.875  9.7    .343883 3340
"Punjab"                                   1  592.8875 10.3  .20428346 1173
"Rajasthan"                                0 225.76036  7.5  .12059432 1360
"Sikkim"                                   0  93.57384  6.7   .2939759 5126
"Tamil Nadu"                               0  582.0096 10.4  .20526455 1235
"Telangana"                                0  324.1031  9.2  .26845512 1322
"Tripura"                                  0  380.6981  7.9  .11690882 2183
"Uttar Pradesh"                            0  933.8018  7.7  .12507923  733
"Uttarakhand"                              1  208.3092  8.9   .2140113 1765
"West Bengal"                              0  1091.874  8.5  .11715993  778

Last edited by Scott Rick; 29 Sep 2023, 06:16.

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#6

29 Sep 2023, 09:47

In the example you show in #5 the variable treated no longer appears. I see in its place a variable called stepone which, as it turns out, exactly matches the treated variable from your example in #3. So I will assume that this is the same variable, which, for some reason, you have chosen to rename.

The following should be fairly efficient:

Code:

capture program drop one_treated_state program define one_treated_state local filename = state[1] frameappend untreated save `"`filename'"', replace exit end frame put _all if !stepone, into(untreated) drop if !stepone runby one_treated_state, by(state)

-runby- is written by Robert Picard and me, -frameappend- by Jereme Freese, with contributions from Daniel Fernandes and Roger Newson. Both are available from SSC.

This will create an eponymous file for each treated state, containing the data for that state plus all untreated states.
Comment
Scott Rick

Join Date: May 2021

Posts: 242
#7

30 Sep 2023, 21:06

Clyde Schechter Thank you very much for this. It works perfectly! A very naïve question - how do I change the location where the files are saved. I want to save them to the following directory "${outdir}\matched_cohorts\states" and seem to be tripping up with the quotation marks in filename
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#8

30 Sep 2023, 22:33

What's tripping you up is using \ as the path separator for your pathnames. While this is the convention in Windows, it plays very poorly with filenames defined in local macros. The reason is that the sequence \`, which you intend to be a path separator (\) followed by the opening quote of a local macro (`), has a different meaning to Stata. In Stata, this is one of the "escape sequences." The digraph \` is interpreted as a single literal ` character, that is, not as the opening of a local macro but as an actual ` character might be used in any other context.

The easy solution is to use / as your path separator. Even though this is not the usual convention in Windows, nevertheless Windows accepts it. There are other ways to get around this, but I don't recommend them. Using / as your path separator has an additional advantage: if you ever need to port your code to a Mac or Linux machine, you won't need to modify this convention.

Code:

capture program drop one_treated_state program define one_treated_state local filename = state[1] frameappend untreated save `"${outdir}/matched_cohorts/states/`filename'"', replace exit end frame put _all if !stepone, into(untreated) drop if !stepone runby one_treated_state, by(state)

Last edited by Clyde Schechter; 30 Sep 2023, 22:35.
Comment

Scott Rick

Join Date: May 2021
Posts: 242

01 Oct 2023, 06:28

Clyde Schechter thanks a lot that; the explanation really helped! If I may continue to ask of you - I am trying to run the code in the loop on each of the individual files created, generating matches by four criteria (matchlist). The end result would be four datasets per state - one for each match criteria. When I test the code on an individual state file, it works well. However, when I put it in the loop below to run on all the files, the loop fails with no error given. In the absence of the error, I'm at a loss. I've tried playing around with the quotation marks around 'file' and changing the sequence of the two loops, but to no effect. I'd greatly appreciate you taking a look at this. If I should be making a separate post for it, please do let me know.

Code:



capture program drop one_treated_state
program define one_treated_state
    local filename = state[1]
    frameappend untreated
    save `"${outdir}/matched_cohorts/treated states/`filename'"', replace
    exit
end

frame put _all if !stepone, into(untreated)
drop if !stepone 

runby one_treated_state, by(state)
//creates 15 datasets - one for each treated state

*frame drop untreated


*create matched controls for each treated unit

* four lists of matched vars
local matchList1 "pop_perkm2_2019"
local matchList2 "pop_perkm2_2019 per60_2011"
local matchList3 "pop_perkm2_2019 per60_2011 exphealth_percap1516"
local matchList4 "pop_perkm2_2019 per60_2011 hospital_beds_total"


local method "sturges"


local files : dir "${outdir}/matched_cohorts/treated states" files "*.dta"

cd "${outdir}/matched_cohorts/treated states"

forvalues j = 1/4 { 

    foreach file in `file' {

    * open the data 
        import delimited "`file'", clear

    * Run CEM command
            cem `matchList`j'', treatment(stepone) 
            
            * keep only matched controls (and treated)
            keep if cem_matched == 1
            count
            if r(N) == 0 continue // no matches found - check for variable to keep that is a group indicator
            *should have only treated unit in it
            *keep only variables needed
            keep state stepone cem_matched cem_strata cem_weights
            
            
            * drop duplicate observations (shouldn't be any) 
            duplicates drop
            
            * keep track of cem variables
            gen cem_varlist = `j'
            
            * identify treated state
            gen Treated_State = state if stepone == 1
            gsort -stepone // sort by SO and the group indicator
            replace Treated_State = Treated_State[_n-1] if Treated_State == ""
            
                    
            * save
            save "${outdir}/matched_cohorts/matched states/temp`file'_`j'.dta", replace

            }
}

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#10

01 Oct 2023, 09:23

I see two problems. There may be more, but first try fixing these.

1. You define a local macro files, capturing a list of the .dta files produced earlier in the code. But then your inner loop says -foreach file in `file'-. No local macro file has been defined: I think you just left off the final s by mistake. But what I don't understand is why you don't get an error message as a result of that. After all, `file' expands to an empty string, so the loop begins -foreach file in -, which is a syntax error.

2. Once you get the loop to actually run, you will encounter a problem because the loop iterator `file' will be a name of a .dta file, and trying to open it with -import delimited- is doomed.

If these things don't get your code working, when you post back, give a more detailed explanation of how things are failing. Is the loop executing at all? (Put a few -display- commands in at various places so you can see whether the code is even entering the loops.) Are you sure you are not getting any error messages? Show the actual output of the loops copy/pasted from your Results window or log file.
1 like
Comment

Scott Rick

Join Date: May 2021
Posts: 242

#11

01 Oct 2023, 21:51

Clyde Schechter Thank you. Your suggestion in #1 got the loop to execute, and I do get a error message, along with the multiple Notes below. However, all the notes pertain to parts of the code before the loop, except for " 18,268 binary zeros were ignored in the source file.", and in some cases refers to an empty line/space between code lines in my do file.

The error message "variable stepone not found" pertains to the loop. However, each of the state files does include the variable stepone, so this is strange. There's also something weird happening with the data. I've included the error message that dataex gives me below. This error doesn't make sense to me because when I break down the code and test it on a single state file, it appears to run perfectly. Since dataex wouldn't work, I've included a screenshot of the data below. I'm completely lost at this point!

Code:

(encoding automatically selected: windows-1252)
Note: Unmatched quote while processing row 39; this can be due to a formatting
    problem in the file or because a quoted data element spans multiple lines. You
    should carefully inspect your data after importing. Consider using option
    bindquote(strict) if quoted data spans multiple lines or option bindquote(nobind)
    if quotes are not used for binding data.
Note: Unmatched quote while processing row 41; this can be due to a formatting
    problem in the file or because a quoted data element spans multiple lines. You
    should carefully inspect your data after importing. Consider using option
    bindquote(strict) if quoted data spans multiple lines or option bindquote(nobind)
    if quotes are not used for binding data.
Note: Unmatched quote while processing row 43; this can be due to a formatting
    problem in the file or because a quoted data element spans multiple lines. You
    should carefully inspect your data after importing. Consider using option
    bindquote(strict) if quoted data spans multiple lines or option bindquote(nobind)
    if quotes are not used for binding data.
Note: Unmatched quote while processing row 49; this can be due to a formatting
    problem in the file or because a quoted data element spans multiple lines. You
    should carefully inspect your data after importing. Consider using option
    bindquote(strict) if quoted data spans multiple lines or option bindquote(nobind)
    if quotes are not used for binding data.
Note:  18,268 binary zeros were ignored in the source file.  The first instance
       occurred on line 1.  Binary zeros are not valid in text data.  Inspect your
       data carefully.
(5 vars, 53 obs)
variable stepone not found
(error in option treatment())
r(111);



*Dataex error message
clear
input strL(v1 v2 v3) str103 v4 str34 v5
data width (1372 chars) exceeds max linesize. Try specifying fewer variables
r(1000);

Comment

Clyde Schechter

Join Date: Apr 2014
Posts: 30164

#12

02 Oct 2023, 09:00

The screenshot you show does not look at all like I would expect it to. And you can clearly see in the screenshot that it does not contain a variable stepone. All of its variables are named v1, v2, ... etc. This suggests to me that you are either trying to-import delimited- something that is a Stata data set. I notice that in your code, inside the loop you have an -import delmited `file', clear- command, where `file' refers to a Stata .dta file created earlier in the code. So you need to change that to -use `file', clear-.

Here's what happens when you try to -import delimited- a Stata .dta file:

Code:

. import delimited auto.dta
(encoding automatically selected: windows-1252)
Note: Unmatched quote while processing row 25; this can be due to a formatting problem in the file or because a quoted data element spans
multiple lines. You should carefully inspect your data after importing. Consider using option bindquote(strict) if quoted data spans
multiple lines or option bindquote(nobind) if quotes are not used for binding data.
Note: Unmatched quote while processing row 27; this can be due to a formatting problem in the file or because a quoted data element spans
multiple lines. You should carefully inspect your data after importing. Consider using option bindquote(strict) if quoted data spans
multiple lines or option bindquote(nobind) if quotes are not used for binding data.
Note: Unmatched quote while processing row 28; this can be due to a formatting problem in the file or because a quoted data element spans
multiple lines. You should carefully inspect your data after importing. Consider using option bindquote(strict) if quoted data spans
multiple lines or option bindquote(nobind) if quotes are not used for binding data.
Note: Unmatched quote while processing row 40; this can be due to a formatting problem in the file or because a quoted data element spans
multiple lines. You should carefully inspect your data after importing. Consider using option bindquote(strict) if quoted data spans
multiple lines or option bindquote(nobind) if quotes are not used for binding data.
Note: Unmatched quote while processing row 48; this can be due to a formatting problem in the file or because a quoted data element spans
multiple lines. You should carefully inspect your data after importing. Consider using option bindquote(strict) if quoted data spans
multiple lines or option bindquote(nobind) if quotes are not used for binding data.
Note: Unmatched quote while processing row 50; this can be due to a formatting problem in the file or because a quoted data element spans
multiple lines. You should carefully inspect your data after importing. Consider using option bindquote(strict) if quoted data spans
multiple lines or option bindquote(nobind) if quotes are not used for binding data.
Note: 6,962 binary zeros were ignored in the source file. The first instance occurred on line 1. Binary zeros are not valid in text data.
Inspect your data carefully.
(3 vars, 54 obs)

That's what's going wrong here.

Comment

Scott Rick

Join Date: May 2021
Posts: 242

#13

02 Oct 2023, 19:14

Clyde Schechter Thank you! That was what was wrong and it works perfectly now, generating the output I need.

In hopefully the last on this thread: I'm trying to save the final output using the value of the Treated_State variable. All observations in this variable are the same. However, I run into the error below. Alternately, I could use the filename in `files', but that includes a .dta in the name which I haven't been able to separate from and it keeps tripping me up here.

Code:

file C:\Users\dsouz\Dropbox\UNC\Dissertation\Dissertation\Aim
    1\Data\analysis\output\matched_cohorts\matched states\tempTreated_State[19]_1.dta
    could not be opened
r(603);



capture program drop one_treated_state
program define one_treated_state
    local filename = state[1]
    frameappend untreated
    save `"${outdir}/matched_cohorts/treated states/`filename'"', replace
    exit
end

frame put _all if !stepone, into(untreated)
drop if !stepone 

runby one_treated_state, by(state)
//creates 15 datasets - one for each treated state

*frame drop untreated


*create matched controls for each treated unit

* four lists of matched vars
local matchList1 "pop_perkm2_2019"
local matchList2 "pop_perkm2_2019 per60_2011"
local matchList3 "pop_perkm2_2019 per60_2011 exphealth_percap1516"
local matchList4 "pop_perkm2_2019 per60_2011 hospital_beds_total"

* When specifying cutpoints, several automatic methods may be chosen, including 
* "sturges" (Sturges' rule, the default), "fd" (Freedman-Diaconis' rule), 
* "scott" (Scott's rule) and "ss" (Shimazaki-Shinomoto's rule). See references 
* for a description of each rule.
local method "sturges"


local files : dir "${outdir}/matched_cohorts/treated states" files "*.dta"

cd "${outdir}/matched_cohorts/treated states"

forvalues j = 1/4 { 

    foreach file in `files' {

    * open the data 
        use `file', clear

    * Run CEM command
            cem `matchList`j'', treatment(stepone) 
            
            * keep only matched controls (and treated)
            keep if cem_matched == 1
            count
            if r(N) == 0 continue // no matches found - check for variable to keep that is a group indicator
            *should have only treated unit in it
            *keep only variables needed
            keep state stepone cem_matched cem_strata cem_weights
            
            
            * drop duplicate observations (shouldn't be any) 
            duplicates drop
            
            * keep track of cem variables
            gen cem_varlist = `j'
            
            * identify treated state
            gen Treated_State = state if stepone == 1
            gsort -stepone // sort by SO and the group indicator
            replace Treated_State = Treated_State[_n-1] if Treated_State == ""
            
            local N = _N
            forvalues i = 1/`N' {
                local myfilename Treated_State[`i']
            }
                    
            * save
            save "${outdir}\matched_cohorts\matched states\temp`myfilename'_`j'.dta", replace

            }
}

Comment

Clyde Schechter

Join Date: Apr 2014

Posts: 30164
#14

02 Oct 2023, 21:14

Well, I see several things wrong here. But I don't understand what you're trying to do, so I don't know what alternatives to suggest. So all I can do is show you what is going wrong, and perhaps you can figure out what to change.

First

Code:

local N = _N forvalues i = 1/`N' { local myfilename Treated_State[`i'] }

pointlessly loops over all the values from 1 to _N, in the end just to set local myfilename to "Treated_State[#]" where # is the number of observations in memory (which, from what you show in the error message you get is 19). If you want to set local myfilename to "Treated_State[19]", then just do -local myfilename Treated_State[19]-. But I don't know if you really want to do that. It strikes me that perhaps you really want to set local myfilename to the value contained in observation 19 of the variable Treated_State. This seems to me to be more in keeping with the overall gist of this thread. That would be -local myfilename = Treated_State[19]-. Of course, it may also be that you really want to set local myfilename to the value contained in some observation `i' of the variable Treated_State. But I can't for the life of me figure out what `i' is supposed to be, because it appears nowhere in the code except in that pointless loop and there is nothing else obvious that it should refer to.

Now, you have also said that you might want to use the name in local macro file, except that you want to get rid of the .dta it contains. That's simple enough to do:

Code:

local outname: subinstr local file ".dta" ""

and then use `outname' to refer to that in your -save- command.

Finally, I should point out that the error message you are getting may not be resolved by dealing with these matters. There are many possible causes for this kind of error message. One is that the directory does not exist. Did you ever create it? Did you misname it in your -save command: perhaps it is meant to be "${outdir}\matched_cohorts\treated states", which definitely already exists because you have read files from it. Or perhaps you do intend a separate directory here and have created it, but perhaps you didn't assign yourself write permission there. Or perhaps your hard drive is getting full. Or perhaps the full pathname is longer than your operating system will allow. These are all things you should look into if dealing with the aforementioned problems does not resolve that problem.
Comment

Announcement

Create new variable for observations of another variable

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment