command to look up a variable in another file

Dan Daugaard

Join Date: Jul 2018
Posts: 61

command to look up a variable in another file

05 Aug 2018, 18:14

Hi there, I am having trouble finding the command to look up a variable in another file.
The context is running regressions over a time series panel of fund returns (in file: Q1_fund_return.csv).
I would like to see if a fund is listed in another file (ie Q1_fund_categories) and create a time series Dummy variable if the fund is listed in that file.
Thank you very much for any guidance

Code:

 
* importing fund data, look up Dummy characteristic and regressing returns on lagged returns
clear
import delimited Q1_fund_return.csv, numericcols(3) 
* Recognise date string and Set panel timeseries
tostring caldt, replace
gen month = mofd(date(caldt,"YMD"))
format month %tmYYMon
tsset id month
* to be done - Look up ID in another file to find characteristic
* to be done - Set Dummy variable for characteristic
* multiple regress ff on 1 month lag for each fund - add in Dummy variable here
statsby, by(id) clear: regress mret L.mret
list, clean

Code:

 Q1_fund_return.csv looks like:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long(id caldt) float(mret mnav)
2708 20010731        .     .
2708 20010831        . 13.38
2708 20010928 -.069507 12.45
2708 20011031  .025703 12.77
2708 20011130  .047768 13.38
2708 20011231   .00753 13.37
2708 20020131 -.011967 13.21
2708 20020228 -.005299 13.14
2708 20020328  .031963 13.56
2708 20020430 -.022124 13.26
end

Code:

 Q1_fund_categories looks like:
* Example generated by -dataex-. To install: ssc install dataex
clear
input long id str90 fund_name str5 ticker str1(retail_fund inst_fund) long first_offer_dt
 2708 "GuideStone Funds: Growth Allocation Fund; Investor Class Shares"                            "GCOZX" "N" "Y" 20010827
 2901 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class R Shares" "DDFRX" "N" "Y" 20051025
 2926 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class C Shares" "LCEVX" "Y" "N" 20011231
 2927 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class B Shares" "LCEDX" "Y" "N" 20011231
 2928 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class A Shares" "LCEAX" "Y" "N" 20011231
 2951 "AIM Equity Funds (Invesco Equity Funds): Invesco Charter Fund; Class B Shares"              "BCHTX" "Y" "N" 19950626
 2952 "AIM Equity Funds (Invesco Equity Funds): Invesco Charter Fund; Class C Shares"              "CHTCX" "Y" "N" 19970804
 2960 "AIM Equity Funds (Invesco Equity Funds): Invesco Summit Fund; Class P Shares"               "SMMIX" "Y" "N" 19821101
46863 "Neuberger Berman Equity Funds: Neuberger Berman Socially Responsive Fund; Class A Shares"   "NRAAX" "Y" "N" 20090527
46864 "Neuberger Berman Equity Funds: Neuberger Berman Socially Responsive Fund; Class C Shares"   "NRACX" "Y" "N" 20090527
46865 "Neuberger Berman Equity Funds: Neuberger Berman Socially Responsive Fund; Class R3 Shares"  "NRARX" "N" "Y" 20090527
end

Attached Files

Tags: None

Dave Airey

Join Date: Apr 2014

Posts: 396
#2

06 Aug 2018, 06:55

This would seem like a task for "describing using..." or "preserve/restore".
1 like
Comment

Daniel Bela

Join Date: Apr 2014
Posts: 246

06 Aug 2018, 08:11

Hi,

I'm not absolutely sure that I understand what kind of variable you want to create. You want to insert a dummy variable that indicates if the observation is present in the categories dataset? If so, you can do this with a simple -merge-:

Code:

// create (temporary) example data
tempfile Q1_fund_return Q1_fund_categories
clear
input long(id caldt) float(mret mnav)
2708 20010731        .     .
2708 20010831        . 13.38
2708 20010928 -.069507 12.45
2708 20011031  .025703 12.77
2708 20011130  .047768 13.38
2708 20011231   .00753 13.37
2708 20020131 -.011967 13.21
2708 20020228 -.005299 13.14
2708 20020328  .031963 13.56
2708 20020430 -.022124 13.26
end
save `Q1_fund_return'
clear
input long id str90 fund_name str5 ticker str1(retail_fund inst_fund) long first_offer_dt
 2708 "GuideStone Funds: Growth Allocation Fund; Investor Class Shares"                            "GCOZX" "N" "Y" 20010827
 2901 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class R Shares" "DDFRX" "N" "Y" 20051025
 2926 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class C Shares" "LCEVX" "Y" "N" 20011231
 2927 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class B Shares" "LCEDX" "Y" "N" 20011231
 2928 "AIM Equity Funds (Invesco Equity Funds): Invesco Diversified Dividend Fund; Class A Shares" "LCEAX" "Y" "N" 20011231
 2951 "AIM Equity Funds (Invesco Equity Funds): Invesco Charter Fund; Class B Shares"              "BCHTX" "Y" "N" 19950626
 2952 "AIM Equity Funds (Invesco Equity Funds): Invesco Charter Fund; Class C Shares"              "CHTCX" "Y" "N" 19970804
 2960 "AIM Equity Funds (Invesco Equity Funds): Invesco Summit Fund; Class P Shares"               "SMMIX" "Y" "N" 19821101
46863 "Neuberger Berman Equity Funds: Neuberger Berman Socially Responsive Fund; Class A Shares"   "NRAAX" "Y" "N" 20090527
46864 "Neuberger Berman Equity Funds: Neuberger Berman Socially Responsive Fund; Class C Shares"   "NRACX" "Y" "N" 20090527
46865 "Neuberger Berman Equity Funds: Neuberger Berman Socially Responsive Fund; Class R3 Shares"  "NRARX" "N" "Y" 20090527
end
save `Q1_fund_categories'

// procedure starts here
use `Q1_fund_return'
merge m:1 id using `Q1_fund_categories' , keep(master match) keepusing(id)
generate dummy=(_merge==3)
drop _merge

list

Regards
Bela

Last edited by Daniel Bela; 06 Aug 2018, 08:13. Reason: formatting

Comment

Dan Daugaard

Join Date: Jul 2018

Posts: 61
#4

07 Aug 2018, 16:29

Thank you Bela and Dave, you are both champions!
The merge worked very well – thank you Bela for your efforts in setting out how it would work.
Dave, I’ll keep a note of your other suggested commands and investigate further because I might need something in addition to -merge- with the next stage – ie the next stage will be to assign the dummy variable if the fund is present on the list – but only assign the dummy as 1 between a unique start and end date for each fund on the list.
Here's the current status in case others encounter this query. Thanks again, Dan

Code:

* importing fund categories and data, look up Dummy characteristic and regressing returns on lagged returns clear import delimited Q1_fund_categories.csv * Save file save Q1_fund_categories, replace clear import delimited Q1_fund_return.csv, numericcols(3 4) * Save file save Q1_fund_return.dta, replace clear use Q1_fund_return * Recognise date string tostring caldt, replace gen month = mofd(date(caldt,"YMD")) format month %tmYYMon drop caldt * Set panel timeseries tsset id month * merge fund data with category merge m:1 id using Q1_fund_categories , keep(master match) keepusing(id) generate dummy=(_merge==3) drop _merge list in 1/20 * Save file save Q1_results.dta, replace

Attached Files

Q1_fund_categories.csv (1.2 KB, 1 view)

Q1_fund_return.csv (75.5 KB, 2 views)
Comment

Announcement

command to look up a variable in another file

Comment

Comment

Comment