You are not logged in. You can browse but not post. Login or Register by clicking 'Login or Register' at the top-right of this page. For more information on Statalist, see the FAQ.
The reason for me putting them all together as I want to form a singular index, as I'm going to regress this further on various things such as FTSE data. I want to keep those with a negative relationship as literature suggests this is the best way to see an impact on the market
The explanation for wanting a single index looks circular to me. It would be interesting to see references arguing that and why very weak negative relationships are worth examination.
Just for reference, this is what one of the papers does:
"We start with the first six months (January to June) in 2004. For each search term, we regress the adjusted daily changes in log SVIs on the contemporaneous market excess returns and keep the t‐value associated with the regression slope coefficient. We sort the t‐values across terms and pick the 30 terms with the most negative t‐values. So there is no look‐ ahead bias, we then use these “Top 30” terms as our FEARS index for the following 6‐months (July 2004 – December 2004). We cumulate and continue in this fashion: the 30 most negative terms during the period January 2004 – December 2004 are used for the FEARS index for the period January 2005 – June 2005, the 30 most negative terms during the period January 2004 – June 2005 are used for the FEARS index for the period July 2005 – December 2005, and so on."
#19 seems consistent with my point. The authors are looking for most negative, not just any negative. What is not explained in the quotation is how many candidates they had to choose from.
To create the entire index, they used a well-known finance dictionary with words with Econ and Finance tags. Then used google trends to find the top 10 search-related words to each of these words, which gives you around 1500 words. After filtering my relevance and available data, they were left with 118.
So if I were to apply this to my project, how would I input this into STATA. Is it a case of manually doing this for each 6 months or are there shortcuts which can automate this, to give me the final output at the end?
Comment