Hi all,
Apologies in advance if this is not quite the place for this question but I seem to have exhausted all other avenues. For context, I am replicating a published paper for my thesis which essentially gathers key data from firms listed on on the US and Canadian stock exchange and runs a regression to prove that - per the fundamentals - the US firms are valued higher. To start this, the paper pulls loads of data from Compustat and forms an initial sample from which it matches US firms to Canadian firms based on industry and firm size (please see the methodology as an image attached). My hope was to do this for a dozen countries, however I have been unable to match firms as in the paper. Having spoken to some people, they suggested the 'range', 'merge' and 'psmatch2' commands but I get a range of error messages from 'command not recognised' to 'incorrect data type'.
I am able to run a regression, test for basic things such as autocorrelation and hetroskedastidicy, but this seems to be a little (a lot) outside my skillset so any advice on how to match data and replicate the paper would be hugely appreciated. I cannot seem to share the dataset I am currently experimenting with, maybe because it has near 1M observations?
Thank you in advance!
Apologies in advance if this is not quite the place for this question but I seem to have exhausted all other avenues. For context, I am replicating a published paper for my thesis which essentially gathers key data from firms listed on on the US and Canadian stock exchange and runs a regression to prove that - per the fundamentals - the US firms are valued higher. To start this, the paper pulls loads of data from Compustat and forms an initial sample from which it matches US firms to Canadian firms based on industry and firm size (please see the methodology as an image attached). My hope was to do this for a dozen countries, however I have been unable to match firms as in the paper. Having spoken to some people, they suggested the 'range', 'merge' and 'psmatch2' commands but I get a range of error messages from 'command not recognised' to 'incorrect data type'.
I am able to run a regression, test for basic things such as autocorrelation and hetroskedastidicy, but this seems to be a little (a lot) outside my skillset so any advice on how to match data and replicate the paper would be hugely appreciated. I cannot seem to share the dataset I am currently experimenting with, maybe because it has near 1M observations?
Thank you in advance!
Comment