No announcement yet.
  • Filter
  • Time
  • Show
Clear All
new posts

  • Combining datasets for yearly analysis with individual level observations

    Hi statalist,

    I'm trying to make a regressions analysis of variables X --> Y, I'm interested in the general relationsship but will also be looking at the relationsship over the years 1989-2014.

    The X variable has been tracked in two different kinds of yearly surveys from 1989-1994 and from 2004-2014 and the Y variable from 1989-2014. With all of these I've taken the survey from one year, appended it with they survey from the second year and so forth, so now I have X tracked in two datasets, one from 1989-1994, one from 2004-2014 and X tracked in one dataset from 1989-2014.

    My problem is how I should combine these three datasets in order to make the analysis, I've tried with append, but stata tells me that it doesn't have any observations, which I think is because the individual level observations doesn't have both the X and Y variable. I've also tried using 1:1 merge on my "year"-variable, but stata tells me that "year does not uniquely identify observations in the master data".

    So my question is how I should combine the datasets to make the analysis possible.

    Thanks in advance,
    Kasper Lovén

  • #2
    I seriously doubt anybody can help you without seeing examples of the data sets. Please post back showing excerpts. Make sure the excerpts you show include some observations of X and Y that should be matched up with each other. Also be sure to include the year variable, and any other variables that help you decide which observations of X match with which observations of Y. When doing this, be sure to use the -dataex- command.

    If you are running version 15.1 or a fully updated version 14.2, -dataex- is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.


    • #3
      Hi Kasper,

      As Clyde mentioned, you will be *far* more likely to get help if you can post 20 obs from each of your datasets using Stata's dataex command (help dataex). If you're not familiar with dataex (and most Stata users aren't) there is a Youtube tutorial here.

      so now I have X tracked in two datasets, one from 1989-1994, one from 2004-2014
      The above will need to be appended to each other. Make sure the variables have the exact same name and datatype (at least numeric vs string). This one (presumably) will become your master dataset. Do these datasets have firm-year or country-year data?

      Y tracked in one dataset from 1989-2014.
      This one you will merge into the master. You'll have to share sample data for others to help you with why the 1:1 merge failed.