Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Merging Variables

    Hi! I am very new at using Stata and I'm not confident at all. I am using the European Social Survey Dataset Round9 (2018/9). I am wanting to compare Finland with Britain in my analysis. One of my independent variables is level of education, and I've recoded Finland's education responses to either 'degree or higher' or 'no degree'. When I've come to the UK's responses, for some reason they have two variables for education (edubgb2 and educgb1, highest level of education PhD or equiv and highest level of education 2 or more a-levels, respectively). I want to merge these variables so I can recode them in the same style I recoded Finlands. The only help guides I can find is on merging whole datasets, not variables within the same dataset. Please help ! (TIA)

  • #2
    It sounds like you need to append the datasets rather than merge them. If this is the case, you need the variable names and variable types (numeric or string) to match across both datasets. See

    Code:
    help append
    Here is an example:

    Code:
    clear
    input str20(id education)
    "F001" "no degree"
    "F002" "no degree"
    "F003" "degree or higher"
    end
    tempfile Finland
    save `Finland'
    
    clear
    input str20(id education)
    "GB012" "educgb2"
    "GB013" "educgb2"
    "GB014" "educgb1"
    end
    
    gen GB=1
    append using `Finland'
    replace GB=0 if missing(GB)
    l, sepby(GB)
    Res.:

    Code:
    . l, sepby(GB)
    
         +-------------------------------+
         |    id          education   GB |
         |-------------------------------|
      1. | GB012            educgb2    1 |
      2. | GB013            educgb2    1 |
      3. | GB014            educgb1    1 |
         |-------------------------------|
      4. |  F001          no degree    0 |
      5. |  F002          no degree    0 |
      6. |  F003   degree or higher    0 |
         +-------------------------------+

    Comment


    • #3
      I'm interpreting your question and situation differently than Andrew. I'm thinking that you already do have the UK and Finland data appended into one data set, but that you need help to create a new education variable that harmonizes the several different variables. I'm also guessing that the variables are numeric, with value labels attached to them.

      If my interpretation is right, we would need to have a more detailed and clearer description of the variables involved, since most of us don't happen to work with that data set. If I'm on target, what you need to tell us is exactly how Finland's education variable is coded, and exactly how the two variables are coded for the UK. To give us this information, please re-read the StataList FAQ for new users, with particular attention to section 12.2 on using the -dataex- command to create example data, and use that command to prepare a small example. Your example should include your country variable, and the three education variables. We would also need to know how the numerical codings for these variables correspond to the verbal descriptions of such things as "highest level of education PhD."

      Comment


      • #4
        Hi both, thank you so much for reaching out. I have been able to locate two separate datasets; one for Finland, one for UK, rather than a dataset for all Europe. They can have the same do file as the variables are coded exactly the same. Hopefully, this will allow me to contrast the results from each country's analysis, albeit I have to run the do file twice. Thanks for reaching out ! :-)

        Comment

        Working...
        X