Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Select & Merge

    Good morning!

    I have 2 files (observations = 43 617). In the first file are countries with a lot of information (in the example I will not write down them, just the necessary variables). In the second file is the unemployment information about the countries.

    I want to merge these file, BUT unemployment rates are expected to have a delayed impact, affecting one’s job situation after some time, I want to assign respondents with the quarterly unemployment rate that prevailed in their respective countries three months before the date (month/year).

    For example: Belgium 2004. April. (1. File) necessary Belgium 2004. Q1 -------> 4 (April) - 3 = 1 (January - Q1)

    I show an example:

    The first file:
    Country Year Month
    Austria 2004 January
    Austria 2004 June
    Belgium 2004 April
    Belgium 2004 May
    Germany 2004 May
    Germany 2011 May
    Germany 2013 May
    France 2011 September
    France 2013 December
    The second file (it contains unemployment rate of country):
    Country Time Value
    Austria 2003 - Q4 1.12
    Austria 2004 - Q1 1.14
    Belgium 2004 - Q1 1.23
    France 2011 - Q2 1.56
    Germany 2004 - Q1 1.33
    Germany 2011 - Q1 1.39
    Germany 2013 - Q1 1.30
    France 2013 - Q3 1.51
    • January, February, and March (Q1)
    • April, May, and June (Q2)
    • July, August, and September (Q3)
    • October, November, and December (Q4)
    Thank you

  • #2
    Are you going to use the lagged unemployment rate in a regression? In that case, after properly using -xtset-, you can just use the -lag- operator in your regression, i.e.

    Code:
    reg y x l3.unemployment

    Comment


    • #3
      Originally posted by Ulrich Wohak View Post
      Are you going to use the lagged unemployment rate in a regression? In that case, after properly using -xtset-, you can just use the -lag- operator in your regression, i.e.

      Code:
      reg y x l3.unemployment


      Hello
      Thank you for your answer

      Can you explain it more excatly, because I do not understand how I should do.

      Should not I merge these two files first?

      Comment


      • #4
        Yes, apologies for my incomplete response.

        If you work with quarterly data you need to aggregate your first data set to quarterly data as it is on a monthly basis at the moment. You can create a variable to indicate the quarter by

        Code:
        gen quarter= ""
        replace quarter = "Q1" if inlist(Month, "January", "February", "March")
        replace quarter = "Q2" if inlist(Month, "April", "May", "June")
        and so on.

        We can then create the Time variable that you have in your second data set in your first data set by
        Code:
        gen Time = Year + " - " + quarter
        assuming that the variable Year is string.

        You will then probably want to aggregate it from monthly to quarterly using

        Code:
        collapse (stat1) varlist1 (stat2) varlist2, by(Country Time)
        where stat1/stat2 depends on how you want to aggregate it (i.e. mean, sum, etc) and varlist1/varlist2 are the corresponding variables for the stat. You will need to check how you want to do this. You can then merge your data sets with

        Code:
        merge 1:1 Country Time using name_of_second_data_set.dta
        and you should then be able to apply whatever analysis you want. Note that this is not a very elegant suggestion as it does not use proper time formatting, but it should get you started. Hope that helps.

        Comment


        • #5
          Originally posted by Ulrich Wohak View Post
          Yes, apologies for my incomplete response.

          If you work with quarterly data you need to aggregate your first data set to quarterly data as it is on a monthly basis at the moment. You can create a variable to indicate the quarter by

          Code:
          gen quarter= ""
          replace quarter = "Q1" if inlist(Month, "January", "February", "March")
          replace quarter = "Q2" if inlist(Month, "April", "May", "June")
          and so on.

          We can then create the Time variable that you have in your second data set in your first data set by
          Code:
          gen Time = Year + " - " + quarter
          assuming that the variable Year is string.

          You will then probably want to aggregate it from monthly to quarterly using

          Code:
          collapse (stat1) varlist1 (stat2) varlist2, by(Country Time)
          where stat1/stat2 depends on how you want to aggregate it (i.e. mean, sum, etc) and varlist1/varlist2 are the corresponding variables for the stat. You will need to check how you want to do this. You can then merge your data sets with

          Code:
          merge 1:1 Country Time using name_of_second_data_set.dta
          and you should then be able to apply whatever analysis you want. Note that this is not a very elegant suggestion as it does not use proper time formatting, but it should get you started. Hope that helps.



          Hi
          I have problem here:
          CODE]
          gen Time = Year + " - " + quarter
          [/CODE]


          The problem: Quarter is string but the the Year is int, therefore I got error message (type mismatch)

          What should I do?

          Thank you for your patient

          Comment


          • #6
            Originally posted by Ulrich Wohak View Post
            Yes, apologies for my incomplete response.

            If you work with quarterly data you need to aggregate your first data set to quarterly data as it is on a monthly basis at the moment. You can create a variable to indicate the quarter by

            Code:
            gen quarter= ""
            replace quarter = "Q1" if inlist(Month, "January", "February", "March")
            replace quarter = "Q2" if inlist(Month, "April", "May", "June")
            and so on.

            We can then create the Time variable that you have in your second data set in your first data set by
            Code:
            gen Time = Year + " - " + quarter
            assuming that the variable Year is string.

            You will then probably want to aggregate it from monthly to quarterly using

            Code:
            collapse (stat1) varlist1 (stat2) varlist2, by(Country Time)
            where stat1/stat2 depends on how you want to aggregate it (i.e. mean, sum, etc) and varlist1/varlist2 are the corresponding variables for the stat. You will need to check how you want to do this. You can then merge your data sets with

            Code:
            merge 1:1 Country Time using name_of_second_data_set.dta
            and you should then be able to apply whatever analysis you want. Note that this is not a very elegant suggestion as it does not use proper time formatting, but it should get you started. Hope that helps.

            Hi
            I have already solved the previous question. I have Time variable and also the Merge is done.

            Just the last question:
            Does it contain also the delay impact?

            "I want to merge these file, BUT unemployment rates are expected to have a delayed impact, affecting one’s job situation after some time, I want to assign respondents with the quarterly unemployment rate that prevailed in their respective countries three months before the date (month/year).

            For example: Belgium 2004. April. (1. File) necessary Belgium 2004. Q1 -------> 4 (April) - 3 = 1 (January - Q1)"

            Thank you

            Comment

            Working...
            X