Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merge daily and annual data

    Good evening,

    I have an assignment and now I should merge daily data ( stock prices from 2008 till 2015 retrieved from Datastream) and annual data ( from 2008-2015). My issue is that I cannot merge the data, when I try I got error.. Below I have an example for each of my data.
    Thank you very much in advance.
    Daily data:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str8 Country str15 Code double delta_log_price float(date SMP1 SMP2 LTRO1 LTRO2 OMT1 OMT2 OMT3 OMT4 APP)
    "Ireland" "AU000000JHX1(P)" -.016725128541631662 19877 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .002779709920497364 19870 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .004705891037412702 18060 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .004960054258250857 20390 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .024845998586530804 18252 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .006594465725280774 19634 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .04686430897621774 17730 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.011034594723709157 18175 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.0023174981403624173 20363 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.007317105817067127 18241 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .026623868712171372 17643 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.031252543504104446 18892 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .007968169649176904 17592 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.010695289116747997 20138 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.007393749024938279 18564 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .04271772968515419 18760 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .01959566294505022 19157 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .016973533379705863 18135 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.012409673215712402 19606 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.015345569674660275 19183 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .021779445264039854 18571 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.039716411212505465 17553 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .05011487012562332 17745 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .015504186535965303 18920 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" .011560822401076143 18044 0 0 0 0 0 0 0 0 0
    "Ireland" "AU000000JHX1(P)" -.026798193154724093 17982 0 0 0 0 0 0 0 0 0

  • #2
    Annual data:

    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str4 year str2 counntry str12 code double(Ebitda Equity Tassets)
    "2008" "ES" "ES0109658031" . . .
    "2008" "ES" "ES0109642035" -62.37994000000001 . .
    "2008" "ES" "ES0105157038" -41.7491 . 12285
    "2008" "ES" "ES0107390033" -79.65709 . 10879
    "2008" "ES" "ES0156303002" . . 11346.6403
    "2008" "ES" "ES0157076037" -36.71439 . 11493.35476
    "2008" "ES" "ES0109654030" -75.12382000000001 . 9690.67692
    "2008" "IT" "IT0005204729" . . 8734.12394
    "2008" "ES" "ES0105158036" -38.94843 . 16303.24247
    "2008" "ES" "ES0114493036" -51.08692 . 17968.53492
    "2008" "IT" "IT0001008876" 42404 10232 18121.68116
    "2008" "IT" "IT0001233417" 1128000 3874000 18060.6699
    "2008" "ES" "ES0184836031" -51.34407 . 19425.87729
    "2008" "ES" "ES0109659013" . . 24783.02222
    "2008" "ES" "ES0109653032" -69.19242999999999 . 27744.24308
    "2008" "IE" "IE0000020408" 39889 207940 32418.27782
    "2008" "ES" "ES0190053035" -51.0532 . 2794.627700000001
    "2008" "ES" "ES0190054009" . . 2765.10474
    "2008" "ES" "ES0105200416" 466070 406789 2851.83706
    "2008" "ES" "ES0105191037" -36.22676 . 3107.55266
    "2008" "IE" "IE00B1VS3F54" . . 3365.50097
    "2008" "ES" "ES0111845014" 2261886 3372601 3990.50988
    "2008" "ES" "ES0105202032" -98.53948 . 4182.01981
    "2008" "IT" "IT0005160996" . . 4715.43863
    "2008" "ES" "ES0105204038" -34.03163 . 5015.35067
    "2008" "ES" "ES0105261038" -17.139 . 5316.547030000001

    Comment


    • #3
      Thank you for using -dataex- to show the data. But without seeing what code you tried and what response you got from Stata, it isn't possible to troubleshoot what you are doing. Saying that you got an error is not helpful: we don't know what command you were trying, and we don't know what error you actually got. When looking for help with code, you should always show the exact commands you gave and the full output/response you got from Stata. That should be done by copying and pasting directly from the Results window or your log file--do not edit it in any way, as details are often crucial. And you should wrap all of that in code delimiters. (Your -dataex- output should also be in code delimeters--and -dataex- automatically provides them as part of its output. So you should be more attentive to copying precisely what the -dataex- output tells you to copy to get the best possible results.)

      I should also note that if this is a school assignment, there is a generally agreed upon policy here that we do not provide help with those because there is an expectation that students will complete their assignments on their own.

      Comment


      • #4
        Thank you very much for your response. You are right I should had added the command that I used. Please find it below as well as, the error message that I got:
        db merge

        merge 1:1 Code using "C:\Users\Evangelia\Desktop\\Data\Stata part II\final stock prices.dta"
        variable Code does not uniquely identify observations in the master data

        Comment


        • #5
          That error message means exactly what it says. When you use -merge 1:1 Code-, then each data set must contain at most one observation for any given value of Code. This is clearly not the case in your data. The daily data contain large numbers of observations having the same values of Code. This data is simply not suitable for -merge 1:1-. By contrast, in the yearly data, at least in the portion of it that you show, any given value of Code appears only once.

          So you need to do -merge m:1 Code using...-

          Comment


          • #6
            I tried also m:1, but I got the same error message.

            merge m:1 Code Country using "C:\Users\Evangelia\Data\Stata part II\all orbis.dta"
            variables Code Country do not uniquely identify observations in the using data
            r(459);

            Comment


            • #7
              Evangelia: read the error message, it is telling you in plain English what is wrong. This one says that Code and Country do not uniquely identify observations in the using data (all orbis.dta). Now, in the example data you showed, Code alone sufficed to do that. But perhaps that is not true in the full data set. Perhaps in the full data set each Code appears several times, in different years.

              So, we cannot go to -merge m:m- as a way out of this because -merge m:m- does something crazy that just produces a jumbled data set. And it is unlikely that what you want is -joinby-. I think the missing ingredient here is a date variable.

              Now, one of your data sets has yearly dates, and the other has only years. I would suggest that in the master data set you create a variable that has the year only:

              Code:
              gen year = yofd(date)
              Then I think you will have success with -merge m:1 Code year using...- or perhaps -merge m:1 Code Country year using...-.

              Note: Your variable names are changing from post to post. There is a variable Country in the daily data set shown in #1, but the yearly data shown in #2 does not have any such variable. It has a variable counntry, which may look like a good match to Country to human eyes, but to Stata they are different as night and day. I take it that you corrected counntry to Country in the interim. Good. Make sure you have identical variable names in both data sets for whatever is supposed to be the same variable: no alternative spellings, and no changes in case.

              If that is not right, post back with examples of your data that, when run, reproduce the problem you are encountering.

              Comment


              • #8
                Good evening,

                I created the year variable that you suggested me and I made sure that the variable have the same name at the two datastes.

                part 1:
                Code:
                * Example generated by -dataex-. To install: ssc install dataex
                clear
                input str8 Country str18 Code double delta_log_price float(date SMP1 SMP2 LTRO1 LTRO2 OMT1 OMT2 OMT3 OMT4 APP) int year
                "Ireland" "AU000000JHX1(P)"  -.010241494052196876 -638817 0 0 0 0 0 0 0 0 0  210
                "Ireland" "AU000000JHX1(P)" -.0006624710411213819 -637287 0 0 0 0 0 0 0 0 0  215
                "Ireland" "AU000000JHX1(P)"  -.008867926869660092 -637207 0 0 0 0 0 0 0 0 0  215
                "Ireland" "AU000000JHX1(P)"  -.005029348405002132   17652 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"                     0   17892 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .016807118316381375   17874 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .016949558313773094   17546 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .027715399846546918   17876 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"  -.004995847193371539   17646 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"  -.044137790825711454   17799 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .010810916104215584   17602 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"  -.019204979836050073   17863 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .042559614418795855   17857 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .004995847193371539   17650 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"  .0016934805063331115   17574 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"  -.005797117684325892   17694 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .014098924379501332   17812 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"   .023940207546008452   17744 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"    .06071903714955518   17532 0 0 0 0 0 0 0 0 0 2008
                "Ireland" "AU000000JHX1(P)"    .01271203558836187   17772 0 0 0 0 0 0 0 0 0 2008
                end
                format %td date

                Comment


                • #9
                  part 2:

                  CODE]
                  * Example generated by -dataex-. To install: ssc install dataex
                  clear
                  input int year str8 Country double(Q ROA ROE FSize SEquity FLeverage Inventory) str18 Code
                  2008 "Ireland" . . . 1393198 . . . "0(P)"
                  2008 "Ireland" . . . 1125854 . . . "0(P)"
                  2008 "Ireland" . . . 8346221 3815047 . . "0(P)"
                  2008 "Ireland" .661 .9043049 -1955842 1208198 -.462361 .7347131 9329713 "AU000000JHX1(P)"
                  2008 "Ireland" . . . 1306216 . . . "AU000000ONE9(P)"
                  2008 "Spain" . -.0481803 . 8445859 . . . "ES0100081035(P)"
                  2008 "Spain" . -.0013651 . 1107778 . . . "ES0100096033(P)"
                  2008 "Spain" . . . . . . . "ES0101382036(P)"
                  2008 "Spain" . -.0087624 . 8309425 . . . "ES0105000030(P)"
                  8 "Spain" . -.0057251 . 8564097 . . . "ES0105781035(P)"
                  2008 "Spain" . .5877839 . 8481194 . . . "ES0105786034(P)"
                  2008 "Spain" . -.0059476 . 8502783 . . . "ES0105787032(P)"
                  2008 "Spain" . . . 8397789 . . . "ES0105822037(P)"
                  end
                  [/CODE]

                  But again I got the same error message..
                  merge m:1 year Code using "C:\Users\Εvangelia\Data\Stata part II\final stock prices.dta"
                  variables year Code do not uniquely identify observations in the using data

                  Comment


                  • #10
                    OK, so looking at your part 2 data in #9, you need look no farther than your first three observations to see that this data set is not yearly data about whatever it is that Code stands for. The first three observations all purport to be about Code 0(P) in year 2008. The only other non-missing variable in those three records is FSize. FSize does not appear as a variable in the Part 1 data in #8, so it is not going to help with the merging.

                    So the problem here is that your part 2 data is simply not what you think and say it is. The next step is to figure out what's going on with it.

                    1. Perhaps the data is supposed to be yearly data about each Code and two of those first three observations aren't supposed to be there. Perhaps they are wrong and you need to drop them. (It seems odd that the code in these three observations is 0(P)--which does not resemble any of the other codes in either data set.) Or perhaps you need to convert this into genuine yearly data about each Code by somehow combining them into a single observation. (?Averaging Fsize or something like that.

                    2. Perhaps you understanding of this data set is just wrong; it isn't supposed to be yearly data per Code and you just misunderstood it.

                    So you need to figure out what this data set really is. As it stands, it simply isn't what you say it is and it can't be used for your stated purpose. You need to either fix it, or find a different data set that really is yearly data for each Code.

                    Comment

                    Working...
                    X