Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • merging problem

    Hello~

    I`m not sure what the problen is with my code
    but I have created 2 datasets and they both have the same date variable in the same format.
    but when I try to merge the two, for some reason they are matched weirdly
    and when I try to check with the tab function I get this really weird reults

    | hml_group
    smb_group | 1 2 3 4 5 | Total
    -----------+-------------------------------------------------------+----------
    1 | 5,587 0 0 0 0 | 5,587
    2 | 0 5,587 0 0 0 | 5,587
    3 | 0 0 5,587 0 0 | 5,587
    4 | 0 0 0 5,587 0 | 5,587
    5 | 0 0 0 0 5,587 | 5,587
    -----------+-------------------------------------------------------+----------
    Total | 5,587 5,587 5,587 5,587 5,587 | 27,935


    so i`m guessing the merge function is the problem
    any help please

  • #2
    Your question really isn't clear without more detail, or at a minimum it is too difficult to guess at a good answer from what you have shared. Please help us help you. Show example data. Show your code. Show us what Stata told you. Tell us what precisely is wrong. The Statalist FAQ provides advice on effectively posing your questions, posting data, and sharing Stata output.

    Comment


    • #3
      What William Lisowski said, and this: your problem is said to arise when dates are present, but your question says absolutely nothing about dates, either by way of data example, or by showing how your code handles them.

      FWIW, merge and tabulate are commands, not functions.

      Comment


      • #4
        @William Lisowski @Nick Cox so sorry for the confusion

        so the first dataset looks something like this :

        date_m com date com_cap smb_group
        2018m12 Comp394 12/31/2018 77380 1
        2018m12 Comp398 12/31/2018 72000 1
        2018m12 Comp676 12/31/2018 58104 1
        2018m12 Comp408 12/31/2018 46985 1
        2018m12 Comp675 12/31/2018 49511 1

        the 2nd dataset looks like this:

        date_m com date com_book hml_group
        2018m12 Comp407 12/31/2018 .28 1
        2018m12 Comp152 12/31/2018 .46 1
        2018m12 Comp585 12/31/2018 .42 1
        2018m12 Comp257 12/31/2018 .37 1
        2018m12 Comp393 12/31/2018 .32 1


        when I try to merge these 2 datasets using the merge command, I get the following table

        date_m smb_group smb_factor hml_group hml_factor
        2018m12 1 -.0357583 1 -.0357583
        2018m12 1 -.0357583 1 -.0357583
        2018m12 1 -.0357583 1 -.0357583
        2018m12 1 -.0357583 1 -.0357583
        2018m12 1 -.0357583 1 -.0357583
        2018m12 1 -.0357583 1 -.0357583


        ** i dropped unimportant variables, now when i do tab command, i get a weird outcome
        it should be different numbers in each group not like that so i`m guessing the merge command is the problem

        | hml_group
        smb_group | 1 2 3 4 5 | Total
        -----------+-------------------------------------------------------+----------
        1 | 5,587 0 0 0 0 | 5,587
        2 | 0 5,587 0 0 0 | 5,587
        3 | 0 0 5,587 0 0 | 5,587
        4 | 0 0 0 5,587 0 | 5,587
        5 | 0 0 0 0 5,587 | 5,587
        -----------+-------------------------------------------------------+----------
        Total | 5,587 5,587 5,587 5,587 5,587 | 27,935


        could u help please
        ​​​​​​​

        Comment


        • #5
          I cannot tell you what's wrong with your merge command if you cannot tell us exactly what your merge command was.

          The more you help others understand your problem, the more likely others are to be able to help you solve your problem. And you will need to attract someone other than me, because I am away from my desk for several days starting in a few hours.

          Do follow the link to the Statalist FAQ in post #2.
          Last edited by William Lisowski; 03 Jun 2022, 19:15.

          Comment


          • #6
            @William Lisowski
            my merge command says the following:

            use merge1, clear
            merge m:m date_m using merge2

            and then I used the following tab command:
            tab2 smb_group hml_group

            Comment


            • #7
              merge m:m is almost always not what you want. Otherwise to my best of my understanding you need something more like

              Code:
              merge 1:1 date_m com

              Comment


              • #8
                Nick Cox i tried that one but I get the same results
                the tab command gives me the same outcome

                Comment


                • #9
                  I don't think we can help more without more details on why the merge didn't do what you expect.

                  Comment

                  Working...
                  X