Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Comparing data

    Hi,

    I am comparing information from different data sources. The datasets have been merged already and the variable names are different across datasets.

    Dataset 1 includes variables "vehicle, burglary, murder ".

    Dataset 2 includes variables "motorvehicle ,housebreaking ,homicides ".

    I am trying to compare "vehicle" from Dataset1 to "motorvehicle" from Dataset2.

    Here is what I did:

    local data1 vehicle burglary murder
    local data2 motorvehicle housebreaking homicides
    compare `"`data1'"' `"`data2'"'

    However, I get this error:
    vehicle burglary murder invalid name

    I actually have dozens of variables which is why I am trying to use locals.

    I would really appreciate your help.

    Thanks

  • #2
    Perhaps the following (untested) code will point in a useful direction
    Code:
    local data1 vehicle burglary murder
    local data2 motorvehicle housebreaking homicides
    forvalues v = 1/3 {
        local v1 : word `i' of `data1'
        local v2 : word `i' of `data2'
        compare `v1' `v2'
    }
    or alternatively, the following can be easier to keep track of if you have dozens of variables
    Code:
    local args1 vehicle motorvehicle
    local args2 burglary housebreaking
    local args3 murder homicides
    forvalues v = 1/3 {
        compare `args`v''
    }

    Comment


    • #3
      Hi William,

      Thank you so much for your help! I tried your suggestion but it seems like there is a syntax issue: "invalid syntax".

      Best,

      Elizabeth

      Comment


      • #4
        Did that happen for the first suggestion, the second suggestion, or both?

        Comment


        • #5
          The error for the first suggestion was "invalid syntax". The error for the second suggestion was "varlist required".

          Comment


          • #6
            Below is demonstration code using invented data. You will see that I corrected two lines in the first suggestion, but the second suggestion is unchanged, other than the variable names. This suggests that you made an error when implementing the second suggestion, but since you haven't shown us the code you ran, I cannot tell you what that error was. There's a third suggestion that some would consider to be an improvement on the first suggestion.

            If you copy this demonstration code into your do-file editor and run it, you will see that both suggestions run as expected.
            Code:
            // create sample data
            clear
            set obs 10
            generate a = _n
            generate b = _n/3
            generate c = _n*_n
            generate x = a
            generate y = b if _n>1
            generate z = c
            replace  z = 42 in 1
            // first version
            local data1 a b c
            local data2 x y z
            forvalues v = 1/3 {
                local v1 : word `v' of `data1'
                local v2 : word `v' of `data2'
                compare `v1' `v2'
            }
            // second version
            local args1 a x
            local args2 b y
            local args3 c z
            forvalues v = 1/3 {
                compare `args`v''
            }
            // third version
            local data1 a b c
            local data2 x y z
            while "`data1'"!="" {
                gettoken v1 data1 : data1
                gettoken v2 data2 : data2
                compare `v1' `v2'
            }
            With that said, some further advice. Please review the Statalist FAQ linked to from the top of the page, as well as from the Advice on Posting link on the page you used to create your post. Note especially sections 9-12 on how to best pose your question.

            When asking for help with code, always show the code you ran, using CODE blocks to present it, and provide example data, using dataex to present it. Both CODE blocks and dataex are described in the FAQ. If your original presentation had included 10 observations of sample data, you would have had this answer 16 hours sooner.

            The more you help others understand your problem, the more likely others are to be able to help you solve your problem.

            Comment


            • #7
              Thank you so much William!!

              Comment

              Working...
              X