Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to create matrices for every observations and compute distance scores?

    Hi

    I have a 100 observations and 4 variables. The first column variable is a string variable and the rest 4 column values contains numerical values. I need to create 100 vectors that are based on the observations (_N) with 1X3 dimensions (see below).

    var1 var2 var3 var4
    A1= [3 4 5]
    A2= [5 5 5]
    .
    .
    A100

    Then I need to do matrix operations to compute Euclidean distance between every pair. Let me know if there is a way this task can be achieved.

    Thanks for your help.

    Best
    Veeresh

  • #2
    There are a number of matrix-oriented ways to do this, but they aren't the easiest way, I don't think. Assuming that the values of each of your vectors A1, ... A_N come from the corresponding values of var2-var4, I would use the -cross- command to form all possible pairs and just write the distance expression you want:

    Code:
    // Example data
    clear
    input  var1 var2 var3 var4
    1 1 2 3
    2 4 5 6
    3 7 8 9
    end
    // Copy data set with different names.
    preserve
    rename (*) (other_*)
    tempfile other
    save `other'
    restore
    //
    cross using `other' // make all pairwise combinations
    keep if var1 <= other_var1  // drop duplicates
    gen dist = sqrt((var2 - other_var2)^2 + (var3 - other_var3)^2 + (var4- other_var4)^2 )
    If you are committed to a matrix solution, you can save your data in a Stata matrix with -mkmat- or in a mata matrix with -putmata- and go from there, but I'm guessing that's not really what you want.

    Comment

    Working...
    X