Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • convert r code into mata

    Dear list members,

    I hope somebody can help me with this problem. I'd like to run a small R code in stata and mata. The existing R code (see below) calculates a distance matrix (cosine distance) based on variables x y z in a dataset (testdata.csv) and writes the distance matrix in a new file (distancematrix_test.csv).

    This is the R program, including some output based on a test dataset:

    > # load data
    > mydata = read.csv("testdata.csv")

    > # View the data
    > mydata
    x y z
    1 0.5 0 0.5
    2 0.5 0 0.5
    3 0.5 0 0.5
    4 0.5 0 0.0
    5 1.0 0 0.0

    > # define distance function (cosine distance)
    > # and perform function on mydata (this should be done in MATA)

    > cosineDistance <- function(x){
    as.dist(1 - x%*%t(x)/(sqrt(rowSums(x^2) %*% t(rowSums(x^2)))))
    }

    > # perform distance function on mydata
    > s <- data.matrix(mydata)

    > # view result (distance matrix)
    > cosineDistance(s)
    1 2 3 4
    2 0.0000000
    3 0.0000000 0.0000000
    4 0.2928932 0.2928932 0.2928932
    5 0.2928932 0.2928932 0.2928932 0.0000000

    > # write result (distance matrix) to disk
    > x <- cosineDistance(s)
    > m <- as.matrix(x)
    > write.table(m, file = "distancematrix_test.csv")


    Any help is much appreciated!

    Thanks a lot!!

    Mike

  • #2
    Try the following:

    Code:
    cscript
    mata:
    real matrix cosineDistance(real matrix X)
    {
        real rowvector norm
        real matrix res, rr
        
        /* compute norm of X for each row vector*/
        norm = sqrt(rowsum(X:^2))'
        rr = cross(norm, norm)
        res = 1:-cross(X', X'):/rr
        return(res)
    }
    end
    
    /* read the dataset from csv */
    import delimited testdata.csv, varname(1)
    
    /* put dataset into a Mata matrix */
    putmata X=(x y z)
    
    mata:
    X
    D=cosineDistance(X)
    D
    end
    
    /* put the distance matrix back into a Stata dataset */
    drop _all
    getmata (d*)=D
    
    /* save the datset to a csv file */
    export delimited distncematrix_test.csv, replace

    Comment


    • #3
      It works!! Thank you so much -- you saved my life!!!

      Mike

      Comment

      Working...
      X