Question about rcall to call an R script from within stata

Pierre Azoulay

Join Date: Sep 2014

Posts: 6
#1

Question about rcall to call an R script from within stata

11 Jun 2019, 11:02

I just downloaded rcall from GitHub to use R from within Stata. (E.F. Hagish; http://www.haghish.com/packages/Rcall.php)

I want to use it in batch mode. In my do file, I enter:

rcall vanilla: source("${F7}yellow_berets/analysis/penalized_logit.R")

The shell window opens, and the program stars to execute. However, at some point it stops with the following error message:

lhs not found
r(111);

lhs is the reponse that I am trying to predict in the R program. I want to make clear the following: penalized_logit.R runs just fine in Rstudio and in R. It is a simple program, nothing fancy (see below). The error message (and the fact that it shows up 30 seconds in) tells me that R began executing the program. But when using rcall it chokes in the middle and stops. Why might something work fine in Rstudio, and not with rcall?

If someone has any idea about what might be going on, I would be grateful if you could let me know.
Regards,

Pierre

PS: it should not matter, but here is my r script:

#uncomment installs for first time
#install.packages("glmnet")
#install.packages("tidyverse")

#required libraries to run script
library(glmnet) #required for plogit
library(haven) #required to read in .dta files

#working directories
pierre <- "D:\\Dropbox (Personal)\\yellow_berets\\analysis"

setwd(pierre)

#begin log
sink("./penalized_logit.log")

#clear environment
rm(list=ls())

#data
load("pnlzd_logit_data.rdata")
pnlzd_logit_data <- data.matrix(pnlzd_logit_data)

#glmnet fit
rhs <- subset(pnlzd_logit_data, select = -c(person_uuid, treat))
lhs <- subset(pnlzd_logit_data, select = treat)
cvfit = cv.glmnet(rhs, lhs, family="binomial", intercept=TRUE)

#to probability (rhs)
betas <- coef(cvfit, s="lambda.min")
#odds_rhs <- exp(betas)[nonzero_betas]
#nonzero_betas <- nonzeroCoef(betas)
#prob_rhs <- odds_rhs/(1+odds_rhs)
#names(prob_rhs) <- row.names(betas)[nonzero_betas]

#to probability (person_uuid)
prob_person <- predict(cvfit, newx = rhs, type = "response", s = "lambda.min")
row.names(prob_person) <- as.character(pnlzd_logit_data[,"person_uuid"])

#stata export
doctor_probabilities <- data.frame(cbind(pnlzd_logit_data[,"person_uuid"], prob_person))
colnames(doctor_probabilities) <- c("person_uuid", "probability")
write_dta(doctor_probabilities, "doctor_probabilities.dta", version = 13)

#print values
#cat("\n") adds spaces to the log to separate printed values
options(max.print=10000)
print("Coefficient Values")
betas
cat(c("\n","\n","\n"))

print("Lambda Value Used")
cvfit$lambda.min
cat(c("\n","\n","\n"))

print("Doctor Probabilities")
doctor_probabilities

#end log
sink()
Tags: None
haghish

Join Date: Aug 2014

Posts: 201
#2

12 Jun 2019, 01:41

Your code looks fine. Mind you, Rstudio is "more" than R. You might try to call R from the command line and source the file to ensure the error is not from the R. If R command-line can run your code, please send me an example of your data and I will look into it. [email protected].

Right now, I don't remember how rcall handles changing the working directory. rcall and Stata communicate data and changing the R working directory within an R script file (i.e. Stata wouldn't know R has changed its WD), could possibly mess things up at the end of the program. But I'm uncertain about your code, without having my hands on the data.
Comment
haghish

Join Date: Aug 2014

Posts: 201
#3

12 Jun 2019, 07:05

After a test, it seems that changing the working directory is indeed troublesome, causing rcall not to return any data to Stata. it is likely that that is the reason you are getting the error. Try changing the WD in Stata and removing the WD from the R code.
Comment

Announcement

Question about rcall to call an R script from within stata

Comment

Comment