I just downloaded rcall from GitHub to use R from within Stata. (E.F. Hagish; http://www.haghish.com/packages/Rcall.php)
I want to use it in batch mode. In my do file, I enter:
rcall vanilla: source("${F7}yellow_berets/analysis/penalized_logit.R")
The shell window opens, and the program stars to execute. However, at some point it stops with the following error message:
lhs not found
r(111);
lhs is the reponse that I am trying to predict in the R program. I want to make clear the following: penalized_logit.R runs just fine in Rstudio and in R. It is a simple program, nothing fancy (see below). The error message (and the fact that it shows up 30 seconds in) tells me that R began executing the program. But when using rcall it chokes in the middle and stops. Why might something work fine in Rstudio, and not with rcall?
If someone has any idea about what might be going on, I would be grateful if you could let me know.
Regards,
Pierre
PS: it should not matter, but here is my r script:
#uncomment installs for first time
#install.packages("glmnet")
#install.packages("tidyverse")
#required libraries to run script
library(glmnet) #required for plogit
library(haven) #required to read in .dta files
#working directories
pierre <- "D:\\Dropbox (Personal)\\yellow_berets\\analysis"
setwd(pierre)
#begin log
sink("./penalized_logit.log")
#clear environment
rm(list=ls())
#data
load("pnlzd_logit_data.rdata")
pnlzd_logit_data <- data.matrix(pnlzd_logit_data)
#glmnet fit
rhs <- subset(pnlzd_logit_data, select = -c(person_uuid, treat))
lhs <- subset(pnlzd_logit_data, select = treat)
cvfit = cv.glmnet(rhs, lhs, family="binomial", intercept=TRUE)
#to probability (rhs)
betas <- coef(cvfit, s="lambda.min")
#odds_rhs <- exp(betas)[nonzero_betas]
#nonzero_betas <- nonzeroCoef(betas)
#prob_rhs <- odds_rhs/(1+odds_rhs)
#names(prob_rhs) <- row.names(betas)[nonzero_betas]
#to probability (person_uuid)
prob_person <- predict(cvfit, newx = rhs, type = "response", s = "lambda.min")
row.names(prob_person) <- as.character(pnlzd_logit_data[,"person_uuid"])
#stata export
doctor_probabilities <- data.frame(cbind(pnlzd_logit_data[,"person_uuid"], prob_person))
colnames(doctor_probabilities) <- c("person_uuid", "probability")
write_dta(doctor_probabilities, "doctor_probabilities.dta", version = 13)
#print values
#cat("\n") adds spaces to the log to separate printed values
options(max.print=10000)
print("Coefficient Values")
betas
cat(c("\n","\n","\n"))
print("Lambda Value Used")
cvfit$lambda.min
cat(c("\n","\n","\n"))
print("Doctor Probabilities")
doctor_probabilities
#end log
sink()
I want to use it in batch mode. In my do file, I enter:
rcall vanilla: source("${F7}yellow_berets/analysis/penalized_logit.R")
The shell window opens, and the program stars to execute. However, at some point it stops with the following error message:
lhs not found
r(111);
lhs is the reponse that I am trying to predict in the R program. I want to make clear the following: penalized_logit.R runs just fine in Rstudio and in R. It is a simple program, nothing fancy (see below). The error message (and the fact that it shows up 30 seconds in) tells me that R began executing the program. But when using rcall it chokes in the middle and stops. Why might something work fine in Rstudio, and not with rcall?
If someone has any idea about what might be going on, I would be grateful if you could let me know.
Regards,
Pierre
PS: it should not matter, but here is my r script:
#uncomment installs for first time
#install.packages("glmnet")
#install.packages("tidyverse")
#required libraries to run script
library(glmnet) #required for plogit
library(haven) #required to read in .dta files
#working directories
pierre <- "D:\\Dropbox (Personal)\\yellow_berets\\analysis"
setwd(pierre)
#begin log
sink("./penalized_logit.log")
#clear environment
rm(list=ls())
#data
load("pnlzd_logit_data.rdata")
pnlzd_logit_data <- data.matrix(pnlzd_logit_data)
#glmnet fit
rhs <- subset(pnlzd_logit_data, select = -c(person_uuid, treat))
lhs <- subset(pnlzd_logit_data, select = treat)
cvfit = cv.glmnet(rhs, lhs, family="binomial", intercept=TRUE)
#to probability (rhs)
betas <- coef(cvfit, s="lambda.min")
#odds_rhs <- exp(betas)[nonzero_betas]
#nonzero_betas <- nonzeroCoef(betas)
#prob_rhs <- odds_rhs/(1+odds_rhs)
#names(prob_rhs) <- row.names(betas)[nonzero_betas]
#to probability (person_uuid)
prob_person <- predict(cvfit, newx = rhs, type = "response", s = "lambda.min")
row.names(prob_person) <- as.character(pnlzd_logit_data[,"person_uuid"])
#stata export
doctor_probabilities <- data.frame(cbind(pnlzd_logit_data[,"person_uuid"], prob_person))
colnames(doctor_probabilities) <- c("person_uuid", "probability")
write_dta(doctor_probabilities, "doctor_probabilities.dta", version = 13)
#print values
#cat("\n") adds spaces to the log to separate printed values
options(max.print=10000)
print("Coefficient Values")
betas
cat(c("\n","\n","\n"))
print("Lambda Value Used")
cvfit$lambda.min
cat(c("\n","\n","\n"))
print("Doctor Probabilities")
doctor_probabilities
#end log
sink()
Comment