Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • R script from STATA

    Hello everyone,

    I am trying to run some R-code through STATA but I am experiencing some issues and I was hoping to find help here.

    First I was using the Rsource user written code, but I think it was slower than what I am doing now, and in particular, it gave me an r(198) error if trying to loop over it.
    I was doing something like this:
    Code:
    levelsof region, local(regionlevel)
    foreach l of local regionlevel {
    preserve
    keep if regione=="`l'"
    rsource, terminator(END_OF_R) roptions(--vanilla) rpath(C:\Program Files\R\R-3.5.2\bin\R.exe)
    RCODE here
    END_OF_R
    save "$path/myfolder/`l'_Rcoded.dta", replace
    restore
    }
    But as soon as the loop start I get the following error

    Code:
    > 
    End of R output
    invalid 'list' 
    r(198);
    The R code itself is correct because it works without the loop, so I think that the rsoruce has some issue when it is inside a loop.

    As a workaround I found out that I can call R from shell and make it launch an Rscript with the following code:
    Code:
    levelsof region, local(regionlevel)
    foreach l of local regionlevel {
    preserve
    keep if regione=="`l'"
    shell "C:/Program Files/R/R-3.5.2/bin/R.exe" --vanilla <"C:/myfolder/Rscript.R"
    save "$path/myfolder/`l'_Rcoded.dta", replace
    restore
    In this way everything works good but I am annoyed by the fact that to change the Rscript I have to open the script with R, while with the Rsource I could modify it directly in the dofile.

    Do any of you know if there is a way to make the rsource work within the loop or if there is a way to make the shell recognize the script directly inside the do-file?
    maybe something like
    Code:
    shell "C:/Program Files/R/R-3.5.2/bin/R.exe" --vanilla <"Rcode_here"
    I hope to find help here,
    and thank you in advance for your time

  • #2
    You didn't show us your R code, but I will hazard a guess that the first token on the first line is "list".

    Some experimentation suggests to me that, like other Stata commands that "read" from the do-file, the rsource command cannot be used within a loop.
    Code:
    . rsource, terminator(END_OF_R) maxlines(5) rpath(/bin/cat) roptions(-b) lsource
    Assumed R program path: "/bin/cat"
    Beginning of listing of R source code 
    this is the first line of r code
    this is the second line of r code
    this is the final line of r code
    End of listing of R source code 
    
    Beginning of R output
         1  this is the first line of r code
         2  this is the second line of r code
         3  this is the final line of r code
    End of R output
    
    . 
    . forvalues i=1/2 {
      2. rsource, terminator(END_OF_R)  maxlines(5) rpath(/bin/cat) roptions(-b) lsource
      3. this is the first line of r code
      4. this is the second line of r code
      5. this is the final line of r code
      6. END_OF_R
      7. }
    Assumed R program path: "/bin/cat"
    Beginning of listing of R source code 
    
    
    
    
    End of listing of R source code 
    
    Beginning of R output
    
    
    
    
    End of R output
    command this is unrecognized
    r(199);
    About your workaround, you write

    In this way everything works good but I am annoyed by the fact that to change the Rscript I have to open the script with R
    Why is that? On my installation, Stata 15.1 for Mac, the Open dialog box for the Do-file Editor has an "Options" button which when clicked allows me to choose the file format that the dialog box will allow me to select, and the last choice in the list is All Files (*.*). It seems to me you could use that to open your R source in a second Do-file Editor Window.

    Comment


    • #3
      Thank you for your answer, William! And sorry for my late reply!

      Let me go point by point.
      I did not show my R code because it is an user-written code, so I do not really know what it does behind the scene. It may very well have a "list" as you suggest, but anyway I can't do much about it.

      Regarding your hint, it is actually very helpful and did not notice I could open the Rscript through STATA. However, this does not fully overcome my issue.

      I have five loops with a slightly different R-code to run, this means that I would need to use one dofile in which I "run" 5 different R-scripts.
      What I would like to have instead (which is what I thought to be possible using rsource) is to have only one do file, in which I wrote the five different R-scripts directly inside the loop.

      Do you know how I could achieve this result?
      Any help would be greatly appreciated!

      Comment


      • #4
        The problem lies in reading the lines of your source code. Stata's input command requires strings with embedded spaces to be surrounded in quotation marks, making your inline R source difficult to read and leading to potential difficulties if it contains quotation marks itself. The rsource command uses Mata techniques to read the inline R source, but it is beyond my Mata capabilities (and my current interests, honestly) to adapt that code into a new program to copy inline text to a disk file. But wow, that would be a useful standalone program. Unfortunately, the ouput of search inline does not turn up any such utility already written.

        But do understand that even if you wrote a command, say, copyR, to read inline R source following the copyR command and write it to a disk file, the copyR command could not be run within a loop. You would have to create your five R scripts as five separate invocations of copyR outside of any loop.

        If I were in your position, I would most likely resign myself to storing R scripts as separate disk files. Since I'm rigorous about creating log files for my production Stata jobs, I'd use Stata's type command to display each R script in my log just before it is run using rsource using. (I note that in post #1 you use shell to invoke R to run an R script; I think rsource using would do that for you, but I haven't tested that idea.)

        That way, the Stata log would provide comprehensive documentation of the commands that were run in both Stata and R. along with the Stata output and, via rsource, the R output.

        Finally, since he's been active on Statalist recently, I'll mention Roger Newson the author of rsource to call this discussion to his attention, in case he has any ideas to add. Or corrections to make about our understanding of rsource.

        Comment

        Working...
        X