Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata slow in a distributed environment

    Hello –

    I am not a Stata user, but am trying to help someone who is. We recently migrated to a virtual environment (VM) where Stata (Windows v 12) runs on a remote (and geographically distant) network to which we login via Citrix. Now our existing programs are suddenly running very slowly on the same data and on better hardware. We suspect that there are internal references to folders/paths that point back to our (geographically) local network resulting in round-trip delays during processing, especially with large datasets. For example, the “My Documents” folder on both networks are mapped to the same volume hosted on a local machine.

    I searched the forum for similar issues and came across STATATMP. I have a script that sets STATATMP, TMP and TEMP to folders on the remote server before invoking Stata. The hope was that Stata would not need to access anything local thus speeding it up. Unfortunately, thus far we have not seen any dramatic improvements. Memory use seems to about 50%-75% of what’s available on the 11GB VM. My questions are:
    • Are there other environment variables I need to set, or should I modify any of the “sysdir” folders?
    • Typing display “`c(tmpdir)’” shows a path that is local thus indicating that there might be a problem.It seems unrelated to the TMPDIR environment variable.Should this be changed?How?

    I also typed “shell set” and ensured that STATATMP was indeed being set correctly. It is. Any thoughts or suggestions will be welcome.

    Thanks,

    --Suresh

  • #2
    Welcome to Statalist! You have my sympathy trying to set up Stata as a remote application. Been there, done that, although not with Stata.

    First, along with STATATMP, you should be looking into all the Stata system directory locations if you have not already done so. From within Stata, type help sysdir for further information. Better yet, for a very complete discussion see full discussion of sysdir in the Stata Programming Reference Manual PDF which is part of the Stata installation and can be found through Stata's Help menu as part of PDF Documentation (at least, that's how I find it on my Mac).

    Also, I searched the Statalist pre-2014 archives for "Citrix" and came across this in a discussion from 2013. Perhaps there are ideas here for configuring your Stata setup. It appears that nothing in their setup points back to the user desktop, but rather to a local directory that the user copies files onto and off of.
    At Boston College, we are now using Stata in statistics and econometrics course labs for about 300 students per semester, who access Stata through a Citrix 'apps server'. When they connect to the app server (which they do with the downloaded Citrix Receiver app, which works much better than the web interface) they have a mapping to their "N:" drive, which is a web-accessible filestore that they can mount on any laptop (Win/Mac OS X). The "N" drive is set to be the current working directory for Stata, so any data files, log files, etc. that they create go there. Likewise, if they install anything from SSC or SJ, there is an \ado directory created on the N drive, so they have no problems installing user-written software and accessing it when they next open Stata (as they will again have their N drive linked).

    Comment


    • #3
      Is the VM deployed on dedicated hardware and has the user noticed any differences in speed/performance if they are connected directly to the local network? I had similar performance issues with Citrix and noticed pretty significant performance differences depending on whether I was using the application from home (~10-15 miles from the office) or while hard wired to the local network. I noticed even better performance when using Remote Desktop Connection. My guess is that there may be a lot of overhead consumed by the Citrix server application.

      Comment


      • #4
        Thank you to William Lisowski and wbuchanan for taking the time to post responses. I am proceeding on the assumption that I have to set SITE, PLUS, PERSONAL and OLDPLACE to volumes on the remote server as well (these were either set to local shares, or to the system C: drive of the VM which has very limited space). If Stata is attempting to reference these folders while it runs it might slow things down quite a bit. In any case I am learning Stata along the way - so not a bad thing!

        wbuchanan - the VM is hosted (along with other VMs) on dedicated hardware that is a server class machine. The Citrix connection is used to purely communicate with the VM and for screen updates. So even on a slow network link, it should not really slow Stata running on the VM. We are noticing significant slowdown (3x to 5x?) for what should be simple data management computations, e.g., adding a new variable to the dataset. Other applications (SAS) run fine over this link.

        I will let the forum know if I find any breakthrough solutions.

        --Suresh

        Comment

        Working...
        X