Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Do I have to specify a directory?

    Hello,

    I have some do-files that shows me importing excel files, writing a few commands, and then saving a completely new .dta in a particular directory.

    At my workplace they have moved our project folders and today they are moving them again.

    Since the project folders were moved, I couldn't run my do-files without updating the directory where I have all of excel files that I use to import into Stata.

    My workplace prefers that I save all of my files in the project folders just in case if it ever gets deleted it can be recoverable by the IT department.

    So today the project folders will be moved. I'm thinking ahead and asking the community if there is a way to have Stata find my files without me having to specify a certain directory.

    If in the near future they move the project folders again I rather not have to update every do-file that I have created because Stata can't locate my excel files where I had them originally.

    Any tips would be helpful.


  • #2
    The simple solution to this problem is to never use full pathnames in do-files. Just put all your stuff into a main project directory and use the "Change working directory..." item from Stata's file menu when you want to start working on the project. If the files are in the main directory, you can refer to them by filename only. If the files are in subdirectories, use a relative path. For example, say all your Excel files are in a subdirectory called "Excel" and within that, in subdirectories by year. For example:

    Code:
    import excel using "Excel/2010/auto2010..xlsx", clear
    As long as you do not change Stata's current directory, all the file references will work. If the whole project directory is moved or if you work on another computer (over Dropbox for example), all you need is to change Stata's current directory and you are good to go.

    For a more powerful solution, look up project (from SSC):

    Code:
    ssc describe project

    Comment


    • #3
      Hello Claudia,

      There are several solutions to this issue. First thing to remember is that when you open a do-file by double clicking on it, Stata (at least Windows) should automatically set the working directory to the directory that the do-file is located in.

      Solution #1: Use relative file paths.

      Let's say you double click on your do-file located here: C:\Users\rchu\Stata\myfile.do
      The working directory of Stata will now be as follows: C:\Users\rchu\Stata

      Say your data is saved here: C:\Users\rchu\Stata\data\observations.xlsx

      Then use a relative file path inside of your do-file to reference your data:
      Code:
      import excel "data\observations.xlsx"
      Thus, no matter where you move the folder "Stata" to, myfile.do will always work, as long as the "data" folder stays within "Stata". (Or more precisely, "data" stays in the same folder as myfile.do.) This method will allow you to create self-contained project folders that you can move anywhere.

      Using an absolute file path would look like the following, and would break if you ever change your folder locations
      Code:
      import excel "C:\Users\rchu\Stata\data\observations.xlsx" //If I ever move my Stata folder, this import excel would break
      There is more to learn about relative file paths that can be useful. Check this website for more info (I know, it's HTML, but it still applies). https://www.w3schools.com/html/html_filepaths.asp


      Solution #2: Save all of your directory locations or change directory codes in a single include file.

      If relative file paths just won't do, you could also created on single do-file that you include in all of your other do-files. This way, when you move your files, you just have to update one file.

      For example, say you simply have use a -cd- command at the top of every do-file, and you don't want to have to change each one individually.

      Then have a do-file called "cd-include.do" with something like the following code in it:
      Code:
      cd MY_DIRECTORY_PATH
      And then include this include file at the top of every do-file:
      Code:
      include cd-include.do
      You could also do more complicated things like defining locals with specific directory paths to be used in different do-files, all within cd-include.do.


      Hope this helps.
      Last edited by Roger Chu; 16 Mar 2017, 13:53. Reason: Crossed with Robert's post.

      Comment


      • #4
        I prefer to use macros to control file path changes. I avoid -cd- since I frequently access data from many directories across the network and I have to track the present working directory throughout my do-file/workflow which is less than ideal. Also, I use packages/programs that interface with latex via Stata (e.g., sutex , texdoc) need to be CD'd to the directory with dependent latex files (.tex, .sty, graphics). So unless I keep all my latex files in the same directory as my stata files - or constantly CD back and forth - then I usually run into issues. I prefer to explicitly define every file path in my do-file and control it with macros set at the top of the do-file. I find this granular control over subdirectories helps me keep better track of my files as they are organized into sub-folders by type.

        A short example of macros for file paths is:

        Code:
        global base `"C:/mydata/"'
        global output `"C:/mydata/output/"'

        then you can specify these paths in the do-file and if you move the data (as you describe in the OP) you can change the globals above (I've seen advice somewhere against globals for this purpose but I've never had any issues - also note that I use forward slashes since these can be read across multiple OSs ) and it will still access your data in the new location when you use commands like:

        Code:
        import excel using `"${base}/myrawdata.xlsx"', clear
        save `"${output}/myresults.dta"', replace

        I use Stata on several machines (e.g., a windows server at work, a mac laptop, and a windows laptop at home). The data are located on a synced folder and are the same no matter which machine I'm using (on a remote drive accessed via vpn) yet I need the start of the file path to change depending on the machine. The sub-folders for my analysis are the same no matter which machine I'm using, so this is at the top of my 1st do-file in my analysis workflow:

        Code:
        ************************************#
        **FILE LOCATIONS**
        if `"`c(os)'"' == "Windows" & `"`c(username)'"' == "ericbooth" global sf `"S:/Data//"'
        if `"`c(os)'"' == "Windows" & `"`c(username)'"' == "server" global sf `"/Volumes/shared/data//"'
        if `"`c(os)'"' == "MacOSX" & `"`c(username)'"' == "ebooth" global sf `"/volumes/sdrive/data//"'
        
        
        di `"${sf}"' //here's the base file
        
        **subdirectories::
        global raw `"${sf}/raw//"'
        global converted `"${sf}/converted//"'
        global final `"${sf}/final//"'
        global results `"${sf}/results//"'
        global syntax `"${sf}/syntax//"'
        global latex   `"${sf}/latex//"'
        
        
        di `"The raw data is here: ${raw}"'
        **on my mac laptop prints: /volumes/sdrive/data///raw//
        ************************************#
        Now you can send results and cleaned data to the right sub-directory depending on the machine/location because $sf base folder changes depending on your conditions. If you share data with multiple co-authors you can include lines to that select the basefile when they run it on their machine, e.g.:


        Code:
        if  `"`c(username)'"' == "claudia_mel"  global sf `"C:/claudia/data//"'
        and it will run on your machine.



        Eric A. Booth | Senior Director of Research | Far Harbor | Austin TX

        Comment


        • #5
          Another alternative is fastcd: https://ideas.repec.org/c/boc/bocode/s429301.html
          with which you can create similar lists of directories on different machines. If folder structures change, you need to change the filepath in one central location (fastcd's list of paths) only.

          Comment


          • #6
            Thank you all so much for the helpful tips.

            I'm currently at the beginning stages of Stata programming so I hope I'll learn about macros as I get there.

            Once again, thank you all!

            Comment

            Working...
            X