Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is the best way to tie a data file to a do file?

    I usually have my data file in the same project directory as the do file. And I execute the do file normally via
    Code:
    do "/full/path/to/the/project/directory/mydofile.do"
    How can I load the data file in my do file without knowing the absolute path reference to the data file? I tried
    Code:
    use "./mydatafile.dta"
    but this doesn't work reliably, because the current working directory is not set to the directory the do-file is in.

    What's the best way to deal with this to make project folders portable in the file system and between machines?

    Many thanks for all suggestions.


  • #2
    One approach is to use project, a user-written program available from SSC. To install it, type in Stata's Command window
    Code:
    ssc install project
    and then type help project to read the help file.

    Here are snippets from the help file:

    project automates the execution of do-files, skipping do-files with unchanged dependencies. With project, you accumulate and organize all Stata code related to a project, from the early data management steps to the final analysis, in a web of interconnected do-files, all managed from a master do-file. Each time you build a project (i.e. run the master do-file), project knows what has changed and only runs do-files that are affected by these changes.

    [...]

    The directory where the master do-file resides becomes the project directory. If you defined a project called "abc", then typing project abc, build will run the master do-file "abc.do" located in the project directory irrespective of Stata's current working directory.

    [...]

    project automatically changes Stata's working directory to the directory of the current do-file. This means that you can always access files in the do-file's directory by file name only. Files elsewhere but still within the project directory can be accessed using a file path that is relative to the project directory (see example 2). A project that never uses full path names can be easily shared with others or moved to a new directory without having to update any file path.
    With project, all files related to a project are located in the master directory (folder). A well organized project has several sub-directories (e.g. "data", "tables", "figures", etc.), and these can also contain any number of nested sub-directories. In most cases, you load and save files by filename only as these are in the same directory as the do-file that refers to them. If a do-file needs to refer to a document that's located in another directory, you can use a path that is relative to the project's master directory. For example, if "final_data.dta" is located in the "data" directory, a do-file within the "tables" directory would include something like:

    Code:
        project, doinfo
        local pdir "`r(pdir)'"                            // the project's main dir.
        local dofile "`r(dofile)'"                        // do-file's stub name
        
        
    * Load the final data
    
        project, uses("`pdir'/data/final_data.dta")    
        use "`pdir'/data/final_data.dta"
    
    [...]
    
    * save the table
    
        project, creates("`dofile'.doc")
    With project, you can work on a project that's stored on Dropbox from different computers and not have to worry about file paths or different operating systems. Because the project is completely self-contained, you can submit the whole directory to a public repository (if a Journal requires publication of data and code) and anyone will be able to download and run the whole thing without having to change anything to the code.

    Comment


    • #3
      Walter, first install dodata module:

      Code:
      net from http://www.radyakin.org/stata/dodata/
      Then instead of your
      Code:
      do "/full/path/to/the/project/directory/mydofile.do"
      write
      Code:
      dodata "/full/path/to/the/project/directory/mydofile.do"
      For example (web):
      Code:
      dodata http://www.radyakin.org/stata/dodata/auto.do
      or another example (local) if both auto.ado and auto.dta are saved in c:\test
      Code:
      dodata c:\test\auto.do
      Inside your do file refer to `1' to load the data file with the same name as the do-file.

      Best, Sergiy Radyakin


      Comment


      • #4
        Thanks for the suggestions. But this looks like to much install depnedencies for this issue. What I do, is I reference the data file via "./data.dta" and if I run the do file and the data cannot be found I cd manually into the directory for now.

        Comment


        • #5
          I start each dofile with
          Code:
          global path_start = "<root folder>"
          All references to dofiles and datafiles are then some form of
          Code:
          do "$path_start/Dofiles/<dofilename>
          use "$path_start/Data/<dataname>
          As certain folders get referenced many times, it's often easy to define e.g.
          Code:
          global path_data = "$path_start/Data"
          If you have multiple users working on the same project in different (computer) locations, you can start with something akin to

          Code:
          if "`c(username)'" == "Jesse Wursten" global path_start = "C:/MyProjects"
          if "`c(username)'" == "Someone Else" global path_start = "Z:/Stata stuff""

          Comment

          Working...
          X