Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • What is Stata path in Python? Are there any problems in Python code

    Hello everyone,

    I'd like to ask you a question. I have around a thousand commands related to multilevel mediation that I want to execute (I listed only four of them in code for test). So I ran Stata programs in Jupyter Notebook. But I don't want to wait for too long for each command, because some of these models are not converging. They will iterate for 300 times for every model. So I asked for ChatGPT's help in writing a code snippet to monitor and terminate them simultaneously. This way, each command will stop executing after 50 iterations of the model and move on to the next line. However, it seems like there might be an issue with my Stata path in Python. Can anyone help me with this?

    Thank you!

    This is my first time posting, so if there are any shortcomings, please don't hesitate to point them out. Thank you!


    Stata version MP 17.0
    MacOS 13.5 (22G74)

    This is my main code to monitor and terminate. Before that, I have done some basic work, like cd, egen...

    Code:
    import subprocess
    
    # define
    commands = [
        'ml_mediation, dv(ciinno) iv(serlea) mv(poentre) l2id(swid)',
        'ml_mediation, dv(cipsm) iv(serlea) mv(poentre) l2id(swid)',
        'ml_mediation, dv(cohes) iv(serlea) mv(poentre) l2id(swid)',
        'ml_mediation, dv(poleff) iv(serlea) mv(poentre) l2id(swid)'
    ]
    
    # Traverse the list of commands to execute each command
    for cmd in commands:
        # Create a child process to execute the Stata command
        process = subprocess.Popen(['/Applications/Stata/StataMP', '-b', '-q', '-e', cmd], stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
        
        # Monitor the standard output of commands
        for line in process.stdout:
            print(line, end='')  
            
            # If the output contains "Iteration 50", the execution of the current command stops
            if "Iteration 50" in line:
                process.terminate()  # Terminates the child process of the current command
                break  # Stop monitoring output and continue with the next command
        process.wait()  # Wait for the child process to complete
    And this is the error as follows:

    Code:
    FileNotFoundError                         Traceback (most recent call last)
    Cell In[12], line 14
         11 # Traverse the list of commands to execute each command
         12 for cmd in commands:
         13     # Create a child process to execute the Stata command
    ---> 14     process = subprocess.Popen(['/Applications/Stata/StataMP', '-b', '-q', '-e', cmd], stdout=subprocess.PIPE, stderr=subprocess.PIPE, universal_newlines=True)
         16     # Monitor the standard output of commands
         17     for line in process.stdout:
    
    File ~/anaconda3/lib/python3.11/subprocess.py:1026, in Popen.__init__(self, args, bufsize, executable, stdin, stdout, stderr, preexec_fn, close_fds, shell, cwd, env, universal_newlines, startupinfo, creationflags, restore_signals, start_new_session, pass_fds, user, group, extra_groups, encoding, errors, text, umask, pipesize, process_group)
       1022         if self.text_mode:
       1023             self.stderr = io.TextIOWrapper(self.stderr,
       1024                     encoding=encoding, errors=errors)
    -> 1026     self._execute_child(args, executable, preexec_fn, close_fds,
       1027                         pass_fds, cwd, env,
       1028                         startupinfo, creationflags, shell,
       1029                         p2cread, p2cwrite,
       1030                         c2pread, c2pwrite,
       1031                         errread, errwrite,
       1032                         restore_signals,
       1033                         gid, gids, uid, umask,
       1034                         start_new_session, process_group)
       1035 except:
       1036     # Cleanup if the child failed starting.
       1037     for f in filter(None, (self.stdin, self.stdout, self.stderr)):
    
    File ~/anaconda3/lib/python3.11/subprocess.py:1950, in Popen._execute_child(self, args, executable, preexec_fn, close_fds, pass_fds, cwd, env, startupinfo, creationflags, shell, p2cread, p2cwrite, c2pread, c2pwrite, errread, errwrite, restore_signals, gid, gids, uid, umask, start_new_session, process_group)
       1948     if errno_num != 0:
       1949         err_msg = os.strerror(errno_num)
    -> 1950     raise child_exception_type(errno_num, err_msg, err_filename)
       1951 raise child_exception_type(err_msg)
    
    FileNotFoundError: [Errno 2] No such file or directory: '/Applications/Stata/StataMP'
    Last edited by Ada Xu; 08 Sep 2023, 21:08.

  • #2
    Are you sure /Applications/Stata/StataMP shouldn't be pointing at a file with an extension, like a .exe?

    [RANT]
    I frankly think this whole endeavor is misguided. In general, ChatGPT is okay at the basics, but bad at anything complicated or (especially) idiosyncratic. You should really understand how to program what you want yourself. ChatGPT should only be there for some light syntax help, and you should expect to problem solve on your own most of the time. The intersection between Stata and python is already rife with possible programming issues and complexities, so why bring jupyter and IPython into this mess? What do you need python for here? Why not code this in pure ado? Why not code this in pure python?

    What you have here is an over-engineered solution to a relatively simple problem. Just drop python, jupyter notebooks, and all of the other trendy technology, and this becomes much simpler.
    [/RANT]

    This is really Tim Huegerich wheelhouse, so he may have more helpful, less ranty advice.

    Comment


    • #3
      Jared Greathouse Also likes to take advantage of the Stata Python interface, and may also have some advice.

      Comment


      • #4
        Originally posted by Daniel Schaefer View Post
        Are you sure /Applications/Stata/StataMP shouldn't be pointing at a file with an extension, like a .exe?

        [RANT]
        I frankly think this whole endeavor is misguided. In general, ChatGPT is okay at the basics, but bad at anything complicated or (especially) idiosyncratic. You should really understand how to program what you want yourself. ChatGPT should only be there for some light syntax help, and you should expect to problem solve on your own most of the time. The intersection between Stata and python is already rife with possible programming issues and complexities, so why bring jupyter and IPython into this mess? What do you need python for here? Why not code this in pure ado? Why not code this in pure python?

        What you have here is an over-engineered solution to a relatively simple problem. Just drop python, jupyter notebooks, and all of the other trendy technology, and this becomes much simpler.
        [/RANT]

        This is really Tim Huegerich wheelhouse, so he may have more helpful, less ranty advice.
        Thank you for your suggestions. My advisor also recommended trying Stata's loop statements. I just didn't know the function of Stata's loop when I started. t_t I thought since I was experimenting with Python, I'd give it a try, but I didn't anticipate it would be this complicated. Anyway, thanks for your information.

        Comment


        • #5
          but bad at anything complicated or (especially) idiosyncratic.
          I generally agree, but in my personal experience, GPT is a complete godsend. It allows me to translate code from MATLAB/R to Python (pretty effectively, too). Literally, it helped me scrape the Chinese government's house price database, it's helped me scrape AQI air quality data. In my paper, it helped me translate much of the underlying code from R and write many of the estimators in the mlsynth Python package (which I'm developing with my coworker Mani).

          I say this to say, I generally agree with using GPT with caution, since you need to know how to ask it for help and how to diagnose bugs, but as as someone who writes code, for me personally, it's been pretty useful.

          I do agree however that this is pretty over-engineered, if you will. I don't even know what the goal is beyond.... Running 1000 commands? I'm unsure of why you need to do this in (what looks like?) batch. What's going on here, anyways, why do you need to do all this? Can't you just do a Stata loop? I'm not asking to be mean, I'm asking because I'm having trouble understanding why all this is needed.

          Comment

          Working...
          X