Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • AI assistance for Stata programming?

    Some posts ago I raised the question how to determine in which Stata version a command was introduced in order to specify the least restrictive version -- see also question 2 and the answers to that question in the following ## here.

    The answers showed that this is no easy task. Today I used AI (Claude/Anthropic, see also here). To show what is possible (disclaimer: I am not affiliated with Anthropic in any way) I am showing you here the result as documented in the Claude's markdown document:
    .
    Stata Version Analysis for html_cb.ado

    Task Summary

    Analysis of the `html_cb.ado` program to determine:
    1. In which Stata version the `vl` command was introduced
    2. The minimum required Stata version for the program to run (excluding lines 48-49)
    Analysis Results

    Question 1: When was the `vl` command introduced?

    Answer: Stata 16

    The `vl` command (for managing variable lists) was introduced in Stata 16. In the program, it appears on lines 48-49:
    Code:
    cap vl rebuild
    if !_rc vl dir

    Question 2: Minimum Stata version required (excluding lines 48-49)

    Answer: Stata 11

    Key Commands Analysis

    The following version-critical commands were identified in the program:
    Code:
    | Command/Function | Introduced In  | Usage in Program                             |
    |------------------| ---------------|----------------------------------------------|
    | `regexm()`       | Stata 10       | String pattern matching (multiple locations) |
    | `usubinstr()`    | Stata 10       | String substitution (multiple locations)     |
    | `fileexists()`   | Stata 11       | File existence checking (line ~11)           |
    | `clonevar`       | Stata 11       | Variable cloning for type checking           |
    | `strpos()`,      |                |                                              |
    | `substr()`,      |                |                                              |
    | `strrpos()`      | Stata 9+       | String manipulation                          |
    | `matcell()`,     |                |                                              |
    | `matrow()`       | Stata 9+       | Matrix operations with tabulation            |
    | `tempname`,      |                |                                              |
    | `tempvar`,       |                |                                              |
    | `tempfile`       | Early versions | Temporary objects                            |

    Conclusion

    The minimum required Stata version is Stata 11 due to the use of:
    • `fileexists()` function
    • `clonevar` command
    Both of these commands were introduced in Stata 11 and are essential for the program's functionality.

    Notes
    • The program currently specifies `version 18` (line 4), which is more restrictive than necessary
    • Excluding the `vl` commands on lines 48-49, the program could run on Stata 11
    • The `vl` commands are wrapped in `capture` statements, so they won't cause errors on older versions, but the variable list management functionality would be unavailable
    File Information

    Program: html_cb.ado
    Current version statement: version 18
    Analysis date: August 23, 2025
    Lines excluded from analysis: 48-49 (vl commands)

    It took me about a minute to get this information and I believe that AI can be very useful for Stata programmers. My experience shows that if you are able to ask the appropriate questions and to evaluate the answers critically it can be useful, especially for these kinds of questions and for writing a first sketch of a Stata help file (see also the "By the way" here).
    Last edited by Dirk Enzmann; 23 Aug 2025, 12:06. Reason: Added tags

  • #2
    I'm away fromy computer for the next two weeks and won't try looking up the details but from the top of my head, clonevar was around way before Stata 11, probably around Stata 8. usubinstr(), on the other hand,.was definitly not introduced in Stata 11 but in Stata 14, when StataCorp introduced Unicode support. So, sorry, this seems much more like yet another cautionary tale of AI hallucination.
    Last edited by daniel klein; 23 Aug 2025, 12:53.

    Comment


    • #3
      Excellent observation.

      Conclusio of this small experiment: It is a common problem with AI that it can generate answers in a language and style that appear convincing because they sound authoritative, as if they came from an expert. Hence, it is ideal for manipulating attitudes and opinions. But AI does not think “intelligently” (although it can handle logic); rather, AI is essentially an artificial parrot able to handle huge data (which the parrot itself cannot verify for factuality) extremely fast. This applies to the answer produced by Claude (Anthropic) in this exercise, as well.

      Comment


      • #4
        Originally posted by Dirk Enzmann View Post
        Conclusion of this small experiment: It is a common problem with AI that it can generate answers in a language and style that appear convincing because they sound authoritative, as if they came from an expert. Hence, it is ideal for manipulating attitudes and opinions. But AI does not think “intelligently” (although it can handle logic); rather, AI is essentially an artificial parrot able to handle huge data (which the parrot itself cannot verify for factuality) extremely fast. This applies to the answer produced by Claude (Anthropic) in this exercise, as well.
        Nice way of putting it. I give several disturbing examples of AI hallucinations at

        https://academicweb.nd.edu/~rwilliam...s/AIPolicy.pdf

        A friend of mine claimed that even lawyers and judges don't realize AI hallucinates. Nor do many/most other people. If so, that is pretty scary. AI can do some great things, but people need to understand its limitations.

        -------------------------------------------
        Richard Williams, Notre Dame Dept of Sociology
        StataNow Version: 19.5 MP (2 processor)

        EMAIL: [email protected]
        WWW: https://academicweb.nd.edu/~rwilliam/

        Comment


        • #5
          Code:
          help whatsnew8 
          
          
          -------- update 05oct2004 --------------------------------------------------------------------------------------------------
          
              Ado-files
          
               3. clonevar is a new command that makes an identical copy of an existing variable; see help clonevar.

          So clonevar was added as an update to Stata 8.

          Comment


          • #6
            For those interested, I tried to check the accuracy of parrot Claude's attempt (shown in #1) to determine in which Stata version certain version-critical commands or functions have been introduced. If I am correct, 75% of Claude's answers were wrong:
            Code:
                             | Introduced in Stata version          
            -----------------|----------------------------------                 
            Command/function | Parrot Claude  | probably correct
            -----------------|----------------|-----------------
             regexm()        | Stata 10       | Stata  8
             usubinstr()     | Stata 10       | Stata 14
             fileexists()     | Stata 11       | Stata 12
             clonevar        | Stata 11       | Stata  8.2
             strpos()        | Stata  9+      | Stata  1
             substr()        | Stata  9+      | Stata  4
             strrpos()       | Stata  9+      | Stata 14
             matcell()*      | Stata  9+      | Stata  7
             matrow()*       | Stata  9+      | Stata  7
             tempname        | early versions | Stata  5
             tempvar         | early versions | Stata  3
             tempfile         | early versions | Stata  5
            -----------------|----------------|-----------------
            * option of tabulate
            Mistakes happen, even though they shouldn't, but an error rate of 75% is alarming. To understand why I am calling Claude a parrot, as a starting point see here.

            I am not looking for an answer to the question in which version these Stata commands/functions were introduced, but if you find mistakes on my part, I am happy to know.

            Comment

            Working...
            X