Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Stata and AI

    It would be good all round if AI tools were able to solve any Stata coding problems too. I haven't tried any myself, but I stumbled across this post.

    https://twitter.com/peternka/status/1598409508900196357

    No one seems to have noticed that -- apart from some confusion about what is or is not a command, a function and an option -- the recommended code

    Code:
    summarize var1, mean sd
    is illegal. If you remove the offending option sd the results don't include the SD, and you need to consult the help to find that meanonly is a programmer's option with r-class results but nothing displayed in the Results window.

    More examples, better, as good, or worse, surely welcome.

  • #2
    Yeah i agree. This is why I'm never really sympathetic to "the robots are gonna take over (our jobs, more specifically)". GPT and other AI sources have the potential to be fantastic outlets that can help people do all manner of things, and it's a phenomenon that excites me...... but! It's no substitute for human coders who are experienced. As you say all the time, commands, functions, and options aren't the same, it out may not be reasonable to expect GPT to know this.

    This is a small error, but imagine many small errors like this at scale (not just for Stata, any other language). You'd essentially have code being corrupted by lots of small errors, even though overall its general thrust is correct. either way though, having properly trained ML algorithms to help for stuff like this is awesome, and at some point I'll likely use it for myself (namely for Python!!!)

    Comment


    • #3
      I think there are different levels to this. Details like knowing the syntax of the various Stata commands and exactly what they do are, I think, well within the grasp of artificial intelligence. In fact, they are well within the grasp of lower levels of technology, such as Stata itself. If you look at modern C++ compilers, they are able to identify and alert you to all sorts of problems with your code that are well beyond just syntactic errors (e.g. memory leaks, accessing uninitialized data, races).

      But there is another level I am skeptical about AI ever being able to reach. My experience here on Statalist is that a large proportion of the questions raised here arise because the user has not sufficiently specified the question. A program, in any language, transforms inputs (usually subject to some validity constraints) into outputs that relate to those inputs in some specific way. But if you can't state precisely what the relationship of the outputs to the inputs is supposed to be, you cannot write code. In most of these cases, at some intuitive level the user knows what is wanted, but has difficulty setting it out completely and in precise language. These threads are usually characterized by a series of back and forth posts that elicit the problem specification clearly enough to enable code writing. But that back and forth is often, in part, guided by the responder's knowledge of the kinds of things that are typically wanted, so that concrete questions like "do you mean this or do you mean that" can be posed. Although I don't want to be on record as claiming that, in principle, an AI cannot be developed that could take over this kind of interrogation function, current technology is far from being able to do that and I don't think it will get there any time soon. Suffice it to say, such an AI would need an in depth knowledge of programming, a working familiarity with the subject matter of the application to be coded, and excellent skills at eliciting clarification from users who have difficulty communicating their intent.

      It might happen eventually, but I don't expect it any time soon.
      Last edited by Clyde Schechter; 23 Feb 2023, 09:25.

      Comment


      • #4
        I suspect that near-term AI tools will be much less capable with Stata than with Python and other such languages with copious amounts of high-quality, open-source code available for crawling. If that's right, this reality may prove to be a significant obstacle to growing/maintaining Stata's popularity.

        Comment


        • #5

          I have used ChatGPT to catch errors in my codes. It's not useful for generating new code from scratch, but it helps a lot in catching errors.

          Comment


          • #6
            Originally posted by Tim Huegerich View Post
            I suspect that near-term AI tools will be much less capable with Stata than with Python and other such languages with copious amounts of high-quality, open-source code available for crawling. If that's right, this reality may prove to be a significant obstacle to growing/maintaining Stata's popularity.
            There is SSC to crawl for Stata. The challenge with R is that there is a lot of code available, but the quality is a real mixed bag: there is absolutely exceptional stuff there, but also complete garbage, and everything in between. For any automated system you have to keep GIGO (Garbage In --> Garbage Out) in mind.
            ---------------------------------
            Maarten L. Buis
            University of Konstanz
            Department of history and sociology
            box 40
            78457 Konstanz
            Germany
            http://www.maartenbuis.nl
            ---------------------------------

            Comment

            Working...
            X