Anyone running benchmarks across LLMs for Stata skill?

CJ Libassi

Join Date: May 2020

Posts: 45
#1

Anyone running benchmarks across LLMs for Stata skill?

20 Aug 2025, 17:17

Curious if anyone has already tried comparing how well different LLMs perform on Stata coding tasks—not just casually, but using formal benchmarks. I’m considering whether it’s worth building one myself, but wanted to check if something already exists in practice or in the early stages.

I think my wishlist of things I'd like to know whether an LLM can do consistently are:
Reshaping and aggregating data (e.g. reshape, collapse, egen, merge), while handling the syntax correctly and by() statements logically

A wide range of regression tasks, including thinking clearly about standard errors and applying the syntax correctly

Post-estimation commands (margins, estimates, predict, lincom), including extracting and interpreting results

Looping or macro-driven routines (foreach, forvalues, local macros)

Creating formatted tables for publication (table and collect)

A wide range of plotting commands/techniques, including with community contributed commands and suites of commands (e.g. schemes, palettes, stata-schemepack, coefplot, etc.)

Writing functional, clean, reusable .do files—more than one-offs; modular coding design

Correct data manipulation with string functions, date handling, and factor-level processing

Knowledge and use of community contributed commands for newer quasi-experimental design techniques (e.g., rdrobust, csdid, ivreg2, etc.)

But I'm sure there are a lot of ways to design a general purpose test of Stata coding ability that isn't skewed toward my applied micro needs. Anyway - just wondering if there's anything out there or if this is something people would like to have or have thoughts about how to design.
Tags: None
Tiago Pereira

Join Date: Jan 2016

Posts: 409
#2

22 Aug 2025, 06:21

Have you checked this?

https://www.statalist.org/forums/for...-dedicated-gpt
Comment

Announcement

Anyone running benchmarks across LLMs for Stata skill?

Comment