Introducing Estima – Your AI Companion for Statistical Work

Josh Zweig

Join Date: Nov 2024

Posts: 18
#1

Introducing Estima – Your AI Companion for Statistical Work

13 May 2025, 10:23

Hi everyone,

After six months of tweaking, testing, and trying to get it right, I’m happy to share something I’ve been working on: Estima, an AI-powered tool designed to support users of statistical software like Stata.

Estima is not just about writing Stata scripts or translating them from Python or R. It’s about helping you think through your analysis, your models, and your assumptions. It acts like a companion during your work: whether you're debugging, planning a causal inference analysis, or deciding how to approach a problem, Estima is there to support your thinking, not just automate your typing.

It’s free to try, I have created a free plan of 15 uses per month, and I’d love your feedback as I keep improving it. This is still an MVP, built by someone with experience using Stata, and I know there’s plenty of room to grow.

Eventually, a native package within Stata might make more sense. But for now, I’m aiming to build something truly useful for analysts, researchers, and students who want more than just code generation — they want intelligent support.

You can try it here: Estima

Thanks for checking it out, and I’m all ears for feedback or suggestions.

Best,
Josh Zweig (See more about me in my Substack)
Tags: None
daniel klein

Join Date: Mar 2014

Posts: 3861
#2

14 May 2025, 12:11

Same promt as here:

I'm working with Stata. I want to add the four smallest and largest values to the returned results of -summarize-. Can you suggest code to do that?

Answer:

For your specific request, you can use the 'extreme' option of the -summarize- command to show the extreme (smallest and largest) values. Here's how you can do it:

Obviously, there is no extreme option for summarize, so slightly hallucinating here. Also, summarize returns minimum and maximum even without option detail (and even with the meanonly option).

Code suggestion:

Code:

summarize varname, detail return list // This shows what's currently stored in return memory // To store extremes in return memory local extremes forvalues i = 1/4 { sort varname local extremes `extremes' r(min`i')=`=varname[`i']' sort varname local extremes `extremes' r(max`i')=`=varname[-`i']' } // Now use the results return add `extremes' return list // This will show all results, including the extremes

Sorts twice [in each iteration!]; why? Negative indices aren't allowed in Stata. Doesn't handle situations with fewer than 4 observations. Treats missing values as valid ones. Overall, not super impressive, but somewhat better than competitors.

Edit: I should add explicitly that this isn't to diminish the contribution; just pointing out that current limits are fairly easy to encounter.

Last edited by daniel klein; 14 May 2025, 12:51.
1 like
Comment
Nick Cox

Join Date: Mar 2014

Posts: 35728
#3

14 May 2025, 14:57

As a very small twist on #2 -- not disturbing its general message -- negative indices are legal in Stata, just no help for this problem, and wrong in not giving what is wanted. (A reference to an observation with negative index always is evaluated as missing.
1 like
Comment
Josh Zweig

Join Date: Nov 2024

Posts: 18
#4

15 May 2025, 02:17

Thank you both for your comments, I agree there's nothing particularly impressive at this stage. That said, if someone had asked three years ago whether this sort of thing would be possible, I don’t think many of us would have believed it.

On the point about the chatbot not being fully accurate or perfect. I understand the concern, though I personally don’t see that as a major problem. With Estima, I’m not trying to build a substitute for expertise, but rather a companion while working in Stata. The bot is quite helpful for structuring problems, checking that your code addresses all the relevant points, and suggesting ways forward. It's true that it might not get things 100% right on the first go, but we’ve seen plenty of examples where a short exchange leads users to the solution they were after.

In that sense, I see it as similar to messaging a colleague or posting a question here on the forum, except you get an answer instantly and can iterate from there.

Just to be clear: I’m not trying to oversell what’s possible here, and I don’t think there’s any need to.
Comment
Josh Zweig

Join Date: Nov 2024

Posts: 18
#5

15 May 2025, 05:54

Originally posted by FernandoRios View Post

I see
Yeah the programming example that was pointed out is the attitude of most llms to refuse to say I don’t know. And that it is still hard to program a language that is relatively less known
i wonder if for edge cases like that one could add to the system prompt
if unsure of how to do it on Stata and mata alone, use python for the back end

Just following up to this FernandoRios - I thought and tried about using Python on the back. That did not fully work, but I should try again being able to compile try etc. Regarding Stata being less well known: that's true, and it's a limitation but system prompts are in a way the method to flag the right weights of the model. I iterated through different system prompts, I have few examples as well, but giving as specific as the on discussed here can get tricky (do you think otherwise?)
Comment
FernandoRios

Join Date: Apr 2014

Posts: 2471
#6

15 May 2025, 16:41

I well what I meant is ask the llm that when in doubt he could mix Stata with python to get the desired outcome
it would be more likely to work bc llm do python really well
Comment
Josh Zweig

Join Date: Nov 2024

Posts: 18
#7

16 May 2025, 09:07

Originally posted by FernandoRios View Post

I well what I meant is ask the llm that when in doubt he could mix Stata with python to get the desired outcome
it would be more likely to work bc llm do python really well

Ah I see - well the user can certainly do that, but as a product decision I see it off, could confuse a lot of people.
Comment

Announcement

Introducing Estima – Your AI Companion for Statistical Work

Comment

Comment

Comment

Comment

Comment

Comment