Wish List for Stata 20

Dave Airey

Join Date: Apr 2014
Posts: 416

#106

22 Jan 2026, 05:29

Ahmed Abdalla, re#96, one of the helpful packages for running many models across 1000s of 'omics features is the parallel package. For example, running mixed models across 5000+ proteins we can use code like below. While StataMP is a multiprocessor version of Stata, I think parallel capabilities making manual use of our cores when needed could be part of official Stata as well.

Code:

program define parfor
    levelsof gene, local(mygenes)
    global mygenes = r(levels) // a global macro is required
    foreach i of global mygenes {
        // mixed model here for protein differential expression
        xsvmat, from(r(table)') rowname(parm) names(col) idstr("`i'") /// 
            saving(.\parallel\gene_`i', replace)
    }
end

parallel initialize 8 // request 8 of 14 available cores
parallel, prog(parfor) by(gene): parfor // by(gene) stops splitting a gene between cores

// code to combine results files here

Comment

Matej Seifert

Join Date: Apr 2025

Posts: 5
#107

23 Jan 2026, 06:52

I very much like navigator and bookmarks. Would it be possible to create two types of bookmarks - like for "chapters" and "subchapters" in the code? It´s clear for me, I can use capitals etc., but maybe this can be done as well already on the program side.
Comment
Bert Lloyd

Join Date: Apr 2014

Posts: 122
#108

23 Jan 2026, 07:44

Originally posted by Matej Seifert View Post

I very much like navigator and bookmarks. Would it be possible to create two types of bookmarks - like for "chapters" and "subchapters" in the code? It´s clear for me, I can use capitals etc., but maybe this can be done as well already on the program side.

You can create a hierarchy of bookmarks by using multiple # characters

**# top level

**## second level

**### third level

I'm not sure how many levels are available, I've only gone up to 3 or maybe 4
2 likes
Comment
Matej Seifert

Join Date: Apr 2025

Posts: 5
#109

23 Jan 2026, 10:18

Originally posted by Bert Lloyd View Post

You can create a hierarchy of bookmarks by using multiple # characters

**# top level

**## second level

**### third level

I'm not sure how many levels are available, I've only gone up to 3 or maybe 4

Thank you, Bert. I was unaware of this.
1 like
Comment
Fahad Mirza

Join Date: Sep 2018

Posts: 263
#110

23 Jan 2026, 12:51

Allow users to split labels on markers (using twoway) into multiple lines either by word or by letter count. Additionally allow users to justify the split by right, center, or left justified.
Comment
Fahad Mirza

Join Date: Sep 2018

Posts: 263
#111

23 Jan 2026, 12:53

Allow users to import JSON files into Stata. I see we have *.parquet now but would be nice to include JSON too.
1 like
Comment
daniel klein

Join Date: Mar 2014

Posts: 3914
#112

25 Jan 2026, 08:45

Re: #111: JSON allows arbitrary nesting of key-value pairs, with values that can be scalars, arrays, or objects. There's no single, standard way to represent a rectangular (observations x variables) dataset in JSON, which makes it impossible to write a general, structure-agnostic importer. That said, I'd love to see a JSON parser/ writer in Mata.
1 like
Comment
Leonardo Guizzetti

Join Date: Jul 2016

Posts: 2464
#113

25 Jan 2026, 11:18

Originally posted by daniel klein View Post

Re: #111: JSON allows arbitrary nesting of key-value pairs, with values that can be scalars, arrays, or objects. There's no single, standard way to represent a rectangular (observations x variables) dataset in JSON, which makes it impossible to write a general, structure-agnostic importer. That said, I'd love to see a JSON parser/ writer in Mata.

A JSON writer shouldn't be too tricky to write for a simple rectangular dataset. What did you have in mind?
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#114

26 Jan 2026, 14:01

It would be easy to add flogit for fractional responses to telasso, and would make it comparable to teffects. The estimation is identical to logit but where in the objective function the y can be any value in [0,1] rather than just zero or one. Robust inference is already being used and so this is literally removing the data check that y must be binary in the logit version.
2 likes
Comment
Matej Seifert

Join Date: Apr 2025

Posts: 5
#115

07 Feb 2026, 02:47

Would you allow to see filtering history for data editor window - to allow for easy navigation between more created filters?
Comment
Clyde Schechter

Join Date: Apr 2014

Posts: 30360
#116

10 Feb 2026, 15:30

I would like to see some changes in how -collapse- handles value labels. Specifically, when we aggregate with (max), (min), (firstnm), (lastnm), (first), or (last), the values that appear in the collapsed variable are necessarily values that occurred in the original variable, or, at worst, are missing. Consequently it would be helpful if the value label of the original variable were carried over to the collapsed data set and applied to the variable.

Also, there is a defined sort order for string variables, so I don't understand why (max) and (min) are not allowed with them.
5 likes
Comment
Jeff Wooldridge

Join Date: Apr 2014

Posts: 2291
#117

05 Mar 2026, 10:26

I feel I've asked this before, but I could not locate it in the search. How come there is no option "vce(none)" for essentially any Stata command? When running a simulation to evaluate bias and efficiency, one does not need to, or want to, compute the standard errors. For some commands and scenarios the extra computation is trivial but I think for some it's enough to noticeably slow the simulations. If I'm wrong, I'd be glad to know that, too. I'm thinking specifically of teffects.
7 likes
Comment
Ahmad Abbadi

Join Date: Dec 2025

Posts: 2
#118

06 Mar 2026, 23:06

One workflow I often struggle with as a Stata user is extracting structured data from text. In medical research, we frequently receive patient information in the form of PDF medical charts. To process these data, I usually rely on colleagues who use Python to extract the text and convert it into variables, which I then import into Stata and further process with code to create structured variables.

With the increasing availability of large language models (LLMs), it would be extremely useful if Stata could support this workflow more directly. For example, Stata could allow users to extract text from PDFs into variables and then use integrated LLM-based tools to identify and extract relevant information (e.g., diagnoses, dates, laboratory values) and convert them into structured numerical or categorical variables.

Such functionality could significantly streamline the process of transforming unstructured clinical text into analysis-ready datasets, while keeping the entire workflow within Stata and maintaining reproducibility.
Comment
Nick Cox

Join Date: Mar 2014

Posts: 36074
#119

10 Mar 2026, 09:44

#118 Ahmad Abbadi Not speaking for StataCorp, but speaking for how as a user I want StataCorp to work!

I sense your need here -- which I guess is shared by others. But I think we all benefit long-term from extreme caution by StataCorp on what is supported. The company tends to be reluctant to adopt or to depend on third-party software because

1. StataCorp doesn't want to adopt functionality that it can't control.

2. StataCorp doesn't want to document functionality that may become rapidly out-of-date, not to say obsolete.

I have no special expertise or experience on LLMs, but just one wild guess, that the landscape is fast changing and that even major players may disappear with minimal notice.

In any case, which functionality do you have in mind?

It's possibly not a sensitive issue any more, but I can recall some years of requests that StataCorp give special support to external text editors, with two extremes to the request:

A. Support my own favourite (naturally).

B. Support all common text editors.

By and large, StataCorp's solution was just to work intermittently on its own text editor (I mean the do-file editor).
2 likes
Comment
Julian Reif

Join Date: Dec 2018

Posts: 54
#120

12 Mar 2026, 09:12

Re: #117 Jeff Wooldridge Adding a vce(none) option would also be useful for non-parametric power analyses. This was mentioned in the Stata 18 wishlist.

Associate Professor of Finance and Economics
University of Illinois
www.julianreif.com
1 like
Comment

Announcement

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment

Comment