Announcement

Collapse
No announcement yet.
This is a sticky topic.
X
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • #91
    I sometimes log into someone else's server where they have Stata installed. Every time I log in, all settings are returned to default. I would dearly like to have the option to toggle settings like autocomplete, smart brackets, etc., from a .do file.

    The other is that I often work with data that includes free text. Busy people seem to have a variety of ways of expressing the same thing. I know of and have used various functions like soundex, and I am aware of user written options like ngram. But it can be quite laborious extracting the information I need. So a slightly nebulous wish for Stata Corp to invest some time in developing functionality in this area.

    Comment


    • #92
      Hi,

      I feel that stata is lagging behing in Machine learning tools. To my knowledge, there is still no "official" stata package to implement things like simple random forest (which dates back to 2001...). This does not seem right.

      Recently, I have been looking at the recent work of Susan Athey and co about causal forest heteroegenous estimates which dates back to 2016 and which has become the norm in my field. Stata has still not done anything about it. While I am an advocate of the robust and consistent way we code in stata, I see myself coding on R more and more. It used to be a time when stata was the first at implementing robust tools for statistical analysis. I wonder whether this is time is over and I find myself wondering why I am paying a 1000 bucks for the product.

      Am I wrong about this?







      Comment


      • #93
        Adrien Bouguen I grant that community-contributed modules are not the same as built-in ones. However, there are many fine community-contributed machine learning modules for Stata. Have you explored them? (Deft use of -search- would find many of them.) Are you missing any ML method(s) in particular?

        Comment


        • #94
          Hi stephen,

          For instance, the grf package (granted pretty new) from Susan Athey to estimate heterogeenous treatment effect via causal forest ML is not available in stata. I don't believe anyone is working on adapting this new and super useful tool to Stata anytime soon. My take is that stata users accept to pay for stata because they know they will benefit from the work of professional coders. If it takes 20 years for stata to implement something like random forest, I might be incline to switch to R (and thrown away 10 years of stata coding skills). I am saying that while loving the product...

          Comment


          • #95
            Ability to calculate R2 and proportion of variance explained in multi-level, mixed effect generalized linear models such as melogit.

            Comment


            • #96
              Collections are a great way to gather information from many different commands into one place. Currently, collections are oriented toward preparing great human-readable tables. But, it would be fantastic to be able to access programmatically all the great information collected, such as by one or more of the following:
              • Provide -collect export- to a Stata .dta dataset, exporting the current layout in a tidy machine-readable form
              • Provide a similar export directly to a new frame in memory (avoiding need to access file storage)
              • Provide -collect export- that dumps the entire collection contents (rather than just the current layout) into either a dataset or frame
              • Provide accessor functions (in Stata or Mata) that allow programmatic reading of the collection contents, something like getCollectionValueFor( ... dim specifications here ...)
              Is is, of course, currently possible to collect export to an Excel file and then read it back in, but if the current layout has more than one dimension on the rows or columns, the row and column labels are messy from a machine-readability perspective, requiring a bunch of tedious clean up that could be avoided with a machine-focused export function.

              It's also currently possible to use -frame post- after each command to accumulate the desired results in a frame, but if you want both machine-readable and pretty human-readable tables, it would require some duplication of effort with -frame post- and -collect-. This approach also requires knowing the desired frame layout in advance—one of the nice things about collections is that they can grow flexibly. Plus, -collect- has the very nice prefix option, which provides more streamlined code than having a bunch of extra -frame post- lines.

              (I searched the documentation for awhile to see if option 4, above, was already implemented and didn't see anything, but if I missed it and it's already there, please let me know!)

              Comment


              • #97
                Originally posted by Michael Culbertson View Post
                Collections are a great way to gather information from many different commands into one place. Currently, collections are oriented toward preparing great human-readable tables. But, it would be fantastic to be able to access programmatically all the great information collected, such as by one or more of the following:
                • Provide -collect export- to a Stata .dta dataset, exporting the current layout in a tidy machine-readable form
                • Provide a similar export directly to a new frame in memory (avoiding need to access file storage)
                • Provide -collect export- that dumps the entire collection contents (rather than just the current layout) into either a dataset or frame
                • Provide accessor functions (in Stata or Mata) that allow programmatic reading of the collection contents, something like getCollectionValueFor( ... dim specifications here ...)
                Is is, of course, currently possible to collect export to an Excel file and then read it back in, but if the current layout has more than one dimension on the rows or columns, the row and column labels are messy from a machine-readability perspective, requiring a bunch of tedious clean up that could be avoided with a machine-focused export function.

                It's also currently possible to use -frame post- after each command to accumulate the desired results in a frame, but if you want both machine-readable and pretty human-readable tables, it would require some duplication of effort with -frame post- and -collect-. This approach also requires knowing the desired frame layout in advance—one of the nice things about collections is that they can grow flexibly. Plus, -collect- has the very nice prefix option, which provides more streamlined code than having a bunch of extra -frame post- lines.

                (I searched the documentation for awhile to see if option 4, above, was already implemented and didn't see anything, but if I missed it and it's already there, please let me know!)
                I've also asked for a way to programmatically dump collections to datasets. There is a user-contributed command called -collect_to_frame- (Github) that facilitates the first bulleted request. However, numeric values are stored raw, so post-processing of formats, etc. will still need to be (re-)applied.

                Comment


                • #98
                  In Stata 17.0 Mac OS, I can do -sort-command after factor analysis, whereas in Stata 18.0 Window version, I could not do it. Also, I wish I could use it efficiently for Structural equation modeling (SEM) as we can use it in the MPlus. I wish Stata 19.0 has those abilities.

                  Comment


                  • #99
                    when generating a variable d = x > 1, Stata should return a missing value if x is missing.

                    Comment


                    • Originally posted by George Ford View Post
                      when generating a variable d = x > 1, Stata should return a missing value if x is missing.
                      This would be tricky because it would fundamentally change how Stata performs calculations by treating missing as essentially another (very large) number. It would effectively have to treat missings as a special class of numbers for calculation purposes.

                      Comment


                      • Seems like a simple command could be written (mgenerate, or some such) that added the "if !missing(varname)" to it.

                        Comment


                        • Enhance tools to work with remote databases.
                          -odbc- is OK but pretty basic. Admittedly I have not used -jdbc-.
                          It is often a challenge to build SQL queries as strings in local macros, and it is somewhat difficult to understand the code later on.

                          But if there is a more seamless way to integrate remote databases (e.g., see dbplyr in R) that would be great. This kind of tool would translate Stata data management commands (or series of commands) into SQL and send them to the remote database. Perhaps that is one ambitious option.



                          Comment


                          • Let me freely move the sidebar and its associated windows to whatever location I'd like as I used to be able to (at least on Windows with Stata 16). I can't do this on Stata 17 on a MacBook. And the ability to change the background color of the data editor (like I can on the Results window) instead of only being able to choose to have Stata and all of my apps in either dark or light mode would be nice.

                            Comment


                            • Packages to produce graphs like 3D plots or map plots. It would be really helpful.

                              Comment


                              • It would be nice if there were a way to estimate memory usage/consumption of commands (particularly for estimation commands) prior to execution to ensure sufficient memory resources are available (especially important in some *nix environments that do not page memory to temp locations). For example, something like:
                                Code:
                                memest: reg price mpg weight displacement
                                . I think it could be possible to determine memory needed for models that have closed-form solutions and estimates could be used for any methods that require iteration since the memory demands would vary depending on what specifically is being computed in the algorithm (e.g., 1st vs 2nd derivatives, retaining other matrices created during the estimation processes, etc...).

                                Comment

                                Working...
                                X