Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Leonardo Guizzetti
    replied
    On the topic of making a small change, it would be nice to have an option (or default to) using -double- precision when using -sts list, saving()-.

    Leave a comment:


  • Mario Ferri
    replied
    Full baesyan toolbox like the BEAR toolbox from ECB

    Leave a comment:


  • JanDitzen
    replied
    As per usual, I might have missed a comment or a discussion somewhere, but I think it would be helpful in order to save memory and computational power to have a matrix eltype "sparse". I believe for machine learning applications such as lasso style estimators or spatial econometrics with a large number of cross-sections, sparse matrices would be extremely helpful. I know that you can use arrays to create "sparse" matrices, but maintaining them and calculations are much more difficult.

    An example would be a spatial weight matrix in a dataset with - say - 100000 firms. Then I need to create a spatial weights matrix which has 100000^2 (or 1e+10) elements. This takes up space and most of the matrix is 0 anyway since only a handful elements are non zero. In terms of a sparse matrix eltype I am thinking of the following:

    Code:
    N = 100000
    /// initialise sparse matrix
    W = sparse(N,N)
    /// two elements are non zero
    W[1,1] = 0.5
    W[1000,200] = 0.2
    
    X = rnormal(N,1,0,1)
    
    WX = W * X
    In comparison to using arrays, this code is much cleaner and easier to read.

    Of course I can write my own function which maintains a list of non zero items in W and then just loops over those for any mathematical operations. But the result is likely to be sparse again.

    I understand that Stata and mata already have checks for and algorithms to handle sparse matrices for matrix inversion, so I hope it would be an easy task to implement in a more efficient way.

    Leave a comment:


  • Clyde Schechter
    replied
    Re #383. Yes, my point was that the request in #378 sounded like a request for a modification to -predict-, not -margins-. And, yes, it is easy to code the equivalent of an -at()- option for predict: it actually comes up from time to time in my work and I have done it several times on an ad hoc basis. It would be a minor convenience to have it built in to the -predict- command. This isn't something I would have asked for myself, given the ease of doing it homebrew. I was just trying to clarify what I thought the poster of #378 had in mind, given that I didn't think it was unreasonable.

    Leave a comment:


  • wbuchanan
    replied
    Fahad Mirza ,
    If you are familiar with Python and/or Java, it is possible to use the existing APIs and libraries in those languages to do web scraping. It is possible to do something using Mata, but it would require building out a parsing model for HTML and might also need other functionality that isn’t supported natively in Stata (e.g., setting parameter values for Headers, passing cookies, etc...) that are already well supported in other languages.

    Leave a comment:


  • daniel klein
    replied
    Originally posted by Clyde Schechter View Post
    I'm not certain of this, but perhaps what is meant in #378 would be better described as requesting an -at()- option for the -predict- command which would allow the generation of observation-level predictions with a specified list of variables constrained to specified values. If that's what is being requested, it sounds like a reasonable ask to me.
    margins at() option allows as atspec

    Code:
    varname = generate(exp)
    Having this for the predict command would make it easy to store the observation-level predictions in the dataset. I never felt the need to do this, but I see that you might want to have something like that. This should be rather easy to implement yourself, though.

    Leave a comment:


  • Fahad Mirza
    replied
    I am not sure if this will be a legit request but I would really love to see web scraping made possible through Stata. There is a lot of data available online that can be useful to extract, one such example is prices of goods on a daily basis.

    Leave a comment:


  • Clyde Schechter
    replied
    I'm not certain of this, but perhaps what is meant in #378 would be better described as requesting an -at()- option for the -predict- command which would allow the generation of observation-level predictions with a specified list of variables constrained to specified values. If that's what is being requested, it sounds like a reasonable ask to me.

    Leave a comment:


  • William Lisowski
    replied
    The request in #378 regarding margins reflects a discussion at

    https://www.statalist.org/forums/for...the-covariates

    which seems to reflect an incomplete understanding of the margins command.

    Leave a comment:


  • daniel klein
    replied
    I do not quite understand #378. margins has an asobserved as well as a mean stat can be used in at(). There is also the atmeans option. What am I missing?

    Leave a comment:


  • Oscar Ozfidan
    replied
    The margins at option allows only a subset of values of the variable that is allowed to vary when holding other variables at their means etc. when producing predictions. It would be nice to incorporate an option to set the in the "at" option the variable (that is allowed to vary) to observed values. So, with such an option a margins prediction would be much like a prediction command except that it would produce predictions for each observation in the sample when holding all other variables at certain values. This would be very handy to have where manual specification of values are not feasible such as in a rolling regression. Right now after estimation I have to generate original values of the vars that are held at their means, replace them with the mean value, then predict, then have to replace the means back with the original values.

    Leave a comment:


  • FernandoRios
    replied
    Something I would like, which seems simple enough, is to allow margins label their outcomes rather than using the generic #._predict when more than one predicted outcome es generated.
    For example:
    Code:
    **currently this is what margins produces:
    . webuse union, clear
    . probit union age grade not_smsa south##c.year
    
    . margins, predict(pr) predict(xb)
    
    Predictive margins                              Number of obs     =     26,200
    Model VCE    : OIM
    
    1._predict   : Pr(union), predict(pr)
    2._predict   : Linear prediction, predict(xb)
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
        _predict |
              1  |    .221831   .0025368    87.44   0.000     .2168589    .2268031
              2  |  -.7857457   .0088079   -89.21   0.000     -.803009   -.7684825
    ------------------------------------------------------------------------------
    
    ** but It would be useful if it would produce this instead:
    . margins, predict(pr) predict(xb)
    
    Predictive margins                              Number of obs     =     26,200
    Model VCE    : OIM
    
    1._predict   : Pr(union), predict(pr)
    2._predict   : Linear prediction, predict(xb)
    
    ------------------------------------------------------------------------------
                 |            Delta-method
                 |     Margin   Std. Err.      z    P>|z|     [95% Conf. Interval]
    -------------+----------------------------------------------------------------
             pr  |    .221831   .0025368    87.44   0.000     .2168589    .2268031
             xb  |  -.7857457   .0088079   -89.21   0.000     -.803009   -.7684825
    ------------------------------------------------------------------------------

    Leave a comment:


  • Fahad Mirza
    replied
    I would really like Stata 17 to add the feature of analysis using a Geo Tiff image. I have been longing to use night light data by importing it directly into Stata however, it requires the use of ArcGis/Qgis first then converting to csv or shape file before analysis. Would be wonderful to have this feature.

    Leave a comment:


  • Nick Cox
    replied
    #371 to #374

    If you want

    missing + 42

    to be treated as having the same total as

    42

    then you want missing to be treated as 0. There is a side-effect that it is hard to tell apart

    0 + 0

    0 + missing

    and

    missing + missing

    but you can't solve all problems at once. It's also hard to tell apart

    0 + 0

    -2 + 2

    and so on.


    The point often arises with groupings at different scales in which say totals over cross-classifications of X Y are needed together with totals over categories of X. The latter can't be aggregated easily from the former if this rule isn't followed.

    As pointed out, for _final_ results if you want the sum of all missings to be shown as missings, then you can have that optionally. It's just not the default.

    Leave a comment:


  • Rich Goldstein
    replied
    daniel klein re: #371 - if all variables are missing with the rowtotal option, and you want the result to be missing, use the "missing" sub-option; in addition, I agree with #372 and see nothing wrong with either command and the way they deal with missing values and the totals

    Leave a comment:

Working...
X