Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Neural Network

    Hi everyone,

    I have just recently uploaded a neural network module for Stata.

    brain.ado is neural network application to provide some remedy for the distinct lack of such methods for Stata.

    It is a no frills implementation of a backpropagation algorithm in Mata designed for a hassle free setup of multi-layered neural networks. After training the whole network can be saved/loaded using so called brain-files (default postfix .brn). The network is represented by a set of reserved matrices to provide transparent access to all components and to support older Stata versions. Additional functions facilitate the calculation of pseudo-marginal effects or signal through-put, but the main utility is of course prediction, i.e. for propensity scores or classification.

    You can install it with:

    ssc install brain

    Cheers,
    Thorsten

  • #2
    Very interesting, thank you! (and nice examples in the doc)

    Comment


    • #3
      I have updated the brain.ado file. It now has a more consistent output reflecting the real training errror. You can specify the report interval to reduce the volume of output. The "best" option always maintains the best network if the error is fluctuating due to a large eta. Batch training is implemented, even though its full portential will only be available when stata once will implement real parallel programming.

      Because of the cumbersome update process of the ssc repository you can download the files from my Github page:

      https://github.com/ThorstenDoherr/brain

      Delete all traces of the former brain.ado and brain.sthlp from your local ado directory before copying the new version. Look also into the "b" sub-directory.

      Comment


      • #4
        Originally posted by Thorsten Doherr View Post
        I have updated the brain.ado file. It now has a more consistent output reflecting the real training errror. You can specify the report interval to reduce the volume of output. The "best" option always maintains the best network if the error is fluctuating due to a large eta. Batch training is implemented, even though its full portential will only be available when stata once will implement real parallel programming.

        Because of the cumbersome update process of the ssc repository you can download the files from my Github page:

        https://github.com/ThorstenDoherr/brain

        Delete all traces of the former brain.ado and brain.sthlp from your local ado directory before copying the new version. Look also into the "b" sub-directory.
        May I ask how to install brain using net install brain or github install brain command directly? Thanks.

        Comment


        • #5
          Follow the link to an excellent description of "github" for STATA:

          https://haghish.github.io/github/#1-introduction

          I haven't used that tool yet, but thank you for leading me to it.

          Comment


          • #6
            A new version of brain is available on GitHub. This version supports weights for training (and error report) in case you have very skewed output variables, e.g. a binary variable with inflated zeroes or ones. Furthermore, the command "brain fit" calculates precision and recall. Though it is part of the brain package, you can use it also outside the context of brain, e.g. comparing different probit models. The help file has now even more examples.

            https://github.com/ThorstenDoherr/brain

            For the next larger update, I intend to replace the mata code with C plugins to speed up the training and prediction. But this will happen sometime in 2020. That version will be on ssc again.

            Comment


            • #7
              A new version of the Neural Network application "brain" for Stata is available. I have stripped all mata code and replaced it with C plugins supporting multiprocessing based on openMP. The performance gain is beyond expectations. For the full benefit of multiprozessing you should have at least 4 cores. The is an option to switch back to single core processing in case the etiquette of the server environment requires it (brain will occupy all cores it can get hold off).

              As usual, you can download this version from my GitHub page:

              https://github.com/ThorstenDoherr/brain

              Click green "Clone or Download" Button -> Download Zip
              Copy the brain.ado, brain.sthlp, brainwin.plugin, brainunix.plugin and brainmac.plugin into your ADO directory (typically c:\ado). The plugin directory contains the source code for the plugins and a project directory for Visual Studio. If you are further interrested into plugins please consult the Stata documentation: https://www.stata.com/plugins/
              Do not forget to ensure that there are no traces of older versions deeper in the ADO directory.

              I have a request to the community: the Unix and Max plugins are not yet tested due to lack of respective Stata versions. I really would appreciate if somebody with access to one of those could give me feedback about the functionality of the plugins.

              Thanks in advance.

              Comment


              • #8
                The latest version of the brain neural network module is now also available on ssc.
                Unfortunately, the UNIX and MAC plugins do not support mutliprocessig due to the inconsistent support of OpenMP by the UNIX/MAC distributions. They are stil fast (after all it is C code), but just not as fast as the Windows version.

                Comment


                • #9
                  Hi,

                  I have installed the 'brain' command in STATA/SE 15.1 without problems.
                  But when I run the command:

                  .brain

                  I got this error message:

                  non-natively compiled windows plugin detected, e.g. cygwin/mingw
                  unable to load brainwin.plugin from directory c:\ado\plus/b/
                  perhaps additional dlls are required in that directory, e.g.:
                  brainwin.plugin
                  libgomp-1.dll
                  libwinpthread-1.dll
                  libgcc_s_seh-1.dll
                  r(999);

                  Any suggestions?

                  Thanks!

                  Comment


                  • #10
                    Hi Aitor,

                    It seems that you have some old plugins in you ado path, that somehow became invoked after ssc install brain.

                    Just go into your c:\ado directory, delete all brain files and brain related plugins: brain.ado, brain.sthlp, brainwin.plugin, brainmac.plugin, brainunix.plugin
                    In “C:\ado\plus\b” should reside all brain related files and plugins you have downloaded with ssc install.

                    You can always open STATA and enter “which brain” to see the directory STATA will be using when you call “brain”. If it is not the directory you are expecting something is messed up. For instance, you may have old plugins in the active directory. STATA will always choose those instead of the ones in the proper install directory. Just delete all instances that are not in c:\ado\plus\b
                    If that fails, delete the brain files in c:\ado\plus\b and reinstall them with ssc install brain

                    I hope this resolves your issue.
                    Remember, if everything else fails, there is always the Github version on https://github.com/ThorstenDoherr/brain

                    Best,
                    Thorsten

                    Comment


                    • #11
                      Hi Thorsten,

                      Thank you very much for your detailed reply.

                      I have checked that the installation of "brain" is correct.The files are only in the folder "C:\ado\plus\b" and the command "which brain" in STATA indicates that the program that is running is from that folder. But the problem persists.

                      I have tried installing it on a newer laptop and it works perfectly. On the older laptop, I understand that it may be a problem linked to the version of the .dll files in the operating system (old version of windows 10).

                      I have tested the command and I think you have done an excellent job. Congratulations!

                      Thank you very much for your help.

                      Best regards,
                      Aitor

                      Comment


                      • #12
                        Hi Thorsten,

                        Thanks for making this program. I am not a coding whiz, and also am relatively new to neural networks (coming from more traditional econometric techniques).

                        Two questions. 1) How do I open up and then export the matrices that the brain creates?
                        2) If I wanted to use the NN adjustors that was formed on my training data to predict for the new data and then calculate an rsquared from that, would this be the appropriate command sequence?

                        use completedata.dta
                        sum _n
                        preserve
                        drop if _n <r(p50)
                        brain define, input(x x2) output(y) hidden(10)
                        brain train, iter(1000)
                        restore
                        sum _n
                        drop if _n>=r(p50)
                        brain think ybrain
                        egen rbrain = sum((y-ybrain)^2)
                        sum y
                        scalar ymean = r(mean)
                        egen sst = sum((y-ymean)^2)
                        gen rsquared= 1-rbrain/sst


                        Would this then give me the rsquared of the NN created by training but used on the non-training set?

                        Thanks for your help

                        Best
                        Phil

                        Comment


                        • #13
                          Hi Phil,

                          you can save and load the neural network matrices using the "brain save file_name" respectively "brain load file_name" command (the extension is ".brn" by default).
                          I have adjusted your code (sum _n does not work):

                          use completedata.dta
                          preserve
                          drop if _n < _N/2
                          brain define, input(x x2) output(y) hidden(10)
                          brain train, iter(1000)
                          restore
                          drop if _n >= _N/2
                          brain think ybrain
                          egen rbrain = sum((y-ybrain)^2)
                          sum y
                          scalar ymean = r(mean)
                          egen sst = sum((y-ymean)^2)
                          gen rsquared= 1-rbrain/sst

                          And yes, this will return the r-squared of the out-of-sample prediction.

                          Best,
                          Thorsten

                          Comment


                          • #14
                            Originally posted by Sean Fiedler View Post
                            Very interesting, thank you! (and nice examples in the doc)
                            Sorry if I'm missing it, but where can I find those nice examples in doc?

                            Comment


                            • #15
                              Originally posted by Simra Temmy View Post

                              Sorry if I'm missing it, but where can I find those nice examples in doc?
                              They are in the help file. After installing brain, just enter: help brain

                              Comment

                              Working...
                              X