Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Is it sensible to use GitHub for partially Stata-generated websites?

    Hello All

    I have recently had a change of affiliation to King's College London, and my website has been successfully moved there from Imperial, though the domain name is still

    http://www.rogernewsonresources.org.uk/

    as before, and I still distribute Stata packages, of which I have over 100, often in multiple package versions writen in multiple Stata versions. (That way, upgrading a package to a new Stata version does not cause users of the old Stata version to lose the old-version package that they could download before.

    However, the King's College IT people have asked me to consider doing all further updates in GitHub, which they consider to be the sensible default way to maintain a website. And I have already downloaded Rodrigo Martell's SSC package git, which I understand that many Stata programmers use, and am ready to learn more about it if necessary.

    So, I would like to know a few opinions about GitHub.

    There are 2 reasons why I am being open-mindedly skeptical about GitHub at the moment:

    1. I have always done large numbers of updates, which are usually adding either a new package or updating an existing package. However, my website (or at least its Stata-related parts) is currently mostly computer-generated, using Stata itself to create the table of all versions of all my packages at

    http://www.rogernewsonresources.org.uk/stata.htm#all_versions

    and also the instasisay suite of install-wizard do-files described at

    http://www.rogernewsonresources.org....nload_from_ssc

    to install all my packages from SSC, and also the instasisay_x suites of packages described at

    http://www.rogernewsonresources.org.uk/stata.htm#download_from_here

    to upload all my packages available to a particular version of Stata from my website, in the latest versions compatible with that version of Stata. I am warned that, if I use GitHub, then I will no longer be able to see a new update just by clicking on index.htm on my PC, which sounds very bureaucratic and a pain to me. So, are there also other ways in which GitHub is inconvenient, such as making it difficult or impossible to update my Stata packages automatically, using a single Stata command calling a single ado-file in Stata? I would like to update my multitude of Stata packages at will, without introducing any new bureaucratic rituals, such as calling an additional package a "new version" of my whole website and having to give it a name and/or a number.

    2. I am aware that GitHub now belongs to Microsoft. After a quarter of a century of bad experiences with assorted Microsoft products, I think it is a fair generalisation that Microsoft tend to lock people into Microsoft by any means necessary, to no good effect, by making it time-consuming to escape from Microsoft when people have deadlines. In particular, in my new role, my standard email client was compulsorily downgraded from programmer-friendly Mozilla Thunderbird to programmer-hostile Microsoft Outlook, with all the pain that that implies to programmers like ourselves, and this is making me very cautious about diving any deeper into Microsoft dependency without checking first. Having said that, I have had not many serious problems with Windows 10 or OneDrive, both of which are also Microsoft products.

    Best wishes (and hoping for any feedback)

    Roger

  • #2
    PS to the above, another thing that is worrying me is that, according to

    help git

    the user's git repositories live"in the same location as the ado and personal directories". In my case, that location is somewhere on my C:/ drive, which is not a sensible location for the definitive version of anything. Or am I misunderstanding git and/or GitHub?

    Best wishes

    Roger

    Comment


    • #3
      PPS to the above, I have found KCL Information Techniligy to be AMAZINGLY helpful, beyond the call of du/ty, but through no fault of their own, they probably know les than a lot of Statalisters about the practicalities of maintaining large, complicated Stata package repositories...

      Comment


      • #4
        I'm by no means an expert, but I would consider using GitHub (or some other hosted git provider) as the definitive, master version. In essence, you set up a repository there, and then clone the work on your own computer(s). After making some modifications, you can then submit pull requests or push changes upstream to GitHub. As an aside, I wouldn't expect any kind of vendor lock-in vis-à-vis Microsoft, it's relatively easy to migrate git repositories from one host to another since git is an open-source project.

        Comment


        • #5
          As someone who spent a decade working at the border between the IT organization and the research staff in a research institution, I recognize the sort of advice you've been given.

          I think you have received (well intentioned, and likely good) generic advice about software development and maintenance well suited to someone just setting out on the path that you are already years along on, using a system you have developed that is well tailored to the way you choose to support your software. As you say in post #3, they have no appreciation of what your processes accomplish.

          There are statisticians who would tell you that a GitHub repository of R programs is a sensible tool for developing and distributing statistical software.

          If what you do is not broken, why spend the effort to fix it?

          Comment


          • #6
            I don't use GitHub but that doesn't mean much more than it says. Everyone is entitled to hope that they don't have to learn some (very) complicated stuff that enthusiasts rave about, but fear that they may have to. After all, this is probably the attitude of 98% of Stata learners... (As usual in these discussions, I make up the data I don't have.)

            From a distance I see three advantages among, I guess, very many more to GitHub. It is really good if any of the following applies strongly to you.

            1. At work you are developing software more or less rigidly in a team. You need a central place where everybody concerned can see the current version and also modify it if that is what they do.

            2. You are placing stuff in the public domain and hope and expect that others may pick it up and run with it.

            3. It is very important that successive versions of a program remain accessible.

            My personal experience may be extreme but publicising my Stata code does go back to 1994. I make use of the Stata Journal (SJ) and SSC.

            On 1, locally it's just me in the sense that my collaborators and graduate students and students typically expect me to write the commands.

            On 2, making my programs public doesn't mean that I expect people to run off with them. Some years ago, someone took one of my commands, threw away the help file (on the grounds that people should just look at the comments in the code) and then announced elsewhere that they had improved the command. (I forget what changes they made.) They then seemed very surprised when I asked them not to do that. No doubt this makes me seem very old-fashioned in being possessive about code I write, but people are welcome to clone the code and make changes, so long as they then work with it under a different command name. Or, simply, collaborators take over a program when they have much better ideas than I do on where to take it, and we agree on that.

            On 3, naturally this does bite from time to time, although far more often with my stuff the answer is just to use the latest version.

            Now more generally for

            1. Stata Journal and SSC are not generally good solutions for team working. If you run commands or indeed do-files within a team, neither SJ nor SSC will publish your do-files and in any case for most commands the time scale and other barriers are quite inappropriate. You want to change code and post it accessibly and instantly.

            2. Stata Journal and SSC essentially don't support forking.

            3. The SJ is very open to publishing bug fixes and versions with enhancements. (In this respect, the Stata community has been sensitive to updates and rewrites for 30 years now.) SSC isn't geared to maintaining access to previous versions of a program unless you give them different names.

            So, at the broadest level, GitHub can be anywhere from ideal to unnecessary, depending very sensitively on exactly how and why you want to share your Stata or other software.

            Comment


            • #7
              Have you looked Haghish's Stata Journal artilce on GitHub: https://doi.org/10.1177/1536867X20976323 ? That gives you the perspective of a "GitHub believer" who uses it for Stata programs.

              I have tried Git and GitHub for a new package, and its OK. I am not wildly enthusiatic, but I see its appeal. If you are using Git (and GitHub) for a project you are writing on your own, then it's main advantage is as a way to automate your workflow. Workflow is very personal, so whether a tool helps or constrains differs from person to person. My experience is mixed, there were times I liked it and there were times I felt constrained by it. However, I certainly don't like it enough to transfer my entire portfolio of Stata commands to GitHub.
              ---------------------------------
              Maarten L. Buis
              University of Konstanz
              Department of history and sociology
              box 40
              78457 Konstanz
              Germany
              http://www.maartenbuis.nl
              ---------------------------------

              Comment


              • #8
                My personal web pages are written with Microsoft Frontpage 2003. They are not going to win any design awards but they are perfectly functional. I have used Roger's web page a few times and it is perfectly functional too.

                If I were a graduate student today I might get into github and R and maybe even Python. Seeing as how I have been a professor for 35 years github etc. are things I might look into if I get bored in retirement.
                -------------------------------------------
                Richard Williams, Notre Dame Dept of Sociology
                StataNow Version: 19.5 MP (2 processor)

                EMAIL: [email protected]
                WWW: https://www3.nd.edu/~rwilliam

                Comment


                • #9
                  Many thanks to everybody for many helpful comments.

                  One immediate query. Would I be right in guessing that I cannot run Stata from inside GitHub to input and output files under GitHub? Because, if I can't, then I cannot really keep the definitive version of my website under GitHub, because the definitive version of my website needs to be generated using my personal Stata command tocgen.ado, which generates my .toc files in .toc format, my package lists in native HTML, and my install-wizard do-files in Stata (using the net and ssc commands). Technology aside, I would like to be seen to have explored thoroughly the possibilities that GitHub might offer me.

                  Best wishes

                  Roger

                  Comment


                  • #10
                    I haven't done so, bit I suspect it is possible to issue Git commands from Stata by prefixing the Git command with an exclamation mark. These commands include the possibility to upload changes to GitHub. So I guess it is possible.
                    ---------------------------------
                    Maarten L. Buis
                    University of Konstanz
                    Department of history and sociology
                    box 40
                    78457 Konstanz
                    Germany
                    http://www.maartenbuis.nl
                    ---------------------------------

                    Comment


                    • #11
                      Many thanks Maarten. I will read Ebad Haghish's article and investigate the possibilities.

                      Comment

                      Working...
                      X