Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Encrypt data at rest for use with Stata

    I'm helping to review a data security protocol for some proprietary data storage. I'm wondering if Stata supports any data format that allows the data to be encrypted at rest. By "encrypted at rest" I mean that the data should be encrypted on the hard drive, and it should require a password to load the data into random access memory for analysis with Stata. I've done some light googling, but I haven't found anything yet. I'm wondering if someone here might know a little more.

  • #2
    As far as I know, Stata cannot do this. And while the people at StataCorp might see things differently, I can't see how any statistical package could be designed to do this and still remain usable. Consider: you could not execute non-trivial do-files under such a regime as Stata would have to interrupt execution every time you have a -use-, -merge-, or -append- command. Complicated data management would become a nightmare under this regime. I have read that the Spanish Armada was ultimately defeated because they weighed down their ships with so much defensive armor that the ships were slow and hard to maneuver, so they failed in battle. Let's not replicate that in computing.

    [Rant]
    I dont believe that password "protection" provides real data security. Passwords that are strong enough to really protect are not memorable; password manager devices can be lost; and password manager apps ultimately rely on a password to access them. In the real world, people don't use strong passwords unless forced to do so. So, ultimately they will use a weak password for the app that manages their would-be strong passwords. Or, if they are prevented from doing so, they will just keep a written record of the "strong" password, and typically leave it in an unsecured place.

    Passwords are, in my opinion, an idea whose time will never come. Ultimately, the security of computer data must rest on the security of the computer itself. If you are really working with truly sensitive information, you need a computer and OS that have whole-disk encryption built in, and where access to the computer itself is secured by biometrics or two-factor authentication, or, in the most critical cases, even by physical sequestration with surveillance and isolation from the internet in the most sensitive cases.
    [/Rant]

    Comment


    • #3
      I'm not entirely sure I understand (security is not my area of expertise), but I have had clients request that I encrypt via FileVault (Mac) and Stata is fine with that

      Comment


      • #4
        I agree with all three paragraphs of what Clyde Schechter wrote in post #2. Except that to the end of his second paragraph, I would add that they will find a way to include the password in the unsecured do-file that accesses the data - I've seen this done by users who have encrypted ZIP archives. Honestly, we've seen users ask how to do that on Statalist.

        On my macOS system I have used external drives (flash drives, usually) which I have had macOS format as a macOS encrypted filesystem. Every time the drive is attached to the Mac, macOS throws a dialog box into which the password must be typed before macOS will mount the filesystem, and decrypt it on the fly when it is read, and encrypt new files written to it as they are written. Until the drive is ejected from the Mac, the filesystem is readable and writable as though it were an ordinary drive. So the data is vulnerable if I leave my computer unattended while the encrypted drive is mounted. In an office environment a locked door can help with that possibility. But I work from home, and my wife is trustworthy.

        Added in edit: crossed with post #3, which is what I describe.

        Comment


        • #5
          Thank you all. I am going to suggest that we setup full disk encryption for any devise where the data or its derivatives are stored. I use FileVault as a matter of course on my mac, and the integration is so seamless I tend to forget about it. I know this is also fairly easy to set up in most linux distributions, and I've just learned it is straightforward in Windows as well.

          Regarding #2: I agree with much of what Clyde says about good security practice. However, I'm not sure this kind of encryption would be such a disaster for usability. Instead of a single file, you might imagine an encrypted directory: it could be a directory on an internal hard drive, or on an external drive like William suggests. Now, suppose that the data is encrypted with a public key stored on the hard disk and decrypted with a private key that isn't stored at all (this is your password). When you want to use the content of the directory, you might use a command to decrypt the content of the directory, which should prompt you for your password, mount the directory by loading the content into memory, and decrypt the content in memory. You could then add files to the directory, and -use-, -merge-, or -append- the content of the directory seamlessly in memory until the end of the session.1 Saves or other writes to the disk are quietly encrypted and written to the disk with the public key, meaning no password use for the user. When you close Stata, the disk is unmounted in memory, leaving the encrypted data on the disk. I was hopping Stata might have something like this automated with a command, but full disk encryption for the computer seems like a perfectly reasonable alternative. Of course the real Achilles heal of a setup like I describe is given in #4:

          they will find a way to include the password in the unsecured do-file that accesses the data
          Can't argue with that. This kind of bad practice is easily one of the biggest vectors for data breaches.
          Last edited by Daniel Schaefer; 07 Feb 2023, 16:46.

          Comment


          • #6
            Note that macOS also supports "disk images" - files that when "opened" appear as mounted filesystems on the macOS desktop. And disk image filesystems can be encrypted as well, in the same manner as external flash drives.

            My 2cents is that security issues like encryption should be the provenance of the operating system, rather than the application, because the expertise is more focused at the OS level. For part of the application development team, security will be one of many hats they wear, most likely. Not a good idea.

            Comment


            • #7
              I once built an encrypted-at-rest data storage system as part of an applications' intellectual property management system. It made sense to encrypt only a few files in that setting because the application couldn't necessarily enforce something like full disk encryption. Of course, I didn't try to write the cryptographic components myself. I consumed a well vetted library that did the low level work. The library wasn't technically developed by the authors of the operating system, but I can absolutely imagine taking advantage of operating-system level encryption technologies programmatically in an application.

              Comment


              • #8
                Regarding Linux: Every modern Linux system like Ubuntu or Mint supports either full-disk encryption or encryption of the home folder. This is highly convenient and only requires a single tick when installing the OS. While Stata is usually not set up in the home folder, as long as you place your do files and data files in the home folder, they are encrypted. Is this super safe? No. Is this much better than nothing. Yes. Especially when you work on a laptop and it gets stolen or lost, this is really good. If you want to move Stata files safely, I would suggest VeraCrypt or TrueCrypt.
                Best wishes

                Stata 18.0 MP | ORCID | Google Scholar

                Comment

                Working...
                X