Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • strL variables not recognized and encode crashes program

    I am having trouble creating a new variable from my data. See screenshot for the variable that currently exists in the data, it is a strL according to the "describe" function, there are about 8,000,000 observations. It has all of these groups within it. Any time I try to use the generate command, the program doesn't recognize the word "Expired," even though "Expired" seems to be a valid group within this variable based on tabulate. I have tried to encode the variable but every time I type the command "encode hospdisp" the program just thinks (endless thinking wheel of death on my screen; I have let this go on for literally hours) and it never spits out a result. I have also tried "encode hospdisp generate finaldispo" and it does the same thing (thinks for hours and never gives me the new variable).
    I'm a newer user, be nice please, I am hoping this is just a silly error on my part
    Attached Files

  • #2
    Can you copy and paste here the output of

    Code:
    dataex hospdisp
    ?

    Comment


    • #3
      Please send the dataset to [email protected], we will talk a look. Given it's a large dataset, and from your decribtion seems like the program is working long time without finishing. How big is the dataset and how much RAM your system has? Also, maybe try the following to see if it makes a differences (if your Stata is 17 and newer)

      Code:
      set sortmethod qsort
      encode hospdisp, generate(finaldispo)
      Last edited by Hua Peng (StataCorp); 25 Apr 2025, 16:08.

      Comment


      • #4
        Originally posted by Andrew Musau View Post
        Can you copy and paste here the output of

        Code:
        dataex hospdisp
        ?
        First 100 output listed was these two group names.
        Attached Files

        Comment


        • #5
          Originally posted by Hua Peng (StataCorp) View Post
          Please send the dataset to [email protected], we will talk a look. Given it's a large dataset, and from your decribtion seems like the program is working long time without finishing. How big is the dataset and how much RAM your system has? Also, maybe try the following to see if it makes a differences (if your Stata is 17 and newer)

          Code:
          set sortmethod qsort
          encode hospdisp, generate(finaldispo)
          16 GB of RAM, my version is Stata MP 16

          Comment


          • #6
            You could also try to create a numeric variable using
            Code:
            egen finaldispo = group(hospdisp)
            Also, try running the command
            Code:
            compress
            before you do any string manipulations, just in case the data is artificially bloated.

            Finally: when providing dataex output, please do not post a screenshot. There is valuable information lost in doing that. Please instead post the text output of the dataex command using CODE delimiters. See the forum FAQ for more.
            Last edited by Hemanshu Kumar; 26 Apr 2025, 10:24.

            Comment


            • #7
              Please note that the dataex output should be copied and pasted exactly as is, without any modifications. Also, once you’ve posted the dataex output, could you specify which generate command you ran from below and what error message you received (if any)?


              Originally posted by Kendall McEachron View Post
              Any time I try to use the generate command, the program doesn't recognize the word "Expired," even though "Expired" seems to be a valid group within this variable based on tabulate.

              Comment

              Working...
              X