Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Formatting Strings: same words being recognized as "different"

    Hello,

    I imported a dataset from Excel, but the string variables are being recognized as "different" when they should be the same. For example, below there should only be one category for "In Force". As of now, all of my variables are strings, and this is a problem for all of them.

    How do I help Stata recognize that the text is "the same?" You can see in the second screenshot that they even appear "crooked" in browse, but e.g.
    Code:
     format Status %-14s
    didn't make a difference. I can also make edits in the Excel sheet, but of course I'd like to avoid this without knowing what exactly the problem is. Thank you in advance!

    Code:
    * Example generated by -dataex-. For more info, type help dataex
    clear
    input str14 Model str16 EGA_In_Effect str29 Status
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "Model 1"        "​11/30/2014"   "​In Force "                
    "Model 1"        "​6/30/2014"    "In Force "                  
    "Model 1"        "​6/30/2014"    "​In Force "                
    "Model 2"        "​6/30/2014"    "In Force "                    
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 2"     "​6/30/2014"    "​In Force "                
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 1  " "​6/30/2014"    "​In Force "                
    "​Model 1"     "​6/30/2014"    "​​In Force "              
    "​Model 1 "   "​​6/30/2014" "​In Force "                
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 1 "   "​6/30/2014"    "​In Force "                
    "​Model 2"     "​6/30/2014"    "​In Force "                
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 1"     "​6/30/2014"    "​Signed "                  
    "​Model 1"     "​11/30/2014"   "​In Force "                
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 1"     "​6/30/2014"    "​In Force "                
    "​Model 2"     "​6/30/2014"    "​Signed"                    
    "​Model 1"     "​6/30/2014"    "​Agreement in Substance"    
    "​Model 1"     "​6/30/2014"    "​In Force "      
            
    end
    Click image for larger version

Name:	example.png
Views:	1
Size:	66.4 KB
ID:	1633405

    Click image for larger version

Name:	example2.png
Views:	1
Size:	10.9 KB
ID:	1633406






    Thank you in advance!

    Edit 1: Related post potentially: https://www.statalist.org/forums/for...-string-values

    Edit 2: I tried
    Code:
    local varlist status status_date ega_in_effect model related_agreement1 related_agreement_2 correction_1 correction_2 superseding_model_1 understanding_1
    
    foreach v of varlist `varlist'{
    replace `v' = subinstr(`v', " ", "", ., ?)
    }
    and it didn't work either. I also tried importing the file as a csv instead and it didn't make a difference. Even with the above code (which should have gotten rid of all spaces, the data looks like this:
    Attached Files
    Last edited by John Singer; 26 Oct 2021, 15:12.

  • #2
    Replace the replace command in your loop with
    Code:
    replace `v' = trim(`v')

    Comment


    • #3
      I think you have several non-ASCII characters. Use chartab from SSC for a tabulation.

      Comment

      Working...
      X