Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Chen Samulsion
    started a topic convert concatenate strings into numeric

    convert concatenate strings into numeric

    Dear Stata users,

    I have a data like below, the researchers input variables as alphabet. Now I want to convert those strings into numeric such that "A" as "1", "B" as "2", "C" as "3". It is easy to do when string has only one alphabet, but in cases that strings was concatenated as "A,B,C", how can I address it? Thank you in advance for advice.
    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str20 x1 str22 x2 str18 x3
    "A"     "A"       "C"      
    "A,B"   "A,C"     "B,D"    
    "B,C"   "C,G"     "A"      
    "A,B,C" "C"       "B,C,D,E"
    "B"     "B"       "B"      
    "C"     "C"       "A"      
    "A,B"   "A,C"     "B,D"    
    "B"     "E"       "E"      
    "A"     "B,C,D,F" "A"      
    "A,B"   "B"       "A,B,C,E"
    "B"     "A,F,G"   "B,C,E"  
    end

  • Nick Cox
    replied
    #6 and #7 have no bearing on the thread title. Please start a new thread with a good title.

    Leave a comment:


  • Olayiwola Adetutu
    replied
    The subset of the data is:
    Code:
     
    Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10
    0 0 1 1 1 1 1 1 1 1
    1 1 1 1 1 1 1 0 1 1
    0 0 1 1 1 1 0 1 1 1
    1 1 1 0 1 0 0 0 1 1
    1 0 1 1 1 1 0 0 0 0
    0 1 1 1 1 1 1 0 0 0
    1 1 1 1 1 0 1 0 0 1
    1 0 1 0 1 1 0 0 0 1
    1 1 1 0 1 0 1 0 1 0
    0 1 1 1 1 1 0 1 0 1
    1 0 1 1 1 0 0 1 0 0
    0 0 1 0 1 1 1 0 1 0
    1 0 1 0 1 1 0 0 0 0
    0 0 1 1 1 1 1 1 1 1
    0 1 1 1 1 0 0 0 1 1
    0 1 1 1 1 1 0 1 1 1
    0 1 1 1 1 1 0 0 1 1
    1 0 1 1 1 0 0 0 0 0
    0 0 1 1 1 1 0 0 1 0

    Leave a comment:


  • Olayiwola Adetutu
    replied
    I am using Stata 16 SE, Please, I want to fit one and two parameter logistic models using bayesmh, to the sample of the data below :
    HTML Code:
      	 		 			Q1 			Q2 			Q3 			Q4 			Q5 			Q6 			Q7 			Q8 			Q9 			Q10 		 		 			0 			0 			1 			1 			1 			1 			1 			1 			1 			1 		 		 			1 			1 			1 			1 			1 			1 			1 			0 			1 			1 		 		 			0 			0 			1 			1 			1 			1 			0 			1 			1 			1 		 		 			1 			1 			1 			0 			1 			0 			0 			0 			1 			1 		 		 			1 			0 			1 			1 			1 			1 			0 			0 			0 			0 		 		 			0 			1 			1 			1 			1 			1 			1 			0 			0 			0 		 		 			1 			1 			1 			1 			1 			0 			1 			0 			0 			1 		 		 			1 			0 			1 			0 			1 			1 			0 			0 			0 			1 		 		 			1 			1 			1 			0 			1 			0 			1 			0 			1 			0 		 		 			0 			1 			1 			1 			1 			1 			0 			1 			0 			1 		 		 			1 			0 			1 			1 			1 			0 			0 			1 			0 			0 		 		 			0 			0 			1 			0 			1 			1 			1 			0 			1 			0 		 		 			1 			0 			1 			0 			1 			1 			0 			0 			0 			0 		 		 			0 			0 			1 			1 			1 			1 			1 			1 			1 			1 		 		 			0 			1 			1 			1 			1 			0 			0 			0 			1 			1 		 		 			0 			1 			1 			1 			1 			1 			0 			1 			1 			1 		 		 			0 			1 			1 			1 			1 			1 			0 			0 			1 			1 		 		 			1 			0 			1 			1 			1 			0 			0 			0 			0 			0 		 		 			0 			0 			1 			1 			1 			1 			0 			0 			1 			0
    Th original data made up of 35 questions answered by 403 examinees using this codes:
    Code:
     
    .set maxvar 30000
    
    . set emptycells drop
    
    . import excel "C:\Users\MATTHEW ADETUTU\Documents\Result_Coding.xlsx", sheet("Sheet 1") firstrow
    (35 vars, 403 obs)
    
    . generate id = _n
    
    . 
    . quietly reshape long Q, i(id) j(item)
    
    . 
    . rename Q y
    
    . 
    . fvset base none id item
    
    . 
    . set seed 10
    program my1plllogit
    args lnf xb
    tempvar infj
    quietly generate 'infj' = ln(invlogit ('xb') 
    if $MH_y = = 1 & $MH_touse
    quietly replace 'lnf' = ln(invlogit(-'xb')
    if $MH_y = = 0 & $MH_touse
    quietly summarize 'infj', meanonly
    if r(N) < $MH_n {
    scalar 'lnf' = .
    exist
            }
    scalar 'lnf' = r (sum)    
    end
    
    bayesmh y i.item, noconstant reffects(id) llevaluator(my1plllogit)
                prior({y:i.id},normal(0,{var}))
                prior({y:i.item}, {y:1bn.item}, normal(0,10))
                prior({var}, igamma(0.01,0.01))
                block({var})block({y:i.item}, reffects)
                exclude({y:i.id})  dots
    The codes did not work, errors encountered include:
    .
    .
    . bayesmh y i.item, noconstant reffects(id) llevaluator(my1plllogit)
    note:random effects ibn.id are shared between dependent variables
    invalid parameter name ibn.id
    r(198);

    .
    . prior({y:i.id},normal(0,{var}))
    command prior is unrecognized
    r(199);

    .
    . prior({y:i.item}, {y:1bn.item}, normal(0,10))
    command prior is unrecognized
    r(199);

    .
    . prior({var}, igamma(0.01,0.01))
    command prior is unrecognized
    r(199);

    .
    . block({var})block({y:i.item}, reffects)
    command block is unrecognized
    r(199);

    .
    . exclude({y:i.id}) dots
    command exclude is unrecognized
    r(199);

    .please I need help . Thanks

    Leave a comment:


  • Chen Samulsion
    replied
    Dear Jorrit Gosens and Nick Cox, thank you very much. Nick, I'm sorry for not clarifying my query, I just want to replace string to numeric variable by variable. Your answer in #3 is just enough to meet my problem! However I'm glad to see the further step using concat function that you provided, I always learn much from you.
    Code:
    tokenize `c(ALPHA)' 
    forval x = 1/26 { 
        foreach v in x1 x2 x3 { 
            replace `v' = subinstr(`v', "``x''", "`x'", .) 
        }
    }

    Leave a comment:


  • Nick Cox
    replied
    Here is a cleaned-up concatenate any way (no duplicates, tidy order):

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str20 x1 str22 x2 str18 x3
    "A"     "A"       "C"      
    "A,B"   "A,C"     "B,D"    
    "B,C"   "C,G"     "A"      
    "A,B,C" "C"       "B,C,D,E"
    "B"     "B"       "B"      
    "C"     "C"       "A"      
    "A,B"   "A,C"     "B,D"    
    "B"     "E"       "E"      
    "A"     "B,C,D,F" "A"      
    "A,B"   "B"       "A,B,C,E"
    "B"     "A,F,G"   "B,C,E"  
    end
    
    tokenize `c(ALPHA)' 
    
    gen wanted = "" 
    
    quietly forval x = 1/26 { 
        replace wanted = cond(wanted == "", "`x'", wanted + ",`x'") if strpos(x1, "``x''") | strpos(x2, "``x''") | strpos(x3, "``x''") 
    }
    
    list , sep(0) 
        
         +-----------------------------------------+
         |    x1        x2        x3        wanted |
         |-----------------------------------------|
      1. |     A         A         C           1,3 |
      2. |   A,B       A,C       B,D       1,2,3,4 |
      3. |   B,C       C,G         A       1,2,3,7 |
      4. | A,B,C         C   B,C,D,E     1,2,3,4,5 |
      5. |     B         B         B             2 |
      6. |     C         C         A           1,3 |
      7. |   A,B       A,C       B,D       1,2,3,4 |
      8. |     B         E         E           2,5 |
      9. |     A   B,C,D,F         A     1,2,3,4,6 |
     10. |   A,B         B   A,B,C,E       1,2,3,5 |
     11. |     B     A,F,G     B,C,E   1,2,3,5,6,7 |
         +-----------------------------------------+

    Leave a comment:


  • Nick Cox
    replied
    This may help. I wonder what you want to do about duplicates, but you say nothing about that, so no suggestions here.

    Code:
    * Example generated by -dataex-. To install: ssc install dataex
    clear
    input str20 x1 str22 x2 str18 x3
    "A"     "A"       "C"      
    "A,B"   "A,C"     "B,D"    
    "B,C"   "C,G"     "A"      
    "A,B,C" "C"       "B,C,D,E"
    "B"     "B"       "B"      
    "C"     "C"       "A"      
    "A,B"   "A,C"     "B,D"    
    "B"     "E"       "E"      
    "A"     "B,C,D,F" "A"      
    "A,B"   "B"       "A,B,C,E"
    "B"     "A,F,G"   "B,C,E"  
    end
    
    tokenize `c(ALPHA)' 
    forval x = 1/26 { 
        foreach v in x1 x2 x3 { 
            replace `v' = subinstr(`v', "``x''", "`x'", .) 
        }
    } 
    
    egen X = concat(x?) , p(,) 
    
    list 
    
         +---------------------------------------------+
         |    x1        x2        x3                 X |
         |---------------------------------------------|
      1. |     1         1         3             1,1,3 |
      2. |   1,2       1,3       2,4       1,2,1,3,2,4 |
      3. |   2,3       3,7         1         2,3,3,7,1 |
      4. | 1,2,3         3   2,3,4,5   1,2,3,3,2,3,4,5 |
      5. |     2         2         2             2,2,2 |
      6. |     3         3         1             3,3,1 |
      7. |   1,2       1,3       2,4       1,2,1,3,2,4 |
      8. |     2         5         5             2,5,5 |
      9. |     1   2,3,4,6         1       1,2,3,4,6,1 |
     10. |   1,2         2   1,2,3,5     1,2,2,1,2,3,5 |
     11. |     2     1,6,7     2,3,5     2,1,6,7,2,3,5 |
         +---------------------------------------------+

    Leave a comment:


  • Jorrit Gosens
    replied
    What would the desired end result look like?
    e.g., "A,B" should be turned into "1,2"? Or 2 separate numeric variables?
    Cam you give explicit examples of what you'd want to have in the end for a few observations?

    Leave a comment:

Working...
X