Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Amalia Arcamone
    started a topic encode strings

    encode strings

    hi, i've some trouble encoding string (months) in numerical variable (i need jan to be 1 and so on).

    i've used this code

    enc month, gen(month_num)
    label define label_mon 1 "Jan" 2 "Feb" 3 "Mar" 4 "Apr" 5 "May" 6 "Jun" 7 "Jul" 8 "Aug" 9 "Sep" 10 "Oct" 11 "Nov" 12 "Dec"
    label values month_num label_mon
    tab month_num

    but it gives me random number, even if i ask to define jan as 1

    could you help me?

  • Nick Cox
    replied
    Indeed. Stata is literal. With those definitions, it won't map jan or JAN or January or JANUARY or anything other than Jan to 1. So, you need to clean up before this approach can possibly work.

    Showing the results of

    Code:
    tab month
    would show you and us the extent of the mess, but it should be that

    Code:
    gen better = proper(month) 
    tab better
    improves all the problems you mention and perhaps some others.

    Leave a comment:


  • Amalia Arcamone
    replied
    Originally posted by daniel klein View Post
    More specifically, type

    Code:
    label define label_mon 1 "Jan" 2 "Feb" 3 "Mar" 4 "Apr" 5 "May" 6 "Jun" 7 "Jul" 8 "Aug" 9 "Sep" 10 "Oct" 11 "Nov" 12 "Dec"
    encode month , generate(month_num) label(label_mon)
    Not crucial to the question but perhaps useful in general, c(Mons) containers the list of months.
    also with this code I have in output random values for each month
    for example aug is 24 instead of 8 while jul is like 13, but i can't find a linear combination to fix it

    Leave a comment:


  • Amalia Arcamone
    replied
    Originally posted by William Lisowski View Post
    I prefer Daniel's approach to the following because the resulting numeric variable is created with a value label, but someone might find this approach useful in limited circumstances. It takes advantage of the fact that Stata monthly dates begin at 0 in January 1960. It also is robust to capitalization or its lack. I guess it's also robust to spelling out months other than May in full ... .
    Code:
    . generate month_num = monthly(var1+"1960","MY")+1
    
    . list, clean noobs abbreviate(12)
    
    var1 month_num
    Jan 1
    FEB 2
    mar 3
    Apr 4
    May 5
    Jun 6
    Jul 7
    Aug 8
    Sep 9
    Oct 10
    Nov 11
    Dec 12
    ok, this works!! thank you

    Leave a comment:


  • William Lisowski
    replied
    I prefer Daniel's approach to the following because the resulting numeric variable is created with a value label, but someone might find this approach useful in limited circumstances. It takes advantage of the fact that Stata monthly dates begin at 0 in January 1960. It also is robust to capitalization or its lack. I guess it's also robust to spelling out months other than May in full ... .
    Code:
    . generate month_num = monthly(var1+"1960","MY")+1
    
    . list, clean noobs abbreviate(12)
    
        var1   month_num  
         Jan           1  
         FEB           2  
         mar           3  
         Apr           4  
         May           5  
         Jun           6  
         Jul           7  
         Aug           8  
         Sep           9  
         Oct          10  
         Nov          11  
         Dec          12

    Leave a comment:


  • daniel klein
    replied
    More specifically, type

    Code:
    label define label_mon 1 "Jan" 2 "Feb" 3 "Mar" 4 "Apr" 5 "May" 6 "Jun" 7 "Jul" 8 "Aug" 9 "Sep" 10 "Oct" 11 "Nov" 12 "Dec"
    encode month , generate(month_num) label(label_mon)
    Not crucial to the question but perhaps useful in general, c(Mons) containers the list of months.

    Leave a comment:


  • Rich Goldstein
    replied
    if you did the lines in the order shown, -encode- will already have put your data in alpha order prior to your defining the labels; put the label define command prior to the encode command; if that doesn't solve the problem show a data example using -dataex- (see the FAQ)

    Leave a comment:

Working...
X