Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Identify similar string variables: strgroup command?

    Hello,

    I have a string variable which is the combination of several digits. I would like to generate a new variable that helps me identify the strings that are the same or they just differ from 1 digit.
    All the value of the string have all the same length.

    (1) I have tried with the strgroup command but I get the following error message: "file strgroup.plugin not found (error occurred while loading strgroup.ado)"

    I read that in the past the program did not work for 64digit processors, but this should no longer be the case. I am using STATA 12 MP.

    Anybody could help? Thanks a lot in advance,

    Silvia


    strgroup string, gen(match) threshold(.25)

    string
    00000000
    01011011
    00000000
    11111111

  • #2
    I don't understand what you want to obtain.
    You only show the input, what do you want as output?
    Suppose you have:
    A 0000
    B 0010
    C 0011

    Both A and C differ from B by one digit, so they should be in the same group with B, but A and C differ from each other by 2 digits, so they can't be together in one group. Sounds like some sort of tight-clustering problem.

    -strgroup- is unknown to me. if it is a plugin, it is likely user-written, so the original author might provide more info.

    For a simpler task, "does X differ from Y by no more than 1 digit?" you can do:
    1) convert X from binary to decimal;
    2) convert Y from binary to decimal;
    3) subtract smaller from the bigger;
    4) if the result is a power of 2 (0,1,2,4,8,16,...), then the answer to the question is yes, otherwise: no. Use logs properties to check for exact power.

    This is how it works:
    Code:
           x      y   xd   yd   diff   close  
        0000   0010    0    2      2       1  
        0010   0000    2    0      2       1  
        0011   0010    3    2      1       1  
        0000   0011    0    3      3       0  
        1000   1001    8    9      1       1  
        0011   1001    3    9      6       0  
        1000   1111    8   15      7       0
    Best, Sergiy Radyakin

    Comment


    • #3
      Thanks for the advice. You just made me realize of a problem I hadn't considered. I'll contact the plugin author. Best, Silvia

      Comment


      • #4
        I have a Mac and this problem also came up for me. For those who have this problem in this problem in the future, here is the response I received from the author. Now the command runs.
        "SSC sometimes does not download and/or rename the plugin file correctly when you install -strgroup-. Thus you will need to manually download the Mac version of the plugin ("strgroup.Macintosh.MacOSX.plugin"), rename it to "strgroup.plugin", and place it somewhere in your Stata adopath so that Stata can find it when you are running the command. (Type -which strgroup- or -adopath- to see where Stata is currently storing add-ons.)

        You can download the plugin from here:"
        https://ideas.repec.org/c/boc/bocode/s457151.html

        Comment

        Working...
        X