Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Import excel

    Hi,
    I imported following table from excel
    bc7 wc7 bc8 wc8 bc9 wc9 bc10 wc10 bc11 wc11 bc12 wc12
    b a a b a b a b b a b a
    b a a b a b b a b a a b
    b a a b a b b a b a b a
    b a a b a b a b b a b a
    b a a b a b b a c b b a
    Ran the following
    foreach v of var bc7-wc12{
    encode `v', gen(`v'_new)
    }

    drop bc7-wc12
    rename *_new *

    This codes all values but haphazardly. For example, for bc7, it codes b =1, while b = 2 for wc8.

    Could anyone suggest how to fix this?

  • #2
    first, set up your desired labels and then use the label option to -encode-; see
    Code:
    h label
    h encode

    Comment


    • #3
      Rich Goldstein gives excellent advice. I'd just like to make a more general point about Stata and computer programming in general. Computers do not think like human brains. (And, at least, most of them don't think at all.) A human brain sees a bunch of variables named b7-b12 and wc7-wc12 all of which take on values a or b and immediately thinks that it sees a pattern. This then creates an expectation that a and b should be assigned consistent numerical codes across all the variables. But a computer doesn't "see" it that way at all. All variables are unrelated to each other and there is no reason to think that the values of one variable have anything to do with the values in another variable. There are no patterns to the data.

      So, it follows that when programming a computer, in Stata or any other language, that left to its own devices, a computer is going to treat each variable as a separate entity, independent of any other. And from there it follows that if consistency or patterning are expected the programmer will have to explicitly build that in to the code in some way. To become an efficient programmer it is important to always bear in mind that computers know nothing and understand nothing. If they are to handle their data in a knowledgeable manner, you have to build that knowledge directly into your code; you should not expect the computer to do anything "smart" on its own.

      You might think that StataCorp should have built this kind of knowledge/understanding into Stata themselves, so that similarly named variables with similar values would be treated the same. The problem is that there are many data sets where the variables have patterned names like V101a V102a V103a V101b V102b V103b, or something like that, even though those variables actually have nothing to do with each other. So that kind of "smart" behavior would just get in the way.

      Comment

      Working...
      X