Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Help with Data Cleaning

    Hi everyone,

    I require urgent assistance!
    I have 3 variables about means of transport whose values are 1, 0 in case of having them
    X1: Bicycle
    X2: Auto
    X3: Motorcycle

    I need to generate a Means of Transport variable that is binary and that even if the individual has a bicycle, car and motorbike at the same time the result is 1.
    X1: 1
    X2:1
    X3:0
    Means of Transport X4: 1 because it already fulfilled the condition of having at least one means of transport.

    Another thing, I'm cleaning my data, but I have a problem many of my observations are binary variables and numerical variables, but I have instead of 1 "YES" combined with 1 and instead of 0 "no" combined with zeros, that already corrected.
    But where I have Sales values, I have these values: "???" "Not yet"
    How do you identify this kind of thing in Stata? or is it better to correct it in Excel?

    Translated with www.DeepL.com/Translator (free version)

  • #2
    gen X4=1 if X1==1 | X2==1 | X3==1
    try the code and see if it's what you want.

    Comment


    • #3
      Zhe Zhang's helpful answer gets you started, but will produce an indicator variable that is 1 or missing.

      Indicators that are 1 or 0 are immensely more useful.

      Code:
      gen wanted = max(X1, X2, X3)
      will produce 1 if any input is 1, 0 if otherwise any is 0 and missing if all are missing.

      You don't need it for your immediate question but https://journals.sagepub.com/doi/pdf...36867X19830921 is a review of indicator variables in Stata

      It's easy enough to look for inconsistent string values in Stata. Many active users here don't use Excel at all, or at all willingly, so there are differing views on its merits.

      Comment


      • #4
        Originally posted by Nick Cox View Post
        Zhe Zhang's helpful answer gets you started, but will produce an indicator variable that is 1 or missing.

        Indicators that are 1 or 0 are immensely more useful.

        Code:
        gen wanted = max(X1, X2, X3)
        will produce 1 if any input is 1, 0 if otherwise any is 0 and missing if all are missing.

        You don't need it for your immediate question but https://journals.sagepub.com/doi/pdf...36867X19830921 is a review of indicator variables in Stata

        It's easy enough to look for inconsistent string values in Stata. Many active users here don't use Excel at all, or at all willingly, so there are differing views on its merits.
        Thank you, Nick. I forgot the "replace" step. But your approach is much more desirable!

        Comment

        Working...
        X