Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Adding variables

    Dear Sir,
    I am a new user of STATA 15. I have imported an excel datasheet in STATA and I was trying add some variables. The data set looks like:
    X rice
    production
    (metric ton)
    Y rice
    production
    (metric ton)
    3456 45
    2456 777
    I was trying to add the two variables into one single column titled total rice production. But when i tried to generate a new variable ( gen total= Xriceproduction + Y riceproduction) it did not give me the expected result. It added the digits of column Y to the end of the digits in column X. So it appeared like 345645 for 1st row and 2456777 for second row. But I wanted to sum them up. How can I do that.
    TIA

  • #2
    So, you didn't actually show an example of your data in Stata. This is an example where the details of the file structure are crucial, and simply drawing a table like this isn't really very helpful.

    But from what you describe the problem is almost certainly that these variables are stored as strings, not as numbers. You can verify that by running -describe Xriceproduction Yriceproduction- and you will probably see that their storage types begin with str.

    So you need to destring them.

    Code:
    destring Xriceproduction, gen(xrice)
    destring Yriceproduction, gen(yrice)
    gen total = xrice + yrice
    But there may be a hitch. The question is why are these variables stored as strings. We don't know how this data set was originally created. But if it was imported into Stata using, say, -import delimited- or -import excel-, then these variables would have come in as numeric variables to begin with were it not for some "contamination" of their contents with non-numeric material.

    Anyway, if the -destring- commands tell you that they did not do the job, then you need to identify the non-numeric material that is contaminating these variables. You can do that with
    Code:
    browse if missing(real(Xriceproduction), real(Yriceproduction))
    Then you will see what the problem is, and you have to figure out how to deal with it. Sometimes it's easy things like commas that can be bypassed using -destring-'s -ignore()- option. Sometimes it's more complicated and requires some intensive data cleaning before you can proceed.

    In the future, when showing data examples, please use the -dataex- command to do so. If you are running version 15.1 or a fully updated version 14.2, it is already part of your official Stata installation. If not, run -ssc install dataex- to get it. Either way, run -help dataex- to read the simple instructions for using it. -dataex- will save you time; it is easier and quicker than typing out tables. It includes complete information about aspects of the data that are often critical to answering your question but cannot be seen from tabular displays or screenshots. It also makes it possible for those who want to help you to create a faithful representation of your example to try out their code, which in turn makes it more likely that their answer will actually work in your data.

    When asking for help with code, always show example data. When showing example data, always use -dataex-.

    Comment


    • #3
      Thank you very much sir for your kind response.

      Comment

      Working...
      X