Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • How to generate equations

    Hi,
    I'm creating a new variable such that a = b/c if d = chocolates or sweets, else a = b*c.
    HTML Code:
    gen a = 0
    replace a = b/c if d == "chocolates" | d == "sweets"
    Trying to do the first part gives me error type mismatch r(109). Where am I going wrong?
    Last edited by Laksh Noori; 17 Apr 2019, 10:39. Reason: had wrongly mentioned dummy variable previously

  • #2
    Well, either b or c is string or d is numeric. You tell us. See FAQ Advice #12.

    That said, I am hard pushed to see how this would lead to an indicator: you say dummy, but see https://journals.sagepub.com/doi/abs...urnalCode=stja for reasons not to use that term.

    It is true that b/c = 0 if b == 0 and b/c == 1 if b == c, but otherwise how does an indicator come out of a division?

    Comment


    • #3
      Thank you for taking time to reply!
      Sorry - I made mistakes in my question, and I apologise for that. 'a' is not a dummy variable - it is a numeric variable.
      a, b, c, are numeric and d is a string. Am I getting the syntax wrong for my code?

      Comment


      • #4
        OK, but I can't see anything wrong with your syntax.

        But there is no meaning to saying that a is not a dummy variable, but numeric, as dummy variables are numeric too and would be useless if not. But we know what you intend to say.

        As in #2 I recommend that you follow FAQ Advice #12 and give us a data example. https://www.statalist.org/forums/help#stata is the link to follow but every time you post you are reminded to read it first.

        My guess is that d is not string at all, but a numeric variable with value labels. But note how confident answers come much more slowly if you don't give us the information we ask for.

        Comment


        • #5
          Here is the information -
          Code:
          input str1 d int b int c
          
                       d    b   c
            1. Chocolate   20 10
            2. Sweet       30 20
            3. Chocolate  100 50
            4. Snickers    10  2
          Not sure if I got the dataex right. But this is the data I have.
          I need to create a column a such that a = b/c if d = chocolate or sweet, else a= b*c.

          Last edited by Laksh Noori; 17 Apr 2019, 11:36.

          Comment


          • #6
            That's not quite dataex output. More importantly, sorry, but this still makes no sense to me. How can you fit the string "Chocolate" inside a str1? I note that the values here are not the same as stated in #1 although that in itself would not account for a type mismatch.

            If you are just making this up and your example isn't the real data, or like it, then that is no use to us or yourself. Wanting to simplify to get to the real problem is admirable, but not if it obscures the real problem.

            I may be missing something simple, and if so others should be able to help out.

            Comment


            • #7
              It is pretty clear that you did not use dataex to generate this, nor did you test what you provided. Defining d as a string with length 1 is going to result in a string with just 1 character, which is clearly not what you have nor what you want.

              Can you post (in code delimiters) the results of (or whatever the variables are called in your real data if this is all an abstract example)
              Code:
              describe d b c
              I'm with Nick in suspecting that d is, in fact, not a string variable but there's really only one way to know for sure and that's to check.

              Comment


              • #8
                Sorry about the lack of clarity, I just started learning stata.

                So this is what stata gives, when I use describe command.

                Code:
                              storage   display    value
                variable name   type    format     label      variable label
                ---------------------------------------------------------------
                b                        str6    %9s                  
                c                        str5    %9s                   
                d                      str25   %25s

                Comment


                • #9
                  That's progress. The problem is that b and c are string, so you can't divide them. This was given as an explanation in #2. The advice given one way or another by Sarah and myself in #2 #4 #6 #7 still applies: please give us a data example using dataex.

                  Evidently b and c look numeric to you (e.g. in the Data Editor) but to Stata they are string variables. This can happen in various ways. For example, metadata or other text from a spreadsheet may have been read in to Stata and even if that was in a single observation only it could cause a problem.

                  But speculation has few benefits. If you give a data example we might guess smartly at what has happened.

                  Another way to get at the problem is to show the results of

                  Code:
                  list b c if missing(real(b), real(c))
                  https://journals.sagepub.com/doi/abs...867X1801800413 gives a survey of this territory and may be accessible to you. Not for me to say if it is worth the money.

                  Comment


                  • #10
                    Well, your problem is that b and c are string variables. You need to make them numeric before you can complete your calculations. Assuming that there are no non-numeric characters in either variable this should be a very straight-forward task for destring.

                    Code:
                    destring b, replace
                    destring c, replace
                    Of course if there are non-numeric characters that you have not told us about this will be more complicated.

                    Also note when you do your calculations that text matching is case sensitive. In your example you present values of d with initial upper case letters. If that is what is in your data you will need to modify your if statement to match. Similarly, if your real data has a mix of upper and lower case values things will be more complicated.

                    Comment


                    • #11
                      Thank you so much, Nick and Sarah! It is now working. Your comments were very helpful!

                      Comment

                      Working...
                      X