Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • Odbc insert

    Statalisters,

    Could someone using odbc drivers tell me if this is the normal behavior (Stata15/Linux64):

    Code:
    . odbc insert, table("t") dsn("mydsn") sqlshow
    INSERT INTO t (dt, a, m) VALUES (?, ?, ?)
    [...]
    Why wouldn't "VALUES" be an enumerated list of values instead of "?"; or per say, what am I doing wrong?
    Note: odbc load/exec works fine.

    This is a snippet of my odbc logs:

    Code:
    [...]
                    Entry:
                            Statement = 0x6b44730
                            SQL = [INSERT INTO t (dt, a, m) VALUES (?, ?, ?)][length = 41 (SQL_NTS)]
    [...]
    Clearly "INSERT INTO t (dt, a, m) VALUES (?, ?, ?)" isn't a valid sql statement, afaik.
    Last edited by jerome falken; 21 Apr 2018, 17:17.

  • #2
    I've noticed that, too (Stata running under Windows). I was likewise a little curious when I first saw it. I don't recall if it has always been like that, but it has been for at least some time now. It seems to be limited for brevity in output, providing enough for diagnostic purposes only.

    Comment


    • #3
      I'm not certain it's purely cosmetic, because otherwise it shouldn't show in the odbc logs, no?
      Or do you think my odbc driver doesn't support block loading?
      Last edited by jerome falken; 21 Apr 2018, 19:19.

      Comment


      • #4
        I'm afraid that I don't follow anything you're saying, sorry.

        I wasn't trying to claim that it's cosmetic, purely or otherwise.

        I thought that you were referring to what shows in the Stata output window, and not whether it appears driver's log or trace files. I was addressing only the former.

        If your driver doesn't support block loading, then I would think that you'd receive an error message passed through to Stata's output if you tried to use the block option of the odbc insert command .Do you? If not, then as an alternative you could verify by using a timed insert, something like the following (again, I'm using Windows 10 Pro, and Microsoft SQL Server RDBMS, but I imagine that it would be directly analogous with whatever you're using on Linux).

        .ÿversionÿ15.1

        .ÿ
        .ÿclearÿ*

        .ÿ
        .ÿsetÿseedÿ`=strreverse("1440656")'

        .ÿ
        .ÿodbcÿexec("CREATEÿTABLEÿBlockTestÿ(my_pkÿCHAR(6)ÿNOTÿNULLÿPRIMARYÿKEY,ÿmy_stuffÿINTÿNOTÿNULL);"),ÿ///
        >ÿÿÿÿÿÿÿÿÿdsn(Sandbox11)


        .ÿ
        .ÿquietlyÿsetÿobsÿ500000

        .ÿgenerateÿstrÿmy_pkÿ=ÿstring(_n,ÿ"%05.0f")

        .ÿgenerateÿintÿmy_stuffÿ=ÿruniformint(-32767,ÿ32740)

        .ÿ
        .ÿtimerÿclear

        .ÿ
        .ÿtimerÿonÿ1

        .ÿodbcÿinsert,ÿtable(BlockTest)ÿdsn(Sandbox11)ÿinsertÿsqlshow
        INSERTÿINTOÿBlockTestÿ(my_pk,ÿmy_stuff)ÿVALUESÿ(?,ÿ?)

        .ÿtimerÿoffÿ1

        .ÿ
        .ÿodbcÿexec("DELETEÿFROMÿBlockTest;"),ÿdsn(Sandbox11)


        .ÿ
        .ÿtimerÿonÿ2

        .ÿodbcÿinsert,ÿtable(BlockTest)ÿdsn(Sandbox11)ÿinsertÿblockÿsqlshow
        INSERTÿINTOÿBlockTestÿ(my_pk,ÿmy_stuff)ÿVALUESÿ(?,ÿ?)

        .ÿtimerÿoffÿ2

        .ÿ
        .ÿtimerÿlist
        ÿÿÿ1:ÿÿÿÿÿ31.00ÿ/ÿÿÿÿÿÿÿÿ1ÿ=ÿÿÿÿÿÿ31.0020
        ÿÿÿ2:ÿÿÿÿÿÿ8.97ÿ/ÿÿÿÿÿÿÿÿ1ÿ=ÿÿÿÿÿÿÿ8.9690

        .ÿ
        .ÿodbcÿexec("DROPÿTABLEÿBlockTest;"),ÿdsn(Sandbox11)


        .ÿ
        .ÿexit

        endÿofÿdo-file


        .

        Comment


        • #5
          I guess that my curiosity wasn't sufficiently piqued before: Stata is understandably using parameterized SQL statements for INSERTs, at least for MS SQL Server, and I assume for whatever you're using on Linux, too. The question marks are real.

          Comment


          • #6
            Joseph,

            I understand now, is it possible not to use parameterized SQL statements for inserts, the odbc driver does not support this.
            I was hoping odbc insert if _n==`i' in a forvalues loop would allow default to non parameterized SQL statements as a workaround.


            Jerome

            Comment


            • #7
              I didn't appreciate that you believe that the command is failing. I would have expected an error message if your ODBC driver doesn't support what Stata is attempting. Are you sure that Stata's parameterizing the SQL statement is the problem? Can you confirm that there are no insertions?

              About the only alternative, if you want to to a -forvalues- loop is something like the following, but I would definitely look for other causes and solutions of the apparent problem before resorting to something like this:
              Code:
              forvalues row = 1/`=_N' {
                  foreach var of varlist dt a m {
                      local `var' = `var'[`row']
                  }
                  odbc exec("INSERT INTO t (dt, a, m) VALUES (``dt'', ``a'', ``m'');"), table(t) dsn(mydsn)
              }
              with the necessary modifications if the variables are strings.

              Comment


              • #8
                Joseph,

                I didn't appreciate that you believe that the command is failing. I would have expected an error message if your ODBC driver doesn't support what Stata is attempting. Are you sure that Stata's parameterizing the SQL statement is the problem? Can you confirm that there are no insertions?
                Apologies, the command isn't failing, the driver is very buggy/limited, Stata isn't at fault here.

                This is the error message it sends:

                Code:
                . odbc insert dt a m, table("t") dsn("mydsn") sqlshow
                INSERT INTO t (dt, a, m) VALUES (?, ?, ?)
                The ODBC driver reported the following diagnostics
                [db][ODBC]?
                SQLSTATE=HY000
                I confirm there are no insertions.

                Code:
                forvalues row = 1/`=_N' {
                     foreach var of varlist dt a m {
                        local `var' = `var'[`row']
                    }
                    odbc exec("INSERT INTO t (dt, a, m) VALUES (``dt'', ``a'', ``m'');"), table(t) dsn(mydsn)
                }
                You must mean:

                Code:
                forvalues row = 1/`=_N' {
                    foreach var of varlist dt a m {
                        local `var' = `var'[`row']
                    }
                    quiet odbc exec("INSERT INTO t (dt, a, m) VALUES (`dt', `a', `m')"), dsn(mydsn)
                }
                The problem with this approach is that it sends ~25000 observations then fails with this message:

                Code:
                The ODBC driver reported the following diagnostics
                [unixODBC][db][ODBC]10.64.28.13:10000
                SQLSTATE=HY000
                It does not fail if I insert a "sleep 5" after the "odbc exec"; although things get much slower, and there are no guarantees that it will not fail on tables with more observations or subsequent load on the database. Note that this does not happen during "odbc load", I've "loaded" tables with >100,000 observations into stata with this same driver.

                I believe I have two options, under the assumption that I want the lowest latency possible:

                * Check the return code of odbc exec and wait if not null? (you get a chance at improving my code)

                Code:
                forvalues row = 1/`=_N' {
                    foreach var of varlist dt a m {
                        local `var' = `var'[`row']
                    }
                    capture odbc exec("INSERT INTO t (dt, a, m) VALUES (`dt', `a', `m')"), dsn(mydsn)
                    while _rc > 0 {
                          capture odbc exec("INSERT INTO t (dt, a, m) VALUES (`dt', `a', `m')"), dsn(mydsn)
                          * could insert a "sleep 10" here to prevent stata from using cputime.
                    }
                }
                * Write my own C plugin.

                Ideas? Jerome
                Last edited by jerome falken; 22 Apr 2018, 09:40.

                Comment


                • #9
                  It might be that the RDBMS cannot keep up with the stream of INSERTs after about 25 000, because it is re-indexing the columns & checking constraints after each individual insert. If you can be sure that the inserted data won't violate constraints, then you should be able to temporarily turn off the constraints and automatic indexing until the Stata code has run, and then turn them back on.

                  If it is the driver and not the RDBMS at fault, there aren't other vendors of 64-bit Linux ODBC drivers? Developing and maintaining a C plugin sound like a more costly approach.

                  Those are the two alternatives that readily come to mind to the options you have already considered.

                  I imagine that you're not alone in this problem with your particular combination of ODBC driver and RDBMS brands, and I can't imagine that Stata is so unusually demanding as an application. Perhaps there are better solutions to the problem floating around the Internet.

                  Comment

                  Working...
                  X