Announcement

Collapse
No announcement yet.
X
  • Filter
  • Time
  • Show
Clear All
new posts

  • ODBC load error when declaring a temporary function

    (Cross-posting from Stack Exchange)

    I'm trying to load data into Stata from a SQL database hosted in Databricks, via the odbc load command. For part of the query I need to declare and run a small Python function (Databricks SQL, or at least our version, allows for User-Defined Functions written in SQL or Python, but not Stata, R etc.). This seems to yield an error independent of the content of the function.

    Consider this MWE:

    Code:
            odbc load, dsn("DBname") exec(`"CREATE TEMPORARY FUNCTION findrate(name STRING) RETURNS STRING LANGUAGE PYTHON as $$ return "Hello " + name $$; select findrate("World") as num;"') user("token") password("`dbpat'") clear
    Yields the error

    Code:
    Error running query: [PARSE_SYNTAX_ERROR] org.apache.spark.sql.catalyst.parser.ParseException: 
    [PARSE_SYNTAX_ERROR] Syntax error at or near 'select': extra input
    'select'. SQLSTATE: 42601 (line 1, pos 112) =
    SQLSTATE=42601
    While just taking the bit in quotes from the exec() option runs successfully on Databricks.

    Any way around this? Users on my system aren't allowed to add permanent functions (though I could see if my IT group would add it for me), so that isn't an option.

  • #2
    As a follow-up to this, it appears that ODBC (at least on databricks) does not allow multiple SQL statements separated by semi-colons. On my system I had the same problem using R. However, an issue with Stata ODBC is that each odbc load command starts a new session, so one couldn't do on odbc load command to declare the function, then another odbc load command to run the main query. The R odbc driver (at least with package odbc) does allow this, so it's not technically impossible.

    Is there a way for Stata's ODBC commands to run multiple SQL statements in the same session?

    Comment


    • #3
      Originally posted by Ryan Sandler View Post
      . . . an issue with Stata ODBC is that each odbc load command starts a new session . . . Is there a way for Stata's ODBC commands to run multiple SQL statements in the same session?
      Have you considered using odbc sqlfile()?

      According to odbc's help file, "The advantage in using this command, as opposed to odbc exec, is that only one connection is established for multiple SQL statements."

      Comment


      • #4
        Originally posted by Joseph Coveney View Post
        Have you considered using odbc sqlfile()?

        According to odbc's help file, "The advantage in using this command, as opposed to odbc exec, is that only one connection is established for multiple SQL statements."
        My understanding (also from the help file) is that odbc sqlfile and odbc exec as well won't actually load the output into Stata's memory at the end. One needs odbc load, which seems to not like multiple statements. Given that, I'm not entirely sure what the use case of either -sqlfile- or -exec- is.

        Comment


        • #5
          Yeah, you cannot directly load a dataset via either of these two approaches—I've used them for only DDL—and if you're not granted permissions by the database administrator to add user-defined functions or stored procedures, then you probably need to continue using Python as an interface between Stata and the database.

          At one time I became aware that the odbc commands returned an object Stata's old object-oriented programming feature, but I never pursued it and am not sure that it can be leveraged to do what you want.

          Comment

          Working...
          X