Skip to content

Databricks U2M OAuth CSRF Error when concurrent_tasks > 1 #5646

@gabepesco

Description

@gabepesco

Hey all, been trying to use SQLMesh with Databricks through OAuth and I seem to have hit a bug: if I set the concurrent_tasks in the DatabricksConnectionConfig() parameters to anything greater than 1, I get a CSRF error.

Example:

gateways = {
    "databricks": GatewayConfig(
        # Please select one of ['azure-oauth', 'databricks-oauth', 'pat']
        connection=DatabricksConnectionConfig(
            type="databricks",
            server_hostname=os.getenv("CLUSTER_SERVER_HOSTNAME"),
            http_path=os.getenv("CLUSTER_HTTP_PATH"),
            catalog="sep_processing",
            auth_type="databricks-oauth",
            concurrent_tasks=4,
        ),
        state_connection=PostgresConnectionConfig(
            type="postgres",
            host=os.getenv("LAKEBASE_HOST"),
            port=int(os.getenv("LAKEBASE_PORT", "5432")),
            database=os.getenv("LAKEBASE_DBNAME"),
            user=os.getenv("LAKEBASE_USER"),
            password=get_postgres_token(),
            sslmode="require",
        ),
    ),
}

Error:

Apply - Backfill Tables [y/n]: y
--- Logging error ---
Traceback (most recent call last):
  File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\authenticators.py", line 102, in      
_initial_get_token
    (access_token, refresh_token) = self.oauth_manager.get_tokens(
                                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 250, in get_tokens    
    raise e                                                                                                                                         
  File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 244, in get_tokens    
    auth_response = self.__get_authorization_code(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 150, in               
__get_authorization_code                                                                                                                            
    raise e
  File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 145, in
__get_authorization_code
    authorization_code_response = client.parse_request_uri_response(
                                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\oauthlib\oauth2\rfc6749\clients\web_application.py", line 
220, in parse_request_uri_response
    response = parse_authorization_code_response(uri, state=state)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\oauthlib\oauth2\rfc6749\parameters.py", line 277, in      
parse_authorization_code_response
    raise MismatchingStateError()
oauthlib.oauth2.rfc6749.errors.MismatchingStateError: (mismatching_state) CSRF Warning! State not equal in request and response.

Other info:

  1. Setting concurrent_tasks=1 fixes the issue but kills the concurrency of the pipeline.
  2. Setting auth_type="pat" also fixes the issue, but not all Databricks workspaces allow users to generate PATs (my production workspaces don't).
  3. Setting auth_type="azure-oauth" does not impact the issue, neither does logging in with the az cli.
  4. I am using a serverless SQL warehouse for the connection, and a Databricks Lakebase Postgres instance for the state connection.
  5. This could be an issue with the Databricks python package SQLMesh uses, but I do not know how to diagnose this bug to that degree.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions