-
Notifications
You must be signed in to change notification settings - Fork 325
Open
Description
Hey all, been trying to use SQLMesh with Databricks through OAuth and I seem to have hit a bug: if I set the concurrent_tasks in the DatabricksConnectionConfig() parameters to anything greater than 1, I get a CSRF error.
Example:
gateways = {
"databricks": GatewayConfig(
# Please select one of ['azure-oauth', 'databricks-oauth', 'pat']
connection=DatabricksConnectionConfig(
type="databricks",
server_hostname=os.getenv("CLUSTER_SERVER_HOSTNAME"),
http_path=os.getenv("CLUSTER_HTTP_PATH"),
catalog="sep_processing",
auth_type="databricks-oauth",
concurrent_tasks=4,
),
state_connection=PostgresConnectionConfig(
type="postgres",
host=os.getenv("LAKEBASE_HOST"),
port=int(os.getenv("LAKEBASE_PORT", "5432")),
database=os.getenv("LAKEBASE_DBNAME"),
user=os.getenv("LAKEBASE_USER"),
password=get_postgres_token(),
sslmode="require",
),
),
}
Error:
Apply - Backfill Tables [y/n]: y
--- Logging error ---
Traceback (most recent call last):
File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\authenticators.py", line 102, in
_initial_get_token
(access_token, refresh_token) = self.oauth_manager.get_tokens(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 250, in get_tokens
raise e
File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 244, in get_tokens
auth_response = self.__get_authorization_code(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 150, in
__get_authorization_code
raise e
File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\databricks\sql\auth\oauth.py", line 145, in
__get_authorization_code
authorization_code_response = client.parse_request_uri_response(
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\oauthlib\oauth2\rfc6749\clients\web_application.py", line
220, in parse_request_uri_response
response = parse_authorization_code_response(uri, state=state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "c:\Users\gabe.pesco\Repos\DataAccessLayer\DataAccessLayer\.venv\Lib\site-packages\oauthlib\oauth2\rfc6749\parameters.py", line 277, in
parse_authorization_code_response
raise MismatchingStateError()
oauthlib.oauth2.rfc6749.errors.MismatchingStateError: (mismatching_state) CSRF Warning! State not equal in request and response.
Other info:
- Setting
concurrent_tasks=1fixes the issue but kills the concurrency of the pipeline. - Setting
auth_type="pat"also fixes the issue, but not all Databricks workspaces allow users to generate PATs (my production workspaces don't). - Setting
auth_type="azure-oauth"does not impact the issue, neither does logging in with the az cli. - I am using a serverless SQL warehouse for the connection, and a Databricks Lakebase Postgres instance for the state connection.
- This could be an issue with the Databricks python package SQLMesh uses, but I do not know how to diagnose this bug to that degree.
Metadata
Metadata
Assignees
Labels
No labels