Skip to content

Cursor snapshots tx_context at creation time, causing surprising behavior across begin/commit #34

@vgvoleg

Description

@vgvoleg

Summary

Cursor / AsyncCursor capture connection._tx_context at the moment
connection.cursor() is called and hold that reference for the whole
lifetime of the cursor. This diverges from how DB-API drivers like
psycopg2 / asyncpg behave, where a cursor is a lightweight handle
that reads the connection's current transaction state on every
execute. The current behavior produces silent wrong results or errors
in fairly natural usage patterns.

See Connection.cursor() in ydb_dbapi/connections.py (passes
tx_context=self._tx_context into the cursor constructor) and
Cursor.__init__ in ydb_dbapi/cursors.py where
self._tx_context = tx_context is stored as a snapshot and then used on
every execute.

Repro / affected scenarios

Assume conn.set_isolation_level(SERIALIZABLE) so interactive
transactions are on.

1. Cursor created before begin() silently runs outside the transaction

cur = conn.cursor()                 # snapshot: tx_context = None
conn.begin()
cur.execute("INSERT INTO t ...")    # runs in autocommit, NOT in the tx
conn.commit()                       # commits an empty tx; the insert is
                                    # already persisted autonomously

No error is raised. The user thinks the insert was transactional; it
wasn't.

2. Reusing a cursor after commit() / rollback() breaks it

conn.begin()
cur = conn.cursor()
cur.execute("INSERT ...")
conn.commit()                       # tx_context is now dead
cur.execute("SELECT ...")           # cursor still references the closed
                                    # tx_context → SDK-level failure

In psycopg2 the equivalent just works — the cursor picks up the new
transaction automatically.

3. Mixed creation/execute ordering gives inconsistent state

cur1 = conn.cursor()                # snapshot None
conn.begin()
cur2 = conn.cursor()                # snapshot = current tx_context
cur1.execute(...)                   # outside the tx
cur2.execute(...)                   # inside the tx

Two cursors on the same connection end up in different transaction
contexts. Very hard to reason about.

Why this is surprising

The DB-API 2.0 ecosystem (psycopg2, asyncpg, sqlite3, pymysql, …)
treats the transaction as a property of the connection. Cursors are
lightweight handles that read the connection's current state on every
execute. Users and frameworks (SQLAlchemy, Django, Alembic) rely on
that model. The snapshot behavior here is an implementation detail that
leaks into user code.

Proposed fix

In Cursor.execute / AsyncCursor.execute read
self._connection._tx_context at execute time instead of using the
snapshot stored on the cursor. The snapshot field can stay for internal
bookkeeping but should not be the source of truth.

Existing tests should be audited for any assumption that relies on the
current snapshot-at-creation behavior; a quick scan didn't find any
intentional dependency.

Alternative (less invasive)

If changing semantics is undesirable, at minimum execute could
validate that the cursor's snapshot still matches
connection._tx_context and raise ProgrammingError with a clear
message when they diverge — no silent wrong behavior, but same friction
for users.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions