Skip to content

Support multi-column cursors#1333

Open
ClaytonPassmore wants to merge 11 commits intoShopify:mainfrom
ClaytonPassmore:feature/json-cursor
Open

Support multi-column cursors#1333
ClaytonPassmore wants to merge 11 commits intoShopify:mainfrom
ClaytonPassmore:feature/json-cursor

Conversation

@ClaytonPassmore
Copy link

@ClaytonPassmore ClaytonPassmore commented Nov 30, 2025

What

This change adds support for multi-column cursors.

Fixes #1226

Adds a new option to serialize cursor values as JSON for new runs. Any existing runs will continue to use the old cursor encoding mechanism, which was purely taking the cursor value and coercing it to a string via ActiveRecord.

Upon resuming a job, JSON cursors from the run model are parsed before being handed back to the enumerator.

How

This is accomplished by tracking how the cursor is encoded on the run model via a new boolean column, cursor_is_json.

For existing runs, this value will be false and for new runs, we're setting the value based on the newly added MaintenanceTasks.json_cursors config value, which defaults to false.

Why

Multi-column cursors can occur when:

  • A task specifies multiple fields via #cursor_columns, or
  • Iterating over an ActiveRecord collection that has a composite primary key

Since #cursor_columns is a documented feature, it seems like there's some impetus to support multi-column cursor values.

Rails added support for models with composite primary keys back in 7.1. As someone that works on a multi-tenant app, being able to iterate over models with composite primary keys is a very important feature 😄

I just added support for batching over ActiveRecord relations with composite primary keys in the job-iteration gem (see Shopify/job-iteration#650). If we can get multi-column support in the maintenance_tasks gem, then it should "just work" with models that use composite primary keys once that job iteration change lands.

Risks

This change could introduce issues for cursor values that are not serializable as JSON, or that lose data/precision after going through the encode/decode process.

It's worth noting that this change increases the risk of producing a cursor value that is too large to fit into the cursor string column on the maintenance_tasks_runs table. All databases are different, but I believe that databases like MySQL have a hard character limit on string columns. By supporting multi-column cursors, we're enabling larger cursor values to be produced. There are probably going to be edge cases where a multi-column cursor value will not fit into a 255 character column.

Moving forward, encode cursors as JSON for new runs. Existing runs will
continue to use the old cursor encoding mechanism, which was purely
taking the cursor value and coercing it to a string via ActiveRecord.

This is accomplished by tracking how the cursor is encoded on the run
model via a new boolean column, `cursor_is_json`. This value will be
false-y for existing runs.

This change adds support for multi-column cursors, which can occur when:

* A task specifies multiple `cursor_columns`, or
* When iterating over an ActiveRecord collection that has a multi-column
  primary key

Fixes Shopify#1226
Copy link
Contributor

@adrianna-chang-shopify adrianna-chang-shopify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @ClaytonPassmore , thanks for working on this (and getting that work shipped in job-iteration)! Agree that it would be really nice to get this working in the gem, specifically with the use case of iterating over CPK-backed Active Record relations.

A few things come to mind based on an initial pass:

  • We can't assume, when an app upgrades the maintenance_tasks gem, that the AddCursorIsJsonFlagToRuns migration will run imminently. We need to ensure that the code changes would be compatible with an older version of the schema. I think the best way to handle this is to add a config that users can enable after they're confident the database changes have executed that allows us to start setting that column value.
  • We should definitely add documentation around multi-column cursor support, and the fact that cursors are serialized as JSON (which also implies that cursors must be JSON-serializable. We should note this in the Custom Enumerators section as well).
  • It would be awesome to add a new fixture task that iterates over a model with a CPK, and add test coverage for that alongside the tests you added to task_job_test.rb.

cc @etiennebarrie @nvasilevski if either of you have feedback here as well.


class AddCursorIsJsonFlagToRuns < ActiveRecord::Migration[7.1]
def change
add_column(:maintenance_tasks_runs, :cursor_is_json, :boolean)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this default to false?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was just being cautious because I believe adding a default value in certain older versions of MySQL and Postgres is a heavy operation. It seems like this shouldn't be an issue for more recent databases though. I've gone ahead and updated it to have a default and a NOT NULL constraint.

Adds a configuration option that allows users of this gem to opt-in to
JSON cursors. This option is disabled by default.

Updates README with information about JSON cursors.

Adds a default value of `false` to the new `cursor_is_json` column on
the run model. As a result of this, had to change where `cursor_is_json`
is set for new records. Now instead of being set with an
`after_initialize` callback, it is set explicitly by the runner using
the newly added configuration value.

Finally, ensures that the code can run without the `cursor_is_json`
column existing yet. This allows the gem to function even when people
forget to generate migrations after an upgrade.
@ClaytonPassmore
Copy link
Author

Thanks for the feedback @adrianna-chang-shopify! I threw up some changes based on your suggestions. I haven't had a chance to write a new fixture task yet but hopefully I can get to that soon.

Adds a new model to the dummy app (Order) with a composite primary key.
Adds some fixtures for this model, along with a task that iterates over
this model. Added some tests to show that JSON cursors work with models
that have composite primary keys.
@ClaytonPassmore
Copy link
Author

@adrianna-chang-shopify alright, I went ahead and added another task that iterates over a model with CPK and added some test coverage like you suggested. If what I added wasn't quite what you were looking for, let me know!

I was going to write a test case that shows how the gem will fail when you try to iterate over a CPK model without JSON cursors enabled but I quickly realized that how the gem fails depends on a few factors, like the database adapter used, the column types, and the seeded records. So instead of doing that, I figured I would just leave it as "undefined behaviour".


class AddCursorIsJsonFlagToRuns < ActiveRecord::Migration[7.1]
def change
add_column(:maintenance_tasks_runs, :cursor_is_json, :boolean, default: false, null: false)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we avoid this by looking at the first character and assuming it's JSON if it's a [? Or is there a compatibility issue with other types of cursors?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now in main, if the cursor is a multi-column cursor (which obviously doesn't work yet but is documented), the cursor value is an array. Since the cursor value gets serialized with to_s, it will still result in a string that starts with [, but its contents won't necessarily be JSON objects.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there's also the possibility that someone could have a task that uses a single string column for their cursor and that column contains strings that start with [.

@ClaytonPassmore
Copy link
Author

I'll take a look at these failures shortly 👍

Copy link
Contributor

@adrianna-chang-shopify adrianna-chang-shopify left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My main piece of feedback right now is that it adds a good amount of complexity to be checking both the config and the existence of the column via has_attribute?(:cursor_is_json). Can we assume that if the config is on, the migration has run? We could raise when the config is set instead:

# lib/maintenance_tasks.rb

mattr_reader :json_cursors, default: false

def self.json_cursors=(value)
  msg = "Run is missing the required :cursor_is_json column. Ensure migration to add JSON cursor support is run before enabling json_cursors."
  raise msg unless Run.has_attribute?(:cursor_is_json)
  @@json_cursors = value
end

# @return [Boolean]
# True when the cursor value should be treated as serialized JSON.
def cursor_is_json?
has_attribute?(:cursor_is_json) && cursor_is_json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, here can we check the flag instead of checking has_attribute?(:cursor_is_json)?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can make that change if you like, though I would caution that it generates an interesting edge case -

It's possible someone may enable the configuration flag, perform a run part way, then change the configuration to disable the flag before the run has completed.

If we're only looking at the configuration flag, we won't see that the task has a JSON cursor. So when we go to resume the task, we'll assume that the cursor value isn't JSON and then we'll fail to resume the task (much like what happens in main right now if you try to use a multi-column cursor).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when we go to resume the task, we'll assume that the cursor value isn't JSON and then we'll fail to resume the task

Wouldn't this be expected though? If you have a task that relies on a multi-column cursor but you've disabled JSON cursors, shouldn't we fail to resume the task?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a fair expectation, I'll go ahead and make that change 👍

Comment on lines +35 to +43
def serialized_cursor_position
cursor_position && @run.cursor_is_json? ? cursor_position.to_json : cursor_position
end

def deserialized_run_cursor
return JSON.parse(@run.cursor) if @run.cursor && @run.cursor_is_json?

@run.cursor
end
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of serializing / deserializing by hand, do you think something like this in run.rb would work?

if MaintenanceTasks.json_cursors
  serialize :cursor, coder: JSON
end

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this would enable the same edge case that I mentioned here.

If that's acceptable, I can go ahead and make these changes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would really tidy things up but after looking at it a little closer, I think it's going to create more issues than it might be worth.

There are two problems that I see:

  1. Since this is evaluated when the class is loaded, the test suite won't easily be able to switch between cases where JSON serialization is enabled and disabled
  2. For cases where the configuration flag is enabled but we're resuming a run that doesn't have a JSON encoded cursor (because the run was started prior to the flag being enabled), we would try to interpret that cursor as JSON.

If you would like to see this implemented, any guidance would be appreciated.

Renames the configuration flag `json_cursors` to
`serialize_cursors_as_json`.

Also adds a guard clause to the configuration flag so that it cannot be
set to true unless the migration that adds `cursor_is_json` to
`maintenance_tasks_runs` has been executed.

Simplifies the logic that determines when `cursor_is_json` gets set.
@ClaytonPassmore
Copy link
Author

@adrianna-chang-shopify thank you for the thoughtful suggestions!

I went ahead and implemented a few of the changes you suggested. There were a couple of comments that would introduce an edge case, I responded to those with some detail. Let me know if you would like me to make those changes anyway.

The CI failures should be fixed as well -

  1. Didn't realize there were system tests 😅
  2. Apparently minitest-mock was split out of minitest in v6, which affected a subset of the matrix. I added it as a test dependency and things seem to be back on track.

Also had to disable a rubocop rule on the setter for the class variable
since rubocop doesn't like it when you use them. However the established
pattern seems to be to use `mattr_accessor`, which is using class
variables under the hood.
@ClaytonPassmore
Copy link
Author

ClaytonPassmore commented Feb 2, 2026

@adrianna-chang-shopify apologies for missing that rubocop violation. Hopefully we're good for another run now.

I had to disable a rubocop rule when setting the class variable for serialize_cursors_as_json. It seems that rubocop doesn't care about mattr_attribute but as soon as you set a class variable explicitly, it gets flagged.

Edit: Oops I didn't see your reply about reading the configuration flag. I'll make that change as well.

Rather than compute cursor_is_json entirely from the state of the model,
we're now going to deem the cursor to be a non-JSON value if the
configuration flag for JSON cursor serialization is disabled.

This means that if a run is partially performed and has a JSON encoded
cursor value, and the configuration flag is subsequently disabled, when
the run is resumed the cursor value is interpreted as a string.
@ClaytonPassmore
Copy link
Author

GitHub Actions is having some issues today 😞

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Cursor is not deserialized correctly with multiple columns

3 participants