Skip to content

[SYNPY-1749]Allow quote, apostrophe and ellipsis in store_row_async#1316

Open
danlu1 wants to merge 19 commits intodevelopfrom
SYNPY-1749-allow-quote-apostrophe-in-store-rows
Open

[SYNPY-1749]Allow quote, apostrophe and ellipsis in store_row_async#1316
danlu1 wants to merge 19 commits intodevelopfrom
SYNPY-1749-allow-quote-apostrophe-in-store-rows

Conversation

@danlu1
Copy link
Contributor

@danlu1 danlu1 commented Feb 9, 2026

Problem:

A JSON serialization issue occurs when a DataFrame passed to store_row_async contains a list or dictionary with strings that include both double quotes and apostrophes.

Solution:

  1. Add default value "escapechar": "\\" to to_csv_kwargs in store_rows_async
  2. Add to_csv_kwargs to _stream_and_update_from_df so it can take the passed to_csv_kwargs values for downstream data processing.

Testing:

Unit test and integration test have been added.

@danlu1 danlu1 requested a review from a team as a code owner February 9, 2026 20:01
@andrewelamb
Copy link
Contributor

@danlu1 Is this still WIP, or are you looking for reviews?

@danlu1
Copy link
Contributor Author

danlu1 commented Feb 10, 2026

@andrewelamb sorry I should have marked this a draft.

@danlu1 danlu1 marked this pull request as draft February 10, 2026 00:48
@danlu1 danlu1 marked this pull request as ready for review February 18, 2026 18:36
@danlu1
Copy link
Contributor Author

danlu1 commented Feb 18, 2026

The integration test failures are in the recordset and submission modules and do not appear to be related to my changes.

Copy link
Contributor

@linglp linglp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think overall it looks good. The tests can be consolidated a bit to test all the edge cases in fewer integration tests to improve performance, and the docstring can be updated to reflect the new state of the code since json.dumps() was removed. There's also some logic that can be simplified in the redundant checks where sample_values is created but never actually used. The function name could also be more descriptive of what it actually does now (sanitizing special values rather than just converting dtypes)

@danlu1 danlu1 requested a review from linglp February 24, 2026 00:15
Copy link
Member

@BryanFauble BryanFauble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great, once we get the last few items handled (and develop merged in), I can approve!

Copy link
Member

@BryanFauble BryanFauble left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on this fix -- the backslash-escaping approach for embedded quotes makes sense, and the Ellipsis/pd.NA handling is solid. I flagged one issue that I think needs to be addressed before merge, plus a few cleanup nits.

Note: This comment was drafted with AI assistance and reviewed by me for accuracy.

@danlu1 danlu1 requested review from BryanFauble and linglp March 9, 2026 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants