-
-
Notifications
You must be signed in to change notification settings - Fork 46
Fix cell duplication when YStore is out-of-sync with file #526
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix cell duplication when YStore is out-of-sync with file #526
Conversation
This commit addresses the cell duplication bug reported in jupyterlab/jupyterlab#14031. Root cause analysis from production logs: - When initialize() detects an out-of-sync between YStore and file - The YDoc already contains content from YStore (potentially corrupted) - Calling set() tries to MERGE instead of REPLACE - This causes cell duplication Fixes: 1. handlers.py: Reorder initialize() BEFORE serve() to prevent race condition 2. rooms.py: Clear YDoc cells when out-of-sync is detected before loading from file Tested with production logs showing 100% correlation between out-of-sync and duplication.
krassowski
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the PR!
Looking at jupyter_ydoc/ynotebook.py, the set() method (lines 233-321):
The code which you highlight was only introduced 2 weeks ago, but issue that you are addressing was opened over two years ago.
If that code is really the culprit, can you confirm:
- it started two weeks ago for you
- downgrading to jupyter-ydoc v3.1.0 resolves the issue for you
But the Y.js internal state (clientID, clock) differs, causing the merge to create duplicates.
This is kind of irrelevant to set() implementation because it operates on serialized value.
| # FIX: Clear the document content BEFORE loading from file to prevent | ||
| # cell duplication. When out-of-sync is detected, the YDoc contains | ||
| # content from the YStore that may have corrupted state (e.g., from | ||
| # previous duplications). If we don't clear it, the set() method will | ||
| # try to MERGE the YStore content with the file content instead of | ||
| # REPLACING it, which causes cell duplication. | ||
| # See: https://github.com/jupyterlab/jupyterlab/issues/14031 | ||
| self._clear_document() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doing it here would undo the benefits of work done to address #509 and jupyterlab/jupyterlab#18100, e.g.
- Fix cell modifications causing cell output reload and shift to active cell index jupyter-server/jupyter_ydoc#360
- Improve reloading of documents to avoid spurious side-effects jupyter-server/jupyter_ydoc#355
set() method will try to MERGE the YStore content with the file content instead of REPLACING it
This should be:
- tested (there should be a test showing the problem, failing now, passing after a test)
- addressed in
jupyter-ydocset()method for notebook, not in this package - hard reload should be only performed IF required to prevent UI side-effects
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be easier to test this by adding a UI test here, even if the fix ultimately goes to the jupyter-ydoc package because this repo has UI tests setup.
|
We've experienced this bug on both versions of jupyter-ydoc:
So the Looking more carefully at the log sequence:
In all 3 cases, the duplication is detected AFTER The actual mechanism:
About the About Some questions i have in mind if you can help with please :
|
That's something we've been thinking about, and it would also help with managing the size of the YStore (see here).
What do you mean by out-of-sync? Technically client and server are always out-of-sync. I think we could introduce some kind of session ID into the shared document, and when the room is cleared and the shared document recreated in the backend, change this ID. When the client sees that its session ID doesn't match, it recreates a fresh shared document which syncs with the backend. But this would require cooperation from the frontend. |
Fix cell duplication when YStore is out-of-sync with file
Fixes jupyterlab/jupyterlab#14031
Summary
This PR fixes a cell duplication bug that occurs when the SQLite YStore content differs from the file on disk (out-of-sync condition). The bug manifests after laptop sleep/wake cycles or prolonged disconnections.
Root Cause Analysis
Why is this bug INTERMITTENT (random)?
The bug only occurs when ALL of these conditions are met:
document_cleanup_delay(default: 60s)This explains why:
Evidence from Production Logs
We analyzed 15 reconnection events across two production incidents.
Statistical Correlation
Correlation: 100% - Every out-of-sync event led to duplication.
Log Extract: Case WITHOUT duplication (first connection)
Log Extract: Case WITHOUT duplication (first connection, no YStore)
Content loaded from file(direct load, no YStore) → NO DuplicateCellIdLog Extract: Case WITH duplication (reconnection after sleep, YStore out-of-sync)
Content loaded from the ystore SQLiteYStore→ YStore was loaded firstContent in file ... is out-of-sync with the ystore→ OUT-OF-SYNC DETECTED (TRIGGER)Content loaded from file→ File content applied viaset()which does MERGETechnical Deep Dive
The Feedback Loop (Why duplications accumulate)
This explains why duplications accumulate over multiple reconnections:
Why set() causes MERGE instead of REPLACE
Looking at
jupyter_ydoc/ynotebook.py, theset()method (lines 233-321):The problem: When YStore has cells
[A, B, C]and file has cells[A, B, C]with the same IDs,set()tries to retain them. But the Y.js internal state (clientID, clock) differs, causing the merge to create duplicates.The Fix
Fix 1: Clear YDoc on out-of-sync (
rooms.py)When out-of-sync is detected, we now clear the YDoc cells before calling
set():The
_clear_document()method:This ensures
set()receives an empty YDoc, so all cells come fresh from the file with no merge conflicts.Fix 2: Reorder initialize/serve (
handlers.py) - (DEFENSE IN DEPTH)The original code had a race condition:
This allowed SYNC_STEP1 to be sent before initialization completed. We now ensure initialization completes first:
Why Both Fixes Are Needed
rooms.py(clear on out-of-sync)handlers.py(reorder init/serve)The
rooms.pyfix is critical as it directly prevents the merge that causes duplication. Thehandlers.pyfix is defense in depth - good practice but won't help if YStore is already out-of-sync.Testing
ycells.clear()is the same method used internally byset()when no cells are retained