-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Fix folderManager NPE issue in WALNode when starting iotdb in a disk-full state #16869
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: OneSizeFitsQuorum <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR addresses an NPE issue in WALNode creation that occurs when IoTDB starts with a full disk. When the disk is full at startup, the FolderManager initialization in the constructor fails, leaving it null. If disk space is later freed, subsequent WAL node creation attempts fail with NPE because the folderManager is still null.
Key Changes:
- Adds lazy initialization of FolderManager in the createWALNode method to handle recovery after disk space becomes available
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
.../apache/iotdb/db/storageengine/dataregion/wal/allocation/AbstractNodeAllocationStrategy.java
Outdated
Show resolved
Hide resolved
Signed-off-by: OneSizeFitsQuorum <[email protected]>
ca51b0d to
cd69d06
Compare
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #16869 +/- ##
============================================
- Coverage 38.87% 38.87% -0.01%
Complexity 207 207
============================================
Files 5021 5021
Lines 333052 333058 +6
Branches 42389 42389
============================================
- Hits 129487 129484 -3
- Misses 203565 203574 +9 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| // already in lock, so no need to synchronized | ||
| if (folderManager.get() == null) { | ||
| folderManager.set( | ||
| new FolderManager( | ||
| Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY)); |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Race condition: Multiple threads can simultaneously check folderManager.get() == null and proceed to create multiple FolderManager instances. The comment on line 61 states "already in lock, so no need to synchronized", but AtomicReference alone doesn't prevent this race condition. The check-then-set pattern is not atomic. Consider using synchronized double-checked locking:
if (folderManager.get() == null) {
synchronized (folderManager) {
if (folderManager.get() == null) {
folderManager.set(new FolderManager(...));
}
}
}Alternatively, use compareAndSet() for atomic updates.
| // already in lock, so no need to synchronized | |
| if (folderManager.get() == null) { | |
| folderManager.set( | |
| new FolderManager( | |
| Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY)); | |
| // Atomically initialize folderManager if null | |
| while (folderManager.get() == null) { | |
| FolderManager newManager = new FolderManager( | |
| Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY); | |
| if (folderManager.compareAndSet(null, newManager)) { | |
| break; | |
| } | |
| // else, another thread set it, retry check |
| return folderManager | ||
| .get() | ||
| .getNextWithRetry( | ||
| folder -> new WALNode(identifier, folder + File.separator + identifier)); |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential NPE: Even after the null check and re-initialization attempt (lines 62-66), folderManager.get() could still be null if the FolderManager constructor throws DiskSpaceInsufficientException again. In this case, line 67 will throw NPE when calling .get().getNextWithRetry(). The exception from line 63-65 is not caught here, so it would propagate up, but if it were caught elsewhere, this would be an NPE risk. Consider adding a null check after the initialization attempt or handling the case where re-initialization fails.
| if (folderManager.get() == null) { | ||
| folderManager.set( | ||
| new FolderManager( | ||
| Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY)); | ||
| } |
Copilot
AI
Dec 9, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing test coverage for the disk recovery scenario: The new lazy initialization logic (lines 62-66) that handles disk space recovery is not covered by tests. The existing tests in FirstCreateStrategyTest don't simulate a disk-full-at-startup followed by disk-space-recovery scenario. Consider adding a test that:
- Simulates disk full at construction time (folderManager becomes null)
- Verifies WALFakeNode is returned initially
- Frees disk space
- Verifies successful WAL node creation on retry (folderManager is recreated)



This PR addresses an NPE issue in WALNode creation that occurs when IoTDB starts with a full disk. When the disk is full at startup, the FolderManager initialization in the constructor fails, leaving it null. If disk space is later freed, subsequent WAL node creation attempts fail with NPE because the folderManager is still null.