WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Conversation

@OneSizeFitsQuorum
Copy link
Contributor

@OneSizeFitsQuorum OneSizeFitsQuorum commented Dec 4, 2025

This PR addresses an NPE issue in WALNode creation that occurs when IoTDB starts with a full disk. When the disk is full at startup, the FolderManager initialization in the constructor fails, leaving it null. If disk space is later freed, subsequent WAL node creation attempts fail with NPE because the folderManager is still null.

img_v3_02sl_b85e7af0-2a08-4c97-b5d9-1f76ed4162dg

Signed-off-by: OneSizeFitsQuorum <[email protected]>
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses an NPE issue in WALNode creation that occurs when IoTDB starts with a full disk. When the disk is full at startup, the FolderManager initialization in the constructor fails, leaving it null. If disk space is later freed, subsequent WAL node creation attempts fail with NPE because the folderManager is still null.

Key Changes:

  • Adds lazy initialization of FolderManager in the createWALNode method to handle recovery after disk space becomes available

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Signed-off-by: OneSizeFitsQuorum <[email protected]>
@sonarqubecloud
Copy link

sonarqubecloud bot commented Dec 4, 2025

@codecov
Copy link

codecov bot commented Dec 4, 2025

Codecov Report

❌ Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 38.87%. Comparing base (f4d628a) to head (cd69d06).

Files with missing lines Patch % Lines
...wal/allocation/AbstractNodeAllocationStrategy.java 80.00% 2 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff              @@
##             master   #16869      +/-   ##
============================================
- Coverage     38.87%   38.87%   -0.01%     
  Complexity      207      207              
============================================
  Files          5021     5021              
  Lines        333052   333058       +6     
  Branches      42389    42389              
============================================
- Hits         129487   129484       -3     
- Misses       203565   203574       +9     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +61 to +65
// already in lock, so no need to synchronized
if (folderManager.get() == null) {
folderManager.set(
new FolderManager(
Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY));
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition: Multiple threads can simultaneously check folderManager.get() == null and proceed to create multiple FolderManager instances. The comment on line 61 states "already in lock, so no need to synchronized", but AtomicReference alone doesn't prevent this race condition. The check-then-set pattern is not atomic. Consider using synchronized double-checked locking:

if (folderManager.get() == null) {
  synchronized (folderManager) {
    if (folderManager.get() == null) {
      folderManager.set(new FolderManager(...));
    }
  }
}

Alternatively, use compareAndSet() for atomic updates.

Suggested change
// already in lock, so no need to synchronized
if (folderManager.get() == null) {
folderManager.set(
new FolderManager(
Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY));
// Atomically initialize folderManager if null
while (folderManager.get() == null) {
FolderManager newManager = new FolderManager(
Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY);
if (folderManager.compareAndSet(null, newManager)) {
break;
}
// else, another thread set it, retry check

Copilot uses AI. Check for mistakes.
Comment on lines +67 to +70
return folderManager
.get()
.getNextWithRetry(
folder -> new WALNode(identifier, folder + File.separator + identifier));
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential NPE: Even after the null check and re-initialization attempt (lines 62-66), folderManager.get() could still be null if the FolderManager constructor throws DiskSpaceInsufficientException again. In this case, line 67 will throw NPE when calling .get().getNextWithRetry(). The exception from line 63-65 is not caught here, so it would propagate up, but if it were caught elsewhere, this would be an NPE risk. Consider adding a null check after the initialization attempt or handling the case where re-initialization fails.

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +66
if (folderManager.get() == null) {
folderManager.set(
new FolderManager(
Arrays.asList(commonConfig.getWalDirs()), DirectoryStrategyType.SEQUENCE_STRATEGY));
}
Copy link

Copilot AI Dec 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing test coverage for the disk recovery scenario: The new lazy initialization logic (lines 62-66) that handles disk space recovery is not covered by tests. The existing tests in FirstCreateStrategyTest don't simulate a disk-full-at-startup followed by disk-space-recovery scenario. Consider adding a test that:

  1. Simulates disk full at construction time (folderManager becomes null)
  2. Verifies WALFakeNode is returned initially
  3. Frees disk space
  4. Verifies successful WAL node creation on retry (folderManager is recreated)

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant