Skip to content

Implement size-based file rotation for FileBasedEventLogger#8415

Open
Ironankit525 wants to merge 6 commits intoapache:masterfrom
Ironankit525:feature/eventlogger-file-rotation
Open

Implement size-based file rotation for FileBasedEventLogger#8415
Ironankit525 wants to merge 6 commits intoapache:masterfrom
Ironankit525:feature/eventlogger-file-rotation

Conversation

@Ironankit525
Copy link
Contributor

Its a feature request

What is the purpose of the change

The FileBasedEventLogger currently has no bounds on the amount of data it writes to events.log, which can lead to disk exhaustion over time for long-running topologies. There is a //TODO: file rotation comment in the code indicating this was planned.

This PR implements file rotation for the FileBasedEventLogger based on file size and retained file count.

How was this change tested

Added FileBasedEventLoggerTest.java to test that files correctly rotate after reaching the size limit and older files are deleted when exceeding the max retained files. Tests execute properly and rotation chunks are verified.

@rzo1
Copy link
Contributor

rzo1 commented Mar 3, 2026

Thanks for the PR. I added some comments (and also initiated an additional AI-based review). I would suggest removing the STORM- prefix from the title and commit messages since we are no longer on Jira.

@rzo1 rzo1 added this to the 2.8.5 milestone Mar 3, 2026
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements size-based rotation for FileBasedEventLogger to prevent unbounded growth of events.log in long-running topologies.

Changes:

  • Add size/retention-based log rotation to FileBasedEventLogger.
  • Introduce new topology configs for rotation size (MB) and max retained files.
  • Add a JUnit test covering rotation behavior.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
storm-client/src/jvm/org/apache/storm/metric/FileBasedEventLogger.java Adds file-size tracking + rotation/retention logic for events.log.
storm-client/src/jvm/org/apache/storm/Config.java Defines new config keys for rotation size and retained file count.
storm-client/test/jvm/org/apache/storm/metric/FileBasedEventLoggerTest.java Adds rotation test using a temp worker-artifacts directory.
Comments suppressed due to low confidence (4)

storm-client/src/jvm/org/apache/storm/metric/FileBasedEventLogger.java:162

  • rotateFiles() never cleans up pre-existing files beyond maxRetainedFiles (e.g., if the retention setting is reduced, events.log.(maxRetainedFiles+1) and higher will remain forever). Consider deleting any files > maxRetainedFiles (at least maxRetainedFiles+1) before/after the shift so the on-disk state actually matches the configured retention.
        // Shift existing rotated files
        for (int i = maxRetainedFiles - 1; i >= 1; i--) {
            Path src = Paths.get(eventLogPath.toString() + "." + i);
            Path dst = Paths.get(eventLogPath.toString() + "." + (i + 1));
            if (Files.exists(src)) {
                Files.move(src, dst, StandardCopyOption.REPLACE_EXISTING);
            }
        }

        // Rename current events.log
        if (Files.exists(eventLogPath)) {
            Path dst = Paths.get(eventLogPath.toString() + ".1");
            Files.move(eventLogPath, dst, StandardCopyOption.REPLACE_EXISTING);
        }

storm-client/test/jvm/org/apache/storm/metric/FileBasedEventLoggerTest.java:58

  • tearDown() unconditionally calls eventLogger.close(), but FileBasedEventLogger.close() will throw an NPE if prepare() never ran (e.g., if the test fails early), which can mask the real failure. Also, Files.walk(tempDir) isn't closed and deletes in walk order (can try to delete directories before contents), which can leak handles and leave temp dirs behind. Guard the close call (or initialize/prepare in @BeforeEach), and delete the temp directory using try-with-resources + reverse depth order (or Files.walkFileTree), preferably with Files.deleteIfExists / assertions on failures.
    public void tearDown() throws IOException {
        eventLogger.close();
        if (tempDir != null) {
            Files.walk(tempDir)
                    .map(Path::toFile)
                    .forEach(File::delete);
            tempDir.toFile().delete();
        }

storm-client/test/jvm/org/apache/storm/metric/FileBasedEventLoggerTest.java:35

  • assertEquals is imported but never used in this test, which will fail compilation under common static analysis/build settings. Remove the unused import.
import org.junit.jupiter.api.Test;
import static org.junit.jupiter.api.Assertions.assertEquals;
import static org.junit.jupiter.api.Assertions.assertTrue;

storm-client/test/jvm/org/apache/storm/metric/FileBasedEventLoggerTest.java:118

  • The fixed Thread.sleep(100) calls make this test slower and potentially flaky while not guaranteeing durability/visibility of buffered writes. Rotation is synchronous (it closes/moves files), so the sleeps should be unnecessary; consider removing them or replacing them with a deterministic signal (e.g., eventLogger.close() before assertions, or explicitly flushing via a test hook).
        // Wait a bit for flush if any (though rotation is synchronous in write)
        Thread.sleep(100);

        Path expectedLogDir = tempDir.resolve("test-topology-1").resolve("6700");
        Path logFile = expectedLogDir.resolve("events.log");
        Path logFile1 = expectedLogDir.resolve("events.log.1");
        Path logFile2 = expectedLogDir.resolve("events.log.2");

        // The first 10 writes should be in one file, almost 1 MB.
        assertTrue(Files.exists(logFile));
        
        // Write 2 more times to push it over 1MB
        eventLogger.log(eventInfo);
        eventLogger.log(eventInfo);
        
        Thread.sleep(100);


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@rzo1 rzo1 requested a review from reiabreu March 3, 2026 09:14
@Ironankit525 Ironankit525 force-pushed the feature/eventlogger-file-rotation branch from 841f508 to 0c33160 Compare March 3, 2026 09:36
@Ironankit525 Ironankit525 changed the title STORM-: Implement size-based file rotation for FileBasedEventLogger Implement size-based file rotation for FileBasedEventLogger Mar 3, 2026
@Ironankit525
Copy link
Contributor Author

Thanks for the review @rzo1! I have implemented all the requested changes:

  • Removed the File object instantiation and moved to pure NIO (Files.size, Files.exists, Files.createDirectories).
    • Introduced constants for the default config values.
    • Lowered the default rotation size to 100 MB.
    • Cleaned up the commit message and PR title.
      Let me know if anything else is needed!

1 similar comment
@Ironankit525
Copy link
Contributor Author

Thanks for the review @rzo1! I have implemented all the requested changes:

  • Removed the File object instantiation and moved to pure NIO (Files.size, Files.exists, Files.createDirectories).
    • Introduced constants for the default config values.
    • Lowered the default rotation size to 100 MB.
    • Cleaned up the commit message and PR title.
      Let me know if anything else is needed!

@rzo1
Copy link
Contributor

rzo1 commented Mar 3, 2026

Thanks for the update. I think, that the concern raised by copilot is valid. So maybe you should also look into that area.

@Ironankit525
Copy link
Contributor Author

Thanks for the update. I think, that the concern raised by copilot is valid. So maybe you should also look into that area.

ok let me see that also.

Signed-off-by: Ankit Kumar <ankitkumar17541@gmail.com>
@Ironankit525 Ironankit525 force-pushed the feature/eventlogger-file-rotation branch from 0c33160 to 37ef064 Compare March 3, 2026 10:51
@Ironankit525
Copy link
Contributor Author

Ironankit525 commented Mar 3, 2026

I have added a writeLock mechanism to synchronize the log(), flush(), and close() operations. This addresses Copilot's concern by ensuring that file rotations, active writes, and background flushes are now atomic and thread-safe.

thanks

@reiabreu reiabreu requested a review from rzo1 March 5, 2026 23:45
@reiabreu
Copy link
Contributor

reiabreu commented Mar 5, 2026

@Ironankit525 can you please resolve the conversations?

Copy link
Contributor

@rzo1 rzo1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry can you point me to the writeLock introduced? Didn't catch it in the diff?

Maybe something like

// In log()
@Override
public void log(EventInfo event) {
    try {
        String logMessage = buildLogMessage(event);
        byte[] logBytes = logMessage.getBytes(StandardCharsets.UTF_8);
        int writeLength = logBytes.length + System.lineSeparator().length();

        if (currentFileSize + writeLength > maxFileSize) {
            writeLock.lock();               // exclusive: swap the writer
            try {
                // re-check after acquiring lock (another thread may have already rotated)
                if (currentFileSize + writeLength > maxFileSize) {
                    rotateFiles();
                }
            } finally {
                writeLock.unlock();
            }
        }

        readLock.lock();                    // shared: safe to write
        try {
            eventLogWriter.write(logMessage);
            eventLogWriter.newLine();
            currentFileSize += writeLength;
            dirty = true;
        } finally {
            readLock.unlock();
        }
    } catch (IOException ex) {
        LOG.error("Error logging event {}", event, ex);
        throw new RuntimeException(ex);
    }
}

// In the flush Runnable (inside setUpFlushTask)
if (dirty) {
    readLock.lock();                        // shared: safe alongside other reads
    try {
        eventLogWriter.flush();
        dirty = false;
    } finally {
        readLock.unlock();
    }
}

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 8 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

You can also share your feedback on Copilot code review. Take the survey.

rzo1 and others added 2 commits March 6, 2026 11:12
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…ggerTest.java

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
@rzo1
Copy link
Contributor

rzo1 commented Mar 6, 2026

Thanks for the contribution @Ironankit525 - think it's oknow. We can still refine later, if needed.

@Ironankit525
Copy link
Contributor Author

Thanks for the feedback! I've gone ahead and addressed all the remaining Copilot review comments in my latest commits. Let me know if anything else is needed!

@rzo1
Copy link
Contributor

rzo1 commented Mar 6, 2026

Can you look at checkstyle?

@Ironankit525 Ironankit525 force-pushed the feature/eventlogger-file-rotation branch from 06335b6 to 6fc4f3c Compare March 6, 2026 11:56
@Ironankit525
Copy link
Contributor Author

Ironankit525 commented Mar 6, 2026

Yes, I've just force-pushed a fix for the checkstyle failures. A previous commit had accidentally reformatted Config.java and introduced some syntax errors, which broke the build.

@reiabreu
Copy link
Contributor

reiabreu commented Mar 6, 2026

Pipeline is green. Happy to merge it @rzo1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants