Skip to content

Invalidate event cache on duplicate key error#536

Open
neSpecc wants to merge 2 commits intomasterfrom
fix-cache
Open

Invalidate event cache on duplicate key error#536
neSpecc wants to merge 2 commits intomasterfrom
fix-cache

Conversation

@neSpecc
Copy link
Member

@neSpecc neSpecc commented Mar 6, 2026

The OOM is caused by infinite recursion in handle(). Here's what happens:

  1. getEvent() queries MongoDB, finds no document, and caches null (since CacheController treats null as a valid value — only undefined triggers a cache miss)
  2. Since existedEvent is null, the code enters the "first occurrence" branch and calls saveEvent()
  3. If another worker instance already inserted this event, MongoDB throws a duplicate key error
  4. The catch block recursively calls this.handle(task) to re-process as a repetition
  5. But getEvent() returns the stale cached null instead of fetching the now-existing event from the database
  6. The code thinks it's still a "first occurrence", tries to insert again, gets another duplicate error, and recurses again — forever

When a DB duplicate key error indicates the event already exists, clear any cached null result so a subsequent fetch retrieves the newly created document. This change computes the event cache key via getEventCacheKey(projectId, uniqueEventHash) and deletes it (this.cache.del) before retrying handle(task), preventing stale null caches from causing missed event repetitions.
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an OOM caused by infinite recursion in GrouperWorker.handle() when a MongoDB duplicate-key insert happens after getEvent() has cached a null (cache hit on retry prevented re-fetching the newly inserted document).

Changes:

  • On duplicate-key insert errors, computes the getEvent() cache key and deletes it before retrying handle(task).
  • Ensures the retry path re-reads the event from MongoDB instead of using a stale cached null.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link
Contributor

Copilot AI commented Mar 6, 2026

@neSpecc I've opened a new pull request, #537, to work on those changes. Once the pull request is ready, I'll request review from you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants