improvement(idempotency): added atomic claims to prevent duplicate processing for long-running workflows#1366
Conversation
…ocessing for long-running workflows
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Greptile Summary
This PR introduces atomic claiming to the idempotency service to prevent duplicate processing of long-running workflows. The changes address a race condition where Gmail webhook processing operations taking over 60 seconds were being processed multiple times by concurrent instances.
The key architectural changes include:
- Atomic claiming mechanism: Replaces the previous check-then-process pattern with atomic operations using Redis
SET NXand databaseINSERT ON CONFLICT DO NOTHINGto ensure only one instance can claim a processing key at a time - Status-based coordination: Introduces a three-state system ('in-progress', 'completed', 'failed') allowing concurrent processes to coordinate properly
- Waiting mechanism: Implements
waitForResult()with a 5-minute timeout and 1-second polling intervals for processes that fail to claim a key, allowing them to wait for the claiming process to complete - Infrastructure simplification: Removes the memory cache fallback and
enableDatabaseFallbackconfiguration, assuming Redis and database are always available in production
The changes integrate with the existing Gmail polling service which was experiencing the duplicate processing issue. The atomicallyClaim() method ensures that when multiple instances attempt to process the same Gmail webhook simultaneously, only one succeeds in claiming the processing rights while others either wait for completion or receive cached results. This is particularly important for Gmail operations that can exceed 60 seconds due to API rate limits and large email volumes.
Confidence score: 3/5
- This PR addresses a legitimate concurrency issue but introduces complex distributed coordination logic that could have edge cases
- Score reflects the sophisticated atomic claiming implementation that should work well but may have timing-related edge cases in high-concurrency scenarios
- Pay close attention to the atomic claiming logic and potential duplicate condition checking in the executeWithIdempotency method
1 file reviewed, 1 comment
…ocessing for long-running workflows (simstudioai#1366) * improvement(idempotency): added atomic claims to prevent duplicate processing for long-running workflows * ack PR comments
Summary
added atomic claims to prevent duplicate processing for long-running workflows, some gmail processes took >60s and as a result the same emails being processed twice, removed memory fallback since redis/db are always available
Type of Change
Testing
Tested manually.
Checklist