Skip to content

Bug: Duplicate VID constraint violation when multiple offchain triggers fire in the same block #6335

@hudsonhrh

Description

@hudsonhrh

Bug: Duplicate VID constraint violation when multiple offchain triggers fire in the same block

Submitter's note: This issue was researched and written with the assistance of Claude Code. The bug was discovered while indexing the POP (Perpetual Organization Protocol) subgraph on hoodi, and the root cause and fix were identified by tracing the graph-node source code.

Summary

When specVersion >= 1.3.0 (which enables deterministic VID generation), multiple offchain triggers (e.g. file/ipfs data sources) processed in the same block each create a fresh EntityCache with vid_seq reset to RESERVED_VIDS (100). If two or more offchain triggers in the same block write entities to the same table, they produce identical VIDs, causing a PostgreSQL unique constraint violation.

Affected Subgraph

This bug was discovered while indexing the POP subgraph (QmVDwncSF2SKkux73Gv2hDjS3jBgNF54mP7DMzBMzTmNUi) on the hoodi testnet. The subgraph uses TaskMetadata @entity(immutable: false) entities populated by file/ipfs data source handlers. When multiple tasks are created in the same transaction, multiple IPFS file data sources fire in the same block, triggering the VID collision.

The issue was identified and root-caused with the assistance of Claude Code (Anthropic's CLI tool).

Error Message

Subgraph writer failed, error: database constraint violated: duplicate key value
violates unique constraint "task_metadata_pkey": Key (vid)=(9293978516062308) already exists

Decoding the VID: 9293978516062308 = (2163923 << 32) + 100, confirming the sequence number is always 100 (RESERVED_VIDS) — each offchain trigger resets to the same starting point.

Environment

  • graph-node version: v0.36.0 (master at commit 3ca739f)
  • specVersion: 1.3.0 (deterministic VID generation enabled)
  • Multiple file/ipfs data sources triggered in the same block

Root Cause Analysis

The Bug Location

File: core/src/subgraph/runner.rs, method handle_offchain_triggers

Problem: Each offchain trigger creates a fresh BlockState with vid_seq reset to RESERVED_VIDS (100)

async fn handle_offchain_triggers(
    &mut self,
    triggers: Vec<offchain::TriggerData>,
    block: &Arc<C::Block>,
) -> Result<(Vec<EntityModification>, ...), Error> {
    let mut mods = vec![];

    for trigger in triggers {
        // BUG: Fresh BlockState resets vid_seq to RESERVED_VIDS (100) every iteration
        let schema = ReadStore::input_schema(&self.inputs.store);
        let mut block_state = BlockState::new(EmptyStore::new(schema), LfuCache::new());

        // ... process trigger, which calls entity_cache.set() ...

        mods.extend(
            block_state
                .entity_cache
                .as_modifications(block.number())  // VIDs assigned during set(), starting at 100
                .await?
                .modifications,
        );
    }
    Ok((mods, ...))
}

The VID formula (entity_cache.rs:390):

let vid = ((block as i64) << 32) + self.vid_seq as i64;
self.vid_seq += 1;

Bug Scenario

  1. Block N has two offchain triggers (e.g. two IPFS file data sources resolve)
  2. Trigger 1: Fresh EntityCache created, vid_seq = 100. Handler writes entity A → VID = (N << 32) + 100
  3. Trigger 2: Fresh EntityCache created, vid_seq = 100 again. Handler writes entity B → VID = (N << 32) + 100
  4. Both modifications are combined into a single transaction via mods.extend()
  5. Database write fails: Two rows with the same VID in the same table violate the unique constraint

Why This Only Affects specVersion >= 1.3.0

For older spec versions, VIDs are generated by PostgreSQL autoincrement sequences (BIGSERIAL), not by the (block << 32) + vid_seq formula. The EntityCache.vid_seq field is only used when strict_vid_order() returns true, which requires specVersion >= 1.3.0 (see schema/input/mod.rs:1596).

Why This Is Timing-Dependent

The bug only triggers when two or more offchain triggers for the same entity table resolve in the same block. On Subgraph Studio, IPFS content is pre-pinned and resolves instantly, making collisions frequent. On a local graph-node, IPFS fetches are slower and triggers naturally spread across blocks.

Steps to Reproduce

  1. Deploy a subgraph with specVersion: 1.3.0 or higher
  2. Include a file/ipfs data source template that writes to an entity table
  3. In a single transaction, emit events that create multiple file data sources with different IPFS CIDs
  4. Both IPFS files must resolve and be processed in the same block
  5. Both handlers must write to the same entity table
  6. The VID collision causes a constraint violation

Workaround

There is no subgraph-level workaround. The only mitigation is to use specVersion < 1.3.0 (which falls back to PostgreSQL sequences for VIDs), but this sacrifices deterministic VID ordering.

Proposed Fix

Thread vid_seq from the onchain EntityCache through each iteration of the offchain trigger loop, so each trigger continues the sequence where the previous one left off:

async fn handle_offchain_triggers(
    &mut self,
    triggers: Vec<offchain::TriggerData>,
    block: &Arc<C::Block>,
    mut next_vid_seq: u32,  // NEW: carried from onchain EntityCache
) -> Result<(Vec<EntityModification>, ...), Error> {
    let mut mods = vec![];

    for trigger in triggers {
        let schema = ReadStore::input_schema(&self.inputs.store);
        let mut block_state = BlockState::new(EmptyStore::new(schema), LfuCache::new());

        // FIX: Continue vid sequence from previous trigger
        block_state.entity_cache.vid_seq = next_vid_seq;

        // ... process trigger ...

        // FIX: Carry forward for next iteration
        next_vid_seq = block_state.entity_cache.vid_seq;

        mods.extend(
            block_state
                .entity_cache
                .as_modifications(block.number())
                .await?
                .modifications,
        );
    }
    Ok((mods, ...))
}

The caller passes block_state.entity_cache.vid_seq (from onchain processing) as the initial value. This is a 12-line change with zero performance overhead (reading/writing a u32).

Additional Notes

  • The EmptyStore and fresh BlockState per trigger is a "makeshift way to get causality region isolation" (per the existing code comment). This isolation is correct for entity data, but vid_seq is an internal bookkeeping counter that must be monotonic across all triggers in a block.
  • The EntityCache::seq field (used by generate_id() for auto-generated entity IDs) has the same reset-to-zero pattern and could cause similar issues if offchain handlers use auto-generated IDs. This is a separate but related concern.
  • A unit test demonstrating the collision is included in the PR.

Related Code Paths

  • core/src/subgraph/runner.rs: handle_offchain_triggers() loop, caller at line ~813
  • graph/src/components/store/entity_cache.rs: RESERVED_VIDS constant, vid_seq field, VID formula in set()
  • graph/src/schema/input/mod.rs: strict_vid_order() — gates deterministic VID generation on specVersion >= 1.3.0
  • graph/src/components/store/mod.rs: EmptyStore — used for offchain trigger isolation

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions