Fix COPY TO returning 0 rows during concurrent reorganize by gfphoenix78 · Pull Request #1605 · apache/cloudberry

gfphoenix78 · 2026-03-05T02:34:42Z

When ALTER TABLE ... SET WITH (reorganize=true) runs concurrently with COPY TO, COPY may return 0 rows instead of all rows. The root cause is a snapshot/lock ordering problem: PortalRunUtility() pushes the active snapshot before calling DoCopy(), so the snapshot predates any concurrent reorganize that had not yet committed. After COPY TO blocks on AccessExclusiveLock and the reorganize commits, the stale snapshot cannot see the new physical files (xmin = reorganize_xid is invisible) while the old physical files have already been removed, yielding 0 rows.

Three code paths are fixed:

Relation-based COPY TO (copy.c, DoCopy): After table_openrv() acquires AccessShareLock — which blocks until any concurrent reorganize commits — pop and re-push the active snapshot so it reflects all committed data at lock-grant time.
Query-based COPY TO, RLS COPY TO, and CTAS (copyto.c, BeginCopy): After pg_analyze_and_rewrite() -> AcquireRewriteLocks() acquires all direct relation locks, refresh the snapshot. This covers COPY (SELECT ...) TO, COPY on RLS-protected tables (internally rewritten to a query), and CREATE TABLE AS SELECT.
Partitioned table COPY TO (copy.c, DoCopy): Before entering BeginCopy, call find_all_inheritors() to eagerly acquire AccessShareLock on all child partitions. Child partition locks are normally acquired later in ExecutorStart -> ExecInitAppend, after PushCopiedSnapshot has already embedded a stale snapshot. Locking all children upfront ensures the snapshot refresh in fixes 1 and 2 covers all concurrent child-partition reorganize commits.

In REPEATABLE READ or SERIALIZABLE isolation, GetTransactionSnapshot() returns the same transaction-level snapshot, so the Pop/Push is a harmless no-op.

Tests added:

src/test/isolation2/sql/copy_to_concurrent_reorganize.sql Tests 2.1-2.5 for relation-based, query-based, partitioned, RLS, and CTAS paths across heap, AO row, and AO column storage.
contrib/pax_storage/src/test/isolation2/sql/pax/ copy_to_concurrent_reorganize.sql Same coverage for PAX columnar storage.

See: Issue#1545 #1545

Fixes #ISSUE_Number

What does this PR do?

Type of Change

Bug fix (non-breaking change)
New feature (non-breaking change)
Breaking change (fix or feature with breaking changes)
Documentation update

Breaking Changes

Test Plan

Unit tests added/updated
Integration tests added/updated
Passed make installcheck
Passed make -C src/test installcheck-cbdb-parallel

Impact

Performance:

User-facing changes:

Dependencies:

Checklist

Followed contribution guide
Added/updated documentation
Reviewed code for security implications
Requested review from cloudberry committers

Additional Context

CI Skip Instructions

When ALTER TABLE ... SET WITH (reorganize=true) runs concurrently with COPY TO, COPY may return 0 rows instead of all rows. The root cause is a snapshot/lock ordering problem: PortalRunUtility() pushes the active snapshot before calling DoCopy(), so the snapshot predates any concurrent reorganize that had not yet committed. After COPY TO blocks on AccessExclusiveLock and the reorganize commits, the stale snapshot cannot see the new physical files (xmin = reorganize_xid is invisible) while the old physical files have already been removed, yielding 0 rows. Three code paths are fixed: 1. Relation-based COPY TO (copy.c, DoCopy): After table_openrv() acquires AccessShareLock — which blocks until any concurrent reorganize commits — pop and re-push the active snapshot so it reflects all committed data at lock-grant time. 2. Query-based COPY TO, RLS COPY TO, and CTAS (copyto.c, BeginCopy): After pg_analyze_and_rewrite() -> AcquireRewriteLocks() acquires all direct relation locks, refresh the snapshot. This covers COPY (SELECT ...) TO, COPY on RLS-protected tables (internally rewritten to a query), and CREATE TABLE AS SELECT. 3. Partitioned table COPY TO (copy.c, DoCopy): Before entering BeginCopy, call find_all_inheritors() to eagerly acquire AccessShareLock on all child partitions. Child partition locks are normally acquired later in ExecutorStart -> ExecInitAppend, after PushCopiedSnapshot has already embedded a stale snapshot. Locking all children upfront ensures the snapshot refresh in fixes 1 and 2 covers all concurrent child-partition reorganize commits. In REPEATABLE READ or SERIALIZABLE isolation, GetTransactionSnapshot() returns the same transaction-level snapshot, so the Pop/Push is a harmless no-op. Tests added: - src/test/isolation2/sql/copy_to_concurrent_reorganize.sql Tests 2.1-2.5 for relation-based, query-based, partitioned, RLS, and CTAS paths across heap, AO row, and AO column storage. - contrib/pax_storage/src/test/isolation2/sql/pax/ copy_to_concurrent_reorganize.sql Same coverage for PAX columnar storage. See: Issue#1545 <apache#1545>

avamingli

LGTM.

my-ship-it approved these changes Mar 5, 2026

View reviewed changes

avamingli approved these changes Mar 5, 2026

View reviewed changes

reshke approved these changes Mar 5, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix COPY TO returning 0 rows during concurrent reorganize#1605

Fix COPY TO returning 0 rows during concurrent reorganize#1605
gfphoenix78 wants to merge 1 commit intoapache:mainfrom
gfphoenix78:fix-reorg

gfphoenix78 commented Mar 5, 2026

Uh oh!

avamingli left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Conversation

gfphoenix78 commented Mar 5, 2026

What does this PR do?

Type of Change

Breaking Changes

Test Plan

Impact

Checklist

Additional Context

CI Skip Instructions

Uh oh!

avamingli left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants