Fix COPY TO returning 0 rows during concurrent reorganize#1605
Open
gfphoenix78 wants to merge 1 commit intoapache:mainfrom
Open
Fix COPY TO returning 0 rows during concurrent reorganize#1605gfphoenix78 wants to merge 1 commit intoapache:mainfrom
gfphoenix78 wants to merge 1 commit intoapache:mainfrom
Conversation
When ALTER TABLE ... SET WITH (reorganize=true) runs concurrently with COPY TO, COPY may return 0 rows instead of all rows. The root cause is a snapshot/lock ordering problem: PortalRunUtility() pushes the active snapshot before calling DoCopy(), so the snapshot predates any concurrent reorganize that had not yet committed. After COPY TO blocks on AccessExclusiveLock and the reorganize commits, the stale snapshot cannot see the new physical files (xmin = reorganize_xid is invisible) while the old physical files have already been removed, yielding 0 rows. Three code paths are fixed: 1. Relation-based COPY TO (copy.c, DoCopy): After table_openrv() acquires AccessShareLock — which blocks until any concurrent reorganize commits — pop and re-push the active snapshot so it reflects all committed data at lock-grant time. 2. Query-based COPY TO, RLS COPY TO, and CTAS (copyto.c, BeginCopy): After pg_analyze_and_rewrite() -> AcquireRewriteLocks() acquires all direct relation locks, refresh the snapshot. This covers COPY (SELECT ...) TO, COPY on RLS-protected tables (internally rewritten to a query), and CREATE TABLE AS SELECT. 3. Partitioned table COPY TO (copy.c, DoCopy): Before entering BeginCopy, call find_all_inheritors() to eagerly acquire AccessShareLock on all child partitions. Child partition locks are normally acquired later in ExecutorStart -> ExecInitAppend, after PushCopiedSnapshot has already embedded a stale snapshot. Locking all children upfront ensures the snapshot refresh in fixes 1 and 2 covers all concurrent child-partition reorganize commits. In REPEATABLE READ or SERIALIZABLE isolation, GetTransactionSnapshot() returns the same transaction-level snapshot, so the Pop/Push is a harmless no-op. Tests added: - src/test/isolation2/sql/copy_to_concurrent_reorganize.sql Tests 2.1-2.5 for relation-based, query-based, partitioned, RLS, and CTAS paths across heap, AO row, and AO column storage. - contrib/pax_storage/src/test/isolation2/sql/pax/ copy_to_concurrent_reorganize.sql Same coverage for PAX columnar storage. See: Issue#1545 <apache#1545>
my-ship-it
approved these changes
Mar 5, 2026
reshke
approved these changes
Mar 5, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
When ALTER TABLE ... SET WITH (reorganize=true) runs concurrently with COPY TO, COPY may return 0 rows instead of all rows. The root cause is a snapshot/lock ordering problem: PortalRunUtility() pushes the active snapshot before calling DoCopy(), so the snapshot predates any concurrent reorganize that had not yet committed. After COPY TO blocks on AccessExclusiveLock and the reorganize commits, the stale snapshot cannot see the new physical files (xmin = reorganize_xid is invisible) while the old physical files have already been removed, yielding 0 rows.
Three code paths are fixed:
Relation-based COPY TO (copy.c, DoCopy): After table_openrv() acquires AccessShareLock — which blocks until any concurrent reorganize commits — pop and re-push the active snapshot so it reflects all committed data at lock-grant time.
Query-based COPY TO, RLS COPY TO, and CTAS (copyto.c, BeginCopy): After pg_analyze_and_rewrite() -> AcquireRewriteLocks() acquires all direct relation locks, refresh the snapshot. This covers COPY (SELECT ...) TO, COPY on RLS-protected tables (internally rewritten to a query), and CREATE TABLE AS SELECT.
Partitioned table COPY TO (copy.c, DoCopy): Before entering BeginCopy, call find_all_inheritors() to eagerly acquire AccessShareLock on all child partitions. Child partition locks are normally acquired later in ExecutorStart -> ExecInitAppend, after PushCopiedSnapshot has already embedded a stale snapshot. Locking all children upfront ensures the snapshot refresh in fixes 1 and 2 covers all concurrent child-partition reorganize commits.
In REPEATABLE READ or SERIALIZABLE isolation, GetTransactionSnapshot() returns the same transaction-level snapshot, so the Pop/Push is a harmless no-op.
Tests added:
See: Issue#1545 #1545
Fixes #ISSUE_Number
What does this PR do?
Type of Change
Breaking Changes
Test Plan
make installcheckmake -C src/test installcheck-cbdb-parallelImpact
Performance:
User-facing changes:
Dependencies:
Checklist
Additional Context
CI Skip Instructions