[fix](regression) Make Iceberg rewrite where init script idempotent by suxiaogang223 · Pull Request #63673 · apache/doris

suxiaogang223 · 2026-05-26T08:07:54Z

What problem does this PR solve?

Issue Number: N/A

Problem Summary:

test_iceberg_rewrite_data_files_where_conditions depends on three Iceberg tables created by the Spark bootstrap script run21.sql. The script used CREATE TABLE IF NOT EXISTS and then always inserted the test rows. If the table already exists or the bootstrap SQL is re-entered after partial execution, the insert statements append data to the existing table, so the regression case may fail before running rewrite_data_files because COUNT(*) is no longer the expected 30 rows.

This PR makes the init SQL for this case idempotent by dropping and recreating the three test tables before inserting the fixed test data.

Release note

None

Check List (For Author)

Test
- Regression test
- Unit Test
- Manual test (add detailed scripts or steps below)
  - git diff --check -- docker/thirdparties/docker-compose/iceberg/scripts/create_preinstalled_scripts/iceberg/run21.sql
- No need to test or manual test. Explain why:
  - This is a refactor/code format and no logic has been changed.
  - Previous test can cover this change.
  - No code files have been changed.
  - Other reason
Behavior changed:
- No.
- Yes. The Iceberg bootstrap SQL now recreates the dedicated rewrite-where-condition test tables before loading fixed test data.
Does this need documentation?
- No.
- Yes.

Check List (For Reviewer who merge this PR)

Confirm the release note
Confirm test cases
Confirm document
Add branch pick label

hello-stephen · 2026-05-26T08:08:00Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

suxiaogang223 · 2026-05-26T08:41:53Z

run buildall

Gabriel39 · 2026-05-27T02:18:09Z

/review

github-actions

Review Summary
No blocking issues found. The change is small and focused: the dedicated Iceberg bootstrap tables are dropped before recreation so rerunning the Spark init SQL cannot append duplicate rows and break the regression test count assumptions.

Critical checkpoint conclusions:

Goal/test proof: The goal is to make run21.sql idempotent for test_iceberg_rewrite_data_files_where_conditions; the code accomplishes this by resetting the three dedicated tables before fixed inserts. Existing regression test assertions still cover the expected 30-row state, though the PR only reports git diff --check as manual validation.
Scope/focus: Minimal and consistent with neighboring Iceberg bootstrap scripts that use DROP TABLE IF EXISTS followed by CREATE TABLE.
Concurrency/lifecycle: No runtime concurrency or special lifecycle concerns; this is bootstrap SQL executed during external test environment initialization.
Configuration/compatibility/protocol: No configuration, serialization, FE-BE protocol, or storage format compatibility changes.
Parallel paths: All three tables used by the related regression case were updated consistently.
Conditional checks: No new conditional logic beyond standard DROP TABLE IF EXISTS semantics.
Test coverage/results: No test result files are modified. The existing regression test remains the functional coverage; no additional issue found in the changed script.
Observability/transactions/data writes: No Doris transaction-path or observability changes. The script writes external Iceberg test fixture data only.
Performance: No meaningful performance concern; the affected data set is small and bootstrap-only.

Focus points: No additional user-provided review focus was supplied.

github-actions · 2026-05-27T04:14:15Z

OpenCode automated review failed and did not complete.

Error: Review step was failure (possibly timeout or cancelled)
Workflow run: https://github.com/apache/doris/actions/runs/26486763238

Please inspect the workflow logs and rerun the review after the underlying issue is resolved.

zclllyybb · 2026-05-27T10:07:20Z

skip buildall

github-actions · 2026-05-27T10:17:17Z

PR approved by at least one committer and no changes requested.

github-actions · 2026-05-27T10:17:20Z

PR approved by anyone and no changes requested.

…63673) `test_iceberg_rewrite_data_files_where_conditions` depends on three Iceberg tables created by the Spark bootstrap script `run21.sql`. The script used `CREATE TABLE IF NOT EXISTS` and then always inserted the test rows. If the table already exists or the bootstrap SQL is re-entered after partial execution, the insert statements append data to the existing table, so the regression case may fail before running `rewrite_data_files` because `COUNT(*)` is no longer the expected 30 rows. This PR makes the init SQL for this case idempotent by dropping and recreating the three test tables before inserting the fixed test data.

…idempotent #63673 (#63752) Cherry-picked from #63673 Co-authored-by: Socrates <[email protected]>

…idempotent #63673 (#63753) Cherry-picked from #63673 Co-authored-by: Socrates <[email protected]>

Fix Iceberg rewrite where bootstrap data

4482176

suxiaogang223 changed the title ~~[codex] Fix Iceberg rewrite where bootstrap data~~ [fix](regression) Make Iceberg rewrite where init script idempotent May 26, 2026

suxiaogang223 marked this pull request as ready for review May 26, 2026 08:37

github-actions Bot reviewed May 27, 2026

View reviewed changes

Gabriel39 approved these changes May 27, 2026

View reviewed changes

github-actions Bot added the approved Indicates a PR has been approved by one committer. label May 27, 2026

github-actions Bot added the reviewed label May 27, 2026

Gabriel39 merged commit 5fae02f into apache:master May 27, 2026
34 of 35 checks passed

Gabriel39 added dev/4.0.x dev/4.1.x labels May 27, 2026

This was referenced May 27, 2026

branch-4.0: [fix](regression) Make Iceberg rewrite where init script idempotent #63673 #63752

Merged

branch-4.1: [fix](regression) Make Iceberg rewrite where init script idempotent #63673 #63753

Merged

morningman pushed a commit that referenced this pull request May 28, 2026

branch-4.0: [fix](regression) Make Iceberg rewrite where init script …

f4d5837

…idempotent #63673 (#63752) Cherry-picked from #63673 Co-authored-by: Socrates <[email protected]>

morningman added dev/4.0.6-merged and removed dev/4.0.x labels May 28, 2026

yiguolei pushed a commit that referenced this pull request May 28, 2026

branch-4.1: [fix](regression) Make Iceberg rewrite where init script …

d4d1b5f

…idempotent #63673 (#63753) Cherry-picked from #63673 Co-authored-by: Socrates <[email protected]>

yiguolei added dev/4.1.2-merged and removed dev/4.1.x labels May 28, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[fix](regression) Make Iceberg rewrite where init script idempotent#63673

[fix](regression) Make Iceberg rewrite where init script idempotent#63673
Gabriel39 merged 1 commit into
apache:masterfrom
suxiaogang223:codex/fix-iceberg-rewrite-where-bootstrap

suxiaogang223 commented May 26, 2026 •

edited

Loading

Uh oh!

hello-stephen commented May 26, 2026

Uh oh!

suxiaogang223 commented May 26, 2026

Uh oh!

Gabriel39 commented May 27, 2026

Uh oh!

github-actions Bot left a comment

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

zclllyybb commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Conversation

suxiaogang223 commented May 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What problem does this PR solve?

Release note

Check List (For Author)

Check List (For Reviewer who merge this PR)

Uh oh!

hello-stephen commented May 26, 2026

Uh oh!

suxiaogang223 commented May 26, 2026

Uh oh!

Gabriel39 commented May 27, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

zclllyybb commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

github-actions Bot commented May 27, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

suxiaogang223 commented May 26, 2026 •

edited

Loading