Skip to content

Comments

fix: The limit_pushdown physical optimization rule removes limits in some cases leading to incorrect results#20048

Merged
thinkharderdev merged 4 commits intoapache:mainfrom
coralogix:fix-limit-pushdown-fetch
Jan 30, 2026
Merged

fix: The limit_pushdown physical optimization rule removes limits in some cases leading to incorrect results#20048
thinkharderdev merged 4 commits intoapache:mainfrom
coralogix:fix-limit-pushdown-fetch

Conversation

@masonh22
Copy link
Contributor

@masonh22 masonh22 commented Jan 28, 2026

Which issue does this PR close?

None

Rationale for this change

Bug 1: When pushing down limits, we recurse down the physical plan accumulating limits until we reach a node where we can't push the limit down further. At this point, we insert another limit executor (or push it into the current node, if that node supports it). After this, we continue recursing to try to find more limits to push down. If we do find another, we remove it, but we don't set the GlobalRequirements::satisfied field back to false, meaning we don't always re-insert this limit.

Bug 2: When we're pushing down a limit with a skip/offset and no fetch/limit and we run into a node that supports fetch, we set GlobalRequirements::satisfied to true. This is wrong: the limit is not satisfied because fetch doesn't support skip/offset. Instead, we should set GlobalRequirements::satisfied to true if skip/offset is 0.

What changes are included in this PR?

This includes a one-line change to the push down limit logic that fixes the issue.

Are these changes tested?

I added a test that replicates the issue and fails without this change.

Are there any user-facing changes?

No

@github-actions github-actions bot added optimizer Optimizer rules core Core DataFusion crate labels Jan 28, 2026
@masonh22
Copy link
Contributor Author

I'm looking into the CI failures. I guess I forgot to run the tests before making the PR 🙃

From what I can tell so far, it looks like my change improves/fixes things in the tests it breaks

Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I can tell, the tests are wrong and this was behaving incorrectly before.

I'd like to see copy & pasted results with and without the optimizer rule:

  1. if they don't match prior to this PR, we have a issue we may need to hotfix in prior versions
  2. if they do match after this PR is applied, I approve of this PR

@masonh22 masonh22 force-pushed the fix-limit-pushdown-fetch branch from a5f6ad4 to 5ade46f Compare January 28, 2026 17:05
@masonh22
Copy link
Contributor Author

Found one more bug. I updated the PR description and added a test + fix for that bug.

@github-actions github-actions bot added the sqllogictest SQL Logic Tests (.slt) label Jan 28, 2026
Comment on lines -709 to +710
3 99 82
3 99 79
3 98 79
3 97 96
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous result was wrong. Look at the query directly above this. It is identical except that this query adds OFFSET 3 LIMIT 2. None of the rows in the previous test training are in the rows returned by the previous query, and the new training I added is just rows 4-5 in that query.

What was happening here is the inner OFFSET/LIMITs were being removed by the physical optimizer rule.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, when I disable the limit_pushdown rule, I get the new results I added here

17)------BoundedWindowAggExec: wdw=[lead(b.c1,Int64(1)) ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING: Field { "lead(b.c1,Int64(1)) ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING": nullable Int64 }, frame: ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED FOLLOWING], mode=[Sorted]
18)--------ProjectionExec: expr=[1 as c1]
19)----------PlaceholderRowExec
04)------GlobalLimitExec: skip=0, fetch=3
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is a better plan than before: GlobalLimitExecs are pushed into the union, and before they were not pushed below CoalescePartitionsExec.

@masonh22 masonh22 changed the title fix: The limit_pushdown physical optimization rule removes nested limits in some cases fix: The limit_pushdown physical optimization rule removes limits in some cases leading to incorrect results Jan 28, 2026
@avantgardnerio avantgardnerio self-requested a review January 28, 2026 18:10
Copy link
Contributor

@avantgardnerio avantgardnerio left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice find, I think you fixed a long standing bug. @alamb do we need to backport this to any previous releases, given that they were producing incorrect results?

@alamb
Copy link
Contributor

alamb commented Jan 29, 2026

Nice find, I think you fixed a long standing bug. @alamb do we need to backport this to any previous releases, given that they were producing incorrect results?

What I think we should do is file a ticket with some example queries where this bug results in incorrect results. Such queries I think will help us to understand the impact, and depending on that we can figure out if we want to backport this change

Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @masonh22 and @avantgardnerio -- this makes sense to me.

I also made a small follow on PR to update the tests to use insta

let after_optimize =
LimitPushdown::new().optimize(outer_limit, &ConfigOptions::new())?;
let expected = [
"GlobalLimitExec: skip=1, fetch=3",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without the code change in the PR, the actual code looks like

      "SortExec: TopK(fetch=4), expr=[c1@0 ASC], preserve_partitioning=[false]"
      "  EmptyExec"

Note the offset (aka the skip) was dropped

@thinkharderdev thinkharderdev added this pull request to the merge queue Jan 30, 2026
Merged via the queue into apache:main with commit 2860ada Jan 30, 2026
33 checks passed
masonh22 added a commit to coralogix/arrow-datafusion that referenced this pull request Jan 30, 2026
masonh22 added a commit to coralogix/arrow-datafusion that referenced this pull request Jan 30, 2026
…some cases leading to incorrect results (apache#20048)

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

None

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
Bug 1: When pushing down limits, we recurse down the physical plan
accumulating limits until we reach a node where we can't push the limit
down further. At this point, we insert another limit executor (or push
it into the current node, if that node supports it). After this, we
continue recursing to try to find more limits to push down. If we do
find another, we remove it, but we don't set the
`GlobalRequirements::satisfied` field back to false, meaning we don't
always re-insert this limit.

Bug 2: When we're pushing down a limit with a skip/offset and no
fetch/limit and we run into a node that supports fetch, we set
`GlobalRequirements::satisfied` to true. This is wrong: the limit is not
satisfied because fetch doesn't support skip/offset. Instead, we should
set `GlobalRequirements::satisfied` to true if skip/offset is 0.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
This includes a one-line change to the push down limit logic that fixes
the issue.

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

I added a test that replicates the issue and fails without this change.

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
No
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
masonh22 added a commit to coralogix/arrow-datafusion that referenced this pull request Jan 30, 2026
masonh22 added a commit to coralogix/arrow-datafusion that referenced this pull request Jan 30, 2026
fix: The limit_pushdown physical optimization rule removes limits in some cases leading to incorrect results (apache#20048)
## Which issue does this PR close?

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

None

## Rationale for this change

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
Bug 1: When pushing down limits, we recurse down the physical plan
accumulating limits until we reach a node where we can't push the limit
down further. At this point, we insert another limit executor (or push
it into the current node, if that node supports it). After this, we
continue recursing to try to find more limits to push down. If we do
find another, we remove it, but we don't set the
`GlobalRequirements::satisfied` field back to false, meaning we don't
always re-insert this limit.

Bug 2: When we're pushing down a limit with a skip/offset and no
fetch/limit and we run into a node that supports fetch, we set
`GlobalRequirements::satisfied` to true. This is wrong: the limit is not
satisfied because fetch doesn't support skip/offset. Instead, we should
set `GlobalRequirements::satisfied` to true if skip/offset is 0.

## What changes are included in this PR?

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
This includes a one-line change to the push down limit logic that fixes
the issue.

## Are these changes tested?

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

I added a test that replicates the issue and fails without this change.

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
No
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
github-merge-queue bot pushed a commit that referenced this pull request Jan 30, 2026
## Which issue does this PR close?

- Follow on to #20048



## Rationale for this change

While reviewing #20048 and
verifying test coverage, it was hard for me to see the test differences
(b/c the formatting was not great)

## What changes are included in this PR?

Port the tests to use insta rather than `assert_eq`
## Are these changes tested?
Yes, only tests

## Are there any user-facing changes?

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->

<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
masonh22 added a commit to coralogix/arrow-datafusion that referenced this pull request Feb 2, 2026
…some cases leading to incorrect results (apache#20048) (#394)

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

None

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
Bug 1: When pushing down limits, we recurse down the physical plan
accumulating limits until we reach a node where we can't push the limit
down further. At this point, we insert another limit executor (or push
it into the current node, if that node supports it). After this, we
continue recursing to try to find more limits to push down. If we do
find another, we remove it, but we don't set the
`GlobalRequirements::satisfied` field back to false, meaning we don't
always re-insert this limit.

Bug 2: When we're pushing down a limit with a skip/offset and no
fetch/limit and we run into a node that supports fetch, we set
`GlobalRequirements::satisfied` to true. This is wrong: the limit is not
satisfied because fetch doesn't support skip/offset. Instead, we should
set `GlobalRequirements::satisfied` to true if skip/offset is 0.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
This includes a one-line change to the push down limit logic that fixes
the issue.

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

I added a test that replicates the issue and fails without this change.

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
No
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
dispanser pushed a commit to coralogix/arrow-datafusion that referenced this pull request Feb 3, 2026
…some cases leading to incorrect results (apache#20048) (#394)

<!--
We generally require a GitHub issue to be filed for all bug fixes and
enhancements and this helps us generate change logs for our releases.
You can link an issue to this PR using the GitHub syntax. For example
`Closes #123` indicates that this PR will close issue #123.
-->

None

<!--
Why are you proposing this change? If this is already explained clearly
in the issue then this section is not needed.
Explaining clearly why changes are proposed helps reviewers understand
your changes and offer better suggestions for fixes.
-->
Bug 1: When pushing down limits, we recurse down the physical plan
accumulating limits until we reach a node where we can't push the limit
down further. At this point, we insert another limit executor (or push
it into the current node, if that node supports it). After this, we
continue recursing to try to find more limits to push down. If we do
find another, we remove it, but we don't set the
`GlobalRequirements::satisfied` field back to false, meaning we don't
always re-insert this limit.

Bug 2: When we're pushing down a limit with a skip/offset and no
fetch/limit and we run into a node that supports fetch, we set
`GlobalRequirements::satisfied` to true. This is wrong: the limit is not
satisfied because fetch doesn't support skip/offset. Instead, we should
set `GlobalRequirements::satisfied` to true if skip/offset is 0.

<!--
There is no need to duplicate the description in the issue here but it
is sometimes worth providing a summary of the individual changes in this
PR.
-->
This includes a one-line change to the push down limit logic that fixes
the issue.

<!--
We typically require tests for all PRs in order to:
1. Prevent the code from being accidentally broken by subsequent changes
2. Serve as another way to document the expected behavior of the code

If tests are not included in your PR, please explain why (for example,
are they covered by existing tests)?
-->

I added a test that replicates the issue and fails without this change.

<!--
If there are user-facing changes then we may require documentation to be
updated before approving the PR.
-->
No
<!--
If there are any breaking changes to public APIs, please add the `api
change` label.
-->
madesroches added a commit to madesroches/micromegas that referenced this pull request Feb 12, 2026
The LimitPushdown physical optimizer rule removes GlobalLimitExec
without pushing the fetch into DataSourceExec, silently ignoring
LIMIT on projected MemTable queries. Disable the rule until the
upstream fix (apache/datafusion#20048) is released in 52.2.

Also unpack dictionary-encoded columns at registration time and
update LocalQuery default queries to use log_entries.
dispanser pushed a commit to coralogix/arrow-datafusion that referenced this pull request Feb 19, 2026
…some cases leading to incorrect results (apache#20048) (#394)

---
[Cherry-pick summary: v46→v47]
Source commit: 9e4cda9 (fix: limit_pushdown removes limits incorrectly (apache#20048) (#394))
Strategy: cherry-picked cleanly
Upstream PR: apache#20048 (not in v47)
Test coverage: adequate (adds 2 regression tests for the two bugs fixed)
Tests: cargo nextest run -p datafusion-physical-optimizer passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

core Core DataFusion crate optimizer Optimizer rules sqllogictest SQL Logic Tests (.slt)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants