-
Notifications
You must be signed in to change notification settings - Fork 2.1k
[Enhancement] introduce a function to obtain the column size #62481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Co-authored-by: huanmingwong <huanmingwong@gmail.com>
Cursor Agent can help with this pull request. Just |
|
Co-authored-by: huanmingwong <huanmingwong@gmail.com>
Co-authored-by: huanmingwong <huanmingwong@gmail.com>
Signed-off-by: Murphy <mofei@starrocks.com>
Signed-off-by: Murphy <mofei@starrocks.com>
Signed-off-by: Murphy <mofei@starrocks.com>
Signed-off-by: Murphy <mofei@starrocks.com>
Signed-off-by: Murphy <mofei@starrocks.com>
@cursor review |
Signed-off-by: Murphy <mofei@starrocks.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is being reviewed by Cursor Bugbot
Details
You are on the Bugbot Free tier. On this plan, Bugbot will review limited PRs each billing cycle.
To receive Bugbot reviews on all of your PRs, please upgrade to Bugbot Pro by visiting the Cursor dashboard. Your first 14 days will be free!
|
[Java-Extensions Incremental Coverage Report]✅ pass : 0 / 0 (0%) |
[FE Incremental Coverage Report]✅ pass : 20 / 20 (100.00%) file detail
|
[BE Incremental Coverage Report]✅ pass : 51 / 51 (100.00%) file detail
|
@Mergifyio backport branch-4.0 |
✅ Backports have been created
|
Co-authored-by: Cursor Agent <cursoragent@cursor.com> (cherry picked from commit 347b0b1)
Why I'm doing:
This PR resolves issue #60535 by adding the ability to query column size and compressed column size via
_META_
scans. This provides users with estimates of data storage for individual columns.What I'm doing:
column_size(col)
andcolumn_compressed_size(col)
built-in functions.SegmentMetaCollecter
andOlapMetaReader
.column_size
usesColumnMetaPB.total_mem_footprint()
as an uncompressed size proxy.column_compressed_size
calculates the sum of data page sizes by iterating through ordinal page indexes.PushDownAggToMetaScanRule
,RewriteSimpleAggToMetaScanRule
) to support pushing downSUM(column_size(col))
andSUM(column_compressed_size(col))
to meta scans.Usage:
SELECT column_size(col) FROM t [_META_];
SELECT column_compressed_size(col) FROM t [_META_];
SELECT sum(column_size(col)) FROM t [_META_];
Fixes #60535
What type of PR is this:
Does this PR entail a change in behavior?
If yes, please specify the type of change:
Checklist:
Bugfix cherry-pick branch check: