Skip to content

Conversation

zygoloid
Copy link
Contributor

@zygoloid zygoloid commented Jun 4, 2025

We already go to some effort to avoid moving these, but we end up still moving them twice: once when adding to the worklist and again when reversing a chunk of the worklist.

  • To avoid a move when constructing the worklist, add an EmplaceResult utility that allows the result of a function call to be emplaced into a container.
  • To avoid moves when reversing the list, stop reversing it. Instead of reversing the list and popping tasks as we run them, we accumulate a sequence of tasks for a deferred definition region, run them in the order they were enqueued, then pop them all at the end. This will in some cases increase the high-water-mark of the size of the worklist, but not asymptotically. The same high-water-mark could be reached with the old approach by reordering the declarations in the source file.

In passing, we no longer create LeaveDeferredDefinitionRegion tasks for non-nested regions. We don't need them, because we can detect that condition by our reaching the end of the worklist. This means that the enter / leave region actions are now always in correspondence -- we only create them for nested regions. The tasks have been renamed to convey this.

We still move the suspended function states around if the worklist grows to over 64 entries and gets reallocated. We could potentially address that issue too by switching to a chunked allocation strategy as is used by ValueStore and then make the tasks noncopyable, but I'm not attempting that in this PR.

definition worklist.

We already go to some effort to avoid moving these, but we end up still
moving them twice: once when adding to the worklist and again when
reversing a chunk of the worklist.

  * To avoid a move when constructing the worklist, add an
    `EmplaceResult` utility that allows the result of a function call to
    be emplaced into a container.
  * To avoid moves when reversing the list, stop reversing it. Instead
    of reversing the list and popping tasks as we run them, we
    accumulate a sequence of tasks for a deferred definition region, run
    them in the order they were enqueued, then pop them all at the end.
    This will in some cases increase the high-water-mark of the size of
    the worklist, but not asymptotically. The same high-water-mark could
    be reached with the old approach by reordering the declarations in
    the source file.

In passing, we no longer create `LeaveDeferredDefinitionRegion` tasks
for non-nested regions. We don't need them, because we can detect that
condition by our reaching the end of the worklist. This means that the
enter / leave region actions are now always in correspondence -- we only
create them for *nested* regions. The tasks have been renamed to convey
this.

We still move the suspended function states around if the worklist grows
to over 64 entries and gets reallocated. We could potentially address
that issue too by switching to a chunked allocation strategy as is used
by `ValueStore` and then make the tasks noncopyable, but I'm not
attempting that in this PR.

namespace Carbon {

// A utility to use when calling an `emplace` function to emplace the result of
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thoroughly confused how this is changing the behaviour of emplace, could you explain it a bit here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extended the comment to explain why this works.

// Worklist is empty: discard the worklist items associated with this
// chunk, and leave the scope.
worklist_.truncate(chunks_.back().first_worklist_index);
context_->decl_name_stack().PopScope();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is the paired push for this pop? I don't see it in this file, so it seems sufficiently non-local that it could use a comment explaining.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a comment. (I'm not really a fan of the division of responsibility here -- some of the scope pushes / pops / suspends / resumes are in the worklist and some are here -- but that's a pre-existing problem that I don't have a good solution to yet. There's probably a better way to factor this functionality.)

// container has made space for the new element, it should not inspect or modify
// the container that is being emplaced into.
template <typename MakeFnT>
class EmplaceResult {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I might call this EmplaceConstruct as it's what the job of the thing is, and it's not the result of the emplacing operation. Up to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not tied to this particular name, but EmplaceConstruct seems to be missing some important information -- a reference to the fact that it's calling a function and emplacing the result of that function call. I'd also be happy with things like EmplaceByCalling(callable) or EmplaceResultOf(callable) that avoid the possibility of this name being interpreted as "(the) emplace(ment) result" instead of "emplace (the) result (of)".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both of those would be fine with me, yeah.

Copy link
Contributor Author

@zygoloid zygoloid Jun 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll go with EmplaceByCalling -- on further thought I think EmplaceResultOf sounds too much like a type trait analogous to std::result_of.

[Edit: now done.]

worklist_.push_back(
LeaveDeferredDefinitionScope{.in_deferred_definition_scope = true});
CARBON_VLOG("{0}Push LeaveDeferredDefinitionScope (nested)\n", VlogPrefix);
worklist_.emplace_back(LeaveNestedDeferredDefinitionScope{});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still want to push_back unless we're using the EmplaceResult tool, don't we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When adding an element of the same type to a vector, yeah, we should be using push_back rather than emplace_back. But the element type of the worklist is a variant, not LeaveNestedDeferredDefinitionScope. We want an emplace_back not a push_back here so that we pass in an (empty) LeaveNestedDeferredDefinitionScope and the vector calls the variant converting constructor, rather than constructing a (large but almost entirely uninitialized) variant instance on the stack here and a variant copy in the vector push_back logic.

Comment on lines 66 to 67
// If we've not found any deferred definitions in this scope, clean up the
// stack.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a bit of explanation about the ==size() vs == size()-1?

I was going to suggest something around PushEnterDeferredDefinitionScope pushing to worklist_ only in the nested case, but here the size is smaller in the nested case, so it's more complicated/different than that. Why does ==size() mean non-nested and ==size-1 mean nested here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the logic to explicitly use the nested flag instead of effectively recomputing it, and extended comment to explain what's happening.

@zygoloid zygoloid force-pushed the common-emplace-result branch from e57ffe7 to ca6d18f Compare June 9, 2025 20:22
Add a comment explaining why we're popping a scope we didn't push.
@zygoloid zygoloid requested a review from danakj June 9, 2025 20:32
Copy link
Contributor

@danakj danakj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, LGTM

@danakj danakj added this pull request to the merge queue Jun 10, 2025
Merged via the queue into carbon-language:trunk with commit 6753a71 Jun 10, 2025
8 checks passed
@zygoloid zygoloid deleted the common-emplace-result branch June 10, 2025 23:14
bricknerb pushed a commit to bricknerb/carbon-lang that referenced this pull request Jun 11, 2025
…efinition worklist. (carbon-language#5608)

We already go to some effort to avoid moving these, but we end up still
moving them twice: once when adding to the worklist and again when
reversing a chunk of the worklist.

* To avoid a move when constructing the worklist, add an `EmplaceResult`
utility that allows the result of a function call to be emplaced into a
container.
* To avoid moves when reversing the list, stop reversing it. Instead of
reversing the list and popping tasks as we run them, we accumulate a
sequence of tasks for a deferred definition region, run them in the
order they were enqueued, then pop them all at the end. This will in
some cases increase the high-water-mark of the size of the worklist, but
not asymptotically. The same high-water-mark could be reached with the
old approach by reordering the declarations in the source file.

In passing, we no longer create `LeaveDeferredDefinitionRegion` tasks
for non-nested regions. We don't need them, because we can detect that
condition by our reaching the end of the worklist. This means that the
enter / leave region actions are now always in correspondence -- we only
create them for *nested* regions. The tasks have been renamed to convey
this.

We still move the suspended function states around if the worklist grows
to over 64 entries and gets reallocated. We could potentially address
that issue too by switching to a chunked allocation strategy as is used
by `ValueStore` and then make the tasks noncopyable, but I'm not
attempting that in this PR.

---------

Co-authored-by: Dana Jansens <danakj@orodu.net>
danakj added a commit to danakj/carbon-lang that referenced this pull request Jun 11, 2025
…efinition worklist. (carbon-language#5608)

We already go to some effort to avoid moving these, but we end up still
moving them twice: once when adding to the worklist and again when
reversing a chunk of the worklist.

* To avoid a move when constructing the worklist, add an `EmplaceResult`
utility that allows the result of a function call to be emplaced into a
container.
* To avoid moves when reversing the list, stop reversing it. Instead of
reversing the list and popping tasks as we run them, we accumulate a
sequence of tasks for a deferred definition region, run them in the
order they were enqueued, then pop them all at the end. This will in
some cases increase the high-water-mark of the size of the worklist, but
not asymptotically. The same high-water-mark could be reached with the
old approach by reordering the declarations in the source file.

In passing, we no longer create `LeaveDeferredDefinitionRegion` tasks
for non-nested regions. We don't need them, because we can detect that
condition by our reaching the end of the worklist. This means that the
enter / leave region actions are now always in correspondence -- we only
create them for *nested* regions. The tasks have been renamed to convey
this.

We still move the suspended function states around if the worklist grows
to over 64 entries and gets reallocated. We could potentially address
that issue too by switching to a chunked allocation strategy as is used
by `ValueStore` and then make the tasks noncopyable, but I'm not
attempting that in this PR.

---------

Co-authored-by: Dana Jansens <danakj@orodu.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants