Avoid moving around large suspended function states in the deferred definition worklist. #5608

zygoloid · 2025-06-04T20:06:25Z

We already go to some effort to avoid moving these, but we end up still moving them twice: once when adding to the worklist and again when reversing a chunk of the worklist.

To avoid a move when constructing the worklist, add an EmplaceResult utility that allows the result of a function call to be emplaced into a container.
To avoid moves when reversing the list, stop reversing it. Instead of reversing the list and popping tasks as we run them, we accumulate a sequence of tasks for a deferred definition region, run them in the order they were enqueued, then pop them all at the end. This will in some cases increase the high-water-mark of the size of the worklist, but not asymptotically. The same high-water-mark could be reached with the old approach by reordering the declarations in the source file.

In passing, we no longer create LeaveDeferredDefinitionRegion tasks for non-nested regions. We don't need them, because we can detect that condition by our reaching the end of the worklist. This means that the enter / leave region actions are now always in correspondence -- we only create them for nested regions. The tasks have been renamed to convey this.

We still move the suspended function states around if the worklist grows to over 64 entries and gets reallocated. We could potentially address that issue too by switching to a chunked allocation strategy as is used by ValueStore and then make the tasks noncopyable, but I'm not attempting that in this PR.

definition worklist. We already go to some effort to avoid moving these, but we end up still moving them twice: once when adding to the worklist and again when reversing a chunk of the worklist. * To avoid a move when constructing the worklist, add an `EmplaceResult` utility that allows the result of a function call to be emplaced into a container. * To avoid moves when reversing the list, stop reversing it. Instead of reversing the list and popping tasks as we run them, we accumulate a sequence of tasks for a deferred definition region, run them in the order they were enqueued, then pop them all at the end. This will in some cases increase the high-water-mark of the size of the worklist, but not asymptotically. The same high-water-mark could be reached with the old approach by reordering the declarations in the source file. In passing, we no longer create `LeaveDeferredDefinitionRegion` tasks for non-nested regions. We don't need them, because we can detect that condition by our reaching the end of the worklist. This means that the enter / leave region actions are now always in correspondence -- we only create them for *nested* regions. The tasks have been renamed to convey this. We still move the suspended function states around if the worklist grows to over 64 entries and gets reallocated. We could potentially address that issue too by switching to a chunked allocation strategy as is used by `ValueStore` and then make the tasks noncopyable, but I'm not attempting that in this PR.

common/emplace_result.h

common/emplace_result_test.cpp

danakj · 2025-06-04T21:52:14Z

common/emplace_result.h

+
+namespace Carbon {
+
+// A utility to use when calling an `emplace` function to emplace the result of


I am thoroughly confused how this is changing the behaviour of emplace, could you explain it a bit here?

Extended the comment to explain why this works.

Co-authored-by: Dana Jansens <danakj@orodu.net>

danakj · 2025-06-05T15:03:58Z

toolchain/check/node_id_traversal.cpp

+      // Worklist is empty: discard the worklist items associated with this
+      // chunk, and leave the scope.
+      worklist_.truncate(chunks_.back().first_worklist_index);
+      context_->decl_name_stack().PopScope();


Where is the paired push for this pop? I don't see it in this file, so it seems sufficiently non-local that it could use a comment explaining.

Added a comment. (I'm not really a fan of the division of responsibility here -- some of the scope pushes / pops / suspends / resumes are in the worklist and some are here -- but that's a pre-existing problem that I don't have a good solution to yet. There's probably a better way to factor this functionality.)

common/emplace_result.h

danakj · 2025-06-05T16:46:08Z

common/emplace_result.h

+// container has made space for the new element, it should not inspect or modify
+// the container that is being emplaced into.
+template <typename MakeFnT>
+class EmplaceResult {


nit: I might call this EmplaceConstruct as it's what the job of the thing is, and it's not the result of the emplacing operation. Up to you.

I'm not tied to this particular name, but EmplaceConstruct seems to be missing some important information -- a reference to the fact that it's calling a function and emplacing the result of that function call. I'd also be happy with things like EmplaceByCalling(callable) or EmplaceResultOf(callable) that avoid the possibility of this name being interpreted as "(the) emplace(ment) result" instead of "emplace (the) result (of)".

Both of those would be fine with me, yeah.

OK, I'll go with EmplaceByCalling -- on further thought I think EmplaceResultOf sounds too much like a type trait analogous to std::result_of.

[Edit: now done.]

danakj · 2025-06-05T16:48:24Z

toolchain/check/deferred_definition_worklist.cpp

-    worklist_.push_back(
-        LeaveDeferredDefinitionScope{.in_deferred_definition_scope = true});
-    CARBON_VLOG("{0}Push LeaveDeferredDefinitionScope (nested)\n", VlogPrefix);
+    worklist_.emplace_back(LeaveNestedDeferredDefinitionScope{});


I think we still want to push_back unless we're using the EmplaceResult tool, don't we?

When adding an element of the same type to a vector, yeah, we should be using push_back rather than emplace_back. But the element type of the worklist is a variant, not LeaveNestedDeferredDefinitionScope. We want an emplace_back not a push_back here so that we pass in an (empty) LeaveNestedDeferredDefinitionScope and the vector calls the variant converting constructor, rather than constructing a (large but almost entirely uninitialized) variant instance on the stack here and a variant copy in the vector push_back logic.

danakj · 2025-06-05T16:58:13Z

toolchain/check/deferred_definition_worklist.cpp

  // If we've not found any deferred definitions in this scope, clean up the
  // stack.


Can we add a bit of explanation about the ==size() vs == size()-1?

I was going to suggest something around PushEnterDeferredDefinitionScope pushing to worklist_ only in the nested case, but here the size is smaller in the nested case, so it's more complicated/different than that. Why does ==size() mean non-nested and ==size-1 mean nested here?

Changed the logic to explicitly use the nested flag instead of effectively recomputing it, and extended comment to explain what's happening.

Make worklist cleanup clearer.

Add a comment explaining why we're popping a scope we didn't push.

danakj

Thanks, LGTM

…efinition worklist. (carbon-language#5608) We already go to some effort to avoid moving these, but we end up still moving them twice: once when adding to the worklist and again when reversing a chunk of the worklist. * To avoid a move when constructing the worklist, add an `EmplaceResult` utility that allows the result of a function call to be emplaced into a container. * To avoid moves when reversing the list, stop reversing it. Instead of reversing the list and popping tasks as we run them, we accumulate a sequence of tasks for a deferred definition region, run them in the order they were enqueued, then pop them all at the end. This will in some cases increase the high-water-mark of the size of the worklist, but not asymptotically. The same high-water-mark could be reached with the old approach by reordering the declarations in the source file. In passing, we no longer create `LeaveDeferredDefinitionRegion` tasks for non-nested regions. We don't need them, because we can detect that condition by our reaching the end of the worklist. This means that the enter / leave region actions are now always in correspondence -- we only create them for *nested* regions. The tasks have been renamed to convey this. We still move the suspended function states around if the worklist grows to over 64 entries and gets reallocated. We could potentially address that issue too by switching to a chunked allocation strategy as is used by `ValueStore` and then make the tasks noncopyable, but I'm not attempting that in this PR. --------- Co-authored-by: Dana Jansens <danakj@orodu.net>

zygoloid requested a review from danakj June 4, 2025 20:06

github-actions bot added the toolchain label Jun 4, 2025

zygoloid mentioned this pull request Jun 4, 2025

Track pending thunks on the deferred definition worklist. #5609

Merged

Trailing return types.

8b369b3

danakj reviewed Jun 4, 2025

View reviewed changes

zygoloid and others added 2 commits June 4, 2025 15:10

Apply suggestions from code review

7d84ffa

Co-authored-by: Dana Jansens <danakj@orodu.net>

Add some commentary explaining how EmplaceResult works and some caveats.

e255e1d

danakj reviewed Jun 5, 2025

View reviewed changes

zygoloid added 2 commits June 9, 2025 19:33

Rephrase comment

7030630

Ename EmplaceResult -> EmplaceByCalling.

ca6d18f

Make worklist cleanup clearer.

zygoloid force-pushed the common-emplace-result branch from e57ffe7 to ca6d18f Compare June 9, 2025 20:22

Fix another case where we can avoid a copy.

af1351a

Add a comment explaining why we're popping a scope we didn't push.

zygoloid requested a review from danakj June 9, 2025 20:32

danakj approved these changes Jun 10, 2025

View reviewed changes

danakj added this pull request to the merge queue Jun 10, 2025

Merged via the queue into carbon-language:trunk with commit 6753a71 Jun 10, 2025
8 checks passed

zygoloid deleted the common-emplace-result branch June 10, 2025 23:14


		namespace Carbon {

		// A utility to use when calling an `emplace` function to emplace the result of

		// If we've not found any deferred definitions in this scope, clean up the
		// stack.

Avoid moving around large suspended function states in the deferred definition worklist. #5608

Avoid moving around large suspended function states in the deferred definition worklist. #5608

Uh oh!

Conversation

zygoloid commented Jun 4, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zygoloid Jun 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

danakj left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

zygoloid Jun 9, 2025 •

edited

Loading