Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: Fix assertion due to remorph #110516

Closed
wants to merge 70 commits into from
Closed

Conversation

hez2010
Copy link
Contributor

@hez2010 hez2010 commented Dec 9, 2024

Extracted from #104906

We always set GTF_DEBUG_NODE_MORPHED in gtFoldExprCompare, gtFoldExprSpecial, gtFoldExprConditional and gtFoldExprHWIntrinsic if the tree is morphed, but not in gtFoldExprConst as it may creates a new tree that can be morphed again.
Especially, gtFoldExprSpecial can create a new COMMA containing a subtree of side effects, which leads to the assertion later when we try to remorph it, as newTree != oldTree can yield false positives for this. The subtree of COMMA is already morphed anyway so in this case we can avoid morphing it again by using a flag instead of relying on newTree != oldTree.

Fixes #107542

cc: @AndyAyersMS

@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Dec 9, 2024
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Dec 9, 2024
@hez2010
Copy link
Contributor Author

hez2010 commented Dec 9, 2024

No diffs

@jakobbotsch
Copy link
Member

gtFoldExprConst is already conditioned on fgGlobalMorph when it folds overflowing arithmetic into a helper call. Can we just invoke morph from there in that case? It's not elegant, but it seems more elegant than this fix.

@hez2010
Copy link
Contributor Author

hez2010 commented Dec 9, 2024

The issue here is that gtFoldExprSpecial can create a new COMMA (returning from gtWrapWithSideEffects) while its subtree is already morphed, so that the tree returned by gtFoldExpr is not equal to neither op1 nor op2. Then tree->AsOp()->gtOp1 = fgMorphTree(tree->AsOp()->gtOp1); caused the tree to be morphed again which leads to the assertion.

Regarding to the fix, in global morph, do we really need to morph op1 again? I think it should already be morphed anyway.

@hez2010
Copy link
Contributor Author

hez2010 commented Dec 9, 2024

Let's see how CI going

@hez2010
Copy link
Contributor Author

hez2010 commented Dec 9, 2024

Okay it becomes Assertion failed '(tree->gtDebugFlags & GTF_DEBUG_NODE_MORPHED) && "ERROR: Did not morph this node!" now.

I think it should already be morphed anyway.

Seems that this is not true.

@hez2010
Copy link
Contributor Author

hez2010 commented Dec 12, 2024

Can we just invoke morph from there in that case? It's not elegant, but it seems more elegant than this fix.

It seems that similar patterns can also be produced by gtFoldExprHWIntrinsic and it's really not obvious where should we put the invocation. I would like to leave it as is to keep the fix simple instead.

@jakobbotsch
Copy link
Member

Can we just invoke morph from there in that case? It's not elegant, but it seems more elegant than this fix.

It seems that similar patterns can also be produced by gtFoldExprHWIntrinsic and it's really not obvious where should we put the invocation. I would like to leave it as is to keep the fix simple instead.

What problem did you hit? gtFoldExprConst introduces a helper call in some cases. Morphing of a new helper call is very much necessary. But I don't see gtFoldExprHWIntrinsic introducing helper calls, and the point where it is invoked by morph makes me think that it should be unnecessary to call morph on anything in there.

@jakobbotsch
Copy link
Member

I opened #110658 as an example of what I meant. Does this fix the assertion for the case in the issue, and for your own case?

@hez2010
Copy link
Contributor Author

hez2010 commented Dec 12, 2024

I opened #110658 as an example of what I meant. Does this fix the assertion for the case in the issue, and for your own case?

Thanks! I will check it and see whether it fixes the issue I hit or not.

stephentoub and others added 9 commits December 14, 2024 17:59
We had indirections set up where we'd `RuntimeExport` a method under a symbolic name and then `RuntimeImport` it elsewhere (or call the export from JIT-generated code).

Most of this was motivated by the old Redhawk layering where things like casting and interface dispatch were part of Runtime.Base and not part of CoreLib. Since we decided if we ever reintroduce a Runtime.Base, we wouldn't have casting in it, this indirection will no longer be needed.

Motivated by dotnet#110267 (and also the reason why I'm only doing it for `RuntimeExport`s used from the JIT right now). Once that PR gets in, calling methods through `RuntimeExport`s would actually become a bigger deoptimization than it is now.
This syncs the gcc version in toolchian file with what's in the newest container build.

Fixes dotnet#110517
* Fixing step becomes a go

* Trying to avoid step that becomes a go

* Adding comments and more protections.

* fixing comment

* Checking if removing this comments, CI failures are gone.

* Adding part of the changes to understand the failures on CI

* Update src/coreclr/debug/ee/controller.cpp

Co-authored-by: mikelle-rogers <[email protected]>

* Fixing wrong fix.

---------

Co-authored-by: mikelle-rogers <[email protected]>
…henNotSet (dotnet#110070)

Accept "localdomain" as a valid DomainName on Android.
…efined. (dotnet#110506)

* Fix build break for DATAS Stress Mode

* Removed stress heap dynamic
…tate (dotnet#110449)

Some methods have a nil token - for example, special runtime methods like array functions. When we tried to look up their IL version state, we were throwing an exception. Methods like this will have no versioning state, so check for a nil token and skip the lookup.
MichalStrehovsky and others added 14 commits December 14, 2024 17:59
Based on measurements on the ASP.NET perf infra (BasicWebApi and TodosApi), using the GuardCF libraries in non-guard context doesn't have measurable impact on throughput. The impact on size is negligible. We can just have it enabled unconditionally on the native runtime bits.
This adds an invariant that there always exists an "init BB" throughout
the JIT's phases. The init BB has the following properties:
- It is only executed once, so it does not have any predecessors
- It is not inside a try region, hence it dominates all other blocks in
  the function

There are no further requirements on the BB. The init BB does not have
to be `BBJ_ALWAYS` (unlike the old "scratch BB" concept). This is mainly
because it philosophically does not make sense to insert IR at the end
of the init BB, since earlier phases can have inserted arbitrary IR in
them.
…assificationDataType` (dotnet#110602)

- Add the different method desc types to the data descriptor
  - We only need their size right now
- Add tests for different method desc classifications
  - Mostly fill-in for things I found we didn't cover - `GetNativeCode_StableEntryPoint_NonVtableSlot` is the one that actually hits the updated code in this change
…tnet#110322)

* Add GetGCStressCodeCopy to ICodeVersions and IRuntimeTypeSystem
If we need to spill ref type args for a GDV, try to annotate the spilled
locals with type information.
…10700)

Found this while running the diagnostics repo SOS tests with the cDAC enabled. General sequence for the repro was:
```
!sethostruntime -none
!bpmd <firstLocation>
!bpmd <secondLocation>
g
g
!clrstack
```
Printed stack shows `<unknown>` for some method(s).

Between the first and second breakpoints, more methods were jitted and the corresponding code heap list updated. When a new method in the stack for `<secondLocation>` had the same code heap list as any method from `<firstLocation>`, we'd end up with a stale end address for the heap list and determine that the method was invalid (outside the address range).

The cdac was assuming that the target would be created every time the state changes, but that is not the case (for the repro above, `!sethostruntime -none` resulted in not re-creating the target). We need to handle `IXCLRDataProcess::Flush` calls to clear out any cached data.

With this change, the SOS tests with the cDAC enabled run successfully with both a Release and Debug cdacreader (on a Release runtime).
@AndyAyersMS
Copy link
Member

Likely obsoleted by #110787

@hez2010
Copy link
Contributor Author

hez2010 commented Dec 18, 2024

Okay, I'm closing this PR.

@hez2010 hez2010 closed this Dec 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet