You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm getting up to snuff on compiler backend details for a personal project and the regalloc2 overview and the follow on article have been great.
I'm not going to pretend to completely understand the details here, but the MachineEnv requires a scratch register for what I gather is typically swaps, symbolically like this.
let x = 4;let y = 5;let swap = x;
x = y;
y = swap;
However there is a way to swap two values without introducing a third variable/register which is via an XOR swap:
x = y ^ x;
y = x ^ y;
x = y ^ x;
You shouldn't do this in high level code but should be fine in machine code of course. If an ISA allows integer operations to be applied to float registers then this can be used for both classes.
I'm nearly certain you know of this already, but I wanted to get it out there for the off chance, and if nothing else get details on why it wouldn't work.
The text was updated successfully, but these errors were encountered:
Yes, indeed, we're aware of this trick; it's not implemented in RA2 for a combination of reasons:
There is a layered approach to performing register moves and using a dedicated scratch is the last resort; and actually the scratch is optional:
First, if there are no cycles in a set of moves (a "parallel move"), we can shift values along linear move chains without any temporary space.
Then, if there is a move cycle but there is any register in the right register class free, we can use that register (and we opportunistically look for one).
Then, we can borrow a register that doesn't participate in a move cycle by spilling its contents to a spare spillslot (which we dynamically create on need).
Then, finally, we can use a dedicated scratch; this last case is quite rare.
The three-XOR trick obfuscates the data flow from the point of view of register renaming hardware: while it can track physical values through moves and wire up instruction dependencies appropriately in the out-of-order window, arithmetic properties of XORs are beyond the reach of the high-speed move-detection circuitry in renaming (as far as I know). This thus reduces performance (relative to a sequence of true moves) with false dependencies.
On x86, we could use xchg; this is a more promising route if we have a two-register move cycle with no spare space. We haven't pursued this.
So, tl;dr -- not much need, and dubious performance implications. You're welcome to experiment with this of course if you find a benchmark where the need appears!
I'm getting up to snuff on compiler backend details for a personal project and the
regalloc2
overview and the follow on article have been great.I'm not going to pretend to completely understand the details here, but the
MachineEnv
requires a scratch register for what I gather is typically swaps, symbolically like this.However there is a way to swap two values without introducing a third variable/register which is via an XOR swap:
You shouldn't do this in high level code but should be fine in machine code of course. If an ISA allows integer operations to be applied to float registers then this can be used for both classes.
I'm nearly certain you know of this already, but I wanted to get it out there for the off chance, and if nothing else get details on why it wouldn't work.
The text was updated successfully, but these errors were encountered: