We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
I'm testing #1076 with clang/test/CodeGen/tbaa.cpp.
clang/test/CodeGen/tbaa.cpp
and I have noticed an inconsistency in the alignment of store instructions between the LLVM IR generated by clang and the one produced by ClangIR.
clang
ClangIR
Here is a simplified example that demonstrates the difference:
// demo.cc typedef unsigned char uint8_t; typedef unsigned short uint16_t; typedef unsigned int uint32_t; typedef unsigned long long uint64_t; typedef struct { uint16_t f16; uint32_t f32; uint16_t f16_2; uint32_t f32_2; } StructA; uint32_t g2(uint32_t *s, StructA *A, uint64_t count) { *s = 1; A->f16 = 4; return *s; }
./bin/clang++ -c demo.cc -Xclang -emit-llvm -o demo.orig.ll
The following is the LLVM IR output when compiled with clang:
// demo.orig.ll define dso_local noundef i32 @_Z2g2PjP7StructAy(ptr noundef %s, ptr noundef %A, i64 noundef %count) #0 { entry: %s.addr = alloca ptr, align 8 %A.addr = alloca ptr, align 8 %count.addr = alloca i64, align 8 store ptr %s, ptr %s.addr, align 8 store ptr %A, ptr %A.addr, align 8 store i64 %count, ptr %count.addr, align 8 %0 = load ptr, ptr %s.addr, align 8 store i32 1, ptr %0, align 4 %1 = load ptr, ptr %A.addr, align 8 %f16 = getelementptr inbounds nuw %struct.StructA, ptr %1, i32 0, i32 0 // highlight align 4 store i16 4, ptr %f16, align 4 %2 = load ptr, ptr %s.addr, align 8 %3 = load i32, ptr %2, align 4 ret i32 %3 }
./bin/clang++ -c demo.cc -Xclang -emit-llvm -o demo.ll -fclangir
In contrast, the LLVM IR produced by ClangIR is as follows:
// demo.ll define dso_local i32 @_Z2g2PjP7StructAy(ptr %0, ptr %1, i64 %2) #0 { %4 = alloca ptr, i64 1, align 8 %5 = alloca ptr, i64 1, align 8 %6 = alloca i64, i64 1, align 8 %7 = alloca i32, i64 1, align 4 store ptr %0, ptr %4, align 8 store ptr %1, ptr %5, align 8 store i64 %2, ptr %6, align 8 %8 = load ptr, ptr %4, align 8 store i32 1, ptr %8, align 4 %9 = load ptr, ptr %5, align 8 %10 = getelementptr %struct.StructA, ptr %9, i32 0, i32 0 // highlight align 2 store i16 4, ptr %10, align 2 %11 = load ptr, ptr %4, align 8 %12 = load i32, ptr %11, align 4 store i32 %12, ptr %7, align 4 %13 = load i32, ptr %7, align 4 ret i32 %13 }
The significant difference lies in this line:
store i16 4, ptr %f16, align 4
store i16 4, ptr %10, align 2
The text was updated successfully, but these errors were encountered:
Interesting. align 2 is valid here, but I guess Clang is taking advantage of the fact that the struct pointer must be 4-byte aligned?
align 2
Sorry, something went wrong.
I guess Clang is taking advantage of the fact that the struct pointer must be 4-byte aligned?
Indeed, another interesting example to showcase this:
struct Foo { short a; short b; short c; short d; int e; // Make the struct 4-byte aligned }; void test(Foo *ptr) { ptr->a = 1; // align 4 ptr->b = 2; // align 2 ptr->c = 3; // align 4 ptr->d = 4; // align 2 }
The alignments original clang emitted for the 4 stores in test are 4, 2, 4, 2.
test
Yup, and it's also able to do clever things like (https://godbolt.org/z/hP8nbhara):
struct Foo { short a; short b; short c; short d; long e; // Make the struct 8-byte aligned }; void test(Foo *ptr) { ptr->a = 1; // align 8 ptr->b = 2; // align 2 ptr->c = 3; // align 4 ptr->d = 4; // align 2 }
No branches or pull requests
I'm testing #1076 with
clang/test/CodeGen/tbaa.cpp
.and I have noticed an inconsistency in the alignment of store instructions between the LLVM IR generated by
clang
and the one produced byClangIR
.Here is a simplified example that demonstrates the difference:
LLVM IR Generated by Clang
The following is the LLVM IR output when compiled with
clang
:LLVM IR Generated by ClangIR
In contrast, the LLVM IR produced by
ClangIR
is as follows:Comparison
The significant difference lies in this line:
store i16 4, ptr %f16, align 4
store i16 4, ptr %10, align 2
The text was updated successfully, but these errors were encountered: