Discussion:
[PATCH] x86: fix early boot crash on gcc-10
(too old to reply)
Peter Zijlstra
2020-03-16 13:04:14 UTC
Permalink
The change fixes boot failure on physical machine where kernel
This happens because `start_secondary()` is responsible for setting
up initial stack canary value in `smpboot.c`, but nothing prevents
gcc from inserting stack canary into `start_secondary()` itself
before `boot_init_stack_canary()` call.
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9b294c13809a..da9f4ea9bf4c 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -11,6 +11,12 @@ extra-y += vmlinux.lds
CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
+# smpboot's init_secondary initializes stack canary.
+# Make sure we don't emit stack checks before it's
+# initialized.
+nostackp := $(call cc-option, -fno-stack-protector)
+CFLAGS_smpboot.o := $(nostackp)
What makes GCC10 insert this while GCC9 does not. Also, I would much
rather GCC10 add a function attrbute to kill this:

__attribute__((no_stack_protect))

Then we can explicitly clear this one function and keep it on for the
rest of the file.
Jakub Jelinek
2020-03-16 13:26:48 UTC
Permalink
Post by Peter Zijlstra
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9b294c13809a..da9f4ea9bf4c 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -11,6 +11,12 @@ extra-y += vmlinux.lds
CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
+# smpboot's init_secondary initializes stack canary.
+# Make sure we don't emit stack checks before it's
+# initialized.
+nostackp := $(call cc-option, -fno-stack-protector)
+CFLAGS_smpboot.o := $(nostackp)
What makes GCC10 insert this while GCC9 does not. Also, I would much
My bet is different inlining decisions.
If somebody hands me over the preprocessed source + gcc command line, I can
have a look in detail (which exact change and why).
Post by Peter Zijlstra
__attribute__((no_stack_protect))
There is no such attribute, only __attribute__((stack_protect)) which is
meant mainly for -fstack-protector-explicit and does the opposite, or
__attribute__((optimize ("no-stack-protector"))) (which will work only
in GCC7+, since https://gcc.gnu.org/PR71585 changes).
Or of course you could add noinline attribute to whatever got inlined
and contains some array or addressable variable that whatever
-fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
it would never work even in the past I believe.

Jakub
Peter Zijlstra
2020-03-16 13:42:34 UTC
Permalink
Post by Jakub Jelinek
Post by Peter Zijlstra
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9b294c13809a..da9f4ea9bf4c 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -11,6 +11,12 @@ extra-y += vmlinux.lds
CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
+# smpboot's init_secondary initializes stack canary.
+# Make sure we don't emit stack checks before it's
+# initialized.
+nostackp := $(call cc-option, -fno-stack-protector)
+CFLAGS_smpboot.o := $(nostackp)
What makes GCC10 insert this while GCC9 does not. Also, I would much
My bet is different inlining decisions.
If somebody hands me over the preprocessed source + gcc command line, I can
have a look in detail (which exact change and why).
Post by Peter Zijlstra
__attribute__((no_stack_protect))
There is no such attribute,
Right I know, I looked for it recently :/ But since this is new in 10
and 10 isn't released yet, I figured someone can add the attribute
before it does get released.
Post by Jakub Jelinek
only __attribute__((stack_protect)) which is
meant mainly for -fstack-protector-explicit and does the opposite, or
__attribute__((optimize ("no-stack-protector"))) (which will work only
in GCC7+, since https://gcc.gnu.org/PR71585 changes).
Cute..
Post by Jakub Jelinek
Or of course you could add noinline attribute to whatever got inlined
and contains some array or addressable variable that whatever
-fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
it would never work even in the past I believe.
I don't think the kernel supports -fstack-protector-all, but I could be
mistaken.
Borislav Petkov
2020-03-16 17:54:50 UTC
Permalink
Post by Peter Zijlstra
Right I know, I looked for it recently :/ But since this is new in 10
and 10 isn't released yet, I figured someone can add the attribute
before it does get released.
Yes, that would be a good solution.

I looked at what happens briefly after building gcc10 from git and IINM,
the function in question - start_secondary() - already gets the stack
canary asm glue added so it checks for a stack canary.

However, the stack canary value itself gets set later in that same
function:

/* to prevent fake stack check failure in clock setup */
boot_init_stack_canary();

so the asm glue which checks for it would need to reload the newly
computed canary value (it is 0 before we compute it and thus the
mismatch).

So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
Post by Peter Zijlstra
Post by Jakub Jelinek
Or of course you could add noinline attribute to whatever got inlined
and contains some array or addressable variable that whatever
-fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
it would never work even in the past I believe.
I don't think the kernel supports -fstack-protector-all, but I could be
mistaken.
The other thing I was thinking was to carve out only that function into
a separate compilation unit and disable stack protector only for it.

All IMHO of course.

Thx.
--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Arvind Sankar
2020-03-16 18:20:00 UTC
Permalink
Post by Borislav Petkov
Post by Peter Zijlstra
Right I know, I looked for it recently :/ But since this is new in 10
and 10 isn't released yet, I figured someone can add the attribute
before it does get released.
Yes, that would be a good solution.
I looked at what happens briefly after building gcc10 from git and IINM,
the function in question - start_secondary() - already gets the stack
canary asm glue added so it checks for a stack canary.
However, the stack canary value itself gets set later in that same
/* to prevent fake stack check failure in clock setup */
boot_init_stack_canary();
so the asm glue which checks for it would need to reload the newly
computed canary value (it is 0 before we compute it and thus the
mismatch).
So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
Post by Peter Zijlstra
Post by Jakub Jelinek
Or of course you could add noinline attribute to whatever got inlined
and contains some array or addressable variable that whatever
-fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
it would never work even in the past I believe.
I don't think the kernel supports -fstack-protector-all, but I could be
mistaken.
The other thing I was thinking was to carve out only that function into
a separate compilation unit and disable stack protector only for it.
All IMHO of course.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
With STACKPROTECTOR_STRONG, gcc9 (at least gentoo's version, not sure if
they have some patches that affect it) already adds stack canary into
start_secondary. Not sure why it doesn't panic already with gcc9?

00000000000008f0 <start_secondary>:
8f0: 53 push %rbx
8f1: 48 83 ec 10 sub $0x10,%rsp
8f5: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
8fc: 00 00
8fe: 48 89 44 24 08 mov %rax,0x8(%rsp)
903: 31 c0 xor %eax,%eax
...
a2e: e8 00 00 00 00 callq a33 <start_secondary+0x143>
a2f: R_X86_64_PLT32 cpu_startup_entry-0x4
a33: 48 8b 44 24 08 mov 0x8(%rsp),%rax
a38: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
a3f: 00 00
a41: 75 12 jne a55 <start_secondary+0x165>
a43: 48 83 c4 10 add $0x10,%rsp
a47: 5b pop %rbx
a48: c3 retq
a49: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a50 <start_secondary+0x160>
a4c: R_X86_64_PC32 debug_idt_descr-0x4
a50: e9 cb fe ff ff jmpq 920 <start_secondary+0x30>
a55: e8 00 00 00 00 callq a5a <start_secondary+0x16a>
a56: R_X86_64_PLT32 __stack_chk_fail-0x4
Arvind Sankar
2020-03-16 18:54:21 UTC
Permalink
Post by Arvind Sankar
Post by Borislav Petkov
Post by Peter Zijlstra
Right I know, I looked for it recently :/ But since this is new in 10
and 10 isn't released yet, I figured someone can add the attribute
before it does get released.
Yes, that would be a good solution.
I looked at what happens briefly after building gcc10 from git and IINM,
the function in question - start_secondary() - already gets the stack
canary asm glue added so it checks for a stack canary.
However, the stack canary value itself gets set later in that same
/* to prevent fake stack check failure in clock setup */
boot_init_stack_canary();
so the asm glue which checks for it would need to reload the newly
computed canary value (it is 0 before we compute it and thus the
mismatch).
So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
Post by Peter Zijlstra
Post by Jakub Jelinek
Or of course you could add noinline attribute to whatever got inlined
and contains some array or addressable variable that whatever
-fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
it would never work even in the past I believe.
I don't think the kernel supports -fstack-protector-all, but I could be
mistaken.
The other thing I was thinking was to carve out only that function into
a separate compilation unit and disable stack protector only for it.
All IMHO of course.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
With STACKPROTECTOR_STRONG, gcc9 (at least gentoo's version, not sure if
they have some patches that affect it) already adds stack canary into
start_secondary. Not sure why it doesn't panic already with gcc9?
8f0: 53 push %rbx
8f1: 48 83 ec 10 sub $0x10,%rsp
8f5: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
8fc: 00 00
8fe: 48 89 44 24 08 mov %rax,0x8(%rsp)
903: 31 c0 xor %eax,%eax
...
a2e: e8 00 00 00 00 callq a33 <start_secondary+0x143>
a2f: R_X86_64_PLT32 cpu_startup_entry-0x4
a33: 48 8b 44 24 08 mov 0x8(%rsp),%rax
a38: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
a3f: 00 00
a41: 75 12 jne a55 <start_secondary+0x165>
a43: 48 83 c4 10 add $0x10,%rsp
a47: 5b pop %rbx
a48: c3 retq
a49: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a50 <start_secondary+0x160>
a4c: R_X86_64_PC32 debug_idt_descr-0x4
a50: e9 cb fe ff ff jmpq 920 <start_secondary+0x30>
a55: e8 00 00 00 00 callq a5a <start_secondary+0x16a>
a56: R_X86_64_PLT32 __stack_chk_fail-0x4
Wait a sec, this function calls cpu_startup_entry as the last thing it
does, which should never return, and hence the stack canary value should
never get checked.

How does the canary get checked with the gcc10 code?

boot_init_stack_canary depends on working if called from functions that
don't return. If that doesn't work with gcc10, we need to disable stack
protector for the other callpoints too -- start_kernel in init/main.c
and cpu_bringup_and_idle in arch/x86/xen/smp_pv.c.

/*
* Initialize the stackprotector canary value.
*
* NOTE: this must only be called from functions that never return,
* and it must always be inlined.
*/
static __always_inline void boot_init_stack_canary(void)
Arvind Sankar
2020-03-16 19:53:41 UTC
Permalink
Post by Arvind Sankar
Post by Arvind Sankar
Post by Borislav Petkov
Post by Peter Zijlstra
Right I know, I looked for it recently :/ But since this is new in 10
and 10 isn't released yet, I figured someone can add the attribute
before it does get released.
Yes, that would be a good solution.
I looked at what happens briefly after building gcc10 from git and IINM,
the function in question - start_secondary() - already gets the stack
canary asm glue added so it checks for a stack canary.
However, the stack canary value itself gets set later in that same
/* to prevent fake stack check failure in clock setup */
boot_init_stack_canary();
so the asm glue which checks for it would need to reload the newly
computed canary value (it is 0 before we compute it and thus the
mismatch).
So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
Post by Peter Zijlstra
Post by Jakub Jelinek
Or of course you could add noinline attribute to whatever got inlined
and contains some array or addressable variable that whatever
-fstack-protector* mode kernel uses triggers it. With -fstack-protector-all
it would never work even in the past I believe.
I don't think the kernel supports -fstack-protector-all, but I could be
mistaken.
The other thing I was thinking was to carve out only that function into
a separate compilation unit and disable stack protector only for it.
All IMHO of course.
Thx.
--
Regards/Gruss,
Boris.
https://people.kernel.org/tglx/notes-about-netiquette
With STACKPROTECTOR_STRONG, gcc9 (at least gentoo's version, not sure if
they have some patches that affect it) already adds stack canary into
start_secondary. Not sure why it doesn't panic already with gcc9?
8f0: 53 push %rbx
8f1: 48 83 ec 10 sub $0x10,%rsp
8f5: 65 48 8b 04 25 28 00 mov %gs:0x28,%rax
8fc: 00 00
8fe: 48 89 44 24 08 mov %rax,0x8(%rsp)
903: 31 c0 xor %eax,%eax
...
a2e: e8 00 00 00 00 callq a33 <start_secondary+0x143>
a2f: R_X86_64_PLT32 cpu_startup_entry-0x4
a33: 48 8b 44 24 08 mov 0x8(%rsp),%rax
a38: 65 48 33 04 25 28 00 xor %gs:0x28,%rax
a3f: 00 00
a41: 75 12 jne a55 <start_secondary+0x165>
a43: 48 83 c4 10 add $0x10,%rsp
a47: 5b pop %rbx
a48: c3 retq
a49: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a50 <start_secondary+0x160>
a4c: R_X86_64_PC32 debug_idt_descr-0x4
a50: e9 cb fe ff ff jmpq 920 <start_secondary+0x30>
a55: e8 00 00 00 00 callq a5a <start_secondary+0x16a>
a56: R_X86_64_PLT32 __stack_chk_fail-0x4
Wait a sec, this function calls cpu_startup_entry as the last thing it
does, which should never return, and hence the stack canary value should
never get checked.
How does the canary get checked with the gcc10 code?
boot_init_stack_canary depends on working if called from functions that
don't return. If that doesn't work with gcc10, we need to disable stack
protector for the other callpoints too -- start_kernel in init/main.c
and cpu_bringup_and_idle in arch/x86/xen/smp_pv.c.
/*
* Initialize the stackprotector canary value.
*
* NOTE: this must only be called from functions that never return,
* and it must always be inlined.
*/
static __always_inline void boot_init_stack_canary(void)
Ugh, gcc10 tail-call optimizes the cpu_startup_entry call, and so checks
the canary before jumping out. The xen one will need to have stack
protector disabled too. It doesn't optimize the arch_call_rest_init call
in start_kernel for some reason, but we should probably disable it there
too.

a06: 0f ae f8 sfence
a09: 48 8b 44 24 08 mov 0x8(%rsp),%rax
a0e: 65 48 2b 04 25 28 00 sub %gs:0x28,%rax
a15: 00 00
a17: 75 1b jne a34 <start_secondary+0x164>
a19: 48 83 c4 10 add $0x10,%rsp
a1d: bf 8d 00 00 00 mov $0x8d,%edi
a22: 5b pop %rbx
a23: e9 00 00 00 00 jmpq a28 <start_secondary+0x158>
a24: R_X86_64_PLT32 cpu_startup_entry-0x4
a28: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a2f <start_secondary+0x15f>
a2b: R_X86_64_PC32 debug_idt_descr-0x4
a2f: e9 cc fe ff ff jmpq 900 <start_secondary+0x30>
a34: e8 00 00 00 00 callq a39 <start_secondary+0x169>
a35: R_X86_64_PLT32 __stack_chk_fail-0x4
Jakub Jelinek
2020-03-16 20:08:55 UTC
Permalink
Post by Arvind Sankar
Post by Arvind Sankar
/*
* Initialize the stackprotector canary value.
*
* NOTE: this must only be called from functions that never return,
* and it must always be inlined.
*/
static __always_inline void boot_init_stack_canary(void)
Ugh, gcc10 tail-call optimizes the cpu_startup_entry call, and so checks
the canary before jumping out. The xen one will need to have stack
protector disabled too. It doesn't optimize the arch_call_rest_init call
in start_kernel for some reason, but we should probably disable it there
too.
If you mark cpu_startup_entry with __attribute__((noreturn)), then gcc won't
tail call it.
If you don't, you could add asm (""); after the call to avoid the tail call
too.
Post by Arvind Sankar
a06: 0f ae f8 sfence
a09: 48 8b 44 24 08 mov 0x8(%rsp),%rax
a0e: 65 48 2b 04 25 28 00 sub %gs:0x28,%rax
a15: 00 00
a17: 75 1b jne a34 <start_secondary+0x164>
a19: 48 83 c4 10 add $0x10,%rsp
a1d: bf 8d 00 00 00 mov $0x8d,%edi
a22: 5b pop %rbx
a23: e9 00 00 00 00 jmpq a28 <start_secondary+0x158>
a24: R_X86_64_PLT32 cpu_startup_entry-0x4
a28: 0f 01 1d 00 00 00 00 lidt 0x0(%rip) # a2f <start_secondary+0x15f>
a2b: R_X86_64_PC32 debug_idt_descr-0x4
a2f: e9 cc fe ff ff jmpq 900 <start_secondary+0x30>
a34: e8 00 00 00 00 callq a39 <start_secondary+0x169>
a35: R_X86_64_PLT32 __stack_chk_fail-0x4
Jakub
Jakub Jelinek
2020-03-16 18:03:03 UTC
Permalink
Post by Borislav Petkov
So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
Easy, but a waste when GCC already has the optimize attribute that can
handle also ~450 other options that are per-function rather than per-TU.

Jakub
Borislav Petkov
2020-03-17 14:36:02 UTC
Permalink
Post by Jakub Jelinek
Post by Borislav Petkov
So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
Easy, but a waste when GCC already has the optimize attribute that can
handle also ~450 other options that are per-function rather than per-TU.
Ok, Micha explained to me what you mean here and I did:

static void __attribute__((optimize("no-stack-protect"))) notrace start_secondary(void *unused)
{

but it said

arch/x86/kernel/smpboot.c:216:1: warning: bad option ‘-fno-stack-protect’ to attribute ‘optimize’ [-Wattributes]
216 | {
| ^

because -fno-stack-protect is not implemented yet.

Regardless, yes, that can work too, if we had the -fno-stack-protect
variant.

Thx.
--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
Jakub Jelinek
2020-03-17 14:39:14 UTC
Permalink
Post by Borislav Petkov
Post by Jakub Jelinek
Post by Borislav Petkov
So having a way to state "do not add stack canary checking to this
particular function" would be optimal. And since you already have the
"stack_protect" function attribute I figure adding a "no_stack_protect"
one should be easy...
Easy, but a waste when GCC already has the optimize attribute that can
handle also ~450 other options that are per-function rather than per-TU.
static void __attribute__((optimize("no-stack-protect"))) notrace start_secondary(void *unused)
{
but it said
That is because the option is called -fno-stack-protector, so one needs to
use __attribute__((optimize("no-stack-protector")))

Jakub
Borislav Petkov
2020-03-17 14:49:42 UTC
Permalink
Post by Jakub Jelinek
That is because the option is called -fno-stack-protector, so one needs to
use __attribute__((optimize("no-stack-protector")))
Ha, that works even! :-)

And my guest boots.

I guess we can do that then. Looks like the optimal solution to me.

Also, I'm assuming older gccs will simply ignore unknown optimize
options?

Thx.
--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette
David Laight
2020-03-17 16:35:42 UTC
Permalink
From: Borislav Petkov
Sent: 17 March 2020 14:50
Post by Jakub Jelinek
That is because the option is called -fno-stack-protector, so one needs to
use __attribute__((optimize("no-stack-protector")))
Ha, that works even! :-)
And my guest boots.
I guess we can do that then. Looks like the optimal solution to me.
Also, I'm assuming older gccs will simply ignore unknown optimize
options?
Unlikely since it rejected the misspelt one.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Regis
Sergei Trofimovich
2020-03-16 22:12:51 UTC
Permalink
On Mon, 16 Mar 2020 14:26:48 +0100
Post by Jakub Jelinek
Post by Peter Zijlstra
+# smpboot's init_secondary initializes stack canary.
+# Make sure we don't emit stack checks before it's
+# initialized.
+nostackp := $(call cc-option, -fno-stack-protector)
+CFLAGS_smpboot.o := $(nostackp)
What makes GCC10 insert this while GCC9 does not. Also, I would much
My bet is different inlining decisions.
If somebody hands me over the preprocessed source + gcc command line, I can
have a look in detail (which exact change and why).
In case you are still interested in preprocessed files and results I've collected
all the bits in a single tarball:
https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14.tar.gz
Same available in separate files in:
https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14/

Specifically:
- gcc-v.gcc-{9,10}: gcc-v output of both compilers. Note --enable-default-pie --enable-default-ssp.
- config.gcc-{9,10}: note, they are not identical as Kbuild does not recognize gcc-10's
plugin support. I don't use it though.
- boot-crash-gcc-10.jpg: picture of a full boot crash
- command.gcc-{9,10} called to generate .s files (it's almost the same when building .o files)
- arch-x86-kernel-smpboot.s-gcc-{9,10}: asm files, gennerated with 'make arch/x86/kernel/smpboot.s V=1'
- arch-x86-kernel-smpboot.c.c-gcc-{9,10}: preprocessed files, generated from command by changing -S to -E.

Another observation: kernel built by gcc-10 boots as-is in qemu without patches.
I wonder if the following boot line right before the crash has something to do wit it:
"random: get_random_bgtes called from start_secondary+0x105/0x1a0 with crng_init=0"
I hope it's not a race of async canary initialization and canary use.
Only one CPU is booted at that time, yes?
--
Sergei
Jakub Jelinek
2020-03-17 11:46:05 UTC
Permalink
Post by Sergei Trofimovich
In case you are still interested in preprocessed files and results I've collected
https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14.tar.gz
https://dev.gentoo.org/~slyfox/bugs/linux-gcc-10-boot-2020-03-14/
Thanks, that is helpful.
So, a few comments.

One thing I've noticed in the command line is that
--param=allow-store-data-races=0
got dropped. That is fine, the parameter is gone, but it has been replaced
in https://gcc.gnu.org/PR92046 by the
-fno-allow-store-data-races
option. Like the param which defaulted to 0 and has been enabled only with
-Ofast this option is also -fno-allow-store-data-races by default unless
-Ofast, but if kernel wanted to be explicit or make sure not to introduce
them even with -Ofast, I'd say it should:
# Tell gcc to never replace conditional load with a non-conditional one
KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
+KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)
in the toplevel Makefile.

The actual change that causes cpu_startup_entry to be tail called in GCC 10
was my https://gcc.gnu.org/PR59813 https://gcc.gnu.org/PR89060
https://gcc.gnu.org/r10-230-gab87ac8d53f3d903bfc9eeb0f0b5e7eed1f38cbc
optimization. Previously, GCC punted just because it saw some earlier call
which passed address of some automatic variable to some other function and
escaped it that way (and could be possibly used during the function
considered for tail call, thus making the tail call not possible as with
tail call the frame is gone then). Now, GCC tries to use information about
scopes to determine that eventhough some automatic variables can escape that
way, if they aren't in the scope anymore during the last call, they
shouldn't be problematic. There are two variables that prevented the tail
call optimization in the past it seems, int cpuid; in smp_callin function
which is inlined, then its address escapes:
__dynamic_pr_debug(&__UNIQUE_ID_ddebug114, "smpboot" ": " "Stack at about %p\n", &cpuid);
and then cpuid goes out of scope.
Similarly, the u64 canary; variable in boot_init_stack_canary which is also
inlined. The address escapes in
get_random_bytes(&canary, sizeof(canary));
and later on is used and eventually goes out of scope.
Finally, there is the cpu_startup_entry call at the end of function.

Regarding the reason why GCC doesn't tailcall noreturn functions, that is a
deliberate decision, often that results in shorter code (there is no need to
restore stack etc. before the call and because it is noreturn, anything
after the call can be thrown away as unreachable) and more importantly,
results in more useful backtraces when something calls abort etc.

Hope this helps.

Jakub
Sergei Trofimovich
2020-03-17 18:10:16 UTC
Permalink
On Tue, 17 Mar 2020 12:46:05 +0100
Post by Jakub Jelinek
So, a few comments.
One thing I've noticed in the command line is that
--param=allow-store-data-races=0
got dropped. That is fine, the parameter is gone, but it has been replaced
in https://gcc.gnu.org/PR92046 by the
-fno-allow-store-data-races
option. Like the param which defaulted to 0 and has been enabled only with
-Ofast this option is also -fno-allow-store-data-races by default unless
-Ofast, but if kernel wanted to be explicit or make sure not to introduce
# Tell gcc to never replace conditional load with a non-conditional one
KBUILD_CFLAGS += $(call cc-option,--param=allow-store-data-races=0)
+KBUILD_CFLAGS += $(call cc-option,-fno-allow-store-data-races)
in the toplevel Makefile.
Yeah, I noticed it yesterday as well and sent out exactly the same change:
https://lkml.org/lkml/2020/3/16/1012
I also checked that flag does not change code generation on -O2 for smpboot.c
--
Sergei
Arvind Sankar
2020-03-16 18:22:08 UTC
Permalink
arch/x86/kernel/Makefile | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile
index 9b294c13809a..da9f4ea9bf4c 100644
--- a/arch/x86/kernel/Makefile
+++ b/arch/x86/kernel/Makefile
@@ -11,6 +11,12 @@ extra-y += vmlinux.lds
CPPFLAGS_vmlinux.lds += -U$(UTS_MACHINE)
+# smpboot's init_secondary initializes stack canary.
^^^^ should be start_secondary?
+# Make sure we don't emit stack checks before it's
+# initialized.
+nostackp := $(call cc-option, -fno-stack-protector)
+CFLAGS_smpboot.o := $(nostackp)
+
ifdef CONFIG_FUNCTION_TRACER
# Do not profile debug and lowlevel utilities
CFLAGS_REMOVE_tsc.o = -pg
--
2.25.1
Loading...