Discussion:
[PATCH 0/4] x86/Hyper-V: Panic code path fixes
(too old to reply)
l***@gmail.com
2020-03-17 13:25:19 UTC
Permalink
From: Tianyu Lan <***@microsoft.com>

This patchset fixes some issues in the Hyper-V panic code path.
Patch 1 resolves issue that panic system still responses network
packets.
Patch 2-3 resolves crash enlightenment issues.
Patch 4 is to set crash_kexec_post_notifiers to true for Hyper-V
VM in order to report crash data or kmsg to host before running
kdump kernel.

Tianyu Lan (4):
x86/Hyper-V: Unload vmbus channel in hv panic callback
x86/Hyper-V: Free hv_panic_page when fail to register kmsg dump
x86/Hyper-V: Trigger crash enlightenment only once during system
crash.
x86/Hyper-V: Report crash register data or ksmg before running crash
kernel

arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++++
drivers/hv/channel_mgmt.c | 5 +++++
drivers/hv/vmbus_drv.c | 35 +++++++++++++++++++++++++----------
3 files changed, 40 insertions(+), 10 deletions(-)
--
2.14.5
l***@gmail.com
2020-03-17 13:25:21 UTC
Permalink
From: Tianyu Lan <***@microsoft.com>

If fail to register kmsg dump on Hyper-V platform, hv_panic_page
will not be used anywhere. So free and reset it.

Signed-off-by: Tianyu Lan <***@microsoft.com>
---
drivers/hv/vmbus_drv.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b56b9fb9bd90..b043efea092a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1385,9 +1385,13 @@ static int vmbus_bus_init(void)
hv_panic_page = (void *)hv_alloc_hyperv_zeroed_page();
if (hv_panic_page) {
ret = kmsg_dump_register(&hv_kmsg_dumper);
- if (ret)
+ if (ret) {
pr_err("Hyper-V: kmsg dump register "
"error 0x%x\n", ret);
+ hv_free_hyperv_page(
+ (unsigned long)hv_panic_page);
+ hv_panic_page = NULL;
+ }
} else
pr_err("Hyper-V: panic message page memory "
"allocation failed");
--
2.14.5
Wei Liu
2020-03-17 17:36:00 UTC
Permalink
Post by l***@gmail.com
If fail to register kmsg dump on Hyper-V platform, hv_panic_page
will not be used anywhere. So free and reset it.
---
drivers/hv/vmbus_drv.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b56b9fb9bd90..b043efea092a 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -1385,9 +1385,13 @@ static int vmbus_bus_init(void)
hv_panic_page = (void *)hv_alloc_hyperv_zeroed_page();
if (hv_panic_page) {
ret = kmsg_dump_register(&hv_kmsg_dumper);
- if (ret)
+ if (ret) {
pr_err("Hyper-V: kmsg dump register "
"error 0x%x\n", ret);
+ hv_free_hyperv_page(
+ (unsigned long)hv_panic_page);
+ hv_panic_page = NULL;
+ }
While this modification looks correct to me, there is a call to free
hv_panic_page in the err_alloc path. That makes the error handling a bit
confusing here.

I think you can just remove that function call in err_alloc path.

Wei.
Post by l***@gmail.com
} else
pr_err("Hyper-V: panic message page memory "
"allocation failed");
--
2.14.5
l***@gmail.com
2020-03-17 13:25:20 UTC
Permalink
From: Tianyu Lan <***@microsoft.com>

Customer reported Hyper-V VM still responded network traffic
ack packets after kernel panic with kernel parameter "panic=0”.
This becauses vmbus driver interrupt handler still works
on the panic cpu after kernel panic. Panic cpu falls into
infinite loop of panic() with interrupt enabled at that point.
Vmbus driver can still handle network traffic.

This confuses remote service that the panic system is still
alive when it gets ack packets. Unload vmbus channel in hv panic
callback and fix it.

vmbus_initiate_unload() maybe double called during panic process
(e.g, hyperv_panic_event() and hv_crash_handler()). So check
and set connection state in vmbus_initiate_unload() to resolve
reenter issue.

Signed-off-by: Tianyu Lan <***@microsoft.com>
---
drivers/hv/channel_mgmt.c | 5 +++++
drivers/hv/vmbus_drv.c | 17 +++++++++--------
2 files changed, 14 insertions(+), 8 deletions(-)

diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0370364169c4..893493f2b420 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
{
struct vmbus_channel_message_header hdr;

+ if (vmbus_connection.conn_state == DISCONNECTED)
+ return;
+
/* Pre-Win2012R2 hosts don't support reconnect */
if (vmbus_proto_version < VERSION_WIN8_1)
return;
@@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
wait_for_completion(&vmbus_connection.unload_event);
else
vmbus_wait_for_unload();
+
+ vmbus_connection.conn_state = DISCONNECTED;
}

static void check_ready_for_resume_event(void)
diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index 029378c27421..b56b9fb9bd90 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -53,9 +53,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,
{
struct pt_regs *regs;

- regs = current_pt_regs();
+ vmbus_initiate_unload(true);

- hyperv_report_panic(regs, val);
+ if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
+ regs = current_pt_regs();
+ hyperv_report_panic(regs, val);
+ }
return NOTIFY_DONE;
}

@@ -1391,10 +1394,12 @@ static int vmbus_bus_init(void)
}

register_die_notifier(&hyperv_die_block);
- atomic_notifier_chain_register(&panic_notifier_list,
- &hyperv_panic_block);
}

+ /* Vmbus channel is unloaded in panic callback when panic happens.*/
+ atomic_notifier_chain_register(&panic_notifier_list,
+ &hyperv_panic_block);
+
vmbus_request_offers();

return 0;
@@ -2204,8 +2209,6 @@ static int vmbus_bus_suspend(struct device *dev)

vmbus_initiate_unload(false);

- vmbus_connection.conn_state = DISCONNECTED;
-
/* Reset the event for the next resume. */
reinit_completion(&vmbus_connection.ready_for_resume_event);

@@ -2289,7 +2292,6 @@ static void hv_kexec_handler(void)
{
hv_stimer_global_cleanup();
vmbus_initiate_unload(false);
- vmbus_connection.conn_state = DISCONNECTED;
/* Make sure conn_state is set as hv_synic_cleanup checks for it */
mb();
cpuhp_remove_state(hyperv_cpuhp_online);
@@ -2306,7 +2308,6 @@ static void hv_crash_handler(struct pt_regs *regs)
* doing the cleanup for current CPU only. This should be sufficient
* for kdump.
*/
- vmbus_connection.conn_state = DISCONNECTED;
cpu = smp_processor_id();
hv_stimer_cleanup(cpu);
hv_synic_disable_regs(cpu);
--
2.14.5
Wei Liu
2020-03-17 17:35:53 UTC
Permalink
Post by l***@gmail.com
Customer reported Hyper-V VM still responded network traffic
ack packets after kernel panic with kernel parameter "panic=0”.
This becauses vmbus driver interrupt handler still works
on the panic cpu after kernel panic. Panic cpu falls into
infinite loop of panic() with interrupt enabled at that point.
Vmbus driver can still handle network traffic.
This confuses remote service that the panic system is still
alive when it gets ack packets. Unload vmbus channel in hv panic
callback and fix it.
vmbus_initiate_unload() maybe double called during panic process
(e.g, hyperv_panic_event() and hv_crash_handler()). So check
and set connection state in vmbus_initiate_unload() to resolve
reenter issue.
---
drivers/hv/channel_mgmt.c | 5 +++++
drivers/hv/vmbus_drv.c | 17 +++++++++--------
2 files changed, 14 insertions(+), 8 deletions(-)
diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
index 0370364169c4..893493f2b420 100644
--- a/drivers/hv/channel_mgmt.c
+++ b/drivers/hv/channel_mgmt.c
@@ -839,6 +839,9 @@ void vmbus_initiate_unload(bool crash)
{
struct vmbus_channel_message_header hdr;
+ if (vmbus_connection.conn_state == DISCONNECTED)
+ return;
+
/* Pre-Win2012R2 hosts don't support reconnect */
if (vmbus_proto_version < VERSION_WIN8_1)
return;
@@ -857,6 +860,8 @@ void vmbus_initiate_unload(bool crash)
wait_for_completion(&vmbus_connection.unload_event);
else
vmbus_wait_for_unload();
+
+ vmbus_connection.conn_state = DISCONNECTED;
This is only set at the end of the function. I don't see how this solve
the re-entrant issue with the check at the beginning. Do I miss anything
here?

Maybe this function should check and set the state to
DISCONNECTING/DISCONNECTED at the beginning of this function?

Wei.
l***@gmail.com
2020-03-17 13:25:23 UTC
Permalink
From: Tianyu Lan <***@microsoft.com>

Hyper-V expects to get crash register data or kmsg via crash
enlightenment when guest crash happens. crash_kexec_post_notifiers
is default to be false and crash kernel runs before calling hv
panic callback and dumping kmsg. In this case, Hyper-V doesn't
get crash register data or kmsg from guest. Set crash_kexec_post
_notifiers to be true for Hyper-V VM and fix it.

Signed-off-by: Tianyu Lan <***@microsoft.com>
---
arch/x86/kernel/cpu/mshyperv.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/arch/x86/kernel/cpu/mshyperv.c b/arch/x86/kernel/cpu/mshyperv.c
index caa032ce3fe3..5e296a7e6036 100644
--- a/arch/x86/kernel/cpu/mshyperv.c
+++ b/arch/x86/kernel/cpu/mshyperv.c
@@ -263,6 +263,16 @@ static void __init ms_hyperv_init_platform(void)
cpuid_eax(HYPERV_CPUID_NESTED_FEATURES);
}

+ /*
+ * Hyper-V expects to get crash register data or kmsg when
+ * crash enlightment is available and system crashes. Set
+ * crash_kexec_post_notifiers to be true to make sure that
+ * calling crash enlightment interface before running kdump
+ * kernel.
+ */
+ if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE)
+ crash_kexec_post_notifiers = true;
+
#ifdef CONFIG_X86_LOCAL_APIC
if (ms_hyperv.features & HV_X64_ACCESS_FREQUENCY_MSRS &&
ms_hyperv.misc_features & HV_FEATURE_FREQUENCY_MSRS_AVAILABLE) {
--
2.14.5
l***@gmail.com
2020-03-17 13:25:22 UTC
Permalink
From: Tianyu Lan <***@microsoft.com>

Hyper-V expects guest only triggers crash enlightenment.
The second crash notify will be ignored by Hyper-V.

Current code may trigger crash enlightenment during system
panic twice.
1) The enlightenment is triggered in hyperv_panic/die_event()
via hyperv_report_panic().
2) hv_kmsg_dump() reports kmsg to host via hyperv_report_panic_msg().

Fix it. If kmsg dump is registered successfully, just report
kmsg via hyperv_report_panic_msg() and not report register values
via hyperv_report_panic().

Signed-off-by: Tianyu Lan <***@microsoft.com>
---
drivers/hv/vmbus_drv.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/hv/vmbus_drv.c b/drivers/hv/vmbus_drv.c
index b043efea092a..1787d6246251 100644
--- a/drivers/hv/vmbus_drv.c
+++ b/drivers/hv/vmbus_drv.c
@@ -55,7 +55,12 @@ static int hyperv_panic_event(struct notifier_block *nb, unsigned long val,

vmbus_initiate_unload(true);

- if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE) {
+ /*
+ * Crash notify only can be triggered once. If crash notify
+ * message is available, just report kmsg to crash buffer.
+ */
+ if (ms_hyperv.misc_features & HV_FEATURE_GUEST_CRASH_MSR_AVAILABLE
+ && !hv_panic_page) {
regs = current_pt_regs();
hyperv_report_panic(regs, val);
}
@@ -68,7 +73,12 @@ static int hyperv_die_event(struct notifier_block *nb, unsigned long val,
struct die_args *die = (struct die_args *)args;
struct pt_regs *regs = die->regs;

- hyperv_report_panic(regs, val);
+ /*
+ * Crash notify only can be triggered once. If crash notify
+ * message is available, just report kmsg to crash buffer.
+ */
+ if (!hv_panic_page)
+ hyperv_report_panic(regs, val);
return NOTIFY_DONE;
}
--
2.14.5
Loading...