Discussion:
Hang in work_resched
(too old to reply)
Guillaume Chazarain
2008-01-30 23:00:19 UTC
Permalink
=======================
gnome-termina S 00000027 0 2201 1
f6711fb0 00200082 cb330d62 00000027 f664105c 00000b1e 00000000 cb331880
00000027 f660d780 009e3840 080ab7d8 080ab298 f6711000 c0103e7e 009e3840
000e0002 00000002 080ab7d8 080ab298 bfb41be8 080ab7d8 0000007b c010007b
[<c0103e7e>] work_resched+0x5/0x16
=======================
c0103e7e: fa cli
I bisected it, and the resulting commit is appended. Rerverting this
commit applies cleanly on today's git
(dd430ca20c40ecccd6954a7efd13d4398f507728) and makes the hang go away
-:)


commit 37bb6cb4097e29ffee970065b74499cbf10603a3
Author: Peter Zijlstra <***@chello.nl>
Date: Fri Jan 25 21:08:32 2008 +0100

hrtimer: unlock hrtimer_wakeup

hrtimer_wakeup creates a

base->lock
rq->lock

lock dependancy. Avoid this by switching to HRTIMER_CB_IRQSAFE_NO_SOFTIRQ
which doesn't hold base->lock.

This fully untangles hrtimer locks from the scheduler locks, and allows
hrtimer usage in the scheduler proper.

Signed-off-by: Peter Zijlstra <***@chello.nl>
Signed-off-by: Ingo Molnar <***@elte.hu>

diff --git a/kernel/hrtimer.c b/kernel/hrtimer.c
index 061ae28..bd5d6b5 100644
--- a/kernel/hrtimer.c
+++ b/kernel/hrtimer.c
@@ -1293,7 +1293,7 @@ void hrtimer_init_sleeper(struct hrtimer_sleeper
*sl, struct task_struct *task)
sl->timer.function = hrtimer_wakeup;
sl->task = task;
#ifdef CONFIG_HIGH_RES_TIMERS
- sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_RESTART;
+ sl->timer.cb_mode = HRTIMER_CB_IRQSAFE_NO_SOFTIRQ;
#endif
}

@@ -1304,6 +1304,8 @@ static int __sched do_nanosleep(struct
hrtimer_sleeper *t, enum hrtimer_mode mod
do {
set_current_state(TASK_INTERRUPTIBLE);
hrtimer_start(&t->timer, t->timer.expires, mode);
+ if (!hrtimer_active(&t->timer))
+ t->task = NULL;

if (likely(t->task))
schedule();
--
Guillaume
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2008-01-31 09:00:19 UTC
Permalink
Post by Guillaume Chazarain
=======================
gnome-termina S 00000027 0 2201 1
f6711fb0 00200082 cb330d62 00000027 f664105c 00000b1e 00000000 cb331880
00000027 f660d780 009e3840 080ab7d8 080ab298 f6711000 c0103e7e 009e3840
000e0002 00000002 080ab7d8 080ab298 bfb41be8 080ab7d8 0000007b c010007b
[<c0103e7e>] work_resched+0x5/0x16
=======================
c0103e7e: fa cli
I bisected it, and the resulting commit is appended. Rerverting this
commit applies cleanly on today's git
(dd430ca20c40ecccd6954a7efd13d4398f507728) and makes the hang go away
-:)
commit 37bb6cb4097e29ffee970065b74499cbf10603a3
Does this patch from thomas fix it as well?

---

From: Thomas Gleixner <***@linutronix.de>

I justed walked trough the code, which updates xtime _AND_
wall_to_monotonic w/o updating xtime_cache:

There are a couple of places missing the xtime_cache update.

Signed-off-by: Thomas Gleixner <***@linuxtronix.de>
---
include/linux/time.h | 1 +
kernel/time.c | 1 +
kernel/time/timekeeping.c | 6 ++++--
3 files changed, 6 insertions(+), 2 deletions(-)

Index: linux-2.6/include/linux/time.h
===================================================================
--- linux-2.6.orig/include/linux/time.h
+++ linux-2.6/include/linux/time.h
@@ -122,6 +122,7 @@ extern void monotonic_to_bootbased(struc
extern struct timespec timespec_trunc(struct timespec t, unsigned gran);
extern int timekeeping_is_continuous(void);
extern void update_wall_time(void);
+extern void update_xtime_cache(u64 nsec);

/**
* timespec_to_ns - Convert timespec to nanoseconds
Index: linux-2.6/kernel/time.c
===================================================================
--- linux-2.6.orig/kernel/time.c
+++ linux-2.6/kernel/time.c
@@ -129,6 +129,7 @@ static inline void warp_clock(void)
write_seqlock_irq(&xtime_lock);
wall_to_monotonic.tv_sec -= sys_tz.tz_minuteswest * 60;
xtime.tv_sec += sys_tz.tz_minuteswest * 60;
+ update_xtime_cache(0);
write_sequnlock_irq(&xtime_lock);
clock_was_set();
}
Index: linux-2.6/kernel/time/timekeeping.c
===================================================================
--- linux-2.6.orig/kernel/time/timekeeping.c
+++ linux-2.6/kernel/time/timekeeping.c
@@ -47,7 +47,7 @@ struct timespec wall_to_monotonic __attr
static unsigned long total_sleep_time; /* seconds */

static struct timespec xtime_cache __attribute__ ((aligned (16)));
-static inline void update_xtime_cache(u64 nsec)
+void update_xtime_cache(u64 nsec)
{
xtime_cache = xtime;
timespec_add_ns(&xtime_cache, nsec);
@@ -145,6 +145,7 @@ int do_settimeofday(struct timespec *tv)

set_normalized_timespec(&xtime, sec, nsec);
set_normalized_timespec(&wall_to_monotonic, wtm_sec, wtm_nsec);
+ update_xtime_cache(0);

clock->error = 0;
ntp_clear();
@@ -252,8 +253,8 @@ void __init timekeeping_init(void)
xtime.tv_nsec = 0;
set_normalized_timespec(&wall_to_monotonic,
-xtime.tv_sec, -xtime.tv_nsec);
+ update_xtime_cache(0);
total_sleep_time = 0;
-
write_sequnlock_irqrestore(&xtime_lock, flags);
}

@@ -290,6 +291,7 @@ static int timekeeping_resume(struct sys
}
/* Make sure that we have the correct xtime reference */
timespec_add_ns(&xtime, timekeeping_suspend_nsecs);
+ update_xtime_cache(0);
/* re-base the last cycle value */
clock->cycle_last = clocksource_read(clock);
clock->error = 0;


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Guillaume Chazarain
2008-01-31 11:30:24 UTC
Permalink
Post by Peter Zijlstra
Does this patch from thomas fix it as well?
Unfortunately, not.

For information, reverting just the first part of the offending commit
(sl->timer.cb_mode) fixed the problem, while reverting only the second
part (if (!hrtimer_active(&t->timer))) had no effect.

Also, I found a trivially reproductible testcase : sleep 0.

It hangs in nanosleep({0, 0}).
--
Guillaume
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2008-01-31 11:40:11 UTC
Permalink
Post by Guillaume Chazarain
Post by Peter Zijlstra
Does this patch from thomas fix it as well?
Unfortunately, not.
For information, reverting just the first part of the offending commit
(sl->timer.cb_mode) fixed the problem, while reverting only the second
part (if (!hrtimer_active(&t->timer))) had no effect.
Also, I found a trivially reproductible testcase : sleep 0.
It hangs in nanosleep({0, 0}).
Thanks, I'll go look.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Peter Zijlstra
2008-01-31 12:20:06 UTC
Permalink
Post by Guillaume Chazarain
Post by Peter Zijlstra
Does this patch from thomas fix it as well?
Unfortunately, not.
For information, reverting just the first part of the offending commit
(sl->timer.cb_mode) fixed the problem, while reverting only the second
part (if (!hrtimer_active(&t->timer))) had no effect.
Also, I found a trivially reproductible testcase : sleep 0.
It hangs in nanosleep({0, 0}).
works for me :-( (x86_64 rawhide userspace)

Could you send your .config?

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Guillaume Chazarain
2008-01-31 12:40:08 UTC
Permalink
This post might be inappropriate. Click to display it.
Continue reading on narkive:
Loading...