Discussion:
Real-Time Preemption, -RT-2.6.10-rc2-mm3-V0.7.31-19
(too old to reply)
Andrew Burgess
2004-12-02 16:00:17 UTC
Permalink
Hmm, i wonder if there's a way to detect non RT behaviour in jackd
clients. I mean AFAIK the only thing allowed for the process callback
of on is the FIFO it waits on to be woken, right? Every other sleeping
is to be considered a bug.
gettimeofday(1,1);
gettimeofday(1,0);
while in atomic-mode, any non-atomic activity (scheduling) will produce
a kernel message and a SIGUSR2 sent to the offending process (once,
atomic mode has to be re-enabled again for the next message). Preemption
by a higher-prio task does not trigger a message/signal.
If you run the client under gdb you should be able to catch the SIGUSR2
signal and then you can see the offending code's backtrace via 'bt'.
Might be handy to have the option to send a SIGABRT, then you don't need
to guess which app to run under gdb and the offending code is there in
the core file.

Also, I'm cc-ing jack-devel. This could fit into libjack so no client
mods would be needed I think. After the thread_init_callback is run libjack
could run 'gettimeofday(1,1);' for each client thread. Then if any client
breaks the rules you get a core showing where.

On further thought, I suppose libjack could install a SIGUSR2 handler and
have that call abort for all the rt client threads. Still no client mods
needed, only an RT-aware libjack.

A big thank you to Ingo and everyone else involved on behalf of all the
linux audio users!

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Florian Schmidt
2004-12-02 16:10:12 UTC
Permalink
On Thu, 2 Dec 2004 07:46:20 -0800
Post by Andrew Burgess
Hmm, i wonder if there's a way to detect non RT behaviour in jackd
clients. I mean AFAIK the only thing allowed for the process callback
of on is the FIFO it waits on to be woken, right? Every other sleeping
is to be considered a bug.
gettimeofday(1,1);
gettimeofday(1,0);
while in atomic-mode, any non-atomic activity (scheduling) will produce
a kernel message and a SIGUSR2 sent to the offending process (once,
atomic mode has to be re-enabled again for the next message). Preemption
by a higher-prio task does not trigger a message/signal.
If you run the client under gdb you should be able to catch the SIGUSR2
signal and then you can see the offending code's backtrace via 'bt'.
Might be handy to have the option to send a SIGABRT, then you don't need
to guess which app to run under gdb and the offending code is there in
the core file.
Also, I'm cc-ing jack-devel. This could fit into libjack so no client
mods would be needed I think. After the thread_init_callback is run libjack
could run 'gettimeofday(1,1);' for each client thread. Then if any client
breaks the rules you get a core showing where.
On further thought, I suppose libjack could install a SIGUSR2 handler and
have that call abort for all the rt client threads. Still no client mods
needed, only an RT-aware libjack.
right. Or instead of aborting jackd might print a debug output (like
"client foo violated RT constraints").

But the calls to gettimeofday would need to be done right before and
after every process callback as each client's RT thread does wait on the
FIFO to get woken by jackd. This waiting would appear as RT constraints
violation if the gettimeofday would be done only once per thread
lifetime at thread startup.
Post by Andrew Burgess
A big thank you to Ingo and everyone else involved on behalf of all the
linux audio users!
BTW: i suppose pretty much every jack client except for very simple ones
do break the RT constraints.

Flo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jack O'Quin
2004-12-02 16:30:21 UTC
Permalink
Post by Florian Schmidt
On Thu, 2 Dec 2004 07:46:20 -0800
Post by Andrew Burgess
On further thought, I suppose libjack could install a SIGUSR2 handler and
have that call abort for all the rt client threads. Still no client mods
needed, only an RT-aware libjack.
right. Or instead of aborting jackd might print a debug output (like
"client foo violated RT constraints").
Libjack cannot assume the client has no SIGUSR2 handler of its own.
Post by Florian Schmidt
Post by Andrew Burgess
A big thank you to Ingo and everyone else involved on behalf of all the
linux audio users!
thanks++ :-)
Post by Florian Schmidt
BTW: i suppose pretty much every jack client except for very simple ones
do break the RT constraints.
True.

It would be wonderful to have a reliable mechanism for debugging them.
--
joq
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Florian Schmidt
2004-12-02 17:00:26 UTC
Permalink
On 02 Dec 2004 10:26:55 -0600
Post by Jack O'Quin
Post by Florian Schmidt
On Thu, 2 Dec 2004 07:46:20 -0800
Post by Andrew Burgess
On further thought, I suppose libjack could install a SIGUSR2 handler and
have that call abort for all the rt client threads. Still no client mods
needed, only an RT-aware libjack.
right. Or instead of aborting jackd might print a debug output (like
"client foo violated RT constraints").
Libjack cannot assume the client has no SIGUSR2 handler of its own.
i see..
Post by Jack O'Quin
It would be wonderful to have a reliable mechanism for debugging them.
I suppose instead of catching the signal the user might just monitor the
syslog. I'm not sure there's printk's triggered by thisalready , but i'm
sure if not, ingo might add them. So a trivial patch for jackd would
probably look like this:

--- libjack/client.c.orig 2004-12-02 17:55:04.000000000 +0100
+++ libjack/client.c 2004-12-02 17:56:23.000000000 +0100
@@ -1238,6 +1238,9 @@
if (control->sync_cb)
jack_call_sync_client (client);

+ // enable atomicity check for RP kernels
+ gettimeofday(1,1);
+
if (control->process) {
if (control->process (control->nframes,
control->process_arg)
@@ -1247,7 +1250,10 @@
} else {
control->state = Finished;
}
-
+
+ // disable atomicity check
+ gettimeofday(0,1);
+
if (control->timebase_cb)
jack_call_timebase_master (client);

flo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jack O'Quin
2004-12-02 17:10:20 UTC
Permalink
Post by Florian Schmidt
I suppose instead of catching the signal the user might just monitor the
syslog. I'm not sure there's printk's triggered by thisalready , but i'm
sure if not, ingo might add them. So a trivial patch for jackd would
--- libjack/client.c.orig 2004-12-02 17:55:04.000000000 +0100
+++ libjack/client.c 2004-12-02 17:56:23.000000000 +0100
@@ -1238,6 +1238,9 @@
if (control->sync_cb)
jack_call_sync_client (client);
+ // enable atomicity check for RP kernels
+ gettimeofday(1,1);
+
if (control->process) {
if (control->process (control->nframes,
control->process_arg)
@@ -1247,7 +1250,10 @@
} else {
control->state = Finished;
}
-
+
+ // disable atomicity check
+ gettimeofday(0,1);
+
if (control->timebase_cb)
jack_call_timebase_master (client);
The sync_cb and timebase_cb callbacks actually need to be RT-safe,
too. ;-)

Is printk() guaranteed not to wait inside the kernel? I am not
familiar with its internal implementation.
--
joq
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Lee Revell
2004-12-02 20:10:15 UTC
Permalink
Post by Jack O'Quin
Is printk() guaranteed not to wait inside the kernel? I am not
familiar with its internal implementation.
Yes. It just writes to a ring buffer and klogd dumps this to syslog.
So if you really start to spew printk's they don't all make it to the
log but you never get blocked.

The implementation probably looks a lot like a correct solution to fix
the printf-from-RT-context issue in JACK would.

Lee

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jack O'Quin
2004-12-02 20:50:12 UTC
Permalink
Post by Lee Revell
Post by Jack O'Quin
Is printk() guaranteed not to wait inside the kernel? I am not
familiar with its internal implementation.
Yes. It just writes to a ring buffer and klogd dumps this to syslog.
So if you really start to spew printk's they don't all make it to the
log but you never get blocked.
The implementation probably looks a lot like a correct solution to fix
the printf-from-RT-context issue in JACK would.
Right. That's exactly what I have in mind, whenever I find time to
work on it. :-)

There's some similar code I wrote for JAMin, which we could adapt.
--
joq
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Florian Schmidt
2004-12-02 17:10:20 UTC
Permalink
On Thu, 2 Dec 2004 17:57:56 +0100
Post by Florian Schmidt
On 02 Dec 2004 10:26:55 -0600
Post by Jack O'Quin
Post by Florian Schmidt
On Thu, 2 Dec 2004 07:46:20 -0800
Post by Andrew Burgess
On further thought, I suppose libjack could install a SIGUSR2 handler and
have that call abort for all the rt client threads. Still no client mods
needed, only an RT-aware libjack.
right. Or instead of aborting jackd might print a debug output (like
"client foo violated RT constraints").
Libjack cannot assume the client has no SIGUSR2 handler of its own.
i see..
Post by Jack O'Quin
It would be wonderful to have a reliable mechanism for debugging them.
I suppose instead of catching the signal the user might just monitor the
syslog. I'm not sure there's printk's triggered by thisalready , but i'm
sure if not, ingo might add them. So a trivial patch for jackd would
--- libjack/client.c.orig 2004-12-02 17:55:04.000000000 +0100
+++ libjack/client.c 2004-12-02 17:56:23.000000000 +0100
@@ -1238,6 +1238,9 @@
if (control->sync_cb)
jack_call_sync_client (client);
+ // enable atomicity check for RP kernels
+ gettimeofday(1,1);
+
if (control->process) {
if (control->process (control->nframes,
control->process_arg)
@@ -1247,7 +1250,10 @@
} else {
control->state = Finished;
}
-
+
+ // disable atomicity check
+ gettimeofday(0,1);
+
if (control->timebase_cb)
jack_call_timebase_master (client);
Well, i do get syslog output with this patch like this:

Dec 2 18:01:11 mango kernel: jack_test:22645 userspace BUG: scheduling in user-atomic context!
Dec 2 18:01:11 mango kernel: [<c02a38b6>] schedule+0x76/0x130 (8)
Dec 2 18:01:11 mango kernel: [<c02a44c5>] schedule_timeout+0x85/0xe0 (36)
Dec 2 18:01:11 mango kernel: [<c016677f>] do_pollfd+0x4f/0x90 (48)
Dec 2 18:01:11 mango kernel: [<c011ceb0>] process_timeout+0x0/0x10 (8)
Dec 2 18:01:11 mango kernel: [<c016686a>] do_poll+0xaa/0xd0 (20)
Dec 2 18:01:11 mango kernel: [<c01669e2>] sys_poll+0x152/0x230 (48)
Dec 2 18:01:11 mango kernel: [<c0165db0>] __pollwait+0x0/0xd0 (36)
Dec 2 18:01:11 mango kernel: [<c01025cb>] syscall_call+0x7/0xb (32)

even if the client's process callback is a noop (except for returning
0). Hmm, i must have missed something in jackd's source. i thought
control->process() directly calls the clients process callback..

hmm..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Jack O'Quin
2004-12-02 17:40:09 UTC
Permalink
Post by Florian Schmidt
Hmm, i must have missed something in jackd's source. i thought
control->process() directly calls the clients process callback..
It does.
--
joq
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Florian Schmidt
2004-12-02 20:10:11 UTC
Permalink
On 02 Dec 2004 11:32:49 -0600
Post by Jack O'Quin
Post by Florian Schmidt
Hmm, i must have missed something in jackd's source. i thought
control->process() directly calls the clients process callback..
It does.
Then either i have misunderstood how it works, or the mechanism is still
buggy.. Wrote mail to Ingo. Will report to jackit-devel when i get an
answer.

Flo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
Florian Schmidt
2004-12-02 21:30:20 UTC
Permalink
On Thu, 2 Dec 2004 21:03:18 +0100
Post by Florian Schmidt
On 02 Dec 2004 11:32:49 -0600
Post by Jack O'Quin
Post by Florian Schmidt
Hmm, i must have missed something in jackd's source. i thought
control->process() directly calls the clients process callback..
It does.
Then either i have misunderstood how it works, or the mechanism is still
buggy.. Wrote mail to Ingo. Will report to jackit-devel when i get an
answer.
PEBCAK :) Here's the corrected patch:

--- libjack/client.c.orig 2004-12-02 17:55:04.000000000 +0100
+++ libjack/client.c 2004-12-02 22:04:06.000000000 +0100
@@ -1238,6 +1238,9 @@
if (control->sync_cb)
jack_call_sync_client (client);

+ // enable atomicity check for RP kernels
+ gettimeofday(1,1);
+
if (control->process) {
if (control->process (control->nframes,
control->process_arg)
@@ -1247,7 +1250,10 @@
} else {
control->state = Finished;
}
-
+
+ // disable atomicity check
+ gettimeofday(1,0);
+
if (control->timebase_cb)
jack_call_timebase_master (client);


seems to work well with my changed jack_test client (this one sleeps in
the 1000th call of the process callback and thuis triggers this trace
and aborts as it doesn't handle SIGUSR2). test client attached.

Dec 2 22:05:10 mango kernel: jack_test:3043 userspace BUG: scheduling in user-atomic context!
Dec 2 22:05:10 mango kernel: [<c02a38b6>] schedule+0x76/0x130 (8)
Dec 2 22:05:10 mango kernel: [<c02a44c5>] schedule_timeout+0x85/0xe0 (36)
Dec 2 22:05:10 mango kernel: [<c01e2f02>] copy_from_user+0x42/0x80 (48)
Dec 2 22:05:10 mango kernel: [<c011ceb0>] process_timeout+0x0/0x10 (8)
Dec 2 22:05:10 mango kernel: [<c011cfae>] sys_nanosleep+0xde/0x170 (20)
Dec 2 22:05:10 mango kernel: [<c01025cb>] syscall_call+0x7/0xb (52)

flo
Bill Huey (hui)
2004-12-02 22:40:14 UTC
Permalink
Post by Andrew Burgess
A big thank you to Ingo and everyone else involved on behalf of all the
linux audio users!
This is going to have a lot of ripple effect throughout the Linux
community in that game developers will quite possibly have the ability
to do low latency OpenGL, frame accurate video and other things along
those lines. The previous Linux kernels without these mods couldn't
allow this level of temporal precious for application developers. It's
going to push applications like jackd and others in ways that will
flush out bugs and general techniques that aren't typically apart of
a proper RT applications.

We have the possibility of providing a first class gaming platform and
other things, that Longhorn can't beging to touch if the middleware
falls into place. :)

bill

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to ***@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Loading...