Salvatore Bonaccorso
2024-02-07 10:50:02 UTC
Hi Valentin, hi all
[This is about a regression reported in Debian for 6.1.67]
the backport of e9cdebbe23f1 ("dlm: use kernel_connect() and
kernel_bind()") .
Let's loop in the upstream regression list for tracking and people
involved for the subsystem to see if the issue can be identified. As
it is working for 6.6.15 which includes the commit backport as well it
might be very well that a prerequisite is missing.
# annotate regression with 6.1.y specific commit
#regzbot ^introduced e11dea8f503341507018b60906c4a9e7332f3663
#regzbot link: https://bugs.debian.org/1063338
Any ideas?
Regards,
Salvatore
[This is about a regression reported in Debian for 6.1.67]
Package: linux-image-amd64
Version: 6.1.76+1
Source: linux
Source-Version: 6.1.76+1
Severity: important
Control: notfound -1 6.6.15-2
Dear Maintainers,
We discovered a bug affecting dlm that prevents any tcp communications by
dlm when booted with debian kernel 6.1.76-1.
Dlm startup works (corosync-cpgtool shows the dlm:controld group with all
```
dlm: Using TCP for communications
dlm: cannot start dlm midcomms -97
```
It seems that commit "dlm: use kernel_connect() and kernel_bind()"
(e9cdebbe) was merged to 6.1.
Checking the code it seems that the changed function dlm_tcp_listen_bind()
fails with exit code 97 (EAFNOSUPPORT)
It is called from
dlm/lockspace.c: threads_start() -> dlm_midcomms_start()
dlm/midcomms.c: dlm_midcomms_start() -> dlm_lowcomms_start()
dlm/lowcomms.c: dlm_lowcomms_start() -> dlm_listen_for_all() ->
dlm_proto_ops->listen_bind() = dlm_tcp_listen_bind()
The error code is returned all the way to threads_start() where the error
message is emmitted.
Booting with the unsigned kernel from testing (6.6.15-2), which also
contains this commit, works without issues.
I'm not sure what additional changes are required to get this working or if
rolling back this change is an option.
We'd be happy to test patches that might fix this issue.
Thanks for your report. So we have a 6.1.76 specific regression forVersion: 6.1.76+1
Source: linux
Source-Version: 6.1.76+1
Severity: important
Control: notfound -1 6.6.15-2
Dear Maintainers,
We discovered a bug affecting dlm that prevents any tcp communications by
dlm when booted with debian kernel 6.1.76-1.
Dlm startup works (corosync-cpgtool shows the dlm:controld group with all
```
dlm: Using TCP for communications
dlm: cannot start dlm midcomms -97
```
It seems that commit "dlm: use kernel_connect() and kernel_bind()"
(e9cdebbe) was merged to 6.1.
Checking the code it seems that the changed function dlm_tcp_listen_bind()
fails with exit code 97 (EAFNOSUPPORT)
It is called from
dlm/lockspace.c: threads_start() -> dlm_midcomms_start()
dlm/midcomms.c: dlm_midcomms_start() -> dlm_lowcomms_start()
dlm/lowcomms.c: dlm_lowcomms_start() -> dlm_listen_for_all() ->
dlm_proto_ops->listen_bind() = dlm_tcp_listen_bind()
The error code is returned all the way to threads_start() where the error
message is emmitted.
Booting with the unsigned kernel from testing (6.6.15-2), which also
contains this commit, works without issues.
I'm not sure what additional changes are required to get this working or if
rolling back this change is an option.
We'd be happy to test patches that might fix this issue.
the backport of e9cdebbe23f1 ("dlm: use kernel_connect() and
kernel_bind()") .
Let's loop in the upstream regression list for tracking and people
involved for the subsystem to see if the issue can be identified. As
it is working for 6.6.15 which includes the commit backport as well it
might be very well that a prerequisite is missing.
# annotate regression with 6.1.y specific commit
#regzbot ^introduced e11dea8f503341507018b60906c4a9e7332f3663
#regzbot link: https://bugs.debian.org/1063338
Any ideas?
Regards,
Salvatore