Page MenuHome Accel-ppp

pppoe vlan_mon doesn't work on debian 10 and ubuntu 18.04
Closed, ResolvedPublicBUG

Description

Steps to reproduce:

  1. check that vlan_mon kernel module is inserted (lsmod | grep vlan_mon)
  2. create veth interface: ip link add type veth && ifconfig veth0 up
  3. run accel-pppd with the configuration below:
[modules]
pppoe

[log]
log-debug=/dev/stdout
level=5

[cli]
tcp=127.0.0.1:2001

[pppoe]
ac-name=test-accel
vlan-mon=veth0,10-20
interface=re:veth0.\d+
  1. do nothing and wait for 10-60 seconds
  2. check accel-pppd status:
$ ps aux | grep accel-pppd
root     11346  0.0  0.3   8696  3608 pts/0    S+   19:08   0:00 accel-pppd -c 123.conf
root     11347  0.0  0.0      0     0 pts/0    Zl+  19:08   0:00 [accel-pppd] <defunct>

(it has zombie status)

  1. check dmesg:
[    0.000000] Linux version 4.19.0-21-amd64 (debian-kernel@lists.debian.org) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.249-2 (2022-06-30)
....
[ 1344.704891] BUG: unable to handle kernel paging request at 00003fa0c121d8e0
[ 1344.706450] PGD 0 P4D 0
[ 1344.707298] Oops: 0000 [#1] SMP NOPTI
[ 1344.708188] CPU: 0 PID: 11347 Comm: accel-pppd Tainted: G           OE     4.19.0-21-amd64 #1 Debian 4.19.249-2
[ 1344.710099] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[ 1344.711770] RIP: 0010:vlan_mon_nl_cmd_add_vlan_mon+0x8f/0x2d0 [vlan_mon]
[ 1344.712347] Code: 85 d5 49 89 c5 48 85 c0 0f 84 2a 02 00 00 48 c7 c7 10 80 56 c0 e8 d1 4e 36 d5 49 8b ad f0 04 00 00 48 85 ed 0f 84 4a 01 00 00 <81> 7d 003
[ 1344.713606] RSP: 0018:ffffbd8680443ad8 EFLAGS: 00000202
[ 1344.714064] RAX: 0000000000000286 RBX: ffffbd8680443b50 RCX: 0000000000008863
[ 1344.714664] RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
[ 1344.715307] RBP: 00003fa0c121d8e0 R08: ffffffffc0567280 R09: 0000000000000000
[ 1344.715893] R10: ffff9de5bb611b80 R11: 0000000000000000 R12: 0000000000000001
[ 1344.716532] R13: ffff9de5b808d000 R14: 0000000000000003 R15: ffff9de5bb611b80
[ 1344.717191] FS:  00007f83a390acc0(0000) GS:ffff9de5bea00000(0000) knlGS:0000000000000000
[ 1344.717873] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1344.718421] CR2: 00003fa0c121d8e0 CR3: 000000003b676000 CR4: 00000000000006f0
[ 1344.719218] Call Trace:
[ 1344.720516]  genl_family_rcv_msg+0x1ca/0x3a0
[ 1344.720995]  ? skb_queue_tail+0x1b/0x50
[ 1344.721355]  ? __netlink_sendskb+0x3d/0x50
[ 1344.721721]  genl_rcv_msg+0x47/0x90
[ 1344.722042]  ? __kmalloc_node_track_caller+0x1dd/0x2a0
[ 1344.722514]  ? genl_family_rcv_msg+0x3a0/0x3a0
[ 1344.722937]  netlink_rcv_skb+0x4c/0x120
[ 1344.723327]  genl_rcv+0x24/0x40
[ 1344.723601]  netlink_unicast+0x181/0x210
[ 1344.723915]  netlink_sendmsg+0x20b/0x3f0
[ 1344.724331]  sock_sendmsg+0x36/0x40
[ 1344.724624]  ___sys_sendmsg+0x295/0x2f0
[ 1344.725012]  ? dev_get_by_name_rcu+0x73/0x90
[ 1344.725409]  ? kmem_cache_alloc_trace+0x15e/0x1e0
[ 1344.725797]  ? netlink_hash+0x29/0xa0
[ 1344.726122]  ? __wake_up_common_lock+0x89/0xc0
[ 1344.726536]  ? __check_object_size+0x46/0x180
[ 1344.726899]  ? _copy_to_user+0x26/0x30
[ 1344.727268]  ? move_addr_to_user+0xae/0xd0
[ 1344.727733]  __sys_sendmsg+0x57/0xa0
[ 1344.728065]  do_syscall_64+0x53/0x110
[ 1344.732564]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 1344.736084] RIP: 0033:0x7f83a3ed8467
[ 1344.739289] Code: 44 00 00 41 54 41 89 d4 55 48 89 f5 53 89 fb 48 83 ec 10 e8 3b ed ff ff 44 89 e2 48 89 ee 89 df 41 89 c0 b8 2e 00 00 00 0f 05 <48> 3d 008
[ 1344.746670] RSP: 002b:00007ffcf9ba40b0 EFLAGS: 00000293 ORIG_RAX: 000000000000002e
[ 1344.750207] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007f83a3ed8467
[ 1344.753758] RDX: 0000000000000000 RSI: 00007ffcf9ba8110 RDI: 0000000000000013
[ 1344.757297] RBP: 00007ffcf9ba8110 R08: 0000000000000000 R09: 0000000000000000
[ 1344.760836] R10: 000055d80e2f3540 R11: 0000000000000293 R12: 0000000000000000
[ 1344.764176] R13: 00007ffcf9ba8dd0 R14: 0000000000000000 R15: 0000000000000000
[ 1344.767585] Modules linked in: veth pppoe pppox ppp_generic slhc vlan_mon(OE) nls_ascii nls_cp437 vfat fat kvm_amd ppdev ccp rng_core bochs_drm kvm ttm dry
[ 1344.783379] CR2: 00003fa0c121d8e0
[ 1344.787353] ---[ end trace 278354700b21babc ]---
[ 1344.791197] RIP: 0010:vlan_mon_nl_cmd_add_vlan_mon+0x8f/0x2d0 [vlan_mon]
[ 1344.795315] Code: 85 d5 49 89 c5 48 85 c0 0f 84 2a 02 00 00 48 c7 c7 10 80 56 c0 e8 d1 4e 36 d5 49 8b ad f0 04 00 00 48 85 ed 0f 84 4a 01 00 00 <81> 7d 003
[ 1344.802960] RSP: 0018:ffffbd8680443ad8 EFLAGS: 00000202
[ 1344.806489] RAX: 0000000000000286 RBX: ffffbd8680443b50 RCX: 0000000000008863
[ 1344.810492] RDX: 0000000000000000 RSI: 0000000000000286 RDI: 0000000000000286
[ 1344.814375] RBP: 00003fa0c121d8e0 R08: ffffffffc0567280 R09: 0000000000000000
[ 1344.818214] R10: ffff9de5bb611b80 R11: 0000000000000000 R12: 0000000000000001
[ 1344.821895] R13: ffff9de5b808d000 R14: 0000000000000003 R15: ffff9de5bb611b80
[ 1344.825588] FS:  00007f83a390acc0(0000) GS:ffff9de5bea00000(0000) knlGS:0000000000000000
[ 1344.829451] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1344.833111] CR2: 00003fa0c121d8e0 CR3: 000000003b676000 CR4: 00000000000006f0

This bug does not exist in Ubuntu 20.04, 22.04 , Debian 11 and Debian 12 (default generic kernels)

Looks like this bug is related to kernel, but most probably it is possible to fix in accel-pppd(vlan_mon) code. Moreover, Debian 10 EOL date is June 30th, 2024.

Details

Protocol
General
Version
38d96b8e20608fb743d543fe3f08ad4b9d1dcd66

Event Timeline

Attaching full log from github CI (the same issue in the end of the log)

I can confirm that kernel 5.10.0-0.deb10.24-amd64 is fixing issue on Debian 10.

Dimka88 claimed this task.

Summarize:
Issue appears with veth and vlan_mon only on 4.19. If use bridge instead, all will work properly. As Deb10 support expired around 1 year we make decision to delete tests on Deb10 from github test