-
Notifications
You must be signed in to change notification settings - Fork 3
Description
Hallo,
over years now I like to use openvswitch as the basic technology in context with virtualization (qemu, lxc, ...)
I love the possibility to setup internal cards with various attributes like vlan, bonding, ...
While using newer openvswitch Versions there comes up a serious problem with this approach, because the newer the openvswitch version is, the more problems (statistical) occur when using a predefined (means setup after boot) ovs internal network card as a so called lxc phys network card ..
If I use real physical network cards (e.g. enp3s0 + enp4so) instead of the ovs internal nic everything worked as expected ..
To reproduce this problem easily do the following ...
Boot with network access from lubuntu-noble-24.04.2 iso image (beta versions of Ubuntu quenting do the same)..Login ...
Install openvswitch + lxc
apt update && apt upgrade -y
apt install lxc1 lxcfs openvswitch-switch
Create two bridges (lxc-br0, lxc-br1) with each owns one internal device (lxc-lan0, lxc-lan1)
ovs-vsctl add-br lxc-br0
ovs-vsctl add-br lxc-br1
ovs-vsctl add-port lxc-br0 lxc-lan0 -- set interface lxc-lan0 type=internal
ovs-vsctl add-port lxc-br1 lxc-lan1 -- set interface lxc-lan1 type=internal
Create new lxc under /lxc/noble by using the download template
mkdir /lxc
cd /lxc
lxc-create -n noble -P /lxc -tdownload
cd /var/lib/lxc
ln -s /lxc/noble noble
Edit /lxc/noble/config and replace network setup as follow
lxc.net.0.type = phys
lxc.net.0.link = lxc-lan0
lxc.net.0.flags = up
lxc.net.1.type = phys
lxc.net.1.link = lxc-lan1
lxc.net.1.flags = up
Then start lxc-start noble to reach the error. For example by doing
rm -rf /tmp/lxc_trace.log;lxc-start noble -l trace -o /tmp/lxc_trace
=> what happens is, that lxc renames lxc-lan0 and lxc-lan1, moves the renamed interface to the lxc-network namespace and then informs the client (running in lxc - namespace) about this new devices.
The client often is not able to find this renamed interface !!
This error is statistical, which means starting the lxc several times it sometime just works !
If you "grep network /tmp/lxc_trace" you can easily see the problem ...
Since it works with real Ethernet cards, I think there is maybe a timing / recognition problem while searching
the renamed card in the net lxc network namespace ! ?
Using older releases (2.X) the problems rarely happen
I use a clustered hierarchical layout (Phys-Server, QEMU-VM (build HA cluster), lxc as HA resource) in real life,
but to reproduce the problem the above showed test is more simple !!