PolarSPARC |
Building a Linux Container using Namespaces :: Part - 2
Bhaskar S | 03/15/2020 |
Overview
In Part - 1 of this series, we demonstrated isolation of the Host name, User/Group IDs, and Process IDs using namespaces UTS, User, PID, and Mount.
In this article, we continue the journey with Mount and Network namespaces. We will not explore the IPC namespace.
Hands-on with Namespaces
Mount Namespace
Next, we will mimic the above UTS, User, PID, and Mount, namespace isolation using the following go program:
Create and change to the directory $GOPATH/mount by executing the following commands in TB:
$ mkdir -p $GOPATH/mount
$ cd $GOPATH/mount
Copy the above code into the program file main.go in the current directory.
To compile the program file main.go, execute the following command in TB:
$ go build main.go
To run program main, execute the following command in TB:
$ sudo ./main
The following would be a typical output:
2020/03/14 22:05:46 Starting process ./main with args: [./main] 2020/03/14 22:05:46 Ready to run command ... 2020/03/14 22:05:46 Starting process ./main with args: [./main CLONE] 2020/03/14 22:05:46 Ready to exec container shell ... 2020/03/14 22:05:46 Chaning to /tmp directory ... 2020/03/14 22:05:46 Mounting / as private ... 2020/03/14 22:05:46 Binding rootfs/ to rootfs/ ... 2020/03/14 22:05:46 Pivot new root at rootfs/ ... 2020/03/14 22:05:46 Changing to / directory ... 2020/03/14 22:05:46 Mounting /tmp as tmpfs ... 2020/03/14 22:05:46 Mounting /proc filesystem ... 2020/03/14 22:05:46 Mounting /.old_root as private ... 2020/03/14 22:05:46 Unmount parent rootfs from /.old_root ... ->
The command prompt will change to a ->.
To display the host name of the simple container, execute the following command in TB:
-> hostname
The following would be a typical output:
leopard
To display the user ID and group ID in the new namespace, execute the following command in TB:
-> id
The following would be a typical output:
uid=0(root) gid=0(root) groups=0(root)
To display all the processes in the simple container, execute the following command in TB:
-> ps -fu
The following would be a typical output:
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 4628 824 pts/1 S 22:05 0:00 root 8 0.0 0.0 37368 3368 pts/1 R+ 22:05 0:00 ps -fu
To list all the mount points in the new namespace by executing the following command in TB :
-> cat /proc/mounts | sort
The following would be a typical output:
/dev/sda1 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0 proc /proc proc rw,relatime 0 0 tmpfs /tmp tmpfs rw,relatime 0 0
To list all the file(s) under / in the new namespace, execute the following command in TB:
# ls -l /
The following would be a typical output:
total 68 drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:24 bin drwxr-xr-x 2 nobody nogroup 4096 Apr 24 2018 boot drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:24 dev drwxr-xr-x 29 nobody nogroup 4096 Feb 3 20:24 etc drwxr-xr-x 2 nobody nogroup 4096 Apr 24 2018 home drwxr-xr-x 8 nobody nogroup 4096 May 23 2017 lib drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:23 lib64 drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:23 media drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:23 mnt drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:23 opt dr-xr-xr-x 329 root root 0 Mar 21 17:32 proc drwx------ 2 nobody nogroup 4096 Feb 3 20:24 root drwxr-xr-x 4 nobody nogroup 4096 Feb 3 20:23 run drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:24 sbin drwxr-xr-x 2 nobody nogroup 4096 Feb 3 20:23 srv drwxr-xr-x 2 nobody nogroup 4096 Apr 24 2018 sys drwxrwxrwt 2 root root 60 Mar 21 17:32 tmp drwxr-xr-x 10 nobody nogroup 4096 Feb 3 20:23 usr drwxr-xr-x 11 nobody nogroup 4096 Feb 3 20:24 var
To list the the properties of the file /tmp/leopard.txt in the simple container, execute the following command in TB:
-> ls -l /tmp/leopard.txt
The following would be a typical output:
-rw-r--r-- 1 root root 7 Mar 14 22:05 /tmp/leopard.txt
To list all the namespaces associated with the simple container, execute the following command in TB:
-> ls -l /proc/$$/ns
The following would be a typical output:
total 0 lrwxrwxrwx 1 root root 0 Mar 14 22:07 cgroup -> 'cgroup:[4026531835]' lrwxrwxrwx 1 root root 0 Mar 14 22:07 ipc -> 'ipc:[4026531839]' lrwxrwxrwx 1 root root 0 Mar 14 22:07 mnt -> 'mnt:[4026532609]' lrwxrwxrwx 1 root root 0 Mar 14 22:07 net -> 'net:[4026531993]' lrwxrwxrwx 1 root root 0 Mar 14 22:07 pid -> 'pid:[4026532611]' lrwxrwxrwx 1 root root 0 Mar 14 22:07 pid_for_children -> 'pid:[4026532611]' lrwxrwxrwx 1 root root 0 Mar 14 22:07 user -> 'user:[4026532608]' lrwxrwxrwx 1 root root 0 Mar 14 22:07 uts -> 'uts:[4026532610]'
To exit the simple container, execute the following command in TB:
-> exit
SUCCESS !!! We have demonstrated the combined UTS, User, PID, and Mount namespaces using both the unshare command and a simple go program.
Network Namespace
Finally, let us now layer the Network namespace on top of the UTS, the User, the PID, and the Mount namespaces.
To launch a simple container whose networking as well as the mount points, the process IDs, the user/group IDs, and the host name are isolated from the parent namespace, execute the following command in TB:
$ sudo unshare -uUrpfmn --mount-proc /bin/sh
The -n option enables the Network namespace.
The command prompt will change to a #.
To list all the network interfaces in the new namespace, execute the following command in TB:
# ip link
The following would be a typical output:
1: lo: <LOOPBACK> mtu 65536 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
From Output.12 above, we see only the loopback (127.0.0.1) interface and it is DOWN.
To bring up the loopback interface in the new namespace, execute the following command in TB:
# ip link set dev lo up
To test the loopback interface in the new namespace, execute the following command in TB:
# ping 127.0.0.1 -c3
The following would be a typical output:
PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data. 64 bytes from 127.0.0.1: icmp_seq=1 ttl=64 time=0.022 ms 64 bytes from 127.0.0.1: icmp_seq=2 ttl=64 time=0.024 ms 64 bytes from 127.0.0.1: icmp_seq=3 ttl=64 time=0.020 ms --- 127.0.0.1 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2040ms rtt min/avg/max/mdev = 0.020/0.022/0.024/0.001 ms
We need to create a bridge network interface in the parent namespace. A bridge is a virtual network switch used to connect two or more network devices.
To create a bridge interface called br0 and in the parent namespace, execute the following command in TA:
S sudo brctl addbr br0
To list all the bridge interfaces in the parent namespace, execute the following command in TA :
$ sudo brctl show
The following would be a typical output:
bridge name bridge id STP enabled interfaces br0 8000.000000000000 no
Let us assign br0 the address 172.20.1.2. To assign an ip address to the bridge interface br0 in the parent namespace, execute the following command in TA:
$ sudo ip addr add 172.20.1.2/24 dev br0
To bring up the bridge interface br0 in the parent namespace, execute the following command in TA:
$ sudo ip link set br0 up
To list all the network interfaces in the parent namespace, execute the following command in TA :
$ ip link
The following would be a typical output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp5s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether 18:18:18:05:05:05 brd ff:ff:ff:ff:ff:ff 3: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/ether 0a:ae:d0:65:21:bb brd ff:ff:ff:ff:ff:ff
One can add a virtual ethernet device veth to the Network namespace. They can act as a tunnel between Network namespaces and are always created in pairs. Packets transmitted on one device in the pair are immediately received on the other device. One end of the pair would be in the parent namespace and the other end of the pair would be in the new namespace.
The following diagram illustrates the bridge network with the virtual ethernet pairs:
To create a veth interface pairs called veth0 and veth1 in the parent namespace, execute the following command in TA :
$ sudo ip link add veth0 type veth peer name veth1
To list all the network interfaces in the parent namespace, execute the following command in TA :
$ ip link
The following would be a typical output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp5s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether 18:18:18:05:05:05 brd ff:ff:ff:ff:ff:ff 3: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/ether 0a:ae:d0:65:21:bb brd ff:ff:ff:ff:ff:ff 4: veth1@veth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether c6:46:7c:18:1c:ef brd ff:ff:ff:ff:ff:ff 5: veth0@veth1: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 76:3e:78:4e:9d:28 brd ff:ff:ff:ff:ff:ff
The end veth0 should be in the parent namespace, while the end veth1 should be in the new namespace.
To place the end veth1 in the new namespace, we need to identify the process ID of the command unshare.
To find and store the pid of unshare in an environment variable UPID, execute the following command in TA:
$ export UPID=$(pidof unshare)
To place the end veth1 in the new namespace, execute the following command in TA:
$ sudo ip link set veth1 netns $UPID
To list all the network interfaces in the parent namespace, execute the following command in TA :
$ ip link
The following would be a typical output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: enp5s0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc fq_codel state DOWN mode DEFAULT group default qlen 1000 link/ether 18:18:18:05:05:05 brd ff:ff:ff:ff:ff:ff 3: br0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/ether 0a:ae:d0:65:21:bb brd ff:ff:ff:ff:ff:ff 5: veth0@if6: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether 76:3e:78:4e:9d:28 brd ff:ff:ff:ff:ff:ff
To list all the network interfaces in the new namespace, execute the following command in TB:
# ip link
The following would be a typical output:
1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000 link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 2: veth1@if3: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000 link/ether c6:46:7c:18:1c:ef brd ff:ff:ff:ff:ff:ff
Comparing Output.15 and Output.14, we see they are completely different.
To connect the end veth0 to the bridge br0 in the parent namespace, execute the following command in TA:
$ sudo ip link set veth0 master br0 up
Let us assign veth0 the address 172.20.1.3. To assign an ip address to the network interface veth0 in the parent namespace, execute the following command in TA:
$ sudo ip addr add 172.20.1.3/24 dev veth0
To bring up the network interface veth0 in the parent namespace, execute the following command in TA:
$ sudo ip link set veth0 up
Let us assign veth1 the address 172.20.1.4. To assign an ip address to the network interface veth1 in the new namespace, execute the following command in TB:
# ip addr add 172.20.1.4/24 dev veth1
To bring up the network interface veth1 in the new namespace, execute the following command in TB:
# ip link set veth1 up
To test the ip address 172.20.1.4 (of the container) in the parent namespace, execute the following command in TA:
$ ping 172.20.1.4 -c3
The following would be a typical output:
PING 172.20.1.4 (172.20.1.4) 56(84) bytes of data. 64 bytes from 172.20.1.4: icmp_seq=1 ttl=64 time=0.079 ms 64 bytes from 172.20.1.4: icmp_seq=2 ttl=64 time=0.038 ms 64 bytes from 172.20.1.4: icmp_seq=3 ttl=64 time=0.040 ms --- 172.20.1.4 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2036ms rtt min/avg/max/mdev = 0.038/0.052/0.079/0.019 ms
Similarly, to test the ip address 172.20.1.3 (of the host) in the new namespace, execute the following command in TB:
# ping 172.20.1.3 -c3
The following would be a typical output:
PING 172.20.1.3 (172.20.1.3) 56(84) bytes of data. 64 bytes from 172.20.1.3: icmp_seq=1 ttl=64 time=0.072 ms 64 bytes from 172.20.1.3: icmp_seq=2 ttl=64 time=0.039 ms 64 bytes from 172.20.1.3: icmp_seq=3 ttl=64 time=0.044 ms --- 172.20.1.3 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2044ms rtt min/avg/max/mdev = 0.039/0.051/0.072/0.016 ms
YAY !!! We have successfully demonstrated a simple container by combining UTS, User, PID, Mount, and Network namespaces using the unshare command.
To clean up the bridge interface we created earlier, we need to first bring it down and then delete it.
To bring down the bridge interface br0 in the parent namespace, execute the following command in TA:
$ sudo ip link set br0 down
To delete the bridge interface br0 in the parent namespace, execute the following command in TA:
$ sudo brctl delbr br0
Next, we will mimic the above UTS, User, PID, Mount, and Network, namespace isolation using the following go program:
Create and change to the directory $GOPATH/network by executing the following commands in TB:
$ mkdir -p $GOPATH/network
$ cd $GOPATH/network
Copy the above code into the program file main.go in the current directory.
To compile the program file main.go, execute the following command in TB:
$ go build main.go
To run program main, execute the following command in TB:
$ sudo ./main
The following would be a typical output:
2020/03/14 22:17:52 Starting process ./main with args: [./main] 2020/03/14 22:17:52 Bridge br0 does not exists ... 2020/03/14 22:17:52 Creating the Bridge br0 ... 2020/03/14 22:17:52 Attaching address 172.20.1.2/24 to the Bridge br0 ... 2020/03/14 22:17:52 Activating the Bridge br0 ... 2020/03/14 22:17:52 Creating the pairs veth0 and veth1 ... 2020/03/14 22:17:52 Link br0 as master of veth0 ... 2020/03/14 22:17:52 Activating the pairs veth0 & veth1 ... 2020/03/14 22:17:52 Getting the link for pair veth0 ... 2020/03/14 22:17:52 Attaching address 172.20.1.3/24 to the pair veth0 ... 2020/03/14 22:17:52 Activating the pair veth0 ... 2020/03/14 22:17:52 Ready to run command ... 2020/03/14 22:17:52 Getting the link for pair veth1 ... 2020/03/14 22:17:52 Namespacing the pair veth1 with pid 20367 ... 2020/03/14 22:17:52 Starting process ./main with args: [./main CLONE] 2020/03/14 22:17:52 Ready to exec container shell ... 2020/03/14 22:17:52 Chaning to /tmp directory ... 2020/03/14 22:17:52 Mounting / as private ... 2020/03/14 22:17:52 Binding rootfs/ to rootfs/ ... 2020/03/14 22:17:52 Pivot new root at rootfs/ ... 2020/03/14 22:17:52 Changing to / directory ... 2020/03/14 22:17:52 Mounting /tmp as tmpfs ... 2020/03/14 22:17:52 Mounting /proc filesystem ... 2020/03/14 22:17:52 Mounting /.old_root as private ... 2020/03/14 22:17:52 Unmount parent rootfs from /.old_root ... 2020/03/14 22:17:52 Getting the link for pair lo ... 2020/03/14 22:17:52 Activating lo ... 2020/03/14 22:17:52 Getting the link for pair veth1 ... 2020/03/14 22:17:52 Attaching address 172.20.1.4/24 to the pair veth1 ... 2020/03/14 22:17:52 Activating the pair veth1 ... ->
The command prompt will change to a ->.
To list all the network interfaces in the parent namespace, execute the following command in TA :
$ cat /proc/self/net/dev
The following would be a typical output:
Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed enp5s0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 docker0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 lo: 471708 4702 0 0 0 0 0 0 471708 4702 0 0 0 0 0 0 veth0: 936 12 0 0 0 0 0 0 27370 162 0 0 0 0 0 0 br0: 768 12 0 0 0 0 0 0 17220 106 0 0 0 0 0 0
To list all the network interfaces in the new namespace, execute the following command in TB:
-> cat /proc/self/net/dev
The following would be a typical output:
Inter-| Receive | Transmit face |bytes packets errs drop fifo frame compressed multicast|bytes packets errs drop fifo colls carrier compressed lo: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 veth1: 20994 126 0 0 0 0 0 0 796 10 0 0 0 0 0 0
Comparing Output.20 and Output.19, we see they are completely different.
To test the ip address 172.20.1.4 (of the container) in the parent namespace, execute the following command in TA:
$ ping 172.20.1.4 -c3
The following would be a typical output:
PING 172.20.1.4 (172.20.1.4) 56(84) bytes of data. 64 bytes from 172.20.1.4: icmp_seq=1 ttl=64 time=0.101 ms 64 bytes from 172.20.1.4: icmp_seq=2 ttl=64 time=0.044 ms 64 bytes from 172.20.1.4: icmp_seq=3 ttl=64 time=0.052 ms --- 172.20.1.4 ping statistics --- 3 packets transmitted, 3 received, 0% packet loss, time 2041ms rtt min/avg/max/mdev = 0.044/0.065/0.101/0.026 ms
Note the new namespace is running a minimalistic Ubuntu Base image and there is no ping command to check the connectivity back to the parent namespace.
To exit the simple container, execute the following command in TB:
-> exit
WALLA !!! We have successfully demonstrated a simple container by combining UTS, User, PID, Mount, and Network namespaces using a simple go program.
References