Understand Docker and Iptables rules

25 Jul 2017 . . Comments

I spent sometime trying to understand how iptables and docker work together. Here I started 2 containers

docker run -it -d -p 1000:1000 sshd
docker run -it -d -p 1002:1000 sshd
[root@maddog maddog]# docker ps
CONTAINER ID        IMAGE               COMMAND               CREATED             STATUS              PORTS                            NAMES
2b7715682ad1        sshd                "/usr/sbin/sshd -D"   6 hours ago         Up 6 hours          22/tcp, 0.0.0.0:1002->1000/tcp   angry_mcclintock
a1133084c72d        sshd                "/usr/sbin/sshd -D"   6 hours ago         Up 6 hours          22/tcp, 0.0.0.0:1000->1000/tcp   grave_mayer  

Then I can use iptables-save to view all the iptables rules

[root@maddog maddog]# iptables-save
# Generated by iptables-save v1.4.7 on Fri Jul 21 18:17:36 2017
*filter
:INPUT ACCEPT [73189:3711800]
:FORWARD ACCEPT [0:0]
:OUTPUT ACCEPT [72620:4043791]
:DOCKER - [0:0]
-A FORWARD -o docker0 -j DOCKER
-A FORWARD -o docker0 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -i docker0 ! -o docker0 -j ACCEPT
-A FORWARD -i docker0 -o docker0 -j ACCEPT
-A DOCKER -d 172.17.0.1/32 ! -i docker0 -o docker0 -p tcp -m tcp --dport 1000 -j ACCEPT
-A DOCKER -d 172.17.0.3/32 ! -i docker0 -o docker0 -p tcp -m tcp --dport 1000 -j ACCEPT
COMMIT
# Completed on Fri Jul 21 18:17:36 2017
# Generated by iptables-save v1.4.7 on Fri Jul 21 18:17:36 2017
*nat
:PREROUTING ACCEPT [0:0]
:POSTROUTING ACCEPT [35350:2121772]
:OUTPUT ACCEPT [35350:2121772]
:DOCKER - [0:0]
-A PREROUTING -m addrtype --dst-type LOCAL -j DOCKER
-A POSTROUTING -s 172.17.0.0/16 ! -o docker0 -j MASQUERADE
-A POSTROUTING -s 172.17.0.1/32 -d 172.17.0.1/32 -p tcp -m tcp --dport 1000 -j MASQUERADE
-A POSTROUTING -s 172.17.0.3/32 -d 172.17.0.3/32 -p tcp -m tcp --dport 1000 -j MASQUERADE
-A OUTPUT ! -d 127.0.0.0/8 -m addrtype --dst-type LOCAL -j DOCKER
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 1000 -j DNAT --to-destination 172.17.0.1:1000
-A DOCKER ! -i docker0 -p tcp -m tcp --dport 1002 -j DNAT --to-destination 172.17.0.3:1000
COMMIT
# Completed on Fri Jul 21 18:17:36 2017

This is the routing table on the server

[root@maddog ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
10.0.2.0        0.0.0.0         255.255.255.0   U     0      0        0 eth0
172.17.0.0      0.0.0.0         255.255.0.0     U     0      0        0 docker0
0.0.0.0         10.0.2.2        0.0.0.0         UG    0      0        0 eth0

We can refer to the below image to understand how the packets traverse through the iptables tables

Let’s see how the packets go through our current iptables rules created by docker

outbound

When a packet from inside the container want to reach outside Internet, say 8.8.8.8:53. You might think it is just a output packet, so the rule match starts from output chain. but the fact is, it is not, because in the computer points of view, the ip adderss 172.17.0.1 is not configured locally, thus the packet is consider to be originated from outside, the rules match starts from prerouting -> forward -> postrouting

We can inject a log rule in the iptables to see those packets are passing forward chain

[root@maddog woosley.github.io]# iptables -I FORWARD -m limit --limit 2/min -j LOG
[root@maddog woosley.github.io]# iptables -L
Chain INPUT (policy ACCEPT)
target     prot opt source               destination

Chain FORWARD (policy ACCEPT)
target     prot opt source               destination
LOG        all  --  anywhere             anywhere            limit: avg 2/min burst 5 LOG level warning
DOCKER     all  --  anywhere             anywhere
.....

If you attach a container, and then ping some external host inside the container, you can see similar logs in /var/log/message

Jul 25 03:54:35 maddog kernel: IN=docker0 OUT=eth0 PHYSIN=veth57a730b SRC=172.17.0.2 DST=10.0.2.3 LEN=58 TOS=0x00 PREC=0x00 TTL=63 ID=28069 DF PROTO=UDP SPT=60692 DPT=53 LEN=38
Jul 25 03:54:35 maddog kernel: IN=eth0 OUT=docker0 SRC=10.0.2.3 DST=172.17.0.2 LEN=74 TOS=0x10 PREC=0x00 TTL=63 ID=3655 PROTO=UDP SPT=53 DPT=60692 LEN=54 

This log is a approve of the previous theory. So when we ping a external host in the container:

  • Packet comes in on the interface docker0
  • Then prerouting, it is not a local type, so continue to routing decision
  • Then it decides what source address and outgoing interfaces to use, here source address is 172.17.0.1:33272, 33272 is a random port by the OS. outgoing interface is eth0 since it is targetting a external address, based on the last rule in the rounting table
  • Go to forward chain, it matches rule -i docker0 ! -o docker0 -j accept, packet passes forward chain
  • Next it will go to the first postrouting rule:-s 172.17.0.0/16 ! -o docker0 -j MASQUERADE, MASQUERADE is a dynamic source NAT, it will change the packet’s source ip address/port to the interfaces’ ip/port, say 10.0.2.15/33273. then this packet is sent to external internet using eth0

When the return packet from outside reaches eth0, here is the process

  • It will do dnat automatically on eth0 since it is masqueraded, so the src/dest is now say 8.8.8.8:53/172.17.0.1/33272
  • Based on the route, the packet’s outgoing interface is docker0. Even your container is running on your local server, the target is not considered to be local since the ip address is not on any local interfaces, it will go to Forward Chain
  • Then it maches the first forward rule, it will be sent to docker chain
  • The next rule matches on the filter table, it is a already extablished connction, package is passes
  • It then goes through the routing again, and all those postrouting rules, but no matches, the server sends the packet out using docker0 interfaces, it is a briged eithernet to the docker container interfaces

inbound

Let’s see a inbound packet whoes destination ip address is 10.0.2.15:1002, which means it wants to reach 172.17.0.3:1000

  • First it hits the first prerouting rule, it is a local targetting packet, so it is sent to the Docker chain
  • In the docker chain, DNAT is done, now its dest address is 172.17.0.3:1000
  • Now routing decision, outbond interface is now docker0, packet goes to Forward chain
  • Frist rule in Forward Chain forwards packet to Docker Chain
  • Docker Chain accepts the packet
  • Packet now goes to postrouting, nothing changes and it is sent to docker container

This is basically how the traffic goes through the iptables rules created by docker on Centos6.