-
Notifications
You must be signed in to change notification settings - Fork 18.8k
Make containers on routed-mode networks accessible from other bridge networks #48596
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
2d5ccca
to
05b29b5
Compare
ipset would most likely provide the best performance and be the most straightforward to implement. My concern is that this would cause issues for people who have to run Docker on an environment where they can't get ipset and the kernel module. Perhaps we can take a bit more time to think. |
For a "routed" network, we're already assuming the system administrator has set up appropriate routes, NAT, forwarding, etc as necessary for that bridge to work, right? And we're just adding a new veth to that pre-existing / pre-configured bridge? (trying to make sure my mental model of the feature is accurate) If that's accurate, then the only piece we should be managing from the (perhaps I've misunderstood the feature, in which case I might be using it incorrectly in the place I just started using it 😂 😇 ❤️) |
Yes, exactly ... in 27.x, In 28.x, |
Ah, thanks for confirming my mental model 🙇 ❤️ So I guess what I'm proposing is that the (If routing on I'm actually a little confused looking at that example because I'm not seeing where the 8080 mapping gets applied -- is that userland proxy or something? (the only hit for 8080 on that whole page is the |
To put that another way, if we didn't add so many |
Thank you for taking a look and giving it some thought! Much appreciated.
Routing in the network to get traffic to the docker host is the user's responsibility, we have no control over that. But, direct routing doesn't mean "no firewall". For that, So, we don't want to leave all ports open on containers on (Mode
There's no NAT from host addresses, so port 8080 in the In 27.x, it's currently an error to over-specify the port mapping - if routed mode can't use it, don't specify the host port. But, that error will be downgraded to a warning in the next 27.x release, for the reasons described in #48575. I'm not sure if that makes it any clearer - but it's what I meant by "Yes, exactly ... in 27.x, routed mode means no NAT mapping from host addresses, just port/protocol filtering to allow traffic for the container-end of -p mappings (the host-port in a port mapping has no effect, host-ip only determines address family). It's up to the user to sort out routing to the container network. https://docs.docker.com/engine/network/packet-filtering-firewalls/#direct-routing".
I don't think that's an option. It would be if we didn't want port filtering, and the routed-mode network didn't need to interact with any other networks - but I'll try to explain why each of the Port 80 from the
The next
Those The final
There are two
The first has nothing to do with what's allowed in to
The The remaining two
With ICC enabled, these two rules can be combined, and they will be by #48641. |
05b29b5
to
79a9fa7
Compare
Ahhhhh, that helps a lot, yes, thank you ❤️ My mental model is wrong, it turns out, because I understood "routed" to be effectively "this bridge already exists, and is already configured with routing (likely for other purposes like Incus/LXD, nspawn, QEMU, etc), and I just want Docker to attach the veths it creates to it", which is actually how I'm using it in practice today, but you envision it to be something still mostly Docker-managed/protected. 🤔 FWIW, #48526 is the only issue I've personally observed with my mental model/setup, and it's "fixed" by removing more rules Docker's putting in place. What you're describing (locking it down to only explicitly published ports) sounds like it's probably going to break my setup actually (I don't publish any ports on my containers on the routed bridge, but their listening ports are still all routed correctly and answer properly to remote hosts on the connected network), so that's a bridge (heh) I'll probably have to cross separately. 🙈 |
Edit 2: @robmry clarified that I was indeed on Docker v23 (not a typo for v27.3 or something) and I am, and thus my In the hopes that it helps, here's an attempt at describing my usecase/setup in more detail: I've got a shared network (a /16) on which a /24 is delegated to me. My access to this shared network is over IPsec, via an "xfrm" device (this is relevant because ipvlan, macvlan, and even bridges are picky about interfaces, and cannot use this interface directly -- the same applies to WireGuard interfaces, which is probably a more common setup than IPsec+xfrm). On my machine, I need to run a set of services available on different IP addresses within that /24. If this were a physical interface on the box, I'd probably create a bridge, add the physical interface to it, then create TAP interfaces for VMs with the IP addresses I wanted to expose. Alternatively, before What I managed to get working (thanks to
docker network create \
--gateway 10.123.1.1 \
--subnet 10.123.1.0/24 \
--ip-range 10.123.1.64/26 \
--opt com.docker.network.bridge.name=br-ipsec \
--opt com.docker.network.bridge.inhibit_ipv4=true \
--opt com.docker.network.driver.mtu=1400 \
--opt com.docker.network.container_iface_prefix=ipsec \
--opt com.docker.network.bridge.gateway_mode_ipv4=routed \
--opt com.docker.network.bridge.enable_ip_masquerade=false \
ipsec (I don't know whether Overall, this has worked really well -- I can choose IPs for containers directly or let Docker assign them, and other systems on the remote end of the IPsec can access them directly without issue. The only real issue I have is that other containers (which can access other hosts on the IPsec network just fine) can't access the same-host IPs (which is what I've understood #48526 to be), and that doing this adds extra firewall rules on my Edit: I guess I should also note that I'm on Docker v23 and this is all working; no idea if newer versions have already broken this based on bad assumptions I've made 🙈 |
3a523c9
to
51cfe90
Compare
Signed-off-by: Rob Murray <[email protected]>
After an error, there's no need for it to roll back rules it's created, the caller already does that. Signed-off-by: Rob Murray <[email protected]>
Signed-off-by: Rob Murray <[email protected]>
IPv4 before IPv6, with consistent error paths. Signed-off-by: Rob Murray <[email protected]>
The default for a user-defined chain is RETURN anyway. This opens up the possibilty of sorting rules into two groups by using insert or append, without having to deal with appending after the unconditional RETURN. Signed-off-by: Rob Murray <[email protected]>
Create ipsets containing the subnet of each non-internal bridge network. Signed-off-by: Rob Murray <[email protected]>
Add an integration test to check that a container on a network with gateway-mode=nat can access a container on a network with gateway-mode=routed, but not vice-versa. Signed-off-by: Rob Murray <[email protected]>
51cfe90
to
223929a
Compare
Rebased as it was a bit behind - I'll let the tests run, then merge. |
## Description Updates for moby 28.0 networking. ## Related issues or tickets Series of commits ... - Fix description of 'inhibit_ipv4' - Not changed in moby 28.0, updated to clarify difference from (new) IPv6-only networks. - Updates to default bridge address config - moby/moby#48319 - Describe IPv6-only network config - moby/moby#48271 - docker/cli#5599 - Update description of gateway modes - moby/moby#48594 - moby/moby#48596 - moby/moby#48597 - Describe gateway selection in the networking overview. - docker/cli#5664 - Describe gateway mode `isolated` - moby/moby#49262 ## Reviews <!-- Notes for reviewers here --> <!-- List applicable reviews (optionally @tag reviewers) --> - [ ] Technical review - [ ] Editorial review - [ ] Product review --------- Signed-off-by: Rob Murray <[email protected]>
- What I did
gateway_mode_ipv[46]=routed
are inaccessible from other containers #48526Containers on a network with option
com.docker.network.bridge.gateway_mode_ipv[46]=routed
do not have NAT set up for port-mappings from the host - but mapped ports are opened in the container's iptables/ip6tables rules, and they can be accessed from a remote host that has routing to the container network (via the host).However, those ports were not accessible from containers on other networks on the same host.
- How I did it
Introduce the use of
ipset
, with sets containing the subnets of each of the externally-accessible docker networks.ipset
needs to be available in the kernel.Use those sets in rules matching packets routed to those docker networks so that:
RELATED,ESTABLISHED
rule is needed in the filter-FORWARD
chainDOCKER-USER
), so that related packets don't need to be checked against any other rules.DOCKER
chain, rather than a rule per-bridgeDOCKER-ISOLATION-STAGE-1
:RELATED,ESTABLISED
packets coming from the routed network, so that responses make it back to the network that made the requestDOCKER
chain.Also:
RETURN
rules from the end of theDOCKER-ISOLATION
chains- How to verify it
New tests.
- Description for the changelog