Cannot create nested network namespaceHow can I switch from a custom linux network namespace back to the default one?Hosts cannot ping in two name spaces using Open VswitchHow to configure a Linux network namespace that allows UDP broadcastcannot ping linux network namespace within the same subnetHow can I move an interface out of a network namespace?VRF on Linux using network namespacesHow to create permanent linux network namespaceAdding vrf interface to a network namespaceRestore namespace for adapters lost to LXCHow to run a command in another process's network namespace?

Patience, young "Padovan"

Can I make popcorn with any corn?

Is it possible to make sharp wind that can cut stuff from afar?

Email Account under attack (really) - anything I can do?

Infinite past with a beginning?

Validation accuracy vs Testing accuracy

How is this relation reflexive?

I probably found a bug with the sudo apt install function

Is it tax fraud for an individual to declare non-taxable revenue as taxable income? (US tax laws)

Motorized valve interfering with button?

How to make payment on the internet without leaving a money trail?

Shell script can be run only with sh command

How do you conduct xenoanthropology after first contact?

Set-theoretical foundations of Mathematics with only bounded quantifiers

How does one intimidate enemies without having the capacity for violence?

New order #4: World

Accidentally leaked the solution to an assignment, what to do now? (I'm the prof)

What do you call a Matrix-like slowdown and camera movement effect?

Can an x86 CPU running in real mode be considered to be basically an 8086 CPU?

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

Why was the small council so happy for Tyrion to become the Master of Coin?

A newer friend of my brother's gave him a load of baseball cards that are supposedly extremely valuable. Is this a scam?

What would the Romans have called "sorcery"?

How can the DM most effectively choose 1 out of an odd number of players to be targeted by an attack or effect?



Cannot create nested network namespace


How can I switch from a custom linux network namespace back to the default one?Hosts cannot ping in two name spaces using Open VswitchHow to configure a Linux network namespace that allows UDP broadcastcannot ping linux network namespace within the same subnetHow can I move an interface out of a network namespace?VRF on Linux using network namespacesHow to create permanent linux network namespaceAdding vrf interface to a network namespaceRestore namespace for adapters lost to LXCHow to run a command in another process's network namespace?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















Is seems that one is not able to create a network namespace from a network namespace. It results in "Error: Peer netns reference is invalid.".



Is this a bug or is there some kind of limitation that I am not aware of?



Below is my cmd trace of the error.



# ip netns add foo1
# ip netns exec foo1 ip netns add foo2
# ip netns
Error: Peer netns reference is invalid.
Error: Peer netns reference is invalid.
foo2
foo1
# ip netns exec foo2 /bin/bash
setting the network namespace "foo2" failed: Invalid argument










share|improve this question




























    1















    Is seems that one is not able to create a network namespace from a network namespace. It results in "Error: Peer netns reference is invalid.".



    Is this a bug or is there some kind of limitation that I am not aware of?



    Below is my cmd trace of the error.



    # ip netns add foo1
    # ip netns exec foo1 ip netns add foo2
    # ip netns
    Error: Peer netns reference is invalid.
    Error: Peer netns reference is invalid.
    foo2
    foo1
    # ip netns exec foo2 /bin/bash
    setting the network namespace "foo2" failed: Invalid argument










    share|improve this question
























      1












      1








      1








      Is seems that one is not able to create a network namespace from a network namespace. It results in "Error: Peer netns reference is invalid.".



      Is this a bug or is there some kind of limitation that I am not aware of?



      Below is my cmd trace of the error.



      # ip netns add foo1
      # ip netns exec foo1 ip netns add foo2
      # ip netns
      Error: Peer netns reference is invalid.
      Error: Peer netns reference is invalid.
      foo2
      foo1
      # ip netns exec foo2 /bin/bash
      setting the network namespace "foo2" failed: Invalid argument










      share|improve this question














      Is seems that one is not able to create a network namespace from a network namespace. It results in "Error: Peer netns reference is invalid.".



      Is this a bug or is there some kind of limitation that I am not aware of?



      Below is my cmd trace of the error.



      # ip netns add foo1
      # ip netns exec foo1 ip netns add foo2
      # ip netns
      Error: Peer netns reference is invalid.
      Error: Peer netns reference is invalid.
      foo2
      foo1
      # ip netns exec foo2 /bin/bash
      setting the network namespace "foo2" failed: Invalid argument







      linux ip linux-networking namespaces network-namespace






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Apr 4 at 13:34









      user98651user98651

      84




      84




















          1 Answer
          1






          active

          oldest

          votes


















          1














          TL;DR: As weird as it seems, this is actually not a network namespace issue, but a mount namespace issue and is to be expected.



          You should create all new "ip netns namespaces" (see later for the meaning), i.e. run all ip netns add ... commands from the initial (host) "ip netns namespace", not from inside an "ip netns namespace" having been entered with ip netns exec .... As long as you don't create them you're then free to switch between them at will including nesting commands from one to an other, with ip netns exec ....



          Detailed explanation with step-by-step examples following...




          ip netns is specialized on network namespaces, but to handle all features, has also to mingle with mount namespaces for two reasons (at least, that I know of):




          • bind mounting /etc/netns/FOO/SOMESERVICE to /etc/SOMESERVICE to manage alternate service/daemon configurations



            A feature which can be handy to easily run some (network related) daemons in an other network namespace but beside this being still part of the "host". You can check my answer at UL on a question about it there: Namespace management with ip netns (iproute2). Its use requires the same treatment as the following feature, so I won't talk about it anymore.




          • remounting /sys to expose new network namespace's network devices in its hierarchy



            This one is a mandatory feature. Example exposing the problem:



            From "initial host":



            # ip link add dev dummy9 type dummy
            # ip -br link show dummy9
            dummy9 DOWN f6:f6:48:9c:12:b9 <BROADCAST,NOARP>
            # ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:09 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            Using a lower level tool to change to an other (ephemeral) network namespace:



            # unshare --net ip -br link show dummy9 
            Device "dummy9" does not exist.
            # unshare --net ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:13 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            And that's the issue: /sys still exposes initial host's interfaces instead of the new network namespace's interface. That's where there is an interaction between network namespace and with mounting /sys: if /sys is mounted from the new network namespace, it will switch to exposing the new network interfaces in select directory hierarchies (eg /sys/class/net and /sys/devices/virtual/net). This is done at mount time only, not dynamically. Some advanced network settings are easily available by just reading or writing there, so they have to be provided, and the reverse is true: the isolated processes running in the new network environment shouldn't be able to see or alter the initial host's interfaces.



          So ip netns exec FOO ... (but not ip netns add FOO) solves this by also unsharing the mount namespace and remounting /sys/ inside it, to not disrupt initial host's network namespace. But what is important is that this mount namespace is itself ephemeral: when you run separately two ip netns exec FOO ... commands, they don't end up in the same mount namespace. They each have their own, with /sys remounted there pointing to the same network namespace.



          Until now, no problem. I'll call this an "ip netns namespace" when this happened since there are now two types of namespaces involved. We have so far:



          term1:



          # ip netns add FOO
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/mnt -> mnt:[4026532618]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/net -> net:[4026532520]


          term2:



          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/mnt -> mnt:[4026532821]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/net -> net:[4026532520]


          Note how after changing ip netns namespaces, while the new network namespace is the same for term1 and term2, the new mount namespaces are different from each others (and from initial host).



          Now what happens when in term1 you create a new ip netns namespace? Let's see:



          term1:



          # ip netns add BAR
          # ip netns ls
          BAR
          FOO


          term2:



          # ip netns ls
          Error: Peer netns reference is invalid.
          Error: Peer netns reference is invalid.
          BAR
          FOO


          That's because the newer namespace BAR, to be kept existing without a process, is, as others, mounted on (the newly created empty file) /var/run/netns/BAR (again, see previous link for examples). While the mount namespaces are different, they have the same root directory: initial host's root. So of course this newly created empty file /var/run/netns/BAR could be seen everywhere (initial, term1's mount ns, term2's mount ns) when it was created.



          Alas, the mount over it, being done on term1's FOO's mount namespace, can only be seen on term1, not on term2 nor anywhere else, because it's a different mount namespace. So while in term1 ('s FOO ip netns namespace) /var/run/netns/BAR is a pseudo-file belonging to the nsfs pseudo-filesystem:



          term1:



          # stat -f -c %T /var/run/netns/BAR
          nsfs


          It's an empty file on tmpfs (from the actual /run mount) anywhere else:



          term2:



          # stat -f -c %T /var/run/netns/BAR
          tmpfs


          Any other terminal:



          $ stat -f -c %T /var/run/netns/BAR
          tmpfs


          It can still be seen in term1 as long as one doesn't exit the current "ip netns namespace". If from term1 one still switches ip netns namespaces , it will still be fine, because the new unshared ephemeral mount namespace is a copy of the previous, including all the mounts.



          If exited, that mount point is lost (and that means if there are no processes or file descriptors using it anymore, BAR's corresponding network namespace will disappear because it was held only by this mount point). After this any ip netns ls command will complain, anywhere. You can just remove the stale and now useless file /run/netns/BAR to fix it.



          After this step-by-step explanation, what to remember is that you shouldn't create new namespaces with ip netns add inside a namespace currently entered with ip netns exec. You should create them all from the initial (host) namespace, then you can switch at will between them from any ip netns namespace.



          Of course, if /var/run/netns/ (i.e. the mount point /run) is distinct between (staying fuzzy) namespaces, then there is no interaction, and each ip netns invocation will be isolated from others, not seing nor interacting with others. Where does this usually happen? In full containers, where both the mount and the network namespaces are separated and point to distinct resources from the start.




          UPDATE: as asked in comments, I checked how to "repair" this problem, but couldn't find any easy solution.



          First there's a prerequisite: as told above, once the new "ip netns" namespace BAR is created inside FOO, and FOO is left, the only reference to BAR will disappear, thus making BAR also disappear. Something more is needed.



          Actually there are three ways to keep a reference to a namespace:



          • process: that's the main method, and most of the time that's how the namespace is used at all

          • mount point (that's the method used by ip netns): allows to keep a namespace without any process, fine to have a namespace with only network settings inside (interfaces, bridges, tc rules, firewall rules, ...)

          • open file descriptor: rare, used when creating the namespaces, but seldom kept, except for applications dealing with multiple namespaces at the same time and switching some of their threads using the file descriptor for easy reference.

          We can use the 1st or 3rd method. Here are various failed attempts before finding something that works...



          As told before, won't work:



          # ip netns add FOO
          # ip netns exec FOO ip netns add BAR


          Just leave a process running temporarily in the first "ip netns" namespace, for its ephemeral mount namespace part, to keep the needed reference to the new "ip netns" namespace's network namespace and reuse it later from outside (from the initial namespace).



          Won't work either:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; sleep 999 < /var/run/netns/BAR & echo $!'
          28344
          # strace -e trace=readlink,mount mount --bind /proc/6295/fd/0 /var/run/netns/BAR
          readlink("/proc/6295/fd/0", "/run/netns/BAR", 4095) = 14
          readlink("/var/run", "/run", 4095) = 4
          mount("/run/netns/BAR", "/run/netns/BAR", 0x55c88c9cccb0, MS_BIND, NULL) = 0
          +++ exited with 0 +++
          # stat -f -c %T /run/netns/BAR
          tmpfs


          As seen with strace the mount command followed the symlink when it shouldn't have for this use case (note: the mount is still linked to the sleep process somehow which has to be killed to unmount it).



          This (entering sleep's mount namespace, to access the BAR's mounted network namespace hidden there) works but relies on the continued existence of sleep or any process for continued use:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; sleep 999 & echo $!'
          12916
          # nsenter --target=12916 --mount ip -n -brief BAR link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 8e:ce:b3:d1:9c:bb <BROADCAST,NOARP>


          strangely this (using the mount namespace shortcut /proc/pid/root/) doesn't work (I don't really know why):



          # stat -f -c %T /proc/12916/root/var/run/netns/BAR 
          tmpfs


          Finally what will work:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; ip netns exec BAR sh -c '''sleep 999 & echo $!''
          14124
          # mount --bind /proc/14124/ns/net /var/run/netns/BAR
          # ip -n BAR -brief link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 3a:48:65:20:68:c1 <BROADCAST,NOARP>


          So something like this could be used in the end. There might be race conditions if you attempt to delete them right after, before the sleep command ends.



          # ip netns add FOO
          # mount --bind /proc/$(ip netns exec FOO sh -c 'ip netns add BAR; ip netns exec BAR bash -c '''sleep 5 </dev/null >/dev/null 2>&1 & echo $!; disown'')/ns/net /var/run/netns/BAR


          How could such a construct be used? I have no idea because the original problem before encountering the nested "ip netns" problem was not given. Maybe easier solutions are available without ever trying to create "a nested network namespace".






          share|improve this answer

























          • Great answer, thanks. Is there a way to create a new netns safely while inside a netfs. i.e ip netfs exec foo1 /bin/bash.... ip netns exec <something> ip netns add foo2?

            – user98651
            2 days ago












          • It appears much more difficult than it seemed, and I don't see how to use the result in an actual use case. Perhaps you should ask an other question, about the original problem which forced you to try creating "nested network namespaces". Anyway I'm updating the answer.

            – A.B
            2 days ago












          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "2"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f961504%2fcannot-create-nested-network-namespace%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          1 Answer
          1






          active

          oldest

          votes








          1 Answer
          1






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes









          1














          TL;DR: As weird as it seems, this is actually not a network namespace issue, but a mount namespace issue and is to be expected.



          You should create all new "ip netns namespaces" (see later for the meaning), i.e. run all ip netns add ... commands from the initial (host) "ip netns namespace", not from inside an "ip netns namespace" having been entered with ip netns exec .... As long as you don't create them you're then free to switch between them at will including nesting commands from one to an other, with ip netns exec ....



          Detailed explanation with step-by-step examples following...




          ip netns is specialized on network namespaces, but to handle all features, has also to mingle with mount namespaces for two reasons (at least, that I know of):




          • bind mounting /etc/netns/FOO/SOMESERVICE to /etc/SOMESERVICE to manage alternate service/daemon configurations



            A feature which can be handy to easily run some (network related) daemons in an other network namespace but beside this being still part of the "host". You can check my answer at UL on a question about it there: Namespace management with ip netns (iproute2). Its use requires the same treatment as the following feature, so I won't talk about it anymore.




          • remounting /sys to expose new network namespace's network devices in its hierarchy



            This one is a mandatory feature. Example exposing the problem:



            From "initial host":



            # ip link add dev dummy9 type dummy
            # ip -br link show dummy9
            dummy9 DOWN f6:f6:48:9c:12:b9 <BROADCAST,NOARP>
            # ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:09 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            Using a lower level tool to change to an other (ephemeral) network namespace:



            # unshare --net ip -br link show dummy9 
            Device "dummy9" does not exist.
            # unshare --net ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:13 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            And that's the issue: /sys still exposes initial host's interfaces instead of the new network namespace's interface. That's where there is an interaction between network namespace and with mounting /sys: if /sys is mounted from the new network namespace, it will switch to exposing the new network interfaces in select directory hierarchies (eg /sys/class/net and /sys/devices/virtual/net). This is done at mount time only, not dynamically. Some advanced network settings are easily available by just reading or writing there, so they have to be provided, and the reverse is true: the isolated processes running in the new network environment shouldn't be able to see or alter the initial host's interfaces.



          So ip netns exec FOO ... (but not ip netns add FOO) solves this by also unsharing the mount namespace and remounting /sys/ inside it, to not disrupt initial host's network namespace. But what is important is that this mount namespace is itself ephemeral: when you run separately two ip netns exec FOO ... commands, they don't end up in the same mount namespace. They each have their own, with /sys remounted there pointing to the same network namespace.



          Until now, no problem. I'll call this an "ip netns namespace" when this happened since there are now two types of namespaces involved. We have so far:



          term1:



          # ip netns add FOO
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/mnt -> mnt:[4026532618]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/net -> net:[4026532520]


          term2:



          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/mnt -> mnt:[4026532821]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/net -> net:[4026532520]


          Note how after changing ip netns namespaces, while the new network namespace is the same for term1 and term2, the new mount namespaces are different from each others (and from initial host).



          Now what happens when in term1 you create a new ip netns namespace? Let's see:



          term1:



          # ip netns add BAR
          # ip netns ls
          BAR
          FOO


          term2:



          # ip netns ls
          Error: Peer netns reference is invalid.
          Error: Peer netns reference is invalid.
          BAR
          FOO


          That's because the newer namespace BAR, to be kept existing without a process, is, as others, mounted on (the newly created empty file) /var/run/netns/BAR (again, see previous link for examples). While the mount namespaces are different, they have the same root directory: initial host's root. So of course this newly created empty file /var/run/netns/BAR could be seen everywhere (initial, term1's mount ns, term2's mount ns) when it was created.



          Alas, the mount over it, being done on term1's FOO's mount namespace, can only be seen on term1, not on term2 nor anywhere else, because it's a different mount namespace. So while in term1 ('s FOO ip netns namespace) /var/run/netns/BAR is a pseudo-file belonging to the nsfs pseudo-filesystem:



          term1:



          # stat -f -c %T /var/run/netns/BAR
          nsfs


          It's an empty file on tmpfs (from the actual /run mount) anywhere else:



          term2:



          # stat -f -c %T /var/run/netns/BAR
          tmpfs


          Any other terminal:



          $ stat -f -c %T /var/run/netns/BAR
          tmpfs


          It can still be seen in term1 as long as one doesn't exit the current "ip netns namespace". If from term1 one still switches ip netns namespaces , it will still be fine, because the new unshared ephemeral mount namespace is a copy of the previous, including all the mounts.



          If exited, that mount point is lost (and that means if there are no processes or file descriptors using it anymore, BAR's corresponding network namespace will disappear because it was held only by this mount point). After this any ip netns ls command will complain, anywhere. You can just remove the stale and now useless file /run/netns/BAR to fix it.



          After this step-by-step explanation, what to remember is that you shouldn't create new namespaces with ip netns add inside a namespace currently entered with ip netns exec. You should create them all from the initial (host) namespace, then you can switch at will between them from any ip netns namespace.



          Of course, if /var/run/netns/ (i.e. the mount point /run) is distinct between (staying fuzzy) namespaces, then there is no interaction, and each ip netns invocation will be isolated from others, not seing nor interacting with others. Where does this usually happen? In full containers, where both the mount and the network namespaces are separated and point to distinct resources from the start.




          UPDATE: as asked in comments, I checked how to "repair" this problem, but couldn't find any easy solution.



          First there's a prerequisite: as told above, once the new "ip netns" namespace BAR is created inside FOO, and FOO is left, the only reference to BAR will disappear, thus making BAR also disappear. Something more is needed.



          Actually there are three ways to keep a reference to a namespace:



          • process: that's the main method, and most of the time that's how the namespace is used at all

          • mount point (that's the method used by ip netns): allows to keep a namespace without any process, fine to have a namespace with only network settings inside (interfaces, bridges, tc rules, firewall rules, ...)

          • open file descriptor: rare, used when creating the namespaces, but seldom kept, except for applications dealing with multiple namespaces at the same time and switching some of their threads using the file descriptor for easy reference.

          We can use the 1st or 3rd method. Here are various failed attempts before finding something that works...



          As told before, won't work:



          # ip netns add FOO
          # ip netns exec FOO ip netns add BAR


          Just leave a process running temporarily in the first "ip netns" namespace, for its ephemeral mount namespace part, to keep the needed reference to the new "ip netns" namespace's network namespace and reuse it later from outside (from the initial namespace).



          Won't work either:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; sleep 999 < /var/run/netns/BAR & echo $!'
          28344
          # strace -e trace=readlink,mount mount --bind /proc/6295/fd/0 /var/run/netns/BAR
          readlink("/proc/6295/fd/0", "/run/netns/BAR", 4095) = 14
          readlink("/var/run", "/run", 4095) = 4
          mount("/run/netns/BAR", "/run/netns/BAR", 0x55c88c9cccb0, MS_BIND, NULL) = 0
          +++ exited with 0 +++
          # stat -f -c %T /run/netns/BAR
          tmpfs


          As seen with strace the mount command followed the symlink when it shouldn't have for this use case (note: the mount is still linked to the sleep process somehow which has to be killed to unmount it).



          This (entering sleep's mount namespace, to access the BAR's mounted network namespace hidden there) works but relies on the continued existence of sleep or any process for continued use:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; sleep 999 & echo $!'
          12916
          # nsenter --target=12916 --mount ip -n -brief BAR link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 8e:ce:b3:d1:9c:bb <BROADCAST,NOARP>


          strangely this (using the mount namespace shortcut /proc/pid/root/) doesn't work (I don't really know why):



          # stat -f -c %T /proc/12916/root/var/run/netns/BAR 
          tmpfs


          Finally what will work:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; ip netns exec BAR sh -c '''sleep 999 & echo $!''
          14124
          # mount --bind /proc/14124/ns/net /var/run/netns/BAR
          # ip -n BAR -brief link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 3a:48:65:20:68:c1 <BROADCAST,NOARP>


          So something like this could be used in the end. There might be race conditions if you attempt to delete them right after, before the sleep command ends.



          # ip netns add FOO
          # mount --bind /proc/$(ip netns exec FOO sh -c 'ip netns add BAR; ip netns exec BAR bash -c '''sleep 5 </dev/null >/dev/null 2>&1 & echo $!; disown'')/ns/net /var/run/netns/BAR


          How could such a construct be used? I have no idea because the original problem before encountering the nested "ip netns" problem was not given. Maybe easier solutions are available without ever trying to create "a nested network namespace".






          share|improve this answer

























          • Great answer, thanks. Is there a way to create a new netns safely while inside a netfs. i.e ip netfs exec foo1 /bin/bash.... ip netns exec <something> ip netns add foo2?

            – user98651
            2 days ago












          • It appears much more difficult than it seemed, and I don't see how to use the result in an actual use case. Perhaps you should ask an other question, about the original problem which forced you to try creating "nested network namespaces". Anyway I'm updating the answer.

            – A.B
            2 days ago
















          1














          TL;DR: As weird as it seems, this is actually not a network namespace issue, but a mount namespace issue and is to be expected.



          You should create all new "ip netns namespaces" (see later for the meaning), i.e. run all ip netns add ... commands from the initial (host) "ip netns namespace", not from inside an "ip netns namespace" having been entered with ip netns exec .... As long as you don't create them you're then free to switch between them at will including nesting commands from one to an other, with ip netns exec ....



          Detailed explanation with step-by-step examples following...




          ip netns is specialized on network namespaces, but to handle all features, has also to mingle with mount namespaces for two reasons (at least, that I know of):




          • bind mounting /etc/netns/FOO/SOMESERVICE to /etc/SOMESERVICE to manage alternate service/daemon configurations



            A feature which can be handy to easily run some (network related) daemons in an other network namespace but beside this being still part of the "host". You can check my answer at UL on a question about it there: Namespace management with ip netns (iproute2). Its use requires the same treatment as the following feature, so I won't talk about it anymore.




          • remounting /sys to expose new network namespace's network devices in its hierarchy



            This one is a mandatory feature. Example exposing the problem:



            From "initial host":



            # ip link add dev dummy9 type dummy
            # ip -br link show dummy9
            dummy9 DOWN f6:f6:48:9c:12:b9 <BROADCAST,NOARP>
            # ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:09 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            Using a lower level tool to change to an other (ephemeral) network namespace:



            # unshare --net ip -br link show dummy9 
            Device "dummy9" does not exist.
            # unshare --net ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:13 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            And that's the issue: /sys still exposes initial host's interfaces instead of the new network namespace's interface. That's where there is an interaction between network namespace and with mounting /sys: if /sys is mounted from the new network namespace, it will switch to exposing the new network interfaces in select directory hierarchies (eg /sys/class/net and /sys/devices/virtual/net). This is done at mount time only, not dynamically. Some advanced network settings are easily available by just reading or writing there, so they have to be provided, and the reverse is true: the isolated processes running in the new network environment shouldn't be able to see or alter the initial host's interfaces.



          So ip netns exec FOO ... (but not ip netns add FOO) solves this by also unsharing the mount namespace and remounting /sys/ inside it, to not disrupt initial host's network namespace. But what is important is that this mount namespace is itself ephemeral: when you run separately two ip netns exec FOO ... commands, they don't end up in the same mount namespace. They each have their own, with /sys remounted there pointing to the same network namespace.



          Until now, no problem. I'll call this an "ip netns namespace" when this happened since there are now two types of namespaces involved. We have so far:



          term1:



          # ip netns add FOO
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/mnt -> mnt:[4026532618]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/net -> net:[4026532520]


          term2:



          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/mnt -> mnt:[4026532821]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/net -> net:[4026532520]


          Note how after changing ip netns namespaces, while the new network namespace is the same for term1 and term2, the new mount namespaces are different from each others (and from initial host).



          Now what happens when in term1 you create a new ip netns namespace? Let's see:



          term1:



          # ip netns add BAR
          # ip netns ls
          BAR
          FOO


          term2:



          # ip netns ls
          Error: Peer netns reference is invalid.
          Error: Peer netns reference is invalid.
          BAR
          FOO


          That's because the newer namespace BAR, to be kept existing without a process, is, as others, mounted on (the newly created empty file) /var/run/netns/BAR (again, see previous link for examples). While the mount namespaces are different, they have the same root directory: initial host's root. So of course this newly created empty file /var/run/netns/BAR could be seen everywhere (initial, term1's mount ns, term2's mount ns) when it was created.



          Alas, the mount over it, being done on term1's FOO's mount namespace, can only be seen on term1, not on term2 nor anywhere else, because it's a different mount namespace. So while in term1 ('s FOO ip netns namespace) /var/run/netns/BAR is a pseudo-file belonging to the nsfs pseudo-filesystem:



          term1:



          # stat -f -c %T /var/run/netns/BAR
          nsfs


          It's an empty file on tmpfs (from the actual /run mount) anywhere else:



          term2:



          # stat -f -c %T /var/run/netns/BAR
          tmpfs


          Any other terminal:



          $ stat -f -c %T /var/run/netns/BAR
          tmpfs


          It can still be seen in term1 as long as one doesn't exit the current "ip netns namespace". If from term1 one still switches ip netns namespaces , it will still be fine, because the new unshared ephemeral mount namespace is a copy of the previous, including all the mounts.



          If exited, that mount point is lost (and that means if there are no processes or file descriptors using it anymore, BAR's corresponding network namespace will disappear because it was held only by this mount point). After this any ip netns ls command will complain, anywhere. You can just remove the stale and now useless file /run/netns/BAR to fix it.



          After this step-by-step explanation, what to remember is that you shouldn't create new namespaces with ip netns add inside a namespace currently entered with ip netns exec. You should create them all from the initial (host) namespace, then you can switch at will between them from any ip netns namespace.



          Of course, if /var/run/netns/ (i.e. the mount point /run) is distinct between (staying fuzzy) namespaces, then there is no interaction, and each ip netns invocation will be isolated from others, not seing nor interacting with others. Where does this usually happen? In full containers, where both the mount and the network namespaces are separated and point to distinct resources from the start.




          UPDATE: as asked in comments, I checked how to "repair" this problem, but couldn't find any easy solution.



          First there's a prerequisite: as told above, once the new "ip netns" namespace BAR is created inside FOO, and FOO is left, the only reference to BAR will disappear, thus making BAR also disappear. Something more is needed.



          Actually there are three ways to keep a reference to a namespace:



          • process: that's the main method, and most of the time that's how the namespace is used at all

          • mount point (that's the method used by ip netns): allows to keep a namespace without any process, fine to have a namespace with only network settings inside (interfaces, bridges, tc rules, firewall rules, ...)

          • open file descriptor: rare, used when creating the namespaces, but seldom kept, except for applications dealing with multiple namespaces at the same time and switching some of their threads using the file descriptor for easy reference.

          We can use the 1st or 3rd method. Here are various failed attempts before finding something that works...



          As told before, won't work:



          # ip netns add FOO
          # ip netns exec FOO ip netns add BAR


          Just leave a process running temporarily in the first "ip netns" namespace, for its ephemeral mount namespace part, to keep the needed reference to the new "ip netns" namespace's network namespace and reuse it later from outside (from the initial namespace).



          Won't work either:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; sleep 999 < /var/run/netns/BAR & echo $!'
          28344
          # strace -e trace=readlink,mount mount --bind /proc/6295/fd/0 /var/run/netns/BAR
          readlink("/proc/6295/fd/0", "/run/netns/BAR", 4095) = 14
          readlink("/var/run", "/run", 4095) = 4
          mount("/run/netns/BAR", "/run/netns/BAR", 0x55c88c9cccb0, MS_BIND, NULL) = 0
          +++ exited with 0 +++
          # stat -f -c %T /run/netns/BAR
          tmpfs


          As seen with strace the mount command followed the symlink when it shouldn't have for this use case (note: the mount is still linked to the sleep process somehow which has to be killed to unmount it).



          This (entering sleep's mount namespace, to access the BAR's mounted network namespace hidden there) works but relies on the continued existence of sleep or any process for continued use:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; sleep 999 & echo $!'
          12916
          # nsenter --target=12916 --mount ip -n -brief BAR link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 8e:ce:b3:d1:9c:bb <BROADCAST,NOARP>


          strangely this (using the mount namespace shortcut /proc/pid/root/) doesn't work (I don't really know why):



          # stat -f -c %T /proc/12916/root/var/run/netns/BAR 
          tmpfs


          Finally what will work:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; ip netns exec BAR sh -c '''sleep 999 & echo $!''
          14124
          # mount --bind /proc/14124/ns/net /var/run/netns/BAR
          # ip -n BAR -brief link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 3a:48:65:20:68:c1 <BROADCAST,NOARP>


          So something like this could be used in the end. There might be race conditions if you attempt to delete them right after, before the sleep command ends.



          # ip netns add FOO
          # mount --bind /proc/$(ip netns exec FOO sh -c 'ip netns add BAR; ip netns exec BAR bash -c '''sleep 5 </dev/null >/dev/null 2>&1 & echo $!; disown'')/ns/net /var/run/netns/BAR


          How could such a construct be used? I have no idea because the original problem before encountering the nested "ip netns" problem was not given. Maybe easier solutions are available without ever trying to create "a nested network namespace".






          share|improve this answer

























          • Great answer, thanks. Is there a way to create a new netns safely while inside a netfs. i.e ip netfs exec foo1 /bin/bash.... ip netns exec <something> ip netns add foo2?

            – user98651
            2 days ago












          • It appears much more difficult than it seemed, and I don't see how to use the result in an actual use case. Perhaps you should ask an other question, about the original problem which forced you to try creating "nested network namespaces". Anyway I'm updating the answer.

            – A.B
            2 days ago














          1












          1








          1







          TL;DR: As weird as it seems, this is actually not a network namespace issue, but a mount namespace issue and is to be expected.



          You should create all new "ip netns namespaces" (see later for the meaning), i.e. run all ip netns add ... commands from the initial (host) "ip netns namespace", not from inside an "ip netns namespace" having been entered with ip netns exec .... As long as you don't create them you're then free to switch between them at will including nesting commands from one to an other, with ip netns exec ....



          Detailed explanation with step-by-step examples following...




          ip netns is specialized on network namespaces, but to handle all features, has also to mingle with mount namespaces for two reasons (at least, that I know of):




          • bind mounting /etc/netns/FOO/SOMESERVICE to /etc/SOMESERVICE to manage alternate service/daemon configurations



            A feature which can be handy to easily run some (network related) daemons in an other network namespace but beside this being still part of the "host". You can check my answer at UL on a question about it there: Namespace management with ip netns (iproute2). Its use requires the same treatment as the following feature, so I won't talk about it anymore.




          • remounting /sys to expose new network namespace's network devices in its hierarchy



            This one is a mandatory feature. Example exposing the problem:



            From "initial host":



            # ip link add dev dummy9 type dummy
            # ip -br link show dummy9
            dummy9 DOWN f6:f6:48:9c:12:b9 <BROADCAST,NOARP>
            # ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:09 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            Using a lower level tool to change to an other (ephemeral) network namespace:



            # unshare --net ip -br link show dummy9 
            Device "dummy9" does not exist.
            # unshare --net ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:13 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            And that's the issue: /sys still exposes initial host's interfaces instead of the new network namespace's interface. That's where there is an interaction between network namespace and with mounting /sys: if /sys is mounted from the new network namespace, it will switch to exposing the new network interfaces in select directory hierarchies (eg /sys/class/net and /sys/devices/virtual/net). This is done at mount time only, not dynamically. Some advanced network settings are easily available by just reading or writing there, so they have to be provided, and the reverse is true: the isolated processes running in the new network environment shouldn't be able to see or alter the initial host's interfaces.



          So ip netns exec FOO ... (but not ip netns add FOO) solves this by also unsharing the mount namespace and remounting /sys/ inside it, to not disrupt initial host's network namespace. But what is important is that this mount namespace is itself ephemeral: when you run separately two ip netns exec FOO ... commands, they don't end up in the same mount namespace. They each have their own, with /sys remounted there pointing to the same network namespace.



          Until now, no problem. I'll call this an "ip netns namespace" when this happened since there are now two types of namespaces involved. We have so far:



          term1:



          # ip netns add FOO
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/mnt -> mnt:[4026532618]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/net -> net:[4026532520]


          term2:



          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/mnt -> mnt:[4026532821]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/net -> net:[4026532520]


          Note how after changing ip netns namespaces, while the new network namespace is the same for term1 and term2, the new mount namespaces are different from each others (and from initial host).



          Now what happens when in term1 you create a new ip netns namespace? Let's see:



          term1:



          # ip netns add BAR
          # ip netns ls
          BAR
          FOO


          term2:



          # ip netns ls
          Error: Peer netns reference is invalid.
          Error: Peer netns reference is invalid.
          BAR
          FOO


          That's because the newer namespace BAR, to be kept existing without a process, is, as others, mounted on (the newly created empty file) /var/run/netns/BAR (again, see previous link for examples). While the mount namespaces are different, they have the same root directory: initial host's root. So of course this newly created empty file /var/run/netns/BAR could be seen everywhere (initial, term1's mount ns, term2's mount ns) when it was created.



          Alas, the mount over it, being done on term1's FOO's mount namespace, can only be seen on term1, not on term2 nor anywhere else, because it's a different mount namespace. So while in term1 ('s FOO ip netns namespace) /var/run/netns/BAR is a pseudo-file belonging to the nsfs pseudo-filesystem:



          term1:



          # stat -f -c %T /var/run/netns/BAR
          nsfs


          It's an empty file on tmpfs (from the actual /run mount) anywhere else:



          term2:



          # stat -f -c %T /var/run/netns/BAR
          tmpfs


          Any other terminal:



          $ stat -f -c %T /var/run/netns/BAR
          tmpfs


          It can still be seen in term1 as long as one doesn't exit the current "ip netns namespace". If from term1 one still switches ip netns namespaces , it will still be fine, because the new unshared ephemeral mount namespace is a copy of the previous, including all the mounts.



          If exited, that mount point is lost (and that means if there are no processes or file descriptors using it anymore, BAR's corresponding network namespace will disappear because it was held only by this mount point). After this any ip netns ls command will complain, anywhere. You can just remove the stale and now useless file /run/netns/BAR to fix it.



          After this step-by-step explanation, what to remember is that you shouldn't create new namespaces with ip netns add inside a namespace currently entered with ip netns exec. You should create them all from the initial (host) namespace, then you can switch at will between them from any ip netns namespace.



          Of course, if /var/run/netns/ (i.e. the mount point /run) is distinct between (staying fuzzy) namespaces, then there is no interaction, and each ip netns invocation will be isolated from others, not seing nor interacting with others. Where does this usually happen? In full containers, where both the mount and the network namespaces are separated and point to distinct resources from the start.




          UPDATE: as asked in comments, I checked how to "repair" this problem, but couldn't find any easy solution.



          First there's a prerequisite: as told above, once the new "ip netns" namespace BAR is created inside FOO, and FOO is left, the only reference to BAR will disappear, thus making BAR also disappear. Something more is needed.



          Actually there are three ways to keep a reference to a namespace:



          • process: that's the main method, and most of the time that's how the namespace is used at all

          • mount point (that's the method used by ip netns): allows to keep a namespace without any process, fine to have a namespace with only network settings inside (interfaces, bridges, tc rules, firewall rules, ...)

          • open file descriptor: rare, used when creating the namespaces, but seldom kept, except for applications dealing with multiple namespaces at the same time and switching some of their threads using the file descriptor for easy reference.

          We can use the 1st or 3rd method. Here are various failed attempts before finding something that works...



          As told before, won't work:



          # ip netns add FOO
          # ip netns exec FOO ip netns add BAR


          Just leave a process running temporarily in the first "ip netns" namespace, for its ephemeral mount namespace part, to keep the needed reference to the new "ip netns" namespace's network namespace and reuse it later from outside (from the initial namespace).



          Won't work either:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; sleep 999 < /var/run/netns/BAR & echo $!'
          28344
          # strace -e trace=readlink,mount mount --bind /proc/6295/fd/0 /var/run/netns/BAR
          readlink("/proc/6295/fd/0", "/run/netns/BAR", 4095) = 14
          readlink("/var/run", "/run", 4095) = 4
          mount("/run/netns/BAR", "/run/netns/BAR", 0x55c88c9cccb0, MS_BIND, NULL) = 0
          +++ exited with 0 +++
          # stat -f -c %T /run/netns/BAR
          tmpfs


          As seen with strace the mount command followed the symlink when it shouldn't have for this use case (note: the mount is still linked to the sleep process somehow which has to be killed to unmount it).



          This (entering sleep's mount namespace, to access the BAR's mounted network namespace hidden there) works but relies on the continued existence of sleep or any process for continued use:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; sleep 999 & echo $!'
          12916
          # nsenter --target=12916 --mount ip -n -brief BAR link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 8e:ce:b3:d1:9c:bb <BROADCAST,NOARP>


          strangely this (using the mount namespace shortcut /proc/pid/root/) doesn't work (I don't really know why):



          # stat -f -c %T /proc/12916/root/var/run/netns/BAR 
          tmpfs


          Finally what will work:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; ip netns exec BAR sh -c '''sleep 999 & echo $!''
          14124
          # mount --bind /proc/14124/ns/net /var/run/netns/BAR
          # ip -n BAR -brief link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 3a:48:65:20:68:c1 <BROADCAST,NOARP>


          So something like this could be used in the end. There might be race conditions if you attempt to delete them right after, before the sleep command ends.



          # ip netns add FOO
          # mount --bind /proc/$(ip netns exec FOO sh -c 'ip netns add BAR; ip netns exec BAR bash -c '''sleep 5 </dev/null >/dev/null 2>&1 & echo $!; disown'')/ns/net /var/run/netns/BAR


          How could such a construct be used? I have no idea because the original problem before encountering the nested "ip netns" problem was not given. Maybe easier solutions are available without ever trying to create "a nested network namespace".






          share|improve this answer















          TL;DR: As weird as it seems, this is actually not a network namespace issue, but a mount namespace issue and is to be expected.



          You should create all new "ip netns namespaces" (see later for the meaning), i.e. run all ip netns add ... commands from the initial (host) "ip netns namespace", not from inside an "ip netns namespace" having been entered with ip netns exec .... As long as you don't create them you're then free to switch between them at will including nesting commands from one to an other, with ip netns exec ....



          Detailed explanation with step-by-step examples following...




          ip netns is specialized on network namespaces, but to handle all features, has also to mingle with mount namespaces for two reasons (at least, that I know of):




          • bind mounting /etc/netns/FOO/SOMESERVICE to /etc/SOMESERVICE to manage alternate service/daemon configurations



            A feature which can be handy to easily run some (network related) daemons in an other network namespace but beside this being still part of the "host". You can check my answer at UL on a question about it there: Namespace management with ip netns (iproute2). Its use requires the same treatment as the following feature, so I won't talk about it anymore.




          • remounting /sys to expose new network namespace's network devices in its hierarchy



            This one is a mandatory feature. Example exposing the problem:



            From "initial host":



            # ip link add dev dummy9 type dummy
            # ip -br link show dummy9
            dummy9 DOWN f6:f6:48:9c:12:b9 <BROADCAST,NOARP>
            # ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:09 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            Using a lower level tool to change to an other (ephemeral) network namespace:



            # unshare --net ip -br link show dummy9 
            Device "dummy9" does not exist.
            # unshare --net ls -l /sys/class/net/dummy9
            lrwxrwxrwx. 1 root root 0 Apr 4 22:13 /sys/class/net/dummy9 -> ../../devices/virtual/net/dummy9


            And that's the issue: /sys still exposes initial host's interfaces instead of the new network namespace's interface. That's where there is an interaction between network namespace and with mounting /sys: if /sys is mounted from the new network namespace, it will switch to exposing the new network interfaces in select directory hierarchies (eg /sys/class/net and /sys/devices/virtual/net). This is done at mount time only, not dynamically. Some advanced network settings are easily available by just reading or writing there, so they have to be provided, and the reverse is true: the isolated processes running in the new network environment shouldn't be able to see or alter the initial host's interfaces.



          So ip netns exec FOO ... (but not ip netns add FOO) solves this by also unsharing the mount namespace and remounting /sys/ inside it, to not disrupt initial host's network namespace. But what is important is that this mount namespace is itself ephemeral: when you run separately two ip netns exec FOO ... commands, they don't end up in the same mount namespace. They each have their own, with /sys remounted there pointing to the same network namespace.



          Until now, no problem. I'll call this an "ip netns namespace" when this happened since there are now two types of namespaces involved. We have so far:



          term1:



          # ip netns add FOO
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:28 /proc/1712/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/mnt -> mnt:[4026532618]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1864/ns/net -> net:[4026532520]


          term2:



          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/mnt -> mnt:[4026531840]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:32 /proc/1761/ns/net -> net:[4026531992]
          # ip netns exec FOO bash
          # ls -l /proc/$$/ns/mnt,net
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/mnt -> mnt:[4026532821]
          lrwxrwxrwx. 1 root root 0 Apr 4 22:33 /proc/1866/ns/net -> net:[4026532520]


          Note how after changing ip netns namespaces, while the new network namespace is the same for term1 and term2, the new mount namespaces are different from each others (and from initial host).



          Now what happens when in term1 you create a new ip netns namespace? Let's see:



          term1:



          # ip netns add BAR
          # ip netns ls
          BAR
          FOO


          term2:



          # ip netns ls
          Error: Peer netns reference is invalid.
          Error: Peer netns reference is invalid.
          BAR
          FOO


          That's because the newer namespace BAR, to be kept existing without a process, is, as others, mounted on (the newly created empty file) /var/run/netns/BAR (again, see previous link for examples). While the mount namespaces are different, they have the same root directory: initial host's root. So of course this newly created empty file /var/run/netns/BAR could be seen everywhere (initial, term1's mount ns, term2's mount ns) when it was created.



          Alas, the mount over it, being done on term1's FOO's mount namespace, can only be seen on term1, not on term2 nor anywhere else, because it's a different mount namespace. So while in term1 ('s FOO ip netns namespace) /var/run/netns/BAR is a pseudo-file belonging to the nsfs pseudo-filesystem:



          term1:



          # stat -f -c %T /var/run/netns/BAR
          nsfs


          It's an empty file on tmpfs (from the actual /run mount) anywhere else:



          term2:



          # stat -f -c %T /var/run/netns/BAR
          tmpfs


          Any other terminal:



          $ stat -f -c %T /var/run/netns/BAR
          tmpfs


          It can still be seen in term1 as long as one doesn't exit the current "ip netns namespace". If from term1 one still switches ip netns namespaces , it will still be fine, because the new unshared ephemeral mount namespace is a copy of the previous, including all the mounts.



          If exited, that mount point is lost (and that means if there are no processes or file descriptors using it anymore, BAR's corresponding network namespace will disappear because it was held only by this mount point). After this any ip netns ls command will complain, anywhere. You can just remove the stale and now useless file /run/netns/BAR to fix it.



          After this step-by-step explanation, what to remember is that you shouldn't create new namespaces with ip netns add inside a namespace currently entered with ip netns exec. You should create them all from the initial (host) namespace, then you can switch at will between them from any ip netns namespace.



          Of course, if /var/run/netns/ (i.e. the mount point /run) is distinct between (staying fuzzy) namespaces, then there is no interaction, and each ip netns invocation will be isolated from others, not seing nor interacting with others. Where does this usually happen? In full containers, where both the mount and the network namespaces are separated and point to distinct resources from the start.




          UPDATE: as asked in comments, I checked how to "repair" this problem, but couldn't find any easy solution.



          First there's a prerequisite: as told above, once the new "ip netns" namespace BAR is created inside FOO, and FOO is left, the only reference to BAR will disappear, thus making BAR also disappear. Something more is needed.



          Actually there are three ways to keep a reference to a namespace:



          • process: that's the main method, and most of the time that's how the namespace is used at all

          • mount point (that's the method used by ip netns): allows to keep a namespace without any process, fine to have a namespace with only network settings inside (interfaces, bridges, tc rules, firewall rules, ...)

          • open file descriptor: rare, used when creating the namespaces, but seldom kept, except for applications dealing with multiple namespaces at the same time and switching some of their threads using the file descriptor for easy reference.

          We can use the 1st or 3rd method. Here are various failed attempts before finding something that works...



          As told before, won't work:



          # ip netns add FOO
          # ip netns exec FOO ip netns add BAR


          Just leave a process running temporarily in the first "ip netns" namespace, for its ephemeral mount namespace part, to keep the needed reference to the new "ip netns" namespace's network namespace and reuse it later from outside (from the initial namespace).



          Won't work either:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; sleep 999 < /var/run/netns/BAR & echo $!'
          28344
          # strace -e trace=readlink,mount mount --bind /proc/6295/fd/0 /var/run/netns/BAR
          readlink("/proc/6295/fd/0", "/run/netns/BAR", 4095) = 14
          readlink("/var/run", "/run", 4095) = 4
          mount("/run/netns/BAR", "/run/netns/BAR", 0x55c88c9cccb0, MS_BIND, NULL) = 0
          +++ exited with 0 +++
          # stat -f -c %T /run/netns/BAR
          tmpfs


          As seen with strace the mount command followed the symlink when it shouldn't have for this use case (note: the mount is still linked to the sleep process somehow which has to be killed to unmount it).



          This (entering sleep's mount namespace, to access the BAR's mounted network namespace hidden there) works but relies on the continued existence of sleep or any process for continued use:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; sleep 999 & echo $!'
          12916
          # nsenter --target=12916 --mount ip -n -brief BAR link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 8e:ce:b3:d1:9c:bb <BROADCAST,NOARP>


          strangely this (using the mount namespace shortcut /proc/pid/root/) doesn't work (I don't really know why):



          # stat -f -c %T /proc/12916/root/var/run/netns/BAR 
          tmpfs


          Finally what will work:



          # ip netns add FOO
          # ip netns exec FOO sh -c 'ip netns add BAR; ip -n BAR link add dummy8 type dummy; ip netns exec BAR sh -c '''sleep 999 & echo $!''
          14124
          # mount --bind /proc/14124/ns/net /var/run/netns/BAR
          # ip -n BAR -brief link show
          lo DOWN 00:00:00:00:00:00 <LOOPBACK>
          dummy8 DOWN 3a:48:65:20:68:c1 <BROADCAST,NOARP>


          So something like this could be used in the end. There might be race conditions if you attempt to delete them right after, before the sleep command ends.



          # ip netns add FOO
          # mount --bind /proc/$(ip netns exec FOO sh -c 'ip netns add BAR; ip netns exec BAR bash -c '''sleep 5 </dev/null >/dev/null 2>&1 & echo $!; disown'')/ns/net /var/run/netns/BAR


          How could such a construct be used? I have no idea because the original problem before encountering the nested "ip netns" problem was not given. Maybe easier solutions are available without ever trying to create "a nested network namespace".







          share|improve this answer














          share|improve this answer



          share|improve this answer








          edited 2 days ago

























          answered Apr 4 at 21:43









          A.BA.B

          1,9342717




          1,9342717












          • Great answer, thanks. Is there a way to create a new netns safely while inside a netfs. i.e ip netfs exec foo1 /bin/bash.... ip netns exec <something> ip netns add foo2?

            – user98651
            2 days ago












          • It appears much more difficult than it seemed, and I don't see how to use the result in an actual use case. Perhaps you should ask an other question, about the original problem which forced you to try creating "nested network namespaces". Anyway I'm updating the answer.

            – A.B
            2 days ago


















          • Great answer, thanks. Is there a way to create a new netns safely while inside a netfs. i.e ip netfs exec foo1 /bin/bash.... ip netns exec <something> ip netns add foo2?

            – user98651
            2 days ago












          • It appears much more difficult than it seemed, and I don't see how to use the result in an actual use case. Perhaps you should ask an other question, about the original problem which forced you to try creating "nested network namespaces". Anyway I'm updating the answer.

            – A.B
            2 days ago

















          Great answer, thanks. Is there a way to create a new netns safely while inside a netfs. i.e ip netfs exec foo1 /bin/bash.... ip netns exec <something> ip netns add foo2?

          – user98651
          2 days ago






          Great answer, thanks. Is there a way to create a new netns safely while inside a netfs. i.e ip netfs exec foo1 /bin/bash.... ip netns exec <something> ip netns add foo2?

          – user98651
          2 days ago














          It appears much more difficult than it seemed, and I don't see how to use the result in an actual use case. Perhaps you should ask an other question, about the original problem which forced you to try creating "nested network namespaces". Anyway I'm updating the answer.

          – A.B
          2 days ago






          It appears much more difficult than it seemed, and I don't see how to use the result in an actual use case. Perhaps you should ask an other question, about the original problem which forced you to try creating "nested network namespaces". Anyway I'm updating the answer.

          – A.B
          2 days ago


















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Server Fault!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f961504%2fcannot-create-nested-network-namespace%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          Club Baloncesto Breogán Índice Historia | Pavillón | Nome | O Breogán na cultura popular | Xogadores | Adestradores | Presidentes | Palmarés | Historial | Líderes | Notas | Véxase tamén | Menú de navegacióncbbreogan.galCadroGuía oficial da ACB 2009-10, páxina 201Guía oficial ACB 1992, páxina 183. Editorial DB.É de 6.500 espectadores sentados axeitándose á última normativa"Estudiantes Junior, entre as mellores canteiras"o orixinalHemeroteca El Mundo Deportivo, 16 setembro de 1970, páxina 12Historia do BreogánAlfredo Pérez, o último canoneiroHistoria C.B. BreogánHemeroteca de El Mundo DeportivoJimmy Wright, norteamericano do Breogán deixará Lugo por ameazas de morteResultados de Breogán en 1986-87Resultados de Breogán en 1990-91Ficha de Velimir Perasović en acb.comResultados de Breogán en 1994-95Breogán arrasa al Barça. "El Mundo Deportivo", 27 de setembro de 1999, páxina 58CB Breogán - FC BarcelonaA FEB invita a participar nunha nova Liga EuropeaCharlie Bell na prensa estatalMáximos anotadores 2005Tempada 2005-06 : Tódolos Xogadores da Xornada""Non quero pensar nunha man negra, mais pregúntome que está a pasar""o orixinalRaúl López, orgulloso dos xogadores, presume da boa saúde económica do BreogánJulio González confirma que cesa como presidente del BreogánHomenaxe a Lisardo GómezA tempada do rexurdimento celesteEntrevista a Lisardo GómezEl COB dinamita el Pazo para forzar el quinto (69-73)Cafés Candelas, patrocinador del CB Breogán"Suso Lázare, novo presidente do Breogán"o orixinalCafés Candelas Breogán firma el mayor triunfo de la historiaEl Breogán realizará 17 homenajes por su cincuenta aniversario"O Breogán honra ao seu fundador e primeiro presidente"o orixinalMiguel Giao recibiu a homenaxe do PazoHomenaxe aos primeiros gladiadores celestesO home que nos amosa como ver o Breo co corazónTita Franco será homenaxeada polos #50anosdeBreoJulio Vila recibirá unha homenaxe in memoriam polos #50anosdeBreo"O Breogán homenaxeará aos seus aboados máis veteráns"Pechada ovación a «Capi» Sanmartín e Ricardo «Corazón de González»Homenaxe por décadas de informaciónPaco García volve ao Pazo con motivo do 50 aniversario"Resultados y clasificaciones""O Cafés Candelas Breogán, campión da Copa Princesa""O Cafés Candelas Breogán, equipo ACB"C.B. Breogán"Proxecto social"o orixinal"Centros asociados"o orixinalFicha en imdb.comMario Camus trata la recuperación del amor en 'La vieja música', su última película"Páxina web oficial""Club Baloncesto Breogán""C. B. Breogán S.A.D."eehttp://www.fegaba.com

          Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

          Cegueira Índice Epidemioloxía | Deficiencia visual | Tipos de cegueira | Principais causas de cegueira | Tratamento | Técnicas de adaptación e axudas | Vida dos cegos | Primeiros auxilios | Crenzas respecto das persoas cegas | Crenzas das persoas cegas | O neno deficiente visual | Aspectos psicolóxicos da cegueira | Notas | Véxase tamén | Menú de navegación54.054.154.436928256blindnessDicionario da Real Academia GalegaPortal das Palabras"International Standards: Visual Standards — Aspects and Ranges of Vision Loss with Emphasis on Population Surveys.""Visual impairment and blindness""Presentan un plan para previr a cegueira"o orixinalACCDV Associació Catalana de Cecs i Disminuïts Visuals - PMFTrachoma"Effect of gene therapy on visual function in Leber's congenital amaurosis"1844137110.1056/NEJMoa0802268Cans guía - os mellores amigos dos cegosArquivadoEscola de cans guía para cegos en Mortágua, PortugalArquivado"Tecnología para ciegos y deficientes visuales. Recopilación de recursos gratuitos en la Red""Colorino""‘COL.diesis’, escuchar los sonidos del color""COL.diesis: Transforming Colour into Melody and Implementing the Result in a Colour Sensor Device"o orixinal"Sistema de desarrollo de sinestesia color-sonido para invidentes utilizando un protocolo de audio""Enseñanza táctil - geometría y color. Juegos didácticos para niños ciegos y videntes""Sistema Constanz"L'ocupació laboral dels cecs a l'Estat espanyol està pràcticament equiparada a la de les persones amb visió, entrevista amb Pedro ZuritaONCE (Organización Nacional de Cegos de España)Prevención da cegueiraDescrición de deficiencias visuais (Disc@pnet)Braillín, un boneco atractivo para calquera neno, con ou sen discapacidade, que permite familiarizarse co sistema de escritura e lectura brailleAxudas Técnicas36838ID00897494007150-90057129528256DOID:1432HP:0000618D001766C10.597.751.941.162C97109C0155020