Virtual Switching Sanity Check - NFS, BGP & Kubernetes Announcing the arrival of Valued Associate #679: Cesar Manara Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern) Come Celebrate our 10 Year Anniversary!Tagged VLAN with Procurve switch and RHEL is not working as expectedWhy would a server lockup knock other servers off the network?Trouble configuring standard VLANs on BNT G8264 and ESXi 5.5Hyper V 2012 R2 private virtual switch with trunk and access mode switchesUpdate Cisco 3750 for BGPIs there any way to configure a vlan interface on linux to only receive the untagged frames?How to configure 3 vlans on HP Procurve 2920 switch and Gateway is a CISCO routerServer can't see VLAN after changing an IP of something elseMounting NFS servers with Docker/Kubernetes containers, without using insecure mode on the serverFilesystem of Proxmox VMs get corrupted when Ceph Node goes down

Do I really need to have a message in a novel to appeal to readers?

What does Turing mean by this statement?

Converted a Scalar function to a TVF function for parallel execution-Still running in Serial mode

Co-worker has annoying ringtone

How to run automated tests after each commit?

Electrolysis of water: Which equations to use? (IB Chem)

How do I find out the mythology and history of my Fortress?

How can I prevent/balance waiting and turtling as a response to cooldown mechanics

How were pictures turned from film to a big picture in a picture frame before digital scanning?

AppleTVs create a chatty alternate WiFi network

Significance of Cersei's obsession with elephants?

Semigroups with no morphisms between them

Has negative voting ever been officially implemented in elections, or seriously proposed, or even studied?

macOS: Name for app shortcut screen found by pinching with thumb and three fingers

A letter with no particular backstory

Why does 14 CFR have skipped subparts in my ASA 2019 FAR/AIM book?

Why do early math courses focus on the cross sections of a cone and not on other 3D objects?

Did Mueller's report provide an evidentiary basis for the claim of Russian govt election interference via social media?

How did Fremen produce and carry enough thumpers to use Sandworms as de facto Ubers?

Induction Proof for Sequences

Would it be easier to apply for a UK visa if there is a host family to sponsor for you in going there?

Intuitive explanation of the rank-nullity theorem

How much damage would a cupful of neutron star matter do to the Earth?

Strange behavior of Object.defineProperty() in JavaScript



Virtual Switching Sanity Check - NFS, BGP & Kubernetes



Announcing the arrival of Valued Associate #679: Cesar Manara
Planned maintenance scheduled April 23, 2019 at 23:30UTC (7:30pm US/Eastern)
Come Celebrate our 10 Year Anniversary!Tagged VLAN with Procurve switch and RHEL is not working as expectedWhy would a server lockup knock other servers off the network?Trouble configuring standard VLANs on BNT G8264 and ESXi 5.5Hyper V 2012 R2 private virtual switch with trunk and access mode switchesUpdate Cisco 3750 for BGPIs there any way to configure a vlan interface on linux to only receive the untagged frames?How to configure 3 vlans on HP Procurve 2920 switch and Gateway is a CISCO routerServer can't see VLAN after changing an IP of something elseMounting NFS servers with Docker/Kubernetes containers, without using insecure mode on the serverFilesystem of Proxmox VMs get corrupted when Ceph Node goes down



.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








0















I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments


This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



I'm looking for a few questions to be answered with this post:



  1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


  2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


  3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


  4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


  5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


  6. Would you recommend a different solution to my problem?










share|improve this question






















  • This is a really complex setup. I don’t know that anyone will be able to help with specifics, but good luck!!

    – ewwhite
    Apr 14 at 7:57


















0















I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments


This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



I'm looking for a few questions to be answered with this post:



  1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


  2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


  3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


  4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


  5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


  6. Would you recommend a different solution to my problem?










share|improve this question






















  • This is a really complex setup. I don’t know that anyone will be able to help with specifics, but good luck!!

    – ewwhite
    Apr 14 at 7:57














0












0








0








I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments


This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



I'm looking for a few questions to be answered with this post:



  1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


  2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


  3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


  4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


  5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


  6. Would you recommend a different solution to my problem?










share|improve this question














I have a home Kubernetes cluster that runs in 4 VMs on top of Proxmox. Proxmox is tagged to VLAN 20, the Kubernetes VMs are tagged to VLAN 40.



The Kubernetes VMs are BGP neighbors of my router so that I can tag pods to then run on one of two other VLANs that are designated as DMZ spaces, 50 and 60. In short, the network looks like this:



- VLAN1: Networking Hardware
- VLAN20: Physical Machines
- VLAN40: Kubernetes VMs
- VLAN50: Internal Kubernetes Deployments
- VLAN60: External Kubernetes Deployments


This works great, everything is able to communicate with one-another and the internet just fine. With one exception, performance.



My Proxmox server also acts as my storage server by advertising a ZFS pool as an NFS server. This works great, and is capable of some pretty fast reads and writes for a home storage server. Upwards of 6Gb/s reads, for example.



When I used to run Docker containers directly on my Proxmox server, virtual switching allowed the containers to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



Furthermore, before I set up VLANs, the Kubernetes VMs used to run on the same VLAN (1) as Proxmox itself. And any pods that were deployed on Kubernetes were also able to interact with the NFS server hosted by Proxmox by hostname at nearly that speed.



However, now that I have configured VLANs and use BGP to provision my Kubernetes pods on separate VLANs from the hosts, networking has been capped at 1Gb/s, if not worse than that.



My Ubiquiti Edgerouter Lite and Unifi Switch 8 are both 1Gb devices, so it makes sense. However, this is starting to feel very painful in my lab. For example, cover art in Plex Media Server takes upwards of 10 seconds to load when I scroll in my library because Kubernetes volume mounts the database on the NFS server. Similarly, Deluge is acting incredibly poorly. The web interface crashes frequently and any sort of action such as opening the Preferences panel or trying to see the Details section of a new torrent can take several minutes! Deluge's cache settings are set to use 4GB of memory, but I'm unsure if these performance issues are because of my network or because Deluge just doesn't scale well to 1100 torrents. Lastly, sometimes my Kubernetes deployments that interact heavily with a database (Plex, Jira, etc) end up with a corrupted database after a few weeks of running. This is presumably because of network latency, but I'm not sure.



I'm looking for a few questions to be answered with this post:



  1. I know my network is complex, especially for a homelab. However, my homelab is used pretty much entirely for learning for my job. And the hobby is fun for me, especially when I cater to obscene levels of complexity. However, I'm just curious if everything seems like it is configured correctly to you, given the fact that I am okay with the complexity.


  2. Would purchasing a 10Gb switch resolve this issue or would it also be necessary to purchase a 10Gb router since the Edgerouter is a BGP neighbor of the Kubernetes nodes?


  3. If it would be necessary to purchase both a Switch and a Router, would it instead be possible to purchase a 10Gb switch with BGP capabilities?


  4. What hardware would you recommend I purchase to resolve this issue? Ideally I would like to keep the total cost under $500-1,000 but it doesn't look like that would be possible given the incredibly high cost of 10Gb routers.


  5. Would it be possible to use a different Kubernetes Storage Class for storing the data directly on the nodes? What would this look like?


  6. Would you recommend a different solution to my problem?







networking router kubernetes proxmox bgp






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked Apr 14 at 2:03









TJ ZimmermanTJ Zimmerman

1615




1615












  • This is a really complex setup. I don’t know that anyone will be able to help with specifics, but good luck!!

    – ewwhite
    Apr 14 at 7:57


















  • This is a really complex setup. I don’t know that anyone will be able to help with specifics, but good luck!!

    – ewwhite
    Apr 14 at 7:57

















This is a really complex setup. I don’t know that anyone will be able to help with specifics, but good luck!!

– ewwhite
Apr 14 at 7:57






This is a really complex setup. I don’t know that anyone will be able to help with specifics, but good luck!!

– ewwhite
Apr 14 at 7:57











0






active

oldest

votes












Your Answer








StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);













draft saved

draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f962947%2fvirtual-switching-sanity-check-nfs-bgp-kubernetes%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes















draft saved

draft discarded
















































Thanks for contributing an answer to Server Fault!


  • Please be sure to answer the question. Provide details and share your research!

But avoid


  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f962947%2fvirtual-switching-sanity-check-nfs-bgp-kubernetes%23new-answer', 'question_page');

);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

How to write a 12-bar blues melodyI-IV-V blues progressionHow to play the bridges in a standard blues progressionHow does Gdim7 fit in C# minor?question on a certain chord progressionMusicology of Melody12 bar blues, spread rhythm: alternative to 6th chord to avoid finger stretchChord progressions/ Root key/ MelodiesHow to put chords (POP-EDM) under a given lead vocal melody (starting from a good knowledge in music theory)Are there “rules” for improvising with the minor pentatonic scale over 12-bar shuffle?Confusion about blues scale and chords

What if the end-user didn't have the required library?What is setup.py?What is a clean, pythonic way to have multiple constructors in Python?What does Ruby have that Python doesn't, and vice versa?What is the reason for having '//' in Python?How do I create a namespace package in Python?How to package shared objects that python modules depend on?setuptools vs. distutils: why is distutils still a thing?Navigation in Windows 10 vs code not going to virtualenv library when the same library is installed at user levelPython create package for local usePackaging a project that uses multiple python versionsWhy is permission denied on pip install except for when “--user” is included at end of command?

Esgonzo ibérico Índice Descrición Distribución Hábitat Ameazas Notas Véxase tamén "Acerca dos nomes dos anfibios e réptiles galegos""Chalcides bedriagai"Chalcides bedriagai en Carrascal, L. M. Salvador, A. (Eds). Enciclopedia virtual de los vertebrados españoles. Museo Nacional de Ciencias Naturales, Madrid. España.Fotos