SQL Server Always On File Share Witness (Quorum vote) on different subnet to other nodesAvailability mode - manual failover mode, best practice for availability mode?AlwaysOn Multi Subnet ConfigurationWindows cluster failed over, but SQL instances didn't moveWSFC File Share WitnessFile share Witness or Disk WitnessNeed to configure Availability Groups between three different SQL failover instanceSQL Server Always On Node and File Share MajorityFile share witness vs 3rd Node for WSFC + AG with only 2 nodes?Do we really need a file share witness?File Share Witness with 4 Node Cluster
How old is Captain America at the end of "Avengers: Endgame"?
Thesis' "Future Work" section – is it acceptable to omit personal involvement in a mentioned project?
Ex-manager wants to stay in touch, I don't want to
the lecture's place or where the lecture takes place
Unit Test - Testing API Methods
find not returning expected files
The lexical root of the perfect tense forms differs from the lexical root of the infinitive form
What is the significance of 4200 BCE in context of farming replacing foraging in Europe?
"Right on the tip of my tongue" meaning?
Limit of an integral vs Limit of the integrand
What food production methods would allow a metropolis like New York to become self sufficient
Why use steam instead of just hot air?
How are one-time password generators like Google Authenticator different from having two passwords?
Is it a Munchausen Number?
Help decide course of action for rotting windows
Adding slope values to attribute table (QGIS 3)
How to pronounce "r" after a "g"?
Looking for a simple way to manipulate one column of a matrix
Is the homebrew weapon attack cantrip 'Arcane Strike' balanced?
How to slow yourself down (for playing nice with others)
The lexical root of the past tense forms differs from the lexical root of the infinitive form
Early arrival in Australia, early hotel check in not available
semanage not changing file context
Is Simic Ascendancy triggered by Awakening of Vitu-Ghazi?
SQL Server Always On File Share Witness (Quorum vote) on different subnet to other nodes
Availability mode - manual failover mode, best practice for availability mode?AlwaysOn Multi Subnet ConfigurationWindows cluster failed over, but SQL instances didn't moveWSFC File Share WitnessFile share Witness or Disk WitnessNeed to configure Availability Groups between three different SQL failover instanceSQL Server Always On Node and File Share MajorityFile share witness vs 3rd Node for WSFC + AG with only 2 nodes?Do we really need a file share witness?File Share Witness with 4 Node Cluster
.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;
I am currently having issues with some of my Availability groups where node 1 and node 2 loose connectivity between each other "A connection timeout has occurred on a previously established connection to availability replica".
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
The only thing I can see is that the file share server is on a different subnet to the other 2 SQL Server nodes in the cluster.
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment? All the firewall rules are in place as it can talk to the other nodes but out of hours (usually) it loses connectivity.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
sql-server availability-groups clustering
add a comment |
I am currently having issues with some of my Availability groups where node 1 and node 2 loose connectivity between each other "A connection timeout has occurred on a previously established connection to availability replica".
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
The only thing I can see is that the file share server is on a different subnet to the other 2 SQL Server nodes in the cluster.
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment? All the firewall rules are in place as it can talk to the other nodes but out of hours (usually) it loses connectivity.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
sql-server availability-groups clustering
add a comment |
I am currently having issues with some of my Availability groups where node 1 and node 2 loose connectivity between each other "A connection timeout has occurred on a previously established connection to availability replica".
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
The only thing I can see is that the file share server is on a different subnet to the other 2 SQL Server nodes in the cluster.
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment? All the firewall rules are in place as it can talk to the other nodes but out of hours (usually) it loses connectivity.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
sql-server availability-groups clustering
I am currently having issues with some of my Availability groups where node 1 and node 2 loose connectivity between each other "A connection timeout has occurred on a previously established connection to availability replica".
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
The only thing I can see is that the file share server is on a different subnet to the other 2 SQL Server nodes in the cluster.
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment? All the firewall rules are in place as it can talk to the other nodes but out of hours (usually) it loses connectivity.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
sql-server availability-groups clustering
sql-server availability-groups clustering
asked May 1 at 10:15
Daniel NashDaniel Nash
486
486
add a comment |
add a comment |
2 Answers
2
active
oldest
votes
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment?
It's completely fine to have the FSW on a different subnet, there is absolutely nothing wrong with that. There is no need to have it on the same subnet, in fact there is an Azure witness which definitely won't be on the same subnet and it works without issue.
"A connection timeout has occurred on a previously established connection to availability replica"
Seems to be pointing to either something in the network being an issue or if this is on a virtual machine something is happening to the guest/host which is giving you this trouble. Given there are a plethora of in-depth configuration settings at the host, guest, and OS level that can contribute to this I won't go into any further depth as it'd be out of scope of this site.
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
This means that whomever attempted to arbitrate for the witness was only one vote off of having quorum for the cluster. Since it's a two node cluster, if the nodes couldn't talk to each other they'd be in this exact situation.
If neither node can talk to each other (obviously an issue) and neither node can talk to the FSW (another issue) this makes me wonder what's broken in the infrastructure - again, either at the virtual layer or the physical (network) layer. It's clear something is going on to cause this and is specific to your environment, not SQL Server.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
Yes, however I'm betting the nodes lost connectivity to each other. There are probably some entries about missed heartbeats, connectivity to ~3343, regroups, etc., in the cluster log.
Connectivity doesn't mean votes, connectivity means health checks. Once health checks fail, nodes become partitioned and that's when these events occur. You'll need to find out what occurred in your environment around the time this happened. If it happens quite often and on schedule, then it's some task or software in the environment, if it happens randomly then it's most likely infrastructure issues such as networking or Host/Guest/OS settings if it happens when under load.
add a comment |
I've only seen similar behavior when running AAG on virtual machines in VMWare. If the secondary replica gets stunned while holding a file lock on the witness share things get weird (see a slightly obsolete article about VM Stun: https://cormachogan.com/2015/04/28/when-and-why-do-we-stun-a-virtual-machine/ ). When you for example are extending the disk the VM might be stunned (paused) for from a few seconds to 20-30 seconds for a large disk.
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "182"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f237110%2fsql-server-always-on-file-share-witness-quorum-vote-on-different-subnet-to-oth%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment?
It's completely fine to have the FSW on a different subnet, there is absolutely nothing wrong with that. There is no need to have it on the same subnet, in fact there is an Azure witness which definitely won't be on the same subnet and it works without issue.
"A connection timeout has occurred on a previously established connection to availability replica"
Seems to be pointing to either something in the network being an issue or if this is on a virtual machine something is happening to the guest/host which is giving you this trouble. Given there are a plethora of in-depth configuration settings at the host, guest, and OS level that can contribute to this I won't go into any further depth as it'd be out of scope of this site.
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
This means that whomever attempted to arbitrate for the witness was only one vote off of having quorum for the cluster. Since it's a two node cluster, if the nodes couldn't talk to each other they'd be in this exact situation.
If neither node can talk to each other (obviously an issue) and neither node can talk to the FSW (another issue) this makes me wonder what's broken in the infrastructure - again, either at the virtual layer or the physical (network) layer. It's clear something is going on to cause this and is specific to your environment, not SQL Server.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
Yes, however I'm betting the nodes lost connectivity to each other. There are probably some entries about missed heartbeats, connectivity to ~3343, regroups, etc., in the cluster log.
Connectivity doesn't mean votes, connectivity means health checks. Once health checks fail, nodes become partitioned and that's when these events occur. You'll need to find out what occurred in your environment around the time this happened. If it happens quite often and on schedule, then it's some task or software in the environment, if it happens randomly then it's most likely infrastructure issues such as networking or Host/Guest/OS settings if it happens when under load.
add a comment |
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment?
It's completely fine to have the FSW on a different subnet, there is absolutely nothing wrong with that. There is no need to have it on the same subnet, in fact there is an Azure witness which definitely won't be on the same subnet and it works without issue.
"A connection timeout has occurred on a previously established connection to availability replica"
Seems to be pointing to either something in the network being an issue or if this is on a virtual machine something is happening to the guest/host which is giving you this trouble. Given there are a plethora of in-depth configuration settings at the host, guest, and OS level that can contribute to this I won't go into any further depth as it'd be out of scope of this site.
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
This means that whomever attempted to arbitrate for the witness was only one vote off of having quorum for the cluster. Since it's a two node cluster, if the nodes couldn't talk to each other they'd be in this exact situation.
If neither node can talk to each other (obviously an issue) and neither node can talk to the FSW (another issue) this makes me wonder what's broken in the infrastructure - again, either at the virtual layer or the physical (network) layer. It's clear something is going on to cause this and is specific to your environment, not SQL Server.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
Yes, however I'm betting the nodes lost connectivity to each other. There are probably some entries about missed heartbeats, connectivity to ~3343, regroups, etc., in the cluster log.
Connectivity doesn't mean votes, connectivity means health checks. Once health checks fail, nodes become partitioned and that's when these events occur. You'll need to find out what occurred in your environment around the time this happened. If it happens quite often and on schedule, then it's some task or software in the environment, if it happens randomly then it's most likely infrastructure issues such as networking or Host/Guest/OS settings if it happens when under load.
add a comment |
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment?
It's completely fine to have the FSW on a different subnet, there is absolutely nothing wrong with that. There is no need to have it on the same subnet, in fact there is an Azure witness which definitely won't be on the same subnet and it works without issue.
"A connection timeout has occurred on a previously established connection to availability replica"
Seems to be pointing to either something in the network being an issue or if this is on a virtual machine something is happening to the guest/host which is giving you this trouble. Given there are a plethora of in-depth configuration settings at the host, guest, and OS level that can contribute to this I won't go into any further depth as it'd be out of scope of this site.
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
This means that whomever attempted to arbitrate for the witness was only one vote off of having quorum for the cluster. Since it's a two node cluster, if the nodes couldn't talk to each other they'd be in this exact situation.
If neither node can talk to each other (obviously an issue) and neither node can talk to the FSW (another issue) this makes me wonder what's broken in the infrastructure - again, either at the virtual layer or the physical (network) layer. It's clear something is going on to cause this and is specific to your environment, not SQL Server.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
Yes, however I'm betting the nodes lost connectivity to each other. There are probably some entries about missed heartbeats, connectivity to ~3343, regroups, etc., in the cluster log.
Connectivity doesn't mean votes, connectivity means health checks. Once health checks fail, nodes become partitioned and that's when these events occur. You'll need to find out what occurred in your environment around the time this happened. If it happens quite often and on schedule, then it's some task or software in the environment, if it happens randomly then it's most likely infrastructure issues such as networking or Host/Guest/OS settings if it happens when under load.
Could someone confirm is having the file share server on a different subnet a big no no in an AlwaysOn environment?
It's completely fine to have the FSW on a different subnet, there is absolutely nothing wrong with that. There is no need to have it on the same subnet, in fact there is an Azure witness which definitely won't be on the same subnet and it works without issue.
"A connection timeout has occurred on a previously established connection to availability replica"
Seems to be pointing to either something in the network being an issue or if this is on a virtual machine something is happening to the guest/host which is giving you this trouble. Given there are a plethora of in-depth configuration settings at the host, guest, and OS level that can contribute to this I won't go into any further depth as it'd be out of scope of this site.
The errors in the failover cluster manager say "File share witness resource 'File Share Witness' failed to arbitrate for the file share" The server the file share sits on hasn't restarted or had any issues and all permissions are working.
This means that whomever attempted to arbitrate for the witness was only one vote off of having quorum for the cluster. Since it's a two node cluster, if the nodes couldn't talk to each other they'd be in this exact situation.
If neither node can talk to each other (obviously an issue) and neither node can talk to the FSW (another issue) this makes me wonder what's broken in the infrastructure - again, either at the virtual layer or the physical (network) layer. It's clear something is going on to cause this and is specific to your environment, not SQL Server.
The other weird thing is there are 3 votes in the quorum including the file share, so even if the file share loses connectivity to the failover cluster, node1 & node2 shouldn't lose connectivity between each other as there's enough votes for quorum (2)
Yes, however I'm betting the nodes lost connectivity to each other. There are probably some entries about missed heartbeats, connectivity to ~3343, regroups, etc., in the cluster log.
Connectivity doesn't mean votes, connectivity means health checks. Once health checks fail, nodes become partitioned and that's when these events occur. You'll need to find out what occurred in your environment around the time this happened. If it happens quite often and on schedule, then it's some task or software in the environment, if it happens randomly then it's most likely infrastructure issues such as networking or Host/Guest/OS settings if it happens when under load.
answered May 1 at 10:43
Sean GallardySean Gallardy
17.7k22756
17.7k22756
add a comment |
add a comment |
I've only seen similar behavior when running AAG on virtual machines in VMWare. If the secondary replica gets stunned while holding a file lock on the witness share things get weird (see a slightly obsolete article about VM Stun: https://cormachogan.com/2015/04/28/when-and-why-do-we-stun-a-virtual-machine/ ). When you for example are extending the disk the VM might be stunned (paused) for from a few seconds to 20-30 seconds for a large disk.
add a comment |
I've only seen similar behavior when running AAG on virtual machines in VMWare. If the secondary replica gets stunned while holding a file lock on the witness share things get weird (see a slightly obsolete article about VM Stun: https://cormachogan.com/2015/04/28/when-and-why-do-we-stun-a-virtual-machine/ ). When you for example are extending the disk the VM might be stunned (paused) for from a few seconds to 20-30 seconds for a large disk.
add a comment |
I've only seen similar behavior when running AAG on virtual machines in VMWare. If the secondary replica gets stunned while holding a file lock on the witness share things get weird (see a slightly obsolete article about VM Stun: https://cormachogan.com/2015/04/28/when-and-why-do-we-stun-a-virtual-machine/ ). When you for example are extending the disk the VM might be stunned (paused) for from a few seconds to 20-30 seconds for a large disk.
I've only seen similar behavior when running AAG on virtual machines in VMWare. If the secondary replica gets stunned while holding a file lock on the witness share things get weird (see a slightly obsolete article about VM Stun: https://cormachogan.com/2015/04/28/when-and-why-do-we-stun-a-virtual-machine/ ). When you for example are extending the disk the VM might be stunned (paused) for from a few seconds to 20-30 seconds for a large disk.
answered May 3 at 11:41
Andreas BergdalAndreas Bergdal
91
91
add a comment |
add a comment |
Thanks for contributing an answer to Database Administrators Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdba.stackexchange.com%2fquestions%2f237110%2fsql-server-always-on-file-share-witness-quorum-vote-on-different-subnet-to-oth%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown