BSOD 0x09c on 50 SuperMicro machinesInnaccessible Boot Device BSOD - 0x0000007BIs a 650W PSU enough for 4x 7k HDDs 2x CPUs and a monster AMD Radeon HD 5990How to verify power provided to processors is cleanIPMI MAC address all zeros after updateRandom 0x0000007B BSOD, Windows Server 2008 R2 on HP ServerSupermicro NICs gone after BIOS updateSupermicro stuck on 91Supermicro X8DAH+-F will not power onSupermicro ESXi 5 IPMIView - cant connect to the set IPSupermicro server not booting with OS

How does one intimidate enemies without having the capacity for violence?

Is Social Media Science Fiction?

Pronouncing Dictionary.com's W.O.D "vade mecum" in English

Is there really no realistic way for a skeleton monster to move around without magic?

Why did the Germans forbid the possession of pet pigeons in Rostov-on-Don in 1941?

Can an x86 CPU running in real mode be considered to be basically an 8086 CPU?

DOS, create pipe for stdin/stdout of command.com(or 4dos.com) in C or Batch?

What defenses are there against being summoned by the Gate spell?

Why Is Death Allowed In the Matrix?

How to report a triplet of septets in NMR tabulation?

Is it possible to do 50 km distance without any previous training?

Can I make popcorn with any corn?

Example of a relative pronoun

Copycat chess is back

TGV timetables / schedules?

I’m planning on buying a laser printer but concerned about the life cycle of toner in the machine

Infinite past with a beginning?

Email Account under attack (really) - anything I can do?

What do you call a Matrix-like slowdown and camera movement effect?

Simulate Bitwise Cyclic Tag

I probably found a bug with the sudo apt install function

What is the command to reset a PC without deleting any files

How old can references or sources in a thesis be?

Why don't electron-positron collisions release infinite energy?



BSOD 0x09c on 50 SuperMicro machines


Innaccessible Boot Device BSOD - 0x0000007BIs a 650W PSU enough for 4x 7k HDDs 2x CPUs and a monster AMD Radeon HD 5990How to verify power provided to processors is cleanIPMI MAC address all zeros after updateRandom 0x0000007B BSOD, Windows Server 2008 R2 on HP ServerSupermicro NICs gone after BIOS updateSupermicro stuck on 91Supermicro X8DAH+-F will not power onSupermicro ESXi 5 IPMIView - cant connect to the set IPSupermicro server not booting with OS






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








8















For a project we have 50 servers all equiped with (generally) the same hardware. The issue we have here is very serious and happens on all machines. Despite a lot of effort and contacting manufacturs and the software developpers everyone points to each other and even refuses to give me a clue about what is going one.



First let me describe the setup. This is 'servergrade' hardware. For my first experience, servergrade is the largest dissappointment in my life.



  • SuperMicro X10SDV-8C+-LN2F

  • Intel Xeon D-1540 (embedded on the motherboard)

  • Custom designed 1U case or SuperMicro original case

  • 480 watt server PSU or 200 watt SuperMicro original PSU

  • Samsung Evo 850 500 GB SSD

  • 32 GB DDR4-2133 ECC or NON-ECC (but not mixed in the same server)

  • Asus GT730 4GB DDR3 GPU

  • GPU is mounted with a PCIe riser card (not ribbon), nameless from China or SuperMicro original

Running on the system
- Windows Server 2012 R2 Enterprise
- VMWare Workstation 12
- VM's run GPU intensive tasks
- This system is stock, there's not over/underclocking at all



Symptoms
- Random BSOD 0x09c (aka Machine_Check_Exception): sometimes the system runs for a week with no problems, sometimes in crashes after just 10 minutes, but most of the times it runs for a few hours.



Already tried/checked:



  • BIOS updated to latest version (I would think now that this improved the time for the system to be stable, but that could have been random).

  • Windows updated to the latest version.

  • VMWare updated to the latest version.

  • Swapped all components and tried every different option, even tried a desktop ATX PSU and M.2 SSD.

  • Installed all systems from scratch with Ubuntu. I'm not familiar with Linux and have never seen a Linux BSOD and I still didn't since server systems are headless and I tried this in the DC. RESULT: system would hang and after reboot Linux reported XORG crash (GPU related).

  • Changed GPU setting in BIOS to 'Above 4G', the rest of the BIOS is factory default.

Also informative:



  • Systems are located in a datacenter. Temperature, air, power and network are optimal.

  • Temperatures are well below the factory maximum

  • We have the exact same software setup running on desktop computers (with desktop hardware). These system can run fine with 1 our of 100 PC's crashing every month.

  • I have contacted VMWare, the say this is a hardware issue

  • I have contacted SuperMicro, they say nothing really except some things and already tried and also that this could still be a software issue.

We are desperate here. The application we run luckily is sort of redundant. If a server and it's VM's on it drop, it's not such an issue, other servers will take over the load within 5 minutes, but at this rate I am required to be online all day to restart servers.



I have a large hardware knowdledge but this goes past it, I've search on this all day for over a month trying all sorts of different things.
The fact that these motherboards are used with hosting providers on a large scale makes me suspect that the board on itself is ok. This is definately not a specific hardware issue for RMA as all 50 boards have the same symptoms. The only thing different with us is the GPU. This in combo with the Linux experiment makes me suspect that this is definately something on the PCIe lane. The GPU itself is stable on desktop mobo's. Despite it's large memory capacity this is a small GPU that does not draw much power. I would suspect the Chinese riser cards, but then again we also use SuperMicro certified risers and they show no improvement at all.



I am very desperate to find a solution here. This will start with determing the exact cause.
We are willing to pay a nice bounty to an expert who can analyse some dumps and give us more details (or even better yet, a solution).



Kind regards,



Simon










share|improve this question
























  • I'm a bit familiar with this board, having one myself... There are too many moving parts here and too little explanation of what they are. What's the use of VMware Workstation? What application is being run in them? How is the GPU being passed to the VM(s)?

    – Michael Hampton
    Apr 16 '16 at 18:32











  • The VM's run a Windows company that is requires some GPU load. I cannot elaborate this much further. This is VMWare Workstation, the GPU is virtualised. This also shouldn't really matter, it works exactly the same on desktop hardware without problems.

    – user349749
    Apr 16 '16 at 19:39











  • It matters because you are not running it on desktop hardware!

    – Michael Hampton
    Apr 16 '16 at 19:41






  • 2





    I would suspect an incompatibility between your motherboards and your GPUs. With luck, it might be something that can be corrected in BIOS, but I wouldn't bet much on it. Since this is reproductible with a stock Linux kernel I would try to get more information on the kernel panic that probably happens.

    – Law29
    Apr 16 '16 at 20:39











  • What runs inside the VM does not matter. It could be rendering porn or maybe it's a logaritm to find a cure for aids. All that matters it is a standard GPU load. @Law29; That's exactly how I feel to. Linux didn't really give me any Kernel panic I think. The server was not crashing, just the GUI.

    – user349749
    Apr 17 '16 at 7:42

















8















For a project we have 50 servers all equiped with (generally) the same hardware. The issue we have here is very serious and happens on all machines. Despite a lot of effort and contacting manufacturs and the software developpers everyone points to each other and even refuses to give me a clue about what is going one.



First let me describe the setup. This is 'servergrade' hardware. For my first experience, servergrade is the largest dissappointment in my life.



  • SuperMicro X10SDV-8C+-LN2F

  • Intel Xeon D-1540 (embedded on the motherboard)

  • Custom designed 1U case or SuperMicro original case

  • 480 watt server PSU or 200 watt SuperMicro original PSU

  • Samsung Evo 850 500 GB SSD

  • 32 GB DDR4-2133 ECC or NON-ECC (but not mixed in the same server)

  • Asus GT730 4GB DDR3 GPU

  • GPU is mounted with a PCIe riser card (not ribbon), nameless from China or SuperMicro original

Running on the system
- Windows Server 2012 R2 Enterprise
- VMWare Workstation 12
- VM's run GPU intensive tasks
- This system is stock, there's not over/underclocking at all



Symptoms
- Random BSOD 0x09c (aka Machine_Check_Exception): sometimes the system runs for a week with no problems, sometimes in crashes after just 10 minutes, but most of the times it runs for a few hours.



Already tried/checked:



  • BIOS updated to latest version (I would think now that this improved the time for the system to be stable, but that could have been random).

  • Windows updated to the latest version.

  • VMWare updated to the latest version.

  • Swapped all components and tried every different option, even tried a desktop ATX PSU and M.2 SSD.

  • Installed all systems from scratch with Ubuntu. I'm not familiar with Linux and have never seen a Linux BSOD and I still didn't since server systems are headless and I tried this in the DC. RESULT: system would hang and after reboot Linux reported XORG crash (GPU related).

  • Changed GPU setting in BIOS to 'Above 4G', the rest of the BIOS is factory default.

Also informative:



  • Systems are located in a datacenter. Temperature, air, power and network are optimal.

  • Temperatures are well below the factory maximum

  • We have the exact same software setup running on desktop computers (with desktop hardware). These system can run fine with 1 our of 100 PC's crashing every month.

  • I have contacted VMWare, the say this is a hardware issue

  • I have contacted SuperMicro, they say nothing really except some things and already tried and also that this could still be a software issue.

We are desperate here. The application we run luckily is sort of redundant. If a server and it's VM's on it drop, it's not such an issue, other servers will take over the load within 5 minutes, but at this rate I am required to be online all day to restart servers.



I have a large hardware knowdledge but this goes past it, I've search on this all day for over a month trying all sorts of different things.
The fact that these motherboards are used with hosting providers on a large scale makes me suspect that the board on itself is ok. This is definately not a specific hardware issue for RMA as all 50 boards have the same symptoms. The only thing different with us is the GPU. This in combo with the Linux experiment makes me suspect that this is definately something on the PCIe lane. The GPU itself is stable on desktop mobo's. Despite it's large memory capacity this is a small GPU that does not draw much power. I would suspect the Chinese riser cards, but then again we also use SuperMicro certified risers and they show no improvement at all.



I am very desperate to find a solution here. This will start with determing the exact cause.
We are willing to pay a nice bounty to an expert who can analyse some dumps and give us more details (or even better yet, a solution).



Kind regards,



Simon










share|improve this question
























  • I'm a bit familiar with this board, having one myself... There are too many moving parts here and too little explanation of what they are. What's the use of VMware Workstation? What application is being run in them? How is the GPU being passed to the VM(s)?

    – Michael Hampton
    Apr 16 '16 at 18:32











  • The VM's run a Windows company that is requires some GPU load. I cannot elaborate this much further. This is VMWare Workstation, the GPU is virtualised. This also shouldn't really matter, it works exactly the same on desktop hardware without problems.

    – user349749
    Apr 16 '16 at 19:39











  • It matters because you are not running it on desktop hardware!

    – Michael Hampton
    Apr 16 '16 at 19:41






  • 2





    I would suspect an incompatibility between your motherboards and your GPUs. With luck, it might be something that can be corrected in BIOS, but I wouldn't bet much on it. Since this is reproductible with a stock Linux kernel I would try to get more information on the kernel panic that probably happens.

    – Law29
    Apr 16 '16 at 20:39











  • What runs inside the VM does not matter. It could be rendering porn or maybe it's a logaritm to find a cure for aids. All that matters it is a standard GPU load. @Law29; That's exactly how I feel to. Linux didn't really give me any Kernel panic I think. The server was not crashing, just the GUI.

    – user349749
    Apr 17 '16 at 7:42













8












8








8


1






For a project we have 50 servers all equiped with (generally) the same hardware. The issue we have here is very serious and happens on all machines. Despite a lot of effort and contacting manufacturs and the software developpers everyone points to each other and even refuses to give me a clue about what is going one.



First let me describe the setup. This is 'servergrade' hardware. For my first experience, servergrade is the largest dissappointment in my life.



  • SuperMicro X10SDV-8C+-LN2F

  • Intel Xeon D-1540 (embedded on the motherboard)

  • Custom designed 1U case or SuperMicro original case

  • 480 watt server PSU or 200 watt SuperMicro original PSU

  • Samsung Evo 850 500 GB SSD

  • 32 GB DDR4-2133 ECC or NON-ECC (but not mixed in the same server)

  • Asus GT730 4GB DDR3 GPU

  • GPU is mounted with a PCIe riser card (not ribbon), nameless from China or SuperMicro original

Running on the system
- Windows Server 2012 R2 Enterprise
- VMWare Workstation 12
- VM's run GPU intensive tasks
- This system is stock, there's not over/underclocking at all



Symptoms
- Random BSOD 0x09c (aka Machine_Check_Exception): sometimes the system runs for a week with no problems, sometimes in crashes after just 10 minutes, but most of the times it runs for a few hours.



Already tried/checked:



  • BIOS updated to latest version (I would think now that this improved the time for the system to be stable, but that could have been random).

  • Windows updated to the latest version.

  • VMWare updated to the latest version.

  • Swapped all components and tried every different option, even tried a desktop ATX PSU and M.2 SSD.

  • Installed all systems from scratch with Ubuntu. I'm not familiar with Linux and have never seen a Linux BSOD and I still didn't since server systems are headless and I tried this in the DC. RESULT: system would hang and after reboot Linux reported XORG crash (GPU related).

  • Changed GPU setting in BIOS to 'Above 4G', the rest of the BIOS is factory default.

Also informative:



  • Systems are located in a datacenter. Temperature, air, power and network are optimal.

  • Temperatures are well below the factory maximum

  • We have the exact same software setup running on desktop computers (with desktop hardware). These system can run fine with 1 our of 100 PC's crashing every month.

  • I have contacted VMWare, the say this is a hardware issue

  • I have contacted SuperMicro, they say nothing really except some things and already tried and also that this could still be a software issue.

We are desperate here. The application we run luckily is sort of redundant. If a server and it's VM's on it drop, it's not such an issue, other servers will take over the load within 5 minutes, but at this rate I am required to be online all day to restart servers.



I have a large hardware knowdledge but this goes past it, I've search on this all day for over a month trying all sorts of different things.
The fact that these motherboards are used with hosting providers on a large scale makes me suspect that the board on itself is ok. This is definately not a specific hardware issue for RMA as all 50 boards have the same symptoms. The only thing different with us is the GPU. This in combo with the Linux experiment makes me suspect that this is definately something on the PCIe lane. The GPU itself is stable on desktop mobo's. Despite it's large memory capacity this is a small GPU that does not draw much power. I would suspect the Chinese riser cards, but then again we also use SuperMicro certified risers and they show no improvement at all.



I am very desperate to find a solution here. This will start with determing the exact cause.
We are willing to pay a nice bounty to an expert who can analyse some dumps and give us more details (or even better yet, a solution).



Kind regards,



Simon










share|improve this question
















For a project we have 50 servers all equiped with (generally) the same hardware. The issue we have here is very serious and happens on all machines. Despite a lot of effort and contacting manufacturs and the software developpers everyone points to each other and even refuses to give me a clue about what is going one.



First let me describe the setup. This is 'servergrade' hardware. For my first experience, servergrade is the largest dissappointment in my life.



  • SuperMicro X10SDV-8C+-LN2F

  • Intel Xeon D-1540 (embedded on the motherboard)

  • Custom designed 1U case or SuperMicro original case

  • 480 watt server PSU or 200 watt SuperMicro original PSU

  • Samsung Evo 850 500 GB SSD

  • 32 GB DDR4-2133 ECC or NON-ECC (but not mixed in the same server)

  • Asus GT730 4GB DDR3 GPU

  • GPU is mounted with a PCIe riser card (not ribbon), nameless from China or SuperMicro original

Running on the system
- Windows Server 2012 R2 Enterprise
- VMWare Workstation 12
- VM's run GPU intensive tasks
- This system is stock, there's not over/underclocking at all



Symptoms
- Random BSOD 0x09c (aka Machine_Check_Exception): sometimes the system runs for a week with no problems, sometimes in crashes after just 10 minutes, but most of the times it runs for a few hours.



Already tried/checked:



  • BIOS updated to latest version (I would think now that this improved the time for the system to be stable, but that could have been random).

  • Windows updated to the latest version.

  • VMWare updated to the latest version.

  • Swapped all components and tried every different option, even tried a desktop ATX PSU and M.2 SSD.

  • Installed all systems from scratch with Ubuntu. I'm not familiar with Linux and have never seen a Linux BSOD and I still didn't since server systems are headless and I tried this in the DC. RESULT: system would hang and after reboot Linux reported XORG crash (GPU related).

  • Changed GPU setting in BIOS to 'Above 4G', the rest of the BIOS is factory default.

Also informative:



  • Systems are located in a datacenter. Temperature, air, power and network are optimal.

  • Temperatures are well below the factory maximum

  • We have the exact same software setup running on desktop computers (with desktop hardware). These system can run fine with 1 our of 100 PC's crashing every month.

  • I have contacted VMWare, the say this is a hardware issue

  • I have contacted SuperMicro, they say nothing really except some things and already tried and also that this could still be a software issue.

We are desperate here. The application we run luckily is sort of redundant. If a server and it's VM's on it drop, it's not such an issue, other servers will take over the load within 5 minutes, but at this rate I am required to be online all day to restart servers.



I have a large hardware knowdledge but this goes past it, I've search on this all day for over a month trying all sorts of different things.
The fact that these motherboards are used with hosting providers on a large scale makes me suspect that the board on itself is ok. This is definately not a specific hardware issue for RMA as all 50 boards have the same symptoms. The only thing different with us is the GPU. This in combo with the Linux experiment makes me suspect that this is definately something on the PCIe lane. The GPU itself is stable on desktop mobo's. Despite it's large memory capacity this is a small GPU that does not draw much power. I would suspect the Chinese riser cards, but then again we also use SuperMicro certified risers and they show no improvement at all.



I am very desperate to find a solution here. This will start with determing the exact cause.
We are willing to pay a nice bounty to an expert who can analyse some dumps and give us more details (or even better yet, a solution).



Kind regards,



Simon







supermicro bsod






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited May 18 '18 at 22:28









José Castillo Lema

40119




40119










asked Apr 16 '16 at 17:25









user349749user349749

411




411












  • I'm a bit familiar with this board, having one myself... There are too many moving parts here and too little explanation of what they are. What's the use of VMware Workstation? What application is being run in them? How is the GPU being passed to the VM(s)?

    – Michael Hampton
    Apr 16 '16 at 18:32











  • The VM's run a Windows company that is requires some GPU load. I cannot elaborate this much further. This is VMWare Workstation, the GPU is virtualised. This also shouldn't really matter, it works exactly the same on desktop hardware without problems.

    – user349749
    Apr 16 '16 at 19:39











  • It matters because you are not running it on desktop hardware!

    – Michael Hampton
    Apr 16 '16 at 19:41






  • 2





    I would suspect an incompatibility between your motherboards and your GPUs. With luck, it might be something that can be corrected in BIOS, but I wouldn't bet much on it. Since this is reproductible with a stock Linux kernel I would try to get more information on the kernel panic that probably happens.

    – Law29
    Apr 16 '16 at 20:39











  • What runs inside the VM does not matter. It could be rendering porn or maybe it's a logaritm to find a cure for aids. All that matters it is a standard GPU load. @Law29; That's exactly how I feel to. Linux didn't really give me any Kernel panic I think. The server was not crashing, just the GUI.

    – user349749
    Apr 17 '16 at 7:42

















  • I'm a bit familiar with this board, having one myself... There are too many moving parts here and too little explanation of what they are. What's the use of VMware Workstation? What application is being run in them? How is the GPU being passed to the VM(s)?

    – Michael Hampton
    Apr 16 '16 at 18:32











  • The VM's run a Windows company that is requires some GPU load. I cannot elaborate this much further. This is VMWare Workstation, the GPU is virtualised. This also shouldn't really matter, it works exactly the same on desktop hardware without problems.

    – user349749
    Apr 16 '16 at 19:39











  • It matters because you are not running it on desktop hardware!

    – Michael Hampton
    Apr 16 '16 at 19:41






  • 2





    I would suspect an incompatibility between your motherboards and your GPUs. With luck, it might be something that can be corrected in BIOS, but I wouldn't bet much on it. Since this is reproductible with a stock Linux kernel I would try to get more information on the kernel panic that probably happens.

    – Law29
    Apr 16 '16 at 20:39











  • What runs inside the VM does not matter. It could be rendering porn or maybe it's a logaritm to find a cure for aids. All that matters it is a standard GPU load. @Law29; That's exactly how I feel to. Linux didn't really give me any Kernel panic I think. The server was not crashing, just the GUI.

    – user349749
    Apr 17 '16 at 7:42
















I'm a bit familiar with this board, having one myself... There are too many moving parts here and too little explanation of what they are. What's the use of VMware Workstation? What application is being run in them? How is the GPU being passed to the VM(s)?

– Michael Hampton
Apr 16 '16 at 18:32





I'm a bit familiar with this board, having one myself... There are too many moving parts here and too little explanation of what they are. What's the use of VMware Workstation? What application is being run in them? How is the GPU being passed to the VM(s)?

– Michael Hampton
Apr 16 '16 at 18:32













The VM's run a Windows company that is requires some GPU load. I cannot elaborate this much further. This is VMWare Workstation, the GPU is virtualised. This also shouldn't really matter, it works exactly the same on desktop hardware without problems.

– user349749
Apr 16 '16 at 19:39





The VM's run a Windows company that is requires some GPU load. I cannot elaborate this much further. This is VMWare Workstation, the GPU is virtualised. This also shouldn't really matter, it works exactly the same on desktop hardware without problems.

– user349749
Apr 16 '16 at 19:39













It matters because you are not running it on desktop hardware!

– Michael Hampton
Apr 16 '16 at 19:41





It matters because you are not running it on desktop hardware!

– Michael Hampton
Apr 16 '16 at 19:41




2




2





I would suspect an incompatibility between your motherboards and your GPUs. With luck, it might be something that can be corrected in BIOS, but I wouldn't bet much on it. Since this is reproductible with a stock Linux kernel I would try to get more information on the kernel panic that probably happens.

– Law29
Apr 16 '16 at 20:39





I would suspect an incompatibility between your motherboards and your GPUs. With luck, it might be something that can be corrected in BIOS, but I wouldn't bet much on it. Since this is reproductible with a stock Linux kernel I would try to get more information on the kernel panic that probably happens.

– Law29
Apr 16 '16 at 20:39













What runs inside the VM does not matter. It could be rendering porn or maybe it's a logaritm to find a cure for aids. All that matters it is a standard GPU load. @Law29; That's exactly how I feel to. Linux didn't really give me any Kernel panic I think. The server was not crashing, just the GUI.

– user349749
Apr 17 '16 at 7:42





What runs inside the VM does not matter. It could be rendering porn or maybe it's a logaritm to find a cure for aids. All that matters it is a standard GPU load. @Law29; That's exactly how I feel to. Linux didn't really give me any Kernel panic I think. The server was not crashing, just the GUI.

– user349749
Apr 17 '16 at 7:42










1 Answer
1






active

oldest

votes


















2














Well this is super late, i imagine the issue is resolved by this point? Either way 0x9C usually means a MCE hardware fault, Our GPU systems ran linux as a host os which reports these errors a bit more verbose than windows.



Anyways, these were randomly popping up for us on similar hardware made by HP a while back, It ended up being insufficient power delivery to the GPU. Specifically the 75W that's supposed to be supplied by the PCIe port itself.



We confirmed it with a multimeter on a PCIe breakout board. Voltage dropped when both GPU and 10Gbe network cards were being hit hard at the same time. While the motherboard was capable of delivering 75W to the x16 slot, the power delivery section struggled a bit when the other cards were all consuming power.



The riser may be suspect here and dropping voltage on high current loads.






share|improve this answer























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "2"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f770862%2fbsod-0x09c-on-50-supermicro-machines%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    2














    Well this is super late, i imagine the issue is resolved by this point? Either way 0x9C usually means a MCE hardware fault, Our GPU systems ran linux as a host os which reports these errors a bit more verbose than windows.



    Anyways, these were randomly popping up for us on similar hardware made by HP a while back, It ended up being insufficient power delivery to the GPU. Specifically the 75W that's supposed to be supplied by the PCIe port itself.



    We confirmed it with a multimeter on a PCIe breakout board. Voltage dropped when both GPU and 10Gbe network cards were being hit hard at the same time. While the motherboard was capable of delivering 75W to the x16 slot, the power delivery section struggled a bit when the other cards were all consuming power.



    The riser may be suspect here and dropping voltage on high current loads.






    share|improve this answer



























      2














      Well this is super late, i imagine the issue is resolved by this point? Either way 0x9C usually means a MCE hardware fault, Our GPU systems ran linux as a host os which reports these errors a bit more verbose than windows.



      Anyways, these were randomly popping up for us on similar hardware made by HP a while back, It ended up being insufficient power delivery to the GPU. Specifically the 75W that's supposed to be supplied by the PCIe port itself.



      We confirmed it with a multimeter on a PCIe breakout board. Voltage dropped when both GPU and 10Gbe network cards were being hit hard at the same time. While the motherboard was capable of delivering 75W to the x16 slot, the power delivery section struggled a bit when the other cards were all consuming power.



      The riser may be suspect here and dropping voltage on high current loads.






      share|improve this answer

























        2












        2








        2







        Well this is super late, i imagine the issue is resolved by this point? Either way 0x9C usually means a MCE hardware fault, Our GPU systems ran linux as a host os which reports these errors a bit more verbose than windows.



        Anyways, these were randomly popping up for us on similar hardware made by HP a while back, It ended up being insufficient power delivery to the GPU. Specifically the 75W that's supposed to be supplied by the PCIe port itself.



        We confirmed it with a multimeter on a PCIe breakout board. Voltage dropped when both GPU and 10Gbe network cards were being hit hard at the same time. While the motherboard was capable of delivering 75W to the x16 slot, the power delivery section struggled a bit when the other cards were all consuming power.



        The riser may be suspect here and dropping voltage on high current loads.






        share|improve this answer













        Well this is super late, i imagine the issue is resolved by this point? Either way 0x9C usually means a MCE hardware fault, Our GPU systems ran linux as a host os which reports these errors a bit more verbose than windows.



        Anyways, these were randomly popping up for us on similar hardware made by HP a while back, It ended up being insufficient power delivery to the GPU. Specifically the 75W that's supposed to be supplied by the PCIe port itself.



        We confirmed it with a multimeter on a PCIe breakout board. Voltage dropped when both GPU and 10Gbe network cards were being hit hard at the same time. While the motherboard was capable of delivering 75W to the x16 slot, the power delivery section struggled a bit when the other cards were all consuming power.



        The riser may be suspect here and dropping voltage on high current loads.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Apr 4 at 13:56









        TriadicTechTriadicTech

        3181313




        3181313



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Server Fault!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f770862%2fbsod-0x09c-on-50-supermicro-machines%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Wikipedia:Vital articles Мазмуну Biography - Өмүр баян Philosophy and psychology - Философия жана психология Religion - Дин Social sciences - Коомдук илимдер Language and literature - Тил жана адабият Science - Илим Technology - Технология Arts and recreation - Искусство жана эс алуу History and geography - Тарых жана география Навигация менюсу

            Bruxelas-Capital Índice Historia | Composición | Situación lingüística | Clima | Cidades irmandadas | Notas | Véxase tamén | Menú de navegacióneO uso das linguas en Bruxelas e a situación do neerlandés"Rexión de Bruxelas Capital"o orixinalSitio da rexiónPáxina de Bruselas no sitio da Oficina de Promoción Turística de Valonia e BruxelasMapa Interactivo da Rexión de Bruxelas-CapitaleeWorldCat332144929079854441105155190212ID28008674080552-90000 0001 0666 3698n94104302ID540940339365017018237

            What should I write in an apology letter, since I have decided not to join a company after accepting an offer letterShould I keep looking after accepting a job offer?What should I do when I've been verbally told I would get an offer letter, but still haven't gotten one after 4 weeks?Do I accept an offer from a company that I am not likely to join?New job hasn't confirmed starting date and I want to give current employer as much notice as possibleHow should I address my manager in my resignation letter?HR delayed background verification, now jobless as resignedNo email communication after accepting a formal written offer. How should I phrase the call?What should I do if after receiving a verbal offer letter I am informed that my written job offer is put on hold due to some internal issues?Should I inform the current employer that I am about to resign within 1-2 weeks since I have signed the offer letter and waiting for visa?What company will do, if I send their offer letter to another company