Troubleshooting NIC Logs by NETDEV WATCHDOGBroadcom NIC Teaming with Hyper-V R2SATA drive problems with two SIL RAID cardsDual Port Server Nic QuestionMissing Jumbo and MTU for Broadcom NIC?What network loads require NIC polling vs interrupts?linux hibernate working but slowNIC Teaming with Broadcom on HyperV 2008 R2Ubuntu server - NIC suddenly loses connectivityKVM NIC passthrough “device already in use”How do I troubleshoot slow NIC on R610 Dell?

Reaction of borax with NaOH

Why can't RGB or bicolour LEDs produce a decent yellow?

Create a list of all possible Boolean configurations of three constraints

Can I make ravioli dough with only all-purpose flour or do I NEED semolina flour?

Speculative Biology of a Haplodiploid Humanoid Species

Extracting sublists that contain similar elements

What does "Ich wusste, dass aus dir mal was wird" mean?

Does Lawful Interception of 4G / the proposed 5G provide a back door for hackers as well?

Why do Thanos's punches not kill Captain America or at least cause some mortal injuries?

Does kinetic energy warp spacetime?

Proof that the inverse image of a single element is a discrete space

Exception propagation: When should I catch exceptions?

What is the significance of 4200 BCE in context of farming replacing foraging in Europe?

Why doesn't Rocket Lab use a solid stage?

Who was this character from the Tomb of Annihilation adventure before they became a monster?

What are the implications of the new alleged key recovery attack preprint on SIMON?

Anatomically Correct Carnivorous Tree

Setting the major mode of a new buffer interactively

What is the best way for a skeleton to impersonate human without using magic?

List software from restricted, multiverse separately

How to compact two the parabol commands in the following example?

How can a Lich look like a human without magic?

Light Switch Terminals

Does the 500 feet falling cap apply per fall, or per turn?



Troubleshooting NIC Logs by NETDEV WATCHDOG


Broadcom NIC Teaming with Hyper-V R2SATA drive problems with two SIL RAID cardsDual Port Server Nic QuestionMissing Jumbo and MTU for Broadcom NIC?What network loads require NIC polling vs interrupts?linux hibernate working but slowNIC Teaming with Broadcom on HyperV 2008 R2Ubuntu server - NIC suddenly loses connectivityKVM NIC passthrough “device already in use”How do I troubleshoot slow NIC on R610 Dell?






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;








1















I have a CentOS 6.6 server (Linux 2.6.32-504) running on a brand new Dell PowerEdge R320. Just the other day, I began noticing problems in /var/log/messages related to one of my NICs.



Note: this is a server cluster that requires eth0 to be used in a crossover network to a redundant backup server, so we use eth1 to talk to the customer's network and the world; so it would be challenging to switch NICs for troubleshooting purposes. I'll also note I'm not seeing any issues on eth0, though.



I'd like to troubleshoot this as much as possible ~ is it a hardware failure, Linux kernel doesn't like the Dell hardware, etc.? I am not a career Linux administrator, so I'm kind of green in this regard.



Google searches seem to turn up more questions for me, than answers (and only seem to apply to other distros; but point out possibilities such as kernel/firmware issues. I humbly beg for this community's expertise and ideas!



The one thing I noticed was that the interface was coming back up as 100Mb/half (when it's really on a 1Gb/full).



I'll provide as much as I can think of, that's hopefully relevant.



Here is the snippet of /var/log/messages where I noticed the issue... does this mean anything to anyone?



Mar 22 16:10:07 ind1un043 kernel: ------------[ cut here ]------------
Mar 22 16:10:07 ind1un043 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26b/0x280() (Not tainted)
Mar 22 16:10:07 ind1un043 kernel: Hardware name: PowerEdge R320
Mar 22 16:10:07 ind1un043 kernel: NETDEV WATCHDOG: eth1 (tg3): transmit queue 0 timed out
Mar 22 16:10:07 ind1un043 kernel: Modules linked in: nls_utf8 hfsplus(U) mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase dell_rbu coretemp 8021q garp stp llc wctc4xxp(U) dahdi_transcode(U) wcb4xxp(U) wctdm(U) wcfxo(U) wctdm24xxp(U) wcte11xp(U) wct1xxp(U) wcte12xp(U) dahdi_voicebus(U) wct4xxp(U) oct612x(U) dahdi(U) crc_ccitt ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 uinput iTCO_wdt iTCO_vendor_support microcode dcdbas sb_edac edac_core ipmi_devintf power_meter acpi_ipmi ipmi_si ipmi_msghandler sg shpchp tg3 ptp pps_core lpc_ich mfd_core ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahci wmi megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Mar 22 16:10:07 ind1un043 kernel: Pid: 9, comm: ksoftirqd/1 Not tainted 2.6.32-504.el6.x86_64 #1
Mar 22 16:10:07 ind1un043 kernel: Call Trace:
Mar 22 16:10:07 ind1un043 kernel: <IRQ> [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81074ee6>] ? warn_slowpath_fmt+0x46/0x50
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147df7b>] ? dev_watchdog+0x26b/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81060d7c>] ? scheduler_tick+0xcc/0x260
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147dd10>] ? dev_watchdog+0x0/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81087db7>] ? run_timer_softirq+0x197/0x340
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b0275>] ? tick_dev_program_event+0x65/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b034a>] ? tick_program_event+0x2a/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100fc15>] ? do_softirq+0x65/0xa0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81533c0a>] ? smp_apic_timer_interrupt+0x4a/0x60
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
Mar 22 16:10:07 ind1un043 kernel: <EOI> [<ffffffff8107d4c5>] ? ksoftirqd+0xd5/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d3f0>] ? ksoftirqd+0x0/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Mar 22 16:10:07 ind1un043 kernel: ---[ end trace c154004f7af06fb3 ]---
Mar 22 16:10:07 ind1un043 kernel: tg3 0000:02:00.1: eth1: transmit timed out, resetting
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000000: 0x165f14e4, 0x00100406, 0x02000000, 0x00800010
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000010: 0xd90d000c, 0x00000000, 0xd90e000c, 0x00000000
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000020: 0xd90f000c, 0x00000000, 0x00000000, 0x04f71028
... (TOO MANY CHARACTERS TO POST ON SERVER FAULT, SO STRIPPING) ...
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007020: 0x00000000, 0x00000000, 0x00000406, 0x10004000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007030: 0x000e0000, 0x00004af8, 0x00170030, 0x00000000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: Host status block [00000001:00000063:(0000:0695:0000):(0000:01b8)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: NAPI info [00000063:00000063:(01a5:01b8:01ff):0000:(0095:0000:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: Host status block [00000001:000000ac:(0000:0000:0000):(083a:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: NAPI info [000000ac:000000ac:(0000:0000:01ff):083a:(003a:003a:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: Host status block [00000001:0000004f:(089e:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: NAPI info [00000046:00000046:(0000:0000:01ff):0895:(0095:0095:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: Host status block [00000001:00000012:(0000:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: NAPI info [00000012:00000012:(0000:0000:01ff):0fbc:(07bc:07bc:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: Host status block [00000001:000000d4:(0000:0000:0742):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: NAPI info [000000d4:000000d4:(0000:0000:01ff):0742:(0742:0742:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=1400 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=c00 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is down
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled


But before that, here is the snippet of /var/log/messages where the NIC initializes seemingly normally during a boot...



Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:3f
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: dma_rwctrl[00000001] dma_mask[64-bit]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:40
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: dma_rwctrl[00000001] dma_mask[64-bit]


Here is the snippet of /var/log/messages where the NIC comes up seemingly normally, but not sure in what context (it's just after the init immediately above this snippet)...



Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Link is up at 1000 Mbps, full duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Flow control is on for TX and on for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


And, here is the snippet of /var/log/messages where the NIC seems to soon-after have other issues, renegotiating or something. This happens randomly several times over the next few days...



Mar 21 16:01:42 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:44 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


Here is my lspci | grep -i net...



02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe









share|improve this question
























  • Couple of things I would try first. 1: Check for errors in the iDRAC 2: grab the latest SUU and make sure all your drivers/firmware are up to date 3: check the switch it is connected to for errors 4: check that the switch it's connected to isn't hard coded to 100/half for some reason.

    – Zypher
    Mar 25 '16 at 15:41


















1















I have a CentOS 6.6 server (Linux 2.6.32-504) running on a brand new Dell PowerEdge R320. Just the other day, I began noticing problems in /var/log/messages related to one of my NICs.



Note: this is a server cluster that requires eth0 to be used in a crossover network to a redundant backup server, so we use eth1 to talk to the customer's network and the world; so it would be challenging to switch NICs for troubleshooting purposes. I'll also note I'm not seeing any issues on eth0, though.



I'd like to troubleshoot this as much as possible ~ is it a hardware failure, Linux kernel doesn't like the Dell hardware, etc.? I am not a career Linux administrator, so I'm kind of green in this regard.



Google searches seem to turn up more questions for me, than answers (and only seem to apply to other distros; but point out possibilities such as kernel/firmware issues. I humbly beg for this community's expertise and ideas!



The one thing I noticed was that the interface was coming back up as 100Mb/half (when it's really on a 1Gb/full).



I'll provide as much as I can think of, that's hopefully relevant.



Here is the snippet of /var/log/messages where I noticed the issue... does this mean anything to anyone?



Mar 22 16:10:07 ind1un043 kernel: ------------[ cut here ]------------
Mar 22 16:10:07 ind1un043 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26b/0x280() (Not tainted)
Mar 22 16:10:07 ind1un043 kernel: Hardware name: PowerEdge R320
Mar 22 16:10:07 ind1un043 kernel: NETDEV WATCHDOG: eth1 (tg3): transmit queue 0 timed out
Mar 22 16:10:07 ind1un043 kernel: Modules linked in: nls_utf8 hfsplus(U) mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase dell_rbu coretemp 8021q garp stp llc wctc4xxp(U) dahdi_transcode(U) wcb4xxp(U) wctdm(U) wcfxo(U) wctdm24xxp(U) wcte11xp(U) wct1xxp(U) wcte12xp(U) dahdi_voicebus(U) wct4xxp(U) oct612x(U) dahdi(U) crc_ccitt ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 uinput iTCO_wdt iTCO_vendor_support microcode dcdbas sb_edac edac_core ipmi_devintf power_meter acpi_ipmi ipmi_si ipmi_msghandler sg shpchp tg3 ptp pps_core lpc_ich mfd_core ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahci wmi megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Mar 22 16:10:07 ind1un043 kernel: Pid: 9, comm: ksoftirqd/1 Not tainted 2.6.32-504.el6.x86_64 #1
Mar 22 16:10:07 ind1un043 kernel: Call Trace:
Mar 22 16:10:07 ind1un043 kernel: <IRQ> [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81074ee6>] ? warn_slowpath_fmt+0x46/0x50
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147df7b>] ? dev_watchdog+0x26b/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81060d7c>] ? scheduler_tick+0xcc/0x260
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147dd10>] ? dev_watchdog+0x0/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81087db7>] ? run_timer_softirq+0x197/0x340
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b0275>] ? tick_dev_program_event+0x65/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b034a>] ? tick_program_event+0x2a/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100fc15>] ? do_softirq+0x65/0xa0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81533c0a>] ? smp_apic_timer_interrupt+0x4a/0x60
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
Mar 22 16:10:07 ind1un043 kernel: <EOI> [<ffffffff8107d4c5>] ? ksoftirqd+0xd5/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d3f0>] ? ksoftirqd+0x0/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Mar 22 16:10:07 ind1un043 kernel: ---[ end trace c154004f7af06fb3 ]---
Mar 22 16:10:07 ind1un043 kernel: tg3 0000:02:00.1: eth1: transmit timed out, resetting
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000000: 0x165f14e4, 0x00100406, 0x02000000, 0x00800010
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000010: 0xd90d000c, 0x00000000, 0xd90e000c, 0x00000000
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000020: 0xd90f000c, 0x00000000, 0x00000000, 0x04f71028
... (TOO MANY CHARACTERS TO POST ON SERVER FAULT, SO STRIPPING) ...
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007020: 0x00000000, 0x00000000, 0x00000406, 0x10004000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007030: 0x000e0000, 0x00004af8, 0x00170030, 0x00000000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: Host status block [00000001:00000063:(0000:0695:0000):(0000:01b8)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: NAPI info [00000063:00000063:(01a5:01b8:01ff):0000:(0095:0000:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: Host status block [00000001:000000ac:(0000:0000:0000):(083a:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: NAPI info [000000ac:000000ac:(0000:0000:01ff):083a:(003a:003a:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: Host status block [00000001:0000004f:(089e:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: NAPI info [00000046:00000046:(0000:0000:01ff):0895:(0095:0095:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: Host status block [00000001:00000012:(0000:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: NAPI info [00000012:00000012:(0000:0000:01ff):0fbc:(07bc:07bc:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: Host status block [00000001:000000d4:(0000:0000:0742):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: NAPI info [000000d4:000000d4:(0000:0000:01ff):0742:(0742:0742:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=1400 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=c00 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is down
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled


But before that, here is the snippet of /var/log/messages where the NIC initializes seemingly normally during a boot...



Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:3f
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: dma_rwctrl[00000001] dma_mask[64-bit]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:40
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: dma_rwctrl[00000001] dma_mask[64-bit]


Here is the snippet of /var/log/messages where the NIC comes up seemingly normally, but not sure in what context (it's just after the init immediately above this snippet)...



Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Link is up at 1000 Mbps, full duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Flow control is on for TX and on for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


And, here is the snippet of /var/log/messages where the NIC seems to soon-after have other issues, renegotiating or something. This happens randomly several times over the next few days...



Mar 21 16:01:42 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:44 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


Here is my lspci | grep -i net...



02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe









share|improve this question
























  • Couple of things I would try first. 1: Check for errors in the iDRAC 2: grab the latest SUU and make sure all your drivers/firmware are up to date 3: check the switch it is connected to for errors 4: check that the switch it's connected to isn't hard coded to 100/half for some reason.

    – Zypher
    Mar 25 '16 at 15:41














1












1








1








I have a CentOS 6.6 server (Linux 2.6.32-504) running on a brand new Dell PowerEdge R320. Just the other day, I began noticing problems in /var/log/messages related to one of my NICs.



Note: this is a server cluster that requires eth0 to be used in a crossover network to a redundant backup server, so we use eth1 to talk to the customer's network and the world; so it would be challenging to switch NICs for troubleshooting purposes. I'll also note I'm not seeing any issues on eth0, though.



I'd like to troubleshoot this as much as possible ~ is it a hardware failure, Linux kernel doesn't like the Dell hardware, etc.? I am not a career Linux administrator, so I'm kind of green in this regard.



Google searches seem to turn up more questions for me, than answers (and only seem to apply to other distros; but point out possibilities such as kernel/firmware issues. I humbly beg for this community's expertise and ideas!



The one thing I noticed was that the interface was coming back up as 100Mb/half (when it's really on a 1Gb/full).



I'll provide as much as I can think of, that's hopefully relevant.



Here is the snippet of /var/log/messages where I noticed the issue... does this mean anything to anyone?



Mar 22 16:10:07 ind1un043 kernel: ------------[ cut here ]------------
Mar 22 16:10:07 ind1un043 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26b/0x280() (Not tainted)
Mar 22 16:10:07 ind1un043 kernel: Hardware name: PowerEdge R320
Mar 22 16:10:07 ind1un043 kernel: NETDEV WATCHDOG: eth1 (tg3): transmit queue 0 timed out
Mar 22 16:10:07 ind1un043 kernel: Modules linked in: nls_utf8 hfsplus(U) mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase dell_rbu coretemp 8021q garp stp llc wctc4xxp(U) dahdi_transcode(U) wcb4xxp(U) wctdm(U) wcfxo(U) wctdm24xxp(U) wcte11xp(U) wct1xxp(U) wcte12xp(U) dahdi_voicebus(U) wct4xxp(U) oct612x(U) dahdi(U) crc_ccitt ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 uinput iTCO_wdt iTCO_vendor_support microcode dcdbas sb_edac edac_core ipmi_devintf power_meter acpi_ipmi ipmi_si ipmi_msghandler sg shpchp tg3 ptp pps_core lpc_ich mfd_core ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahci wmi megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Mar 22 16:10:07 ind1un043 kernel: Pid: 9, comm: ksoftirqd/1 Not tainted 2.6.32-504.el6.x86_64 #1
Mar 22 16:10:07 ind1un043 kernel: Call Trace:
Mar 22 16:10:07 ind1un043 kernel: <IRQ> [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81074ee6>] ? warn_slowpath_fmt+0x46/0x50
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147df7b>] ? dev_watchdog+0x26b/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81060d7c>] ? scheduler_tick+0xcc/0x260
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147dd10>] ? dev_watchdog+0x0/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81087db7>] ? run_timer_softirq+0x197/0x340
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b0275>] ? tick_dev_program_event+0x65/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b034a>] ? tick_program_event+0x2a/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100fc15>] ? do_softirq+0x65/0xa0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81533c0a>] ? smp_apic_timer_interrupt+0x4a/0x60
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
Mar 22 16:10:07 ind1un043 kernel: <EOI> [<ffffffff8107d4c5>] ? ksoftirqd+0xd5/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d3f0>] ? ksoftirqd+0x0/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Mar 22 16:10:07 ind1un043 kernel: ---[ end trace c154004f7af06fb3 ]---
Mar 22 16:10:07 ind1un043 kernel: tg3 0000:02:00.1: eth1: transmit timed out, resetting
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000000: 0x165f14e4, 0x00100406, 0x02000000, 0x00800010
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000010: 0xd90d000c, 0x00000000, 0xd90e000c, 0x00000000
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000020: 0xd90f000c, 0x00000000, 0x00000000, 0x04f71028
... (TOO MANY CHARACTERS TO POST ON SERVER FAULT, SO STRIPPING) ...
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007020: 0x00000000, 0x00000000, 0x00000406, 0x10004000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007030: 0x000e0000, 0x00004af8, 0x00170030, 0x00000000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: Host status block [00000001:00000063:(0000:0695:0000):(0000:01b8)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: NAPI info [00000063:00000063:(01a5:01b8:01ff):0000:(0095:0000:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: Host status block [00000001:000000ac:(0000:0000:0000):(083a:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: NAPI info [000000ac:000000ac:(0000:0000:01ff):083a:(003a:003a:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: Host status block [00000001:0000004f:(089e:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: NAPI info [00000046:00000046:(0000:0000:01ff):0895:(0095:0095:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: Host status block [00000001:00000012:(0000:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: NAPI info [00000012:00000012:(0000:0000:01ff):0fbc:(07bc:07bc:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: Host status block [00000001:000000d4:(0000:0000:0742):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: NAPI info [000000d4:000000d4:(0000:0000:01ff):0742:(0742:0742:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=1400 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=c00 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is down
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled


But before that, here is the snippet of /var/log/messages where the NIC initializes seemingly normally during a boot...



Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:3f
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: dma_rwctrl[00000001] dma_mask[64-bit]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:40
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: dma_rwctrl[00000001] dma_mask[64-bit]


Here is the snippet of /var/log/messages where the NIC comes up seemingly normally, but not sure in what context (it's just after the init immediately above this snippet)...



Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Link is up at 1000 Mbps, full duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Flow control is on for TX and on for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


And, here is the snippet of /var/log/messages where the NIC seems to soon-after have other issues, renegotiating or something. This happens randomly several times over the next few days...



Mar 21 16:01:42 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:44 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


Here is my lspci | grep -i net...



02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe









share|improve this question
















I have a CentOS 6.6 server (Linux 2.6.32-504) running on a brand new Dell PowerEdge R320. Just the other day, I began noticing problems in /var/log/messages related to one of my NICs.



Note: this is a server cluster that requires eth0 to be used in a crossover network to a redundant backup server, so we use eth1 to talk to the customer's network and the world; so it would be challenging to switch NICs for troubleshooting purposes. I'll also note I'm not seeing any issues on eth0, though.



I'd like to troubleshoot this as much as possible ~ is it a hardware failure, Linux kernel doesn't like the Dell hardware, etc.? I am not a career Linux administrator, so I'm kind of green in this regard.



Google searches seem to turn up more questions for me, than answers (and only seem to apply to other distros; but point out possibilities such as kernel/firmware issues. I humbly beg for this community's expertise and ideas!



The one thing I noticed was that the interface was coming back up as 100Mb/half (when it's really on a 1Gb/full).



I'll provide as much as I can think of, that's hopefully relevant.



Here is the snippet of /var/log/messages where I noticed the issue... does this mean anything to anyone?



Mar 22 16:10:07 ind1un043 kernel: ------------[ cut here ]------------
Mar 22 16:10:07 ind1un043 kernel: WARNING: at net/sched/sch_generic.c:261 dev_watchdog+0x26b/0x280() (Not tainted)
Mar 22 16:10:07 ind1un043 kernel: Hardware name: PowerEdge R320
Mar 22 16:10:07 ind1un043 kernel: NETDEV WATCHDOG: eth1 (tg3): transmit queue 0 timed out
Mar 22 16:10:07 ind1un043 kernel: Modules linked in: nls_utf8 hfsplus(U) mpt3sas mpt2sas scsi_transport_sas raid_class mptctl mptbase dell_rbu coretemp 8021q garp stp llc wctc4xxp(U) dahdi_transcode(U) wcb4xxp(U) wctdm(U) wcfxo(U) wctdm24xxp(U) wcte11xp(U) wct1xxp(U) wcte12xp(U) dahdi_voicebus(U) wct4xxp(U) oct612x(U) dahdi(U) crc_ccitt ipt_REJECT nf_conntrack_ipv4 nf_defrag_ipv4 iptable_filter ip_tables ip6t_REJECT nf_conntrack_ipv6 nf_defrag_ipv6 xt_state nf_conntrack ip6table_filter ip6_tables ipv6 uinput iTCO_wdt iTCO_vendor_support microcode dcdbas sb_edac edac_core ipmi_devintf power_meter acpi_ipmi ipmi_si ipmi_msghandler sg shpchp tg3 ptp pps_core lpc_ich mfd_core ext4 jbd2 mbcache sr_mod cdrom sd_mod crc_t10dif ahci wmi megaraid_sas dm_mirror dm_region_hash dm_log dm_mod [last unloaded: speedstep_lib]
Mar 22 16:10:07 ind1un043 kernel: Pid: 9, comm: ksoftirqd/1 Not tainted 2.6.32-504.el6.x86_64 #1
Mar 22 16:10:07 ind1un043 kernel: Call Trace:
Mar 22 16:10:07 ind1un043 kernel: <IRQ> [<ffffffff81074df7>] ? warn_slowpath_common+0x87/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81074ee6>] ? warn_slowpath_fmt+0x46/0x50
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147df7b>] ? dev_watchdog+0x26b/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81060d7c>] ? scheduler_tick+0xcc/0x260
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8147dd10>] ? dev_watchdog+0x0/0x280
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81087db7>] ? run_timer_softirq+0x197/0x340
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b0275>] ? tick_dev_program_event+0x65/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d8b1>] ? __do_softirq+0xc1/0x1e0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff810b034a>] ? tick_program_event+0x2a/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c30c>] ? call_softirq+0x1c/0x30
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100fc15>] ? do_softirq+0x65/0xa0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d765>] ? irq_exit+0x85/0x90
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff81533c0a>] ? smp_apic_timer_interrupt+0x4a/0x60
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100bb93>] ? apic_timer_interrupt+0x13/0x20
Mar 22 16:10:07 ind1un043 kernel: <EOI> [<ffffffff8107d4c5>] ? ksoftirqd+0xd5/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8107d3f0>] ? ksoftirqd+0x0/0x110
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e66e>] ? kthread+0x9e/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c20a>] ? child_rip+0xa/0x20
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8109e5d0>] ? kthread+0x0/0xc0
Mar 22 16:10:07 ind1un043 kernel: [<ffffffff8100c200>] ? child_rip+0x0/0x20
Mar 22 16:10:07 ind1un043 kernel: ---[ end trace c154004f7af06fb3 ]---
Mar 22 16:10:07 ind1un043 kernel: tg3 0000:02:00.1: eth1: transmit timed out, resetting
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000000: 0x165f14e4, 0x00100406, 0x02000000, 0x00800010
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000010: 0xd90d000c, 0x00000000, 0xd90e000c, 0x00000000
Mar 22 16:10:08 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00000020: 0xd90f000c, 0x00000000, 0x00000000, 0x04f71028
... (TOO MANY CHARACTERS TO POST ON SERVER FAULT, SO STRIPPING) ...
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007020: 0x00000000, 0x00000000, 0x00000406, 0x10004000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0x00007030: 0x000e0000, 0x00004af8, 0x00170030, 0x00000000
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: Host status block [00000001:00000063:(0000:0695:0000):(0000:01b8)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 0: NAPI info [00000063:00000063:(01a5:01b8:01ff):0000:(0095:0000:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: Host status block [00000001:000000ac:(0000:0000:0000):(083a:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 1: NAPI info [000000ac:000000ac:(0000:0000:01ff):083a:(003a:003a:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: Host status block [00000001:0000004f:(089e:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 2: NAPI info [00000046:00000046:(0000:0000:01ff):0895:(0095:0095:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: Host status block [00000001:00000012:(0000:0000:0000):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 3: NAPI info [00000012:00000012:(0000:0000:01ff):0fbc:(07bc:07bc:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: Host status block [00000001:000000d4:(0000:0000:0742):(0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: 4: NAPI info [000000d4:000000d4:(0000:0000:01ff):0742:(0742:0742:0000:0000)]
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=1400 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: tg3_stop_block timed out, ofs=c00 enable_bit=2
Mar 22 16:10:09 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is down
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 22 16:10:11 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled


But before that, here is the snippet of /var/log/messages where the NIC initializes seemingly normally during a boot...



Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:3f
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: dma_rwctrl[00000001] dma_mask[64-bit]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: PCI INT B -> GSI 17 (level, low) -> IRQ 17
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Tigon3 [partno(BCM95720) rev 5720000] (PCI Express) MAC address 14:18:77:32:77:40
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: attached PHY is 5720C (10/100/1000Base-T Ethernet) (WireSpeed[1], EEE[1])
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[1] TSOcap[1]
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: dma_rwctrl[00000001] dma_mask[64-bit]


Here is the snippet of /var/log/messages where the NIC comes up seemingly normally, but not sure in what context (it's just after the init immediately above this snippet)...



Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth0: link is not ready
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Link is up at 1000 Mbps, full duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: Flow control is on for TX and on for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.0: eth0: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:13 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:13 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


And, here is the snippet of /var/log/messages where the NIC seems to soon-after have other issues, renegotiating or something. This happens randomly several times over the next few days...



Mar 21 16:01:42 ind1un043 kernel: ADDRCONF(NETDEV_UP): eth1: link is not ready
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Link is up at 100 Mbps, half duplex
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: Flow control is off for TX and off for RX
Mar 21 16:01:44 ind1un043 kernel: tg3 0000:02:00.1: eth1: EEE is disabled
Mar 21 16:01:44 ind1un043 kernel: ADDRCONF(NETDEV_CHANGE): eth1: link becomes ready


Here is my lspci | grep -i net...



02:00.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe
02:00.1 Ethernet controller: Broadcom Corporation NetXtreme BCM5720 Gigabit Ethernet PCIe






linux centos6 dell-poweredge nic broadcom






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 25 '16 at 15:32







csr19us

















asked Mar 25 '16 at 15:14









csr19uscsr19us

63




63












  • Couple of things I would try first. 1: Check for errors in the iDRAC 2: grab the latest SUU and make sure all your drivers/firmware are up to date 3: check the switch it is connected to for errors 4: check that the switch it's connected to isn't hard coded to 100/half for some reason.

    – Zypher
    Mar 25 '16 at 15:41


















  • Couple of things I would try first. 1: Check for errors in the iDRAC 2: grab the latest SUU and make sure all your drivers/firmware are up to date 3: check the switch it is connected to for errors 4: check that the switch it's connected to isn't hard coded to 100/half for some reason.

    – Zypher
    Mar 25 '16 at 15:41

















Couple of things I would try first. 1: Check for errors in the iDRAC 2: grab the latest SUU and make sure all your drivers/firmware are up to date 3: check the switch it is connected to for errors 4: check that the switch it's connected to isn't hard coded to 100/half for some reason.

– Zypher
Mar 25 '16 at 15:41






Couple of things I would try first. 1: Check for errors in the iDRAC 2: grab the latest SUU and make sure all your drivers/firmware are up to date 3: check the switch it is connected to for errors 4: check that the switch it's connected to isn't hard coded to 100/half for some reason.

– Zypher
Mar 25 '16 at 15:41











1 Answer
1






active

oldest

votes


















0














Not sure if this is exactly an "answer" or not, but it did solve my issue, as far as this particular case is concerned. This seems to be related to ACPI.



I edited my /etc/grub.conf file to add the following to the end of the kernel line:



noapic acpi=off pci=noacpi



I'm now no longer seeing any errors in the log.



Could it be that power saving features were taking the interface down and bringing it back up in a slower mode? Or could it be some bug with the hardware/firmware and this kernel? Not sure ~ but this at least fixes my issue (it's a production server, whose network we never want to be down).






share|improve this answer























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "2"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: true,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f766109%2ftroubleshooting-nic-logs-by-netdev-watchdog%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    0














    Not sure if this is exactly an "answer" or not, but it did solve my issue, as far as this particular case is concerned. This seems to be related to ACPI.



    I edited my /etc/grub.conf file to add the following to the end of the kernel line:



    noapic acpi=off pci=noacpi



    I'm now no longer seeing any errors in the log.



    Could it be that power saving features were taking the interface down and bringing it back up in a slower mode? Or could it be some bug with the hardware/firmware and this kernel? Not sure ~ but this at least fixes my issue (it's a production server, whose network we never want to be down).






    share|improve this answer



























      0














      Not sure if this is exactly an "answer" or not, but it did solve my issue, as far as this particular case is concerned. This seems to be related to ACPI.



      I edited my /etc/grub.conf file to add the following to the end of the kernel line:



      noapic acpi=off pci=noacpi



      I'm now no longer seeing any errors in the log.



      Could it be that power saving features were taking the interface down and bringing it back up in a slower mode? Or could it be some bug with the hardware/firmware and this kernel? Not sure ~ but this at least fixes my issue (it's a production server, whose network we never want to be down).






      share|improve this answer

























        0












        0








        0







        Not sure if this is exactly an "answer" or not, but it did solve my issue, as far as this particular case is concerned. This seems to be related to ACPI.



        I edited my /etc/grub.conf file to add the following to the end of the kernel line:



        noapic acpi=off pci=noacpi



        I'm now no longer seeing any errors in the log.



        Could it be that power saving features were taking the interface down and bringing it back up in a slower mode? Or could it be some bug with the hardware/firmware and this kernel? Not sure ~ but this at least fixes my issue (it's a production server, whose network we never want to be down).






        share|improve this answer













        Not sure if this is exactly an "answer" or not, but it did solve my issue, as far as this particular case is concerned. This seems to be related to ACPI.



        I edited my /etc/grub.conf file to add the following to the end of the kernel line:



        noapic acpi=off pci=noacpi



        I'm now no longer seeing any errors in the log.



        Could it be that power saving features were taking the interface down and bringing it back up in a slower mode? Or could it be some bug with the hardware/firmware and this kernel? Not sure ~ but this at least fixes my issue (it's a production server, whose network we never want to be down).







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Mar 29 '16 at 13:05









        csr19uscsr19us

        63




        63



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Server Fault!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f766109%2ftroubleshooting-nic-logs-by-netdev-watchdog%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown







            Popular posts from this blog

            Club Baloncesto Breogán Índice Historia | Pavillón | Nome | O Breogán na cultura popular | Xogadores | Adestradores | Presidentes | Palmarés | Historial | Líderes | Notas | Véxase tamén | Menú de navegacióncbbreogan.galCadroGuía oficial da ACB 2009-10, páxina 201Guía oficial ACB 1992, páxina 183. Editorial DB.É de 6.500 espectadores sentados axeitándose á última normativa"Estudiantes Junior, entre as mellores canteiras"o orixinalHemeroteca El Mundo Deportivo, 16 setembro de 1970, páxina 12Historia do BreogánAlfredo Pérez, o último canoneiroHistoria C.B. BreogánHemeroteca de El Mundo DeportivoJimmy Wright, norteamericano do Breogán deixará Lugo por ameazas de morteResultados de Breogán en 1986-87Resultados de Breogán en 1990-91Ficha de Velimir Perasović en acb.comResultados de Breogán en 1994-95Breogán arrasa al Barça. "El Mundo Deportivo", 27 de setembro de 1999, páxina 58CB Breogán - FC BarcelonaA FEB invita a participar nunha nova Liga EuropeaCharlie Bell na prensa estatalMáximos anotadores 2005Tempada 2005-06 : Tódolos Xogadores da Xornada""Non quero pensar nunha man negra, mais pregúntome que está a pasar""o orixinalRaúl López, orgulloso dos xogadores, presume da boa saúde económica do BreogánJulio González confirma que cesa como presidente del BreogánHomenaxe a Lisardo GómezA tempada do rexurdimento celesteEntrevista a Lisardo GómezEl COB dinamita el Pazo para forzar el quinto (69-73)Cafés Candelas, patrocinador del CB Breogán"Suso Lázare, novo presidente do Breogán"o orixinalCafés Candelas Breogán firma el mayor triunfo de la historiaEl Breogán realizará 17 homenajes por su cincuenta aniversario"O Breogán honra ao seu fundador e primeiro presidente"o orixinalMiguel Giao recibiu a homenaxe do PazoHomenaxe aos primeiros gladiadores celestesO home que nos amosa como ver o Breo co corazónTita Franco será homenaxeada polos #50anosdeBreoJulio Vila recibirá unha homenaxe in memoriam polos #50anosdeBreo"O Breogán homenaxeará aos seus aboados máis veteráns"Pechada ovación a «Capi» Sanmartín e Ricardo «Corazón de González»Homenaxe por décadas de informaciónPaco García volve ao Pazo con motivo do 50 aniversario"Resultados y clasificaciones""O Cafés Candelas Breogán, campión da Copa Princesa""O Cafés Candelas Breogán, equipo ACB"C.B. Breogán"Proxecto social"o orixinal"Centros asociados"o orixinalFicha en imdb.comMario Camus trata la recuperación del amor en 'La vieja música', su última película"Páxina web oficial""Club Baloncesto Breogán""C. B. Breogán S.A.D."eehttp://www.fegaba.com

            Vilaño, A Laracha Índice Patrimonio | Lugares e parroquias | Véxase tamén | Menú de navegación43°14′52″N 8°36′03″O / 43.24775, -8.60070

            Cegueira Índice Epidemioloxía | Deficiencia visual | Tipos de cegueira | Principais causas de cegueira | Tratamento | Técnicas de adaptación e axudas | Vida dos cegos | Primeiros auxilios | Crenzas respecto das persoas cegas | Crenzas das persoas cegas | O neno deficiente visual | Aspectos psicolóxicos da cegueira | Notas | Véxase tamén | Menú de navegación54.054.154.436928256blindnessDicionario da Real Academia GalegaPortal das Palabras"International Standards: Visual Standards — Aspects and Ranges of Vision Loss with Emphasis on Population Surveys.""Visual impairment and blindness""Presentan un plan para previr a cegueira"o orixinalACCDV Associació Catalana de Cecs i Disminuïts Visuals - PMFTrachoma"Effect of gene therapy on visual function in Leber's congenital amaurosis"1844137110.1056/NEJMoa0802268Cans guía - os mellores amigos dos cegosArquivadoEscola de cans guía para cegos en Mortágua, PortugalArquivado"Tecnología para ciegos y deficientes visuales. Recopilación de recursos gratuitos en la Red""Colorino""‘COL.diesis’, escuchar los sonidos del color""COL.diesis: Transforming Colour into Melody and Implementing the Result in a Colour Sensor Device"o orixinal"Sistema de desarrollo de sinestesia color-sonido para invidentes utilizando un protocolo de audio""Enseñanza táctil - geometría y color. Juegos didácticos para niños ciegos y videntes""Sistema Constanz"L'ocupació laboral dels cecs a l'Estat espanyol està pràcticament equiparada a la de les persones amb visió, entrevista amb Pedro ZuritaONCE (Organización Nacional de Cegos de España)Prevención da cegueiraDescrición de deficiencias visuais (Disc@pnet)Braillín, un boneco atractivo para calquera neno, con ou sen discapacidade, que permite familiarizarse co sistema de escritura e lectura brailleAxudas Técnicas36838ID00897494007150-90057129528256DOID:1432HP:0000618D001766C10.597.751.941.162C97109C0155020