Hung kernel tasks after unclean shutdown of ceph clusterInode tables shrinking sharply over time causing rsync/inode problemsAfter initial deployment Ceph cluster stays in active+degraded stateceph osd down and rgw Initialization timeout, failed to initialize after rebootceph - can't start osd on rebooted cluster hostHow do I mount one of multiple filesystems in a ceph cluster?Graceful shutdown of Kubernetes clusterproper shutdown of a kubernetes clusterOn what network do the clients connect to the CEPH cluster (public/private)?openstack instance's image name lost after integration with ceph rbdCeph architecture for small HPC cluster

Example of a relative pronoun

Should I join office cleaning event for free?

How did the USSR manage to innovate in an environment characterized by government censorship and high bureaucracy?

How is the claim "I am in New York only if I am in America" the same as "If I am in New York, then I am in America?

The use of multiple foreign keys on same column in SQL Server

A function which translates a sentence to title-case

If I cast Expeditious Retreat, can I Dash as a bonus action on the same turn?

Awk syntax, strange variable?

Download, install and reboot computer at night if needed

How can bays and straits be determined in a procedurally generated map?

Is there really no realistic way for a skeleton monster to move around without magic?

Continuity at a point in terms of closure

Copenhagen passport control - US citizen

Is it possible to do 50 km distance without any previous training?

How to report a triplet of septets in NMR tabulation?

Symplectic equivalent of commuting matrices

Compute hash value according to multiplication method

What would happen to a modern skyscraper if it rains micro blackholes?

TGV timetables / schedules?

How is this relation reflexive?

I probably found a bug with the sudo apt install function

Why was the small council so happy for Tyrion to become the Master of Coin?

Why are 150k or 200k jobs considered good when there are 300k+ births a month?

Can I interfere when another PC is about to be attacked?

Hung kernel tasks after unclean shutdown of ceph cluster

Inode tables shrinking sharply over time causing rsync/inode problemsAfter initial deployment Ceph cluster stays in active+degraded stateceph osd down and rgw Initialization timeout, failed to initialize after rebootceph - can't start osd on rebooted cluster hostHow do I mount one of multiple filesystems in a ceph cluster?Graceful shutdown of Kubernetes clusterproper shutdown of a kubernetes clusterOn what network do the clients connect to the CEPH cluster (public/private)?openstack instance's image name lost after integration with ceph rbdCeph architecture for small HPC cluster

.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty height:90px;width:728px;box-sizing:border-box;

I am running ceph (created by the rook-ceph operator v0.9.3) on kubernetes v1.13. After an unclean shutdown of our cluster, some processes randomly go into uninterruptible sleep. After some time, the kubernetes cluster fails to schedule new Pods. Looking through dmesg, I found this:

[ 3021.890423] INFO: task tp_fstore_op:22689 blocked for more than 120 seconds.
[ 3021.890456] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890480] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890504] tp_fstore_op D 0 22689 20967 0x00000000
[ 3021.890508] ffff93c0a5dc0080 0000000000000000 ffff93d137954540 ffff93c1fe8d8980
[ 3021.890510] ffff93bf42e823c0 ffffb9ae3834b7b0 ffffffff9e0144b9 0000000000008000
[ 3021.890512] 0000000000000040 ffff93c1fe8d8980 ffff93c0a9156300 ffff93d137954540
[ 3021.890515] Call Trace:
[ 3021.890524] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890571] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890574] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890576] [<ffffffff9e017d4d>] ? schedule_timeout+0x1dd/0x380
[ 3021.890602] [<ffffffffc0b8556d>] ? _xfs_log_force_lsn+0x22d/0x320 [xfs]
[ 3021.890613] [<ffffffff9daf107e>] ? ktime_get+0x3e/0xb0
[ 3021.890635] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890638] [<ffffffff9e01421d>] ? io_schedule_timeout+0x9d/0x100
[ 3021.890659] [<ffffffffc0b71e24>] ? __xfs_iunpin_wait+0xd4/0x160 [xfs]
[ 3021.890662] [<ffffffff9dabd3f0>] ? wake_atomic_t_function+0x60/0x60
[ 3021.890681] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890699] [<ffffffffc0b6970e>] ? xfs_reclaim_inodes_ag+0x1de/0x300 [xfs]
[ 3021.890702] [<ffffffff9db91885>] ? node_dirty_ok+0x125/0x170
[ 3021.890704] [<ffffffff9dd53419>] ? list_del+0x9/0x30
[ 3021.890707] [<ffffffff9dbe599a>] ? page_is_poisoned+0xa/0x20
[ 3021.890709] [<ffffffff9db8ba0e>] ? get_page_from_freelist+0x88e/0xb20
[ 3021.890712] [<ffffffff9daae1ff>] ? select_task_rq_fair+0x51f/0x7e0
[ 3021.890714] [<ffffffff9daad9d5>] ? select_idle_sibling+0x25/0x330
[ 3021.890716] [<ffffffff9daa5674>] ? try_to_wake_up+0x54/0x3c0
[ 3021.890734] [<ffffffffc0b6a771>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
[ 3021.890736] [<ffffffff9dc0eed8>] ? super_cache_scan+0x188/0x190
[ 3021.890738] [<ffffffff9db97a0a>] ? shrink_slab.part.38+0x21a/0x440
[ 3021.890740] [<ffffffff9db9c3ca>] ? shrink_node+0x10a/0x340
[ 3021.890742] [<ffffffff9db9c6f1>] ? do_try_to_free_pages+0xf1/0x310
[ 3021.890744] [<ffffffff9dd38b6a>] ? __next_node_in+0x3a/0x50
[ 3021.890745] [<ffffffff9db9cb73>] ? try_to_free_mem_cgroup_pages+0xc3/0x1a0
[ 3021.890748] [<ffffffff9dbfd147>] ? try_charge+0x147/0x6f0
[ 3021.890750] [<ffffffff9dc01237>] ? mem_cgroup_try_charge+0x67/0x1b0
[ 3021.890752] [<ffffffff9dbbb1d2>] ? handle_mm_fault+0x10e2/0x1310
[ 3021.890755] [<ffffffff9dc0ac30>] ? new_sync_write+0xe0/0x130
[ 3021.890758] [<ffffffff9da622f5>] ? __do_page_fault+0x255/0x4f0
[ 3021.890760] [<ffffffff9e01a618>] ? page_fault+0x28/0x30

Immediately after that, accesses to the RBDs produce similar errors:

[ 3021.890820] INFO: task xfsaild/rbd2:23307 blocked for more than 120 seconds.
[ 3021.890845] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890896] xfsaild/rbd2 D 0 23307 2 0x00000000
[ 3021.890898] ffff93c182e46480 0000000000000000 ffff93d0d3a4ca00 ffff93d1fdb58980
[ 3021.890900] ffff93d1f6a4a180 ffffb9ae24e07d80 ffffffff9e0144b9 0000000000000246
[ 3021.890903] 00ffffff9dae787d ffff93d1fdb58980 e182622c538e97d5 ffff93d0d3a4ca00
[ 3021.890905] Call Trace:
[ 3021.890909] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890911] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890948] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.890949] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.890973] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.890994] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.890996] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.890998] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891000] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891002] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70
[ 3021.891004] INFO: task xfsaild/rbd3:23438 blocked for more than 120 seconds.
[ 3021.891027] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.891050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.891074] xfsaild/rbd3 D 0 23438 2 0x00000000
[ 3021.891075] ffff93c0fb0464c0 0000000000000000 ffff93d0a88f61c0 ffff93d1fdd18980
[ 3021.891077] ffff93d1f6a80340 ffffb9ae24e37d80 ffffffff9e0144b9 0000000000000246
[ 3021.891080] 00ffffff9dae787d ffff93d1fdd18980 10168cfc448e06f4 ffff93d0a88f61c0
[ 3021.891081] Call Trace:
[ 3021.891084] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.891086] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.891108] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.891109] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.891130] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.891151] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.891153] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.891154] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891156] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891158] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70

There are more errors in dmesg, but they all follow the same schema: Some process tries to perform some operations on XFS, the kernel task gets stuck and the process remains in uninterruptible sleep.

Shortly after, libceph reports that the OSDs are down:

[ 4218.521314] libceph: osd0 down

Journalctl does not report any additional errors.

The unclean shutdown was necessary due to similar problems when a Kubernetes Pod tried to write a file that was too large for the attached Volume. The Volume was provided by rook-ceph. This is the config I am using:

Cluster config:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
 name: rook-ceph
 namespace: rook-ceph
spec:
 cephVersion:
 image: "ceph/ceph:v13.2.5-20190319"
 dataDirHostPath: "/var/rook/data"
 dashboard:
 enabled: True
 port: 80
 ssl: False
 network:
 hostNetwork: False # use SDN (Canal) as network
 mon:
 count: 3
 allowMultiplePerNode: True 
 resources: # http://docs.ceph.com/docs/mimic/start/hardware-recommendations/
 mgr:
 requests:
 cpu: 4
 memory: "2Gi"
 limits:
 cpu: 4
 memory: "2Gi"
 mon:
 requests:
 cpu: 0.5
 memory: "2Gi"
 limits:
 cpu: 0.5
 memory: "2Gi"
 osd:
 requests:
 cpu: 2
 memory: "5Gi"
 limits:
 cpu: 2
 memory: "5Gi"
 storage:
 useAllNodes: False
 nodes:
 - name: "kubernetes-master" # matches node label: kubernetes.io/hostname
 useAllDevices: False
 directories:
 - path: "/var/rook/filestore"

BlockPool config:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
 name: volatile-replicapool
 namespace: rook-ceph
spec:
 failureDomain: osd
 replicated:
 size: 1

And the StorageClasses:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-development
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Delete
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-production
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Retain

I am running Linux 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64.

Any pointers as to how to debug this issue would be greatly appreciated.

Thanks in advance.

asked Apr 4 at 9:55

strangedev

New contributor

add a comment |

[ 3021.890423] INFO: task tp_fstore_op:22689 blocked for more than 120 seconds.
[ 3021.890456] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890480] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890504] tp_fstore_op D 0 22689 20967 0x00000000
[ 3021.890508] ffff93c0a5dc0080 0000000000000000 ffff93d137954540 ffff93c1fe8d8980
[ 3021.890510] ffff93bf42e823c0 ffffb9ae3834b7b0 ffffffff9e0144b9 0000000000008000
[ 3021.890512] 0000000000000040 ffff93c1fe8d8980 ffff93c0a9156300 ffff93d137954540
[ 3021.890515] Call Trace:
[ 3021.890524] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890571] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890574] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890576] [<ffffffff9e017d4d>] ? schedule_timeout+0x1dd/0x380
[ 3021.890602] [<ffffffffc0b8556d>] ? _xfs_log_force_lsn+0x22d/0x320 [xfs]
[ 3021.890613] [<ffffffff9daf107e>] ? ktime_get+0x3e/0xb0
[ 3021.890635] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890638] [<ffffffff9e01421d>] ? io_schedule_timeout+0x9d/0x100
[ 3021.890659] [<ffffffffc0b71e24>] ? __xfs_iunpin_wait+0xd4/0x160 [xfs]
[ 3021.890662] [<ffffffff9dabd3f0>] ? wake_atomic_t_function+0x60/0x60
[ 3021.890681] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890699] [<ffffffffc0b6970e>] ? xfs_reclaim_inodes_ag+0x1de/0x300 [xfs]
[ 3021.890702] [<ffffffff9db91885>] ? node_dirty_ok+0x125/0x170
[ 3021.890704] [<ffffffff9dd53419>] ? list_del+0x9/0x30
[ 3021.890707] [<ffffffff9dbe599a>] ? page_is_poisoned+0xa/0x20
[ 3021.890709] [<ffffffff9db8ba0e>] ? get_page_from_freelist+0x88e/0xb20
[ 3021.890712] [<ffffffff9daae1ff>] ? select_task_rq_fair+0x51f/0x7e0
[ 3021.890714] [<ffffffff9daad9d5>] ? select_idle_sibling+0x25/0x330
[ 3021.890716] [<ffffffff9daa5674>] ? try_to_wake_up+0x54/0x3c0
[ 3021.890734] [<ffffffffc0b6a771>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
[ 3021.890736] [<ffffffff9dc0eed8>] ? super_cache_scan+0x188/0x190
[ 3021.890738] [<ffffffff9db97a0a>] ? shrink_slab.part.38+0x21a/0x440
[ 3021.890740] [<ffffffff9db9c3ca>] ? shrink_node+0x10a/0x340
[ 3021.890742] [<ffffffff9db9c6f1>] ? do_try_to_free_pages+0xf1/0x310
[ 3021.890744] [<ffffffff9dd38b6a>] ? __next_node_in+0x3a/0x50
[ 3021.890745] [<ffffffff9db9cb73>] ? try_to_free_mem_cgroup_pages+0xc3/0x1a0
[ 3021.890748] [<ffffffff9dbfd147>] ? try_charge+0x147/0x6f0
[ 3021.890750] [<ffffffff9dc01237>] ? mem_cgroup_try_charge+0x67/0x1b0
[ 3021.890752] [<ffffffff9dbbb1d2>] ? handle_mm_fault+0x10e2/0x1310
[ 3021.890755] [<ffffffff9dc0ac30>] ? new_sync_write+0xe0/0x130
[ 3021.890758] [<ffffffff9da622f5>] ? __do_page_fault+0x255/0x4f0
[ 3021.890760] [<ffffffff9e01a618>] ? page_fault+0x28/0x30

Immediately after that, accesses to the RBDs produce similar errors:

[ 3021.890820] INFO: task xfsaild/rbd2:23307 blocked for more than 120 seconds.
[ 3021.890845] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890896] xfsaild/rbd2 D 0 23307 2 0x00000000
[ 3021.890898] ffff93c182e46480 0000000000000000 ffff93d0d3a4ca00 ffff93d1fdb58980
[ 3021.890900] ffff93d1f6a4a180 ffffb9ae24e07d80 ffffffff9e0144b9 0000000000000246
[ 3021.890903] 00ffffff9dae787d ffff93d1fdb58980 e182622c538e97d5 ffff93d0d3a4ca00
[ 3021.890905] Call Trace:
[ 3021.890909] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890911] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890948] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.890949] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.890973] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.890994] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.890996] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.890998] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891000] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891002] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70
[ 3021.891004] INFO: task xfsaild/rbd3:23438 blocked for more than 120 seconds.
[ 3021.891027] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.891050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.891074] xfsaild/rbd3 D 0 23438 2 0x00000000
[ 3021.891075] ffff93c0fb0464c0 0000000000000000 ffff93d0a88f61c0 ffff93d1fdd18980
[ 3021.891077] ffff93d1f6a80340 ffffb9ae24e37d80 ffffffff9e0144b9 0000000000000246
[ 3021.891080] 00ffffff9dae787d ffff93d1fdd18980 10168cfc448e06f4 ffff93d0a88f61c0
[ 3021.891081] Call Trace:
[ 3021.891084] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.891086] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.891108] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.891109] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.891130] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.891151] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.891153] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.891154] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891156] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891158] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70

There are more errors in dmesg, but they all follow the same schema: Some process tries to perform some operations on XFS, the kernel task gets stuck and the process remains in uninterruptible sleep.

Shortly after, libceph reports that the OSDs are down:

[ 4218.521314] libceph: osd0 down

Journalctl does not report any additional errors.

Cluster config:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
 name: rook-ceph
 namespace: rook-ceph
spec:
 cephVersion:
 image: "ceph/ceph:v13.2.5-20190319"
 dataDirHostPath: "/var/rook/data"
 dashboard:
 enabled: True
 port: 80
 ssl: False
 network:
 hostNetwork: False # use SDN (Canal) as network
 mon:
 count: 3
 allowMultiplePerNode: True 
 resources: # http://docs.ceph.com/docs/mimic/start/hardware-recommendations/
 mgr:
 requests:
 cpu: 4
 memory: "2Gi"
 limits:
 cpu: 4
 memory: "2Gi"
 mon:
 requests:
 cpu: 0.5
 memory: "2Gi"
 limits:
 cpu: 0.5
 memory: "2Gi"
 osd:
 requests:
 cpu: 2
 memory: "5Gi"
 limits:
 cpu: 2
 memory: "5Gi"
 storage:
 useAllNodes: False
 nodes:
 - name: "kubernetes-master" # matches node label: kubernetes.io/hostname
 useAllDevices: False
 directories:
 - path: "/var/rook/filestore"

BlockPool config:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
 name: volatile-replicapool
 namespace: rook-ceph
spec:
 failureDomain: osd
 replicated:
 size: 1

And the StorageClasses:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-development
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Delete
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-production
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Retain

I am running Linux 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64.

Any pointers as to how to debug this issue would be greatly appreciated.

Thanks in advance.

asked Apr 4 at 9:55

strangedev

New contributor

add a comment |

[ 3021.890423] INFO: task tp_fstore_op:22689 blocked for more than 120 seconds.
[ 3021.890456] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890480] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890504] tp_fstore_op D 0 22689 20967 0x00000000
[ 3021.890508] ffff93c0a5dc0080 0000000000000000 ffff93d137954540 ffff93c1fe8d8980
[ 3021.890510] ffff93bf42e823c0 ffffb9ae3834b7b0 ffffffff9e0144b9 0000000000008000
[ 3021.890512] 0000000000000040 ffff93c1fe8d8980 ffff93c0a9156300 ffff93d137954540
[ 3021.890515] Call Trace:
[ 3021.890524] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890571] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890574] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890576] [<ffffffff9e017d4d>] ? schedule_timeout+0x1dd/0x380
[ 3021.890602] [<ffffffffc0b8556d>] ? _xfs_log_force_lsn+0x22d/0x320 [xfs]
[ 3021.890613] [<ffffffff9daf107e>] ? ktime_get+0x3e/0xb0
[ 3021.890635] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890638] [<ffffffff9e01421d>] ? io_schedule_timeout+0x9d/0x100
[ 3021.890659] [<ffffffffc0b71e24>] ? __xfs_iunpin_wait+0xd4/0x160 [xfs]
[ 3021.890662] [<ffffffff9dabd3f0>] ? wake_atomic_t_function+0x60/0x60
[ 3021.890681] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890699] [<ffffffffc0b6970e>] ? xfs_reclaim_inodes_ag+0x1de/0x300 [xfs]
[ 3021.890702] [<ffffffff9db91885>] ? node_dirty_ok+0x125/0x170
[ 3021.890704] [<ffffffff9dd53419>] ? list_del+0x9/0x30
[ 3021.890707] [<ffffffff9dbe599a>] ? page_is_poisoned+0xa/0x20
[ 3021.890709] [<ffffffff9db8ba0e>] ? get_page_from_freelist+0x88e/0xb20
[ 3021.890712] [<ffffffff9daae1ff>] ? select_task_rq_fair+0x51f/0x7e0
[ 3021.890714] [<ffffffff9daad9d5>] ? select_idle_sibling+0x25/0x330
[ 3021.890716] [<ffffffff9daa5674>] ? try_to_wake_up+0x54/0x3c0
[ 3021.890734] [<ffffffffc0b6a771>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
[ 3021.890736] [<ffffffff9dc0eed8>] ? super_cache_scan+0x188/0x190
[ 3021.890738] [<ffffffff9db97a0a>] ? shrink_slab.part.38+0x21a/0x440
[ 3021.890740] [<ffffffff9db9c3ca>] ? shrink_node+0x10a/0x340
[ 3021.890742] [<ffffffff9db9c6f1>] ? do_try_to_free_pages+0xf1/0x310
[ 3021.890744] [<ffffffff9dd38b6a>] ? __next_node_in+0x3a/0x50
[ 3021.890745] [<ffffffff9db9cb73>] ? try_to_free_mem_cgroup_pages+0xc3/0x1a0
[ 3021.890748] [<ffffffff9dbfd147>] ? try_charge+0x147/0x6f0
[ 3021.890750] [<ffffffff9dc01237>] ? mem_cgroup_try_charge+0x67/0x1b0
[ 3021.890752] [<ffffffff9dbbb1d2>] ? handle_mm_fault+0x10e2/0x1310
[ 3021.890755] [<ffffffff9dc0ac30>] ? new_sync_write+0xe0/0x130
[ 3021.890758] [<ffffffff9da622f5>] ? __do_page_fault+0x255/0x4f0
[ 3021.890760] [<ffffffff9e01a618>] ? page_fault+0x28/0x30

Immediately after that, accesses to the RBDs produce similar errors:

[ 3021.890820] INFO: task xfsaild/rbd2:23307 blocked for more than 120 seconds.
[ 3021.890845] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890896] xfsaild/rbd2 D 0 23307 2 0x00000000
[ 3021.890898] ffff93c182e46480 0000000000000000 ffff93d0d3a4ca00 ffff93d1fdb58980
[ 3021.890900] ffff93d1f6a4a180 ffffb9ae24e07d80 ffffffff9e0144b9 0000000000000246
[ 3021.890903] 00ffffff9dae787d ffff93d1fdb58980 e182622c538e97d5 ffff93d0d3a4ca00
[ 3021.890905] Call Trace:
[ 3021.890909] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890911] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890948] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.890949] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.890973] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.890994] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.890996] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.890998] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891000] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891002] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70
[ 3021.891004] INFO: task xfsaild/rbd3:23438 blocked for more than 120 seconds.
[ 3021.891027] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.891050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.891074] xfsaild/rbd3 D 0 23438 2 0x00000000
[ 3021.891075] ffff93c0fb0464c0 0000000000000000 ffff93d0a88f61c0 ffff93d1fdd18980
[ 3021.891077] ffff93d1f6a80340 ffffb9ae24e37d80 ffffffff9e0144b9 0000000000000246
[ 3021.891080] 00ffffff9dae787d ffff93d1fdd18980 10168cfc448e06f4 ffff93d0a88f61c0
[ 3021.891081] Call Trace:
[ 3021.891084] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.891086] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.891108] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.891109] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.891130] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.891151] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.891153] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.891154] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891156] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891158] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70

There are more errors in dmesg, but they all follow the same schema: Some process tries to perform some operations on XFS, the kernel task gets stuck and the process remains in uninterruptible sleep.

Shortly after, libceph reports that the OSDs are down:

[ 4218.521314] libceph: osd0 down

Journalctl does not report any additional errors.

Cluster config:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
 name: rook-ceph
 namespace: rook-ceph
spec:
 cephVersion:
 image: "ceph/ceph:v13.2.5-20190319"
 dataDirHostPath: "/var/rook/data"
 dashboard:
 enabled: True
 port: 80
 ssl: False
 network:
 hostNetwork: False # use SDN (Canal) as network
 mon:
 count: 3
 allowMultiplePerNode: True 
 resources: # http://docs.ceph.com/docs/mimic/start/hardware-recommendations/
 mgr:
 requests:
 cpu: 4
 memory: "2Gi"
 limits:
 cpu: 4
 memory: "2Gi"
 mon:
 requests:
 cpu: 0.5
 memory: "2Gi"
 limits:
 cpu: 0.5
 memory: "2Gi"
 osd:
 requests:
 cpu: 2
 memory: "5Gi"
 limits:
 cpu: 2
 memory: "5Gi"
 storage:
 useAllNodes: False
 nodes:
 - name: "kubernetes-master" # matches node label: kubernetes.io/hostname
 useAllDevices: False
 directories:
 - path: "/var/rook/filestore"

BlockPool config:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
 name: volatile-replicapool
 namespace: rook-ceph
spec:
 failureDomain: osd
 replicated:
 size: 1

And the StorageClasses:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-development
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Delete
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-production
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Retain

I am running Linux 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64.

Any pointers as to how to debug this issue would be greatly appreciated.

Thanks in advance.

asked Apr 4 at 9:55

strangedev

New contributor

[ 3021.890423] INFO: task tp_fstore_op:22689 blocked for more than 120 seconds.
[ 3021.890456] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890480] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890504] tp_fstore_op D 0 22689 20967 0x00000000
[ 3021.890508] ffff93c0a5dc0080 0000000000000000 ffff93d137954540 ffff93c1fe8d8980
[ 3021.890510] ffff93bf42e823c0 ffffb9ae3834b7b0 ffffffff9e0144b9 0000000000008000
[ 3021.890512] 0000000000000040 ffff93c1fe8d8980 ffff93c0a9156300 ffff93d137954540
[ 3021.890515] Call Trace:
[ 3021.890524] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890571] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890574] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890576] [<ffffffff9e017d4d>] ? schedule_timeout+0x1dd/0x380
[ 3021.890602] [<ffffffffc0b8556d>] ? _xfs_log_force_lsn+0x22d/0x320 [xfs]
[ 3021.890613] [<ffffffff9daf107e>] ? ktime_get+0x3e/0xb0
[ 3021.890635] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890638] [<ffffffff9e01421d>] ? io_schedule_timeout+0x9d/0x100
[ 3021.890659] [<ffffffffc0b71e24>] ? __xfs_iunpin_wait+0xd4/0x160 [xfs]
[ 3021.890662] [<ffffffff9dabd3f0>] ? wake_atomic_t_function+0x60/0x60
[ 3021.890681] [<ffffffffc0b69321>] ? xfs_reclaim_inode+0x131/0x340 [xfs]
[ 3021.890699] [<ffffffffc0b6970e>] ? xfs_reclaim_inodes_ag+0x1de/0x300 [xfs]
[ 3021.890702] [<ffffffff9db91885>] ? node_dirty_ok+0x125/0x170
[ 3021.890704] [<ffffffff9dd53419>] ? list_del+0x9/0x30
[ 3021.890707] [<ffffffff9dbe599a>] ? page_is_poisoned+0xa/0x20
[ 3021.890709] [<ffffffff9db8ba0e>] ? get_page_from_freelist+0x88e/0xb20
[ 3021.890712] [<ffffffff9daae1ff>] ? select_task_rq_fair+0x51f/0x7e0
[ 3021.890714] [<ffffffff9daad9d5>] ? select_idle_sibling+0x25/0x330
[ 3021.890716] [<ffffffff9daa5674>] ? try_to_wake_up+0x54/0x3c0
[ 3021.890734] [<ffffffffc0b6a771>] ? xfs_reclaim_inodes_nr+0x31/0x40 [xfs]
[ 3021.890736] [<ffffffff9dc0eed8>] ? super_cache_scan+0x188/0x190
[ 3021.890738] [<ffffffff9db97a0a>] ? shrink_slab.part.38+0x21a/0x440
[ 3021.890740] [<ffffffff9db9c3ca>] ? shrink_node+0x10a/0x340
[ 3021.890742] [<ffffffff9db9c6f1>] ? do_try_to_free_pages+0xf1/0x310
[ 3021.890744] [<ffffffff9dd38b6a>] ? __next_node_in+0x3a/0x50
[ 3021.890745] [<ffffffff9db9cb73>] ? try_to_free_mem_cgroup_pages+0xc3/0x1a0
[ 3021.890748] [<ffffffff9dbfd147>] ? try_charge+0x147/0x6f0
[ 3021.890750] [<ffffffff9dc01237>] ? mem_cgroup_try_charge+0x67/0x1b0
[ 3021.890752] [<ffffffff9dbbb1d2>] ? handle_mm_fault+0x10e2/0x1310
[ 3021.890755] [<ffffffff9dc0ac30>] ? new_sync_write+0xe0/0x130
[ 3021.890758] [<ffffffff9da622f5>] ? __do_page_fault+0x255/0x4f0
[ 3021.890760] [<ffffffff9e01a618>] ? page_fault+0x28/0x30

Immediately after that, accesses to the RBDs produce similar errors:

[ 3021.890820] INFO: task xfsaild/rbd2:23307 blocked for more than 120 seconds.
[ 3021.890845] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.890867] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.890896] xfsaild/rbd2 D 0 23307 2 0x00000000
[ 3021.890898] ffff93c182e46480 0000000000000000 ffff93d0d3a4ca00 ffff93d1fdb58980
[ 3021.890900] ffff93d1f6a4a180 ffffb9ae24e07d80 ffffffff9e0144b9 0000000000000246
[ 3021.890903] 00ffffff9dae787d ffff93d1fdb58980 e182622c538e97d5 ffff93d0d3a4ca00
[ 3021.890905] Call Trace:
[ 3021.890909] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.890911] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.890948] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.890949] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.890973] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.890994] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.890996] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.890998] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891000] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891002] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70
[ 3021.891004] INFO: task xfsaild/rbd3:23438 blocked for more than 120 seconds.
[ 3021.891027] Tainted: G O 4.9.0-8-amd64 #1 Debian 4.9.144-3.1
[ 3021.891050] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 3021.891074] xfsaild/rbd3 D 0 23438 2 0x00000000
[ 3021.891075] ffff93c0fb0464c0 0000000000000000 ffff93d0a88f61c0 ffff93d1fdd18980
[ 3021.891077] ffff93d1f6a80340 ffffb9ae24e37d80 ffffffff9e0144b9 0000000000000246
[ 3021.891080] 00ffffff9dae787d ffff93d1fdd18980 10168cfc448e06f4 ffff93d0a88f61c0
[ 3021.891081] Call Trace:
[ 3021.891084] [<ffffffff9e0144b9>] ? __schedule+0x239/0x6f0
[ 3021.891086] [<ffffffff9e0149a2>] ? schedule+0x32/0x80
[ 3021.891108] [<ffffffffc0b8508c>] ? _xfs_log_force+0x15c/0x2b0 [xfs]
[ 3021.891109] [<ffffffff9daa5a70>] ? wake_up_q+0x70/0x70
[ 3021.891130] [<ffffffffc0b92895>] ? xfsaild+0x1a5/0x7a0 [xfs]
[ 3021.891151] [<ffffffffc0b926f0>] ? xfs_trans_ail_cursor_first+0x80/0x80 [xfs]
[ 3021.891153] [<ffffffff9da9a5d9>] ? kthread+0xd9/0xf0
[ 3021.891154] [<ffffffff9e019364>] ? __switch_to_asm+0x34/0x70
[ 3021.891156] [<ffffffff9da9a500>] ? kthread_park+0x60/0x60
[ 3021.891158] [<ffffffff9e0193f7>] ? ret_from_fork+0x57/0x70

There are more errors in dmesg, but they all follow the same schema: Some process tries to perform some operations on XFS, the kernel task gets stuck and the process remains in uninterruptible sleep.

Shortly after, libceph reports that the OSDs are down:

[ 4218.521314] libceph: osd0 down

Journalctl does not report any additional errors.

Cluster config:

apiVersion: ceph.rook.io/v1
kind: CephCluster
metadata:
 name: rook-ceph
 namespace: rook-ceph
spec:
 cephVersion:
 image: "ceph/ceph:v13.2.5-20190319"
 dataDirHostPath: "/var/rook/data"
 dashboard:
 enabled: True
 port: 80
 ssl: False
 network:
 hostNetwork: False # use SDN (Canal) as network
 mon:
 count: 3
 allowMultiplePerNode: True 
 resources: # http://docs.ceph.com/docs/mimic/start/hardware-recommendations/
 mgr:
 requests:
 cpu: 4
 memory: "2Gi"
 limits:
 cpu: 4
 memory: "2Gi"
 mon:
 requests:
 cpu: 0.5
 memory: "2Gi"
 limits:
 cpu: 0.5
 memory: "2Gi"
 osd:
 requests:
 cpu: 2
 memory: "5Gi"
 limits:
 cpu: 2
 memory: "5Gi"
 storage:
 useAllNodes: False
 nodes:
 - name: "kubernetes-master" # matches node label: kubernetes.io/hostname
 useAllDevices: False
 directories:
 - path: "/var/rook/filestore"

BlockPool config:

apiVersion: ceph.rook.io/v1
kind: CephBlockPool
metadata:
 name: volatile-replicapool
 namespace: rook-ceph
spec:
 failureDomain: osd
 replicated:
 size: 1

And the StorageClasses:

apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-development
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Delete
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
 name: ceph-block-production
provisioner: ceph.rook.io/block
parameters:
 blockPool: volatile-replicapool
 clusterNamespace: rook-ceph
 fstype: xfs
reclaimPolicy: Retain

I am running Linux 4.9.0-8-amd64 #1 SMP Debian 4.9.144-3.1 (2019-02-19) x86_64.

Any pointers as to how to debug this issue would be greatly appreciated.

Thanks in advance.

kubernetes xfs ceph

asked Apr 4 at 9:55

strangedev

New contributor

asked Apr 4 at 9:55

strangedev

New contributor

asked Apr 4 at 9:55

strangedev

New contributor

asked Apr 4 at 9:55

strangedev

asked Apr 4 at 9:55

strangedev

New contributor

strangedev is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.

add a comment |

0

active

oldest

votes

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "2"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

strangedev is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f961467%2fhung-kernel-tasks-after-unclean-shutdown-of-ceph-cluster%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

0

active

oldest

votes

0

active

oldest

votes

strangedev is a new contributor. Be nice, and check out our Code of Conduct.

draft saved

draft discarded

strangedev is a new contributor. Be nice, and check out our Code of Conduct.

Thanks for contributing an answer to Server Fault!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Otdfbt

0

Your Answer

Post as a guest

0

0

Post as a guest

Popular posts from this blog

0

Your Answer

Sign up or log in

Post as a guest

Post as a guest

0

0

Sign up or log in

Post as a guest

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Sign up or log in

Post as a guest

Popular posts from this blog