HDFS NFS gateway read Input/output error“mount: RPC: Timed out” when attempting to mount NFS filesystemHow do I speed up and cache mmap file access over NFS on Linux?check nfs over ssh is mountedVFS: Cannot open root device “nfs” or unknown-block(0,255)Is this a Linux NFS client bufferbloat?NFS noac does not force synchronisationHow slow can be a QCOW2 vm disk (proxmox/KVM) over NFS (dedicated 1Gbit)?Solaris 10 NFS client mount error “NFS compound failed for server 10.0.2.18: error 5 (RPC: Timed out)”What is causing `input/output` errors when reading from NFS v4 on CentOS?Configuration NFS v3 Kerberos on Centos 5.x

Layout of complex table

What would Earth look like at night in medieval times?

Impossible darts scores

How could mana leakage be dangerous to a elf?

Does anycast addressing add additional latency in any way?

Is there a maximum distance from a planet that a moon can orbit?

STM Microcontroller burns every time

Find smallest index that is identical to the value in an array

Is there any set of 2-6 notes that doesn't have a chord name?

Fedora boot screen shows both Fedora logo and Lenovo logo. Why and How?

What is the line crossing the Pacific Ocean that is shown on maps?

Going to get married soon, should I do it on Dec 31 or Jan 1?

Is it OK to bottle condition using previously contaminated bottles?

How to positively portray high and mighty characters?

Gare du Nord to Gare de Lyon transfer time for a family

Does the posterior necessarily follow the same conditional dependence structure as the prior?

Inverse-quotes-quine

Why do some games show lights shine through walls?

Is adding a new player (or players) a DM decision, or a group decision?

When is it ok to add filler to a story?

Links to webpages in books

How risky is real estate?

Should I tell my insurance company I'm making payments on my new car?

How to get cool night-vision without lame drawbacks?



HDFS NFS gateway read Input/output error


“mount: RPC: Timed out” when attempting to mount NFS filesystemHow do I speed up and cache mmap file access over NFS on Linux?check nfs over ssh is mountedVFS: Cannot open root device “nfs” or unknown-block(0,255)Is this a Linux NFS client bufferbloat?NFS noac does not force synchronisationHow slow can be a QCOW2 vm disk (proxmox/KVM) over NFS (dedicated 1Gbit)?Solaris 10 NFS client mount error “NFS compound failed for server 10.0.2.18: error 5 (RPC: Timed out)”What is causing `input/output` errors when reading from NFS v4 on CentOS?Configuration NFS v3 Kerberos on Centos 5.x






.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








1















I have enabled the HDFS NFSv3 gateway on our HDFS cluster through official documentation. Everything works well except for one Ubuntu 16.04 server machine. The following is the kernel, mount and machine's sysctl -a output information.



root@Linux:~$ uname -a
Linux xxx-server-001 4.15.0-46-generic #49~16.04.1-Ubuntu SMP Tue Feb 12 17:45:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

root@Linux:~$ mount | grep hdfs
10.30.200.100:/ on /hdfs type nfs (rw,relatime,sync,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.30.200.100,mountvers=3,mountport=4242,mountproto=tcp,local_lock=all,addr=10.30.200.100)

root@Linux:~$ sysctl -a | grep nfs
fs.nfs.idmap_cache_timeout = 2
fs.nfs.nfs_callback_tcpport = 0
fs.nfs.nfs_congestion_kb = 259136
fs.nfs.nfs_mountpoint_timeout = 500
fs.nfs.nlm_grace_period = 0
fs.nfs.nlm_tcpport = 0
fs.nfs.nlm_timeout = 10
fs.nfs.nlm_udpport = 0
fs.nfs.nsm_local_state = 3
fs.nfs.nsm_use_hostnames = 0
sunrpc.nfs_debug = 0xffff
sunrpc.nfsd_debug = 0x0000


The symptoms are the following:



  1. It could ls /hdfs folders with very few files in it without error, but it failed with Input/output error when the folder it tried to read from contains many files (more than 100 or so).


  2. When enabled the NFS debugging information through sudo rpcdebug -m nfs -c all on the machine, I observed the following error logs in dmesg when I hit the Input/ouput error through ls as the following. I have checked the source code here, and it looks like some buffer overflows issue. Does that mean it is a kernel bug for NFS?


[2538707.003904] NFS: dentry_delete(1232344325/sss.123.txt, 4808cc)
[2538707.003907] NFS: decode_fattr3 prematurely hit the end of our receive buffer. Remaining buffer length is 0 words.
[2538707.003914] NFS: readdir(b200/095900) returns -5


  1. When using other laptops or servers to mount the HDFS NFS gateway through sudo mount -t nfs -o vers=3,proto=tcp,nolock,noacl,sync 10.30.200.100:/ /hdfs, it does not have any issue. This means it is probably not the issue on the NFS gateway server itself. However, I have tried installing the 4.15.0-46-generic kernel on my own laptop but I could not reproduce this issue.


  2. This issue is not constantly reproducible, and sometimes it will work in the second or third time of retry after the gateway is just mounted. However, the failure rate would be 90%+ so it is still not usable.


Please let me know if there is any direction I could debug for this weird situation. Thanks in advance!










share|improve this question






























    1















    I have enabled the HDFS NFSv3 gateway on our HDFS cluster through official documentation. Everything works well except for one Ubuntu 16.04 server machine. The following is the kernel, mount and machine's sysctl -a output information.



    root@Linux:~$ uname -a
    Linux xxx-server-001 4.15.0-46-generic #49~16.04.1-Ubuntu SMP Tue Feb 12 17:45:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

    root@Linux:~$ mount | grep hdfs
    10.30.200.100:/ on /hdfs type nfs (rw,relatime,sync,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.30.200.100,mountvers=3,mountport=4242,mountproto=tcp,local_lock=all,addr=10.30.200.100)

    root@Linux:~$ sysctl -a | grep nfs
    fs.nfs.idmap_cache_timeout = 2
    fs.nfs.nfs_callback_tcpport = 0
    fs.nfs.nfs_congestion_kb = 259136
    fs.nfs.nfs_mountpoint_timeout = 500
    fs.nfs.nlm_grace_period = 0
    fs.nfs.nlm_tcpport = 0
    fs.nfs.nlm_timeout = 10
    fs.nfs.nlm_udpport = 0
    fs.nfs.nsm_local_state = 3
    fs.nfs.nsm_use_hostnames = 0
    sunrpc.nfs_debug = 0xffff
    sunrpc.nfsd_debug = 0x0000


    The symptoms are the following:



    1. It could ls /hdfs folders with very few files in it without error, but it failed with Input/output error when the folder it tried to read from contains many files (more than 100 or so).


    2. When enabled the NFS debugging information through sudo rpcdebug -m nfs -c all on the machine, I observed the following error logs in dmesg when I hit the Input/ouput error through ls as the following. I have checked the source code here, and it looks like some buffer overflows issue. Does that mean it is a kernel bug for NFS?


    [2538707.003904] NFS: dentry_delete(1232344325/sss.123.txt, 4808cc)
    [2538707.003907] NFS: decode_fattr3 prematurely hit the end of our receive buffer. Remaining buffer length is 0 words.
    [2538707.003914] NFS: readdir(b200/095900) returns -5


    1. When using other laptops or servers to mount the HDFS NFS gateway through sudo mount -t nfs -o vers=3,proto=tcp,nolock,noacl,sync 10.30.200.100:/ /hdfs, it does not have any issue. This means it is probably not the issue on the NFS gateway server itself. However, I have tried installing the 4.15.0-46-generic kernel on my own laptop but I could not reproduce this issue.


    2. This issue is not constantly reproducible, and sometimes it will work in the second or third time of retry after the gateway is just mounted. However, the failure rate would be 90%+ so it is still not usable.


    Please let me know if there is any direction I could debug for this weird situation. Thanks in advance!










    share|improve this question


























      1












      1








      1








      I have enabled the HDFS NFSv3 gateway on our HDFS cluster through official documentation. Everything works well except for one Ubuntu 16.04 server machine. The following is the kernel, mount and machine's sysctl -a output information.



      root@Linux:~$ uname -a
      Linux xxx-server-001 4.15.0-46-generic #49~16.04.1-Ubuntu SMP Tue Feb 12 17:45:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

      root@Linux:~$ mount | grep hdfs
      10.30.200.100:/ on /hdfs type nfs (rw,relatime,sync,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.30.200.100,mountvers=3,mountport=4242,mountproto=tcp,local_lock=all,addr=10.30.200.100)

      root@Linux:~$ sysctl -a | grep nfs
      fs.nfs.idmap_cache_timeout = 2
      fs.nfs.nfs_callback_tcpport = 0
      fs.nfs.nfs_congestion_kb = 259136
      fs.nfs.nfs_mountpoint_timeout = 500
      fs.nfs.nlm_grace_period = 0
      fs.nfs.nlm_tcpport = 0
      fs.nfs.nlm_timeout = 10
      fs.nfs.nlm_udpport = 0
      fs.nfs.nsm_local_state = 3
      fs.nfs.nsm_use_hostnames = 0
      sunrpc.nfs_debug = 0xffff
      sunrpc.nfsd_debug = 0x0000


      The symptoms are the following:



      1. It could ls /hdfs folders with very few files in it without error, but it failed with Input/output error when the folder it tried to read from contains many files (more than 100 or so).


      2. When enabled the NFS debugging information through sudo rpcdebug -m nfs -c all on the machine, I observed the following error logs in dmesg when I hit the Input/ouput error through ls as the following. I have checked the source code here, and it looks like some buffer overflows issue. Does that mean it is a kernel bug for NFS?


      [2538707.003904] NFS: dentry_delete(1232344325/sss.123.txt, 4808cc)
      [2538707.003907] NFS: decode_fattr3 prematurely hit the end of our receive buffer. Remaining buffer length is 0 words.
      [2538707.003914] NFS: readdir(b200/095900) returns -5


      1. When using other laptops or servers to mount the HDFS NFS gateway through sudo mount -t nfs -o vers=3,proto=tcp,nolock,noacl,sync 10.30.200.100:/ /hdfs, it does not have any issue. This means it is probably not the issue on the NFS gateway server itself. However, I have tried installing the 4.15.0-46-generic kernel on my own laptop but I could not reproduce this issue.


      2. This issue is not constantly reproducible, and sometimes it will work in the second or third time of retry after the gateway is just mounted. However, the failure rate would be 90%+ so it is still not usable.


      Please let me know if there is any direction I could debug for this weird situation. Thanks in advance!










      share|improve this question
















      I have enabled the HDFS NFSv3 gateway on our HDFS cluster through official documentation. Everything works well except for one Ubuntu 16.04 server machine. The following is the kernel, mount and machine's sysctl -a output information.



      root@Linux:~$ uname -a
      Linux xxx-server-001 4.15.0-46-generic #49~16.04.1-Ubuntu SMP Tue Feb 12 17:45:24 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux

      root@Linux:~$ mount | grep hdfs
      10.30.200.100:/ on /hdfs type nfs (rw,relatime,sync,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,noacl,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.30.200.100,mountvers=3,mountport=4242,mountproto=tcp,local_lock=all,addr=10.30.200.100)

      root@Linux:~$ sysctl -a | grep nfs
      fs.nfs.idmap_cache_timeout = 2
      fs.nfs.nfs_callback_tcpport = 0
      fs.nfs.nfs_congestion_kb = 259136
      fs.nfs.nfs_mountpoint_timeout = 500
      fs.nfs.nlm_grace_period = 0
      fs.nfs.nlm_tcpport = 0
      fs.nfs.nlm_timeout = 10
      fs.nfs.nlm_udpport = 0
      fs.nfs.nsm_local_state = 3
      fs.nfs.nsm_use_hostnames = 0
      sunrpc.nfs_debug = 0xffff
      sunrpc.nfsd_debug = 0x0000


      The symptoms are the following:



      1. It could ls /hdfs folders with very few files in it without error, but it failed with Input/output error when the folder it tried to read from contains many files (more than 100 or so).


      2. When enabled the NFS debugging information through sudo rpcdebug -m nfs -c all on the machine, I observed the following error logs in dmesg when I hit the Input/ouput error through ls as the following. I have checked the source code here, and it looks like some buffer overflows issue. Does that mean it is a kernel bug for NFS?


      [2538707.003904] NFS: dentry_delete(1232344325/sss.123.txt, 4808cc)
      [2538707.003907] NFS: decode_fattr3 prematurely hit the end of our receive buffer. Remaining buffer length is 0 words.
      [2538707.003914] NFS: readdir(b200/095900) returns -5


      1. When using other laptops or servers to mount the HDFS NFS gateway through sudo mount -t nfs -o vers=3,proto=tcp,nolock,noacl,sync 10.30.200.100:/ /hdfs, it does not have any issue. This means it is probably not the issue on the NFS gateway server itself. However, I have tried installing the 4.15.0-46-generic kernel on my own laptop but I could not reproduce this issue.


      2. This issue is not constantly reproducible, and sometimes it will work in the second or third time of retry after the gateway is just mounted. However, the failure rate would be 90%+ so it is still not usable.


      Please let me know if there is any direction I could debug for this weird situation. Thanks in advance!







      nfs ubuntu-16.04 hdfs






      share|improve this question















      share|improve this question













      share|improve this question




      share|improve this question








      edited Jun 9 at 8:17







      lordofire

















      asked Jun 8 at 21:27









      lordofirelordofire

      63 bronze badges




      63 bronze badges




















          0






          active

          oldest

          votes














          Your Answer








          StackExchange.ready(function()
          var channelOptions =
          tags: "".split(" "),
          id: "2"
          ;
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function()
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled)
          StackExchange.using("snippets", function()
          createEditor();
          );

          else
          createEditor();

          );

          function createEditor()
          StackExchange.prepareEditor(
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: true,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: 10,
          bindNavPrevention: true,
          postfix: "",
          imageUploader:
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          ,
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          );



          );













          draft saved

          draft discarded


















          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f970689%2fhdfs-nfs-gateway-read-input-output-error%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes















          draft saved

          draft discarded
















































          Thanks for contributing an answer to Server Fault!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid


          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.

          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function ()
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f970689%2fhdfs-nfs-gateway-read-input-output-error%23new-answer', 'question_page');

          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          How to write a 12-bar blues melodyI-IV-V blues progressionHow to play the bridges in a standard blues progressionHow does Gdim7 fit in C# minor?question on a certain chord progressionMusicology of Melody12 bar blues, spread rhythm: alternative to 6th chord to avoid finger stretchChord progressions/ Root key/ MelodiesHow to put chords (POP-EDM) under a given lead vocal melody (starting from a good knowledge in music theory)Are there “rules” for improvising with the minor pentatonic scale over 12-bar shuffle?Confusion about blues scale and chords

          What if the end-user didn't have the required library?What is setup.py?What is a clean, pythonic way to have multiple constructors in Python?What does Ruby have that Python doesn't, and vice versa?What is the reason for having '//' in Python?How do I create a namespace package in Python?How to package shared objects that python modules depend on?setuptools vs. distutils: why is distutils still a thing?Navigation in Windows 10 vs code not going to virtualenv library when the same library is installed at user levelPython create package for local usePackaging a project that uses multiple python versionsWhy is permission denied on pip install except for when “--user” is included at end of command?

          Why did Thanos need his ship to help him in the battle scene?Which actor plays Thanos in the Avengers mid-credits scene?Are there economic implications portrayed in comics where the buildings and cities are ruined almost daily?Old X-Men comic where team travels to alien world with a ring-like sun that needs recharging?Why does Ego need help sleeping?Is there an objective answer to who “the strongest Avenger” is?How did Banner get unstuck?Why did Thanos get hit?How did Thanos (or anyone) know the Infinity Stones would give him this power?Did Thanos leave Eitri alive for his after-sales service?In Avengers 1, why does Thanos need Loki?