VMs running kubernetes clusters go down periodically











up vote
0
down vote

favorite












We are running several kubernetes clusters on a few hundred VMs. A few VMs go down every week. We bring it back up. Our metrics show that the CPU & memory usage are low to moderate on these VMs when they go down. Other VM metrics (like the network traffic) also don't point to any unusual patterns. There are no specific messages in /var/log/messages when the VMs go down.



Kubernetes version: 1.9
Linux kernel version: 4.1.12-124.19.5.el7uek.x86_64



Are there other logs or diagnostic information we can check to get to the root cause of the VM outages.










share|improve this question


























    up vote
    0
    down vote

    favorite












    We are running several kubernetes clusters on a few hundred VMs. A few VMs go down every week. We bring it back up. Our metrics show that the CPU & memory usage are low to moderate on these VMs when they go down. Other VM metrics (like the network traffic) also don't point to any unusual patterns. There are no specific messages in /var/log/messages when the VMs go down.



    Kubernetes version: 1.9
    Linux kernel version: 4.1.12-124.19.5.el7uek.x86_64



    Are there other logs or diagnostic information we can check to get to the root cause of the VM outages.










    share|improve this question
























      up vote
      0
      down vote

      favorite









      up vote
      0
      down vote

      favorite











      We are running several kubernetes clusters on a few hundred VMs. A few VMs go down every week. We bring it back up. Our metrics show that the CPU & memory usage are low to moderate on these VMs when they go down. Other VM metrics (like the network traffic) also don't point to any unusual patterns. There are no specific messages in /var/log/messages when the VMs go down.



      Kubernetes version: 1.9
      Linux kernel version: 4.1.12-124.19.5.el7uek.x86_64



      Are there other logs or diagnostic information we can check to get to the root cause of the VM outages.










      share|improve this question













      We are running several kubernetes clusters on a few hundred VMs. A few VMs go down every week. We bring it back up. Our metrics show that the CPU & memory usage are low to moderate on these VMs when they go down. Other VM metrics (like the network traffic) also don't point to any unusual patterns. There are no specific messages in /var/log/messages when the VMs go down.



      Kubernetes version: 1.9
      Linux kernel version: 4.1.12-124.19.5.el7uek.x86_64



      Are there other logs or diagnostic information we can check to get to the root cause of the VM outages.







      kubernetes virtual-machine






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked 2 days ago









      sengs

      2,66662323




      2,66662323
























          1 Answer
          1






          active

          oldest

          votes

















          up vote
          0
          down vote













          Usually we also check the host journal especially if you are running kubelet as systemd.

          There is a good tutorial on digitalocean explaining journald.



          https://www.digitalocean.com/community/tutorials/how-to-use-journalctl-to-view-and-manipulate-systemd-logs



          It might give you some clue as to why your kube nodes are crashing.






          share|improve this answer





















            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














             

            draft saved


            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53238175%2fvms-running-kubernetes-clusters-go-down-periodically%23new-answer', 'question_page');
            }
            );

            Post as a guest
































            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes








            up vote
            0
            down vote













            Usually we also check the host journal especially if you are running kubelet as systemd.

            There is a good tutorial on digitalocean explaining journald.



            https://www.digitalocean.com/community/tutorials/how-to-use-journalctl-to-view-and-manipulate-systemd-logs



            It might give you some clue as to why your kube nodes are crashing.






            share|improve this answer

























              up vote
              0
              down vote













              Usually we also check the host journal especially if you are running kubelet as systemd.

              There is a good tutorial on digitalocean explaining journald.



              https://www.digitalocean.com/community/tutorials/how-to-use-journalctl-to-view-and-manipulate-systemd-logs



              It might give you some clue as to why your kube nodes are crashing.






              share|improve this answer























                up vote
                0
                down vote










                up vote
                0
                down vote









                Usually we also check the host journal especially if you are running kubelet as systemd.

                There is a good tutorial on digitalocean explaining journald.



                https://www.digitalocean.com/community/tutorials/how-to-use-journalctl-to-view-and-manipulate-systemd-logs



                It might give you some clue as to why your kube nodes are crashing.






                share|improve this answer












                Usually we also check the host journal especially if you are running kubelet as systemd.

                There is a good tutorial on digitalocean explaining journald.



                https://www.digitalocean.com/community/tutorials/how-to-use-journalctl-to-view-and-manipulate-systemd-logs



                It might give you some clue as to why your kube nodes are crashing.







                share|improve this answer












                share|improve this answer



                share|improve this answer










                answered yesterday









                Bal Chua

                36215




                36215






























                     

                    draft saved


                    draft discarded



















































                     


                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53238175%2fvms-running-kubernetes-clusters-go-down-periodically%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest




















































































                    Popular posts from this blog

                    Full-time equivalent

                    Bicuculline

                    さくらももこ