Kubernetes Pod OOMKilled Solution












1















I have a service running on Kubernetes processing files passed from another resource. Single file size can vary between 10MB - 1GB.



Recently I've been seeing the pod dead because of OOMKilled Error:



State: Running
Started: Sun, 11 Nov 2018 07:28:46 +0000
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 09 Nov 2018 18:49:46 +0000
Finished: Sun, 11 Nov 2018 07:28:45 +0000


I mitigate the issue by bumping the resource (Memory) limit on the pod. But I am concerning whenever there is a traffic or file size spike, we will run into this OOMKilled issue again. But if I set the memory limit too high, I am concerning it will cause trouble on the host of this pod.



I read through the best practices given by Kubernetes: https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#best-practices. But I am not sure by adding --eviction-hard and --system-reserved=memory could resolve the issue.



Has anyone had experience with a similar issue before?



Any help would be appreciated.










share|improve this question

























  • paste you application log , and start from there , if there is no room for optimization on app level then allocate more memory

    – Ijaz Ahmad Khan
    Nov 13 '18 at 9:59


















1















I have a service running on Kubernetes processing files passed from another resource. Single file size can vary between 10MB - 1GB.



Recently I've been seeing the pod dead because of OOMKilled Error:



State: Running
Started: Sun, 11 Nov 2018 07:28:46 +0000
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 09 Nov 2018 18:49:46 +0000
Finished: Sun, 11 Nov 2018 07:28:45 +0000


I mitigate the issue by bumping the resource (Memory) limit on the pod. But I am concerning whenever there is a traffic or file size spike, we will run into this OOMKilled issue again. But if I set the memory limit too high, I am concerning it will cause trouble on the host of this pod.



I read through the best practices given by Kubernetes: https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#best-practices. But I am not sure by adding --eviction-hard and --system-reserved=memory could resolve the issue.



Has anyone had experience with a similar issue before?



Any help would be appreciated.










share|improve this question

























  • paste you application log , and start from there , if there is no room for optimization on app level then allocate more memory

    – Ijaz Ahmad Khan
    Nov 13 '18 at 9:59
















1












1








1








I have a service running on Kubernetes processing files passed from another resource. Single file size can vary between 10MB - 1GB.



Recently I've been seeing the pod dead because of OOMKilled Error:



State: Running
Started: Sun, 11 Nov 2018 07:28:46 +0000
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 09 Nov 2018 18:49:46 +0000
Finished: Sun, 11 Nov 2018 07:28:45 +0000


I mitigate the issue by bumping the resource (Memory) limit on the pod. But I am concerning whenever there is a traffic or file size spike, we will run into this OOMKilled issue again. But if I set the memory limit too high, I am concerning it will cause trouble on the host of this pod.



I read through the best practices given by Kubernetes: https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#best-practices. But I am not sure by adding --eviction-hard and --system-reserved=memory could resolve the issue.



Has anyone had experience with a similar issue before?



Any help would be appreciated.










share|improve this question
















I have a service running on Kubernetes processing files passed from another resource. Single file size can vary between 10MB - 1GB.



Recently I've been seeing the pod dead because of OOMKilled Error:



State: Running
Started: Sun, 11 Nov 2018 07:28:46 +0000
Last State: Terminated
Reason: OOMKilled
Exit Code: 137
Started: Fri, 09 Nov 2018 18:49:46 +0000
Finished: Sun, 11 Nov 2018 07:28:45 +0000


I mitigate the issue by bumping the resource (Memory) limit on the pod. But I am concerning whenever there is a traffic or file size spike, we will run into this OOMKilled issue again. But if I set the memory limit too high, I am concerning it will cause trouble on the host of this pod.



I read through the best practices given by Kubernetes: https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/#best-practices. But I am not sure by adding --eviction-hard and --system-reserved=memory could resolve the issue.



Has anyone had experience with a similar issue before?



Any help would be appreciated.







memory kubernetes






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Nov 12 '18 at 23:58









Rico

27.1k94865




27.1k94865










asked Nov 12 '18 at 23:06









EdwardEdward

62




62













  • paste you application log , and start from there , if there is no room for optimization on app level then allocate more memory

    – Ijaz Ahmad Khan
    Nov 13 '18 at 9:59





















  • paste you application log , and start from there , if there is no room for optimization on app level then allocate more memory

    – Ijaz Ahmad Khan
    Nov 13 '18 at 9:59



















paste you application log , and start from there , if there is no room for optimization on app level then allocate more memory

– Ijaz Ahmad Khan
Nov 13 '18 at 9:59







paste you application log , and start from there , if there is no room for optimization on app level then allocate more memory

– Ijaz Ahmad Khan
Nov 13 '18 at 9:59














1 Answer
1






active

oldest

votes


















1














More than a Kubernetes/Container runtime issue this is more memory management in your application and this will vary depending on what language runtime or if something like the JVM is running your application.



You generally want to set an upper limit on the memory usage in the application, for example, a maximum heap space in your JVM, then leave a little headroom for garbage collection and overruns.



Another example is the Go runtime and looks like they have talked about memory management but with no solution as of this writing. For these cases, it might be good to manually set the ulimit the virtual memory for the specific process of your application. (If you have a leak you will see other types of errors) or using timeout



There's also manual cgroup management but then again that's exactly with docker and Kubernetes are supposed to do.



This is a good article with some insights about managing a JVM in containers.






share|improve this answer



















  • 1





    Thanks Rico! My service is in Java. So JVM definitely an issue handling memory issues. I set the heap space limit 1GB lower than the limit of kubernetes pod. Not sure if that is causing any issue for GC.

    – Edward
    Nov 13 '18 at 21:16











  • Edward, if your java code is spawning many many threads, setting your jvm maximum heap isn't going to help. Java threads will use memory outside your jvm heap. These can cause container oom error, not java oom. Bottomline, it really depends on what that java application is doing when it receives big files.

    – Bal Chua
    Nov 16 '18 at 1:03











Your Answer






StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "1"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271397%2fkubernetes-pod-oomkilled-solution%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









1














More than a Kubernetes/Container runtime issue this is more memory management in your application and this will vary depending on what language runtime or if something like the JVM is running your application.



You generally want to set an upper limit on the memory usage in the application, for example, a maximum heap space in your JVM, then leave a little headroom for garbage collection and overruns.



Another example is the Go runtime and looks like they have talked about memory management but with no solution as of this writing. For these cases, it might be good to manually set the ulimit the virtual memory for the specific process of your application. (If you have a leak you will see other types of errors) or using timeout



There's also manual cgroup management but then again that's exactly with docker and Kubernetes are supposed to do.



This is a good article with some insights about managing a JVM in containers.






share|improve this answer



















  • 1





    Thanks Rico! My service is in Java. So JVM definitely an issue handling memory issues. I set the heap space limit 1GB lower than the limit of kubernetes pod. Not sure if that is causing any issue for GC.

    – Edward
    Nov 13 '18 at 21:16











  • Edward, if your java code is spawning many many threads, setting your jvm maximum heap isn't going to help. Java threads will use memory outside your jvm heap. These can cause container oom error, not java oom. Bottomline, it really depends on what that java application is doing when it receives big files.

    – Bal Chua
    Nov 16 '18 at 1:03
















1














More than a Kubernetes/Container runtime issue this is more memory management in your application and this will vary depending on what language runtime or if something like the JVM is running your application.



You generally want to set an upper limit on the memory usage in the application, for example, a maximum heap space in your JVM, then leave a little headroom for garbage collection and overruns.



Another example is the Go runtime and looks like they have talked about memory management but with no solution as of this writing. For these cases, it might be good to manually set the ulimit the virtual memory for the specific process of your application. (If you have a leak you will see other types of errors) or using timeout



There's also manual cgroup management but then again that's exactly with docker and Kubernetes are supposed to do.



This is a good article with some insights about managing a JVM in containers.






share|improve this answer



















  • 1





    Thanks Rico! My service is in Java. So JVM definitely an issue handling memory issues. I set the heap space limit 1GB lower than the limit of kubernetes pod. Not sure if that is causing any issue for GC.

    – Edward
    Nov 13 '18 at 21:16











  • Edward, if your java code is spawning many many threads, setting your jvm maximum heap isn't going to help. Java threads will use memory outside your jvm heap. These can cause container oom error, not java oom. Bottomline, it really depends on what that java application is doing when it receives big files.

    – Bal Chua
    Nov 16 '18 at 1:03














1












1








1







More than a Kubernetes/Container runtime issue this is more memory management in your application and this will vary depending on what language runtime or if something like the JVM is running your application.



You generally want to set an upper limit on the memory usage in the application, for example, a maximum heap space in your JVM, then leave a little headroom for garbage collection and overruns.



Another example is the Go runtime and looks like they have talked about memory management but with no solution as of this writing. For these cases, it might be good to manually set the ulimit the virtual memory for the specific process of your application. (If you have a leak you will see other types of errors) or using timeout



There's also manual cgroup management but then again that's exactly with docker and Kubernetes are supposed to do.



This is a good article with some insights about managing a JVM in containers.






share|improve this answer













More than a Kubernetes/Container runtime issue this is more memory management in your application and this will vary depending on what language runtime or if something like the JVM is running your application.



You generally want to set an upper limit on the memory usage in the application, for example, a maximum heap space in your JVM, then leave a little headroom for garbage collection and overruns.



Another example is the Go runtime and looks like they have talked about memory management but with no solution as of this writing. For these cases, it might be good to manually set the ulimit the virtual memory for the specific process of your application. (If you have a leak you will see other types of errors) or using timeout



There's also manual cgroup management but then again that's exactly with docker and Kubernetes are supposed to do.



This is a good article with some insights about managing a JVM in containers.







share|improve this answer












share|improve this answer



share|improve this answer










answered Nov 13 '18 at 0:14









RicoRico

27.1k94865




27.1k94865








  • 1





    Thanks Rico! My service is in Java. So JVM definitely an issue handling memory issues. I set the heap space limit 1GB lower than the limit of kubernetes pod. Not sure if that is causing any issue for GC.

    – Edward
    Nov 13 '18 at 21:16











  • Edward, if your java code is spawning many many threads, setting your jvm maximum heap isn't going to help. Java threads will use memory outside your jvm heap. These can cause container oom error, not java oom. Bottomline, it really depends on what that java application is doing when it receives big files.

    – Bal Chua
    Nov 16 '18 at 1:03














  • 1





    Thanks Rico! My service is in Java. So JVM definitely an issue handling memory issues. I set the heap space limit 1GB lower than the limit of kubernetes pod. Not sure if that is causing any issue for GC.

    – Edward
    Nov 13 '18 at 21:16











  • Edward, if your java code is spawning many many threads, setting your jvm maximum heap isn't going to help. Java threads will use memory outside your jvm heap. These can cause container oom error, not java oom. Bottomline, it really depends on what that java application is doing when it receives big files.

    – Bal Chua
    Nov 16 '18 at 1:03








1




1





Thanks Rico! My service is in Java. So JVM definitely an issue handling memory issues. I set the heap space limit 1GB lower than the limit of kubernetes pod. Not sure if that is causing any issue for GC.

– Edward
Nov 13 '18 at 21:16





Thanks Rico! My service is in Java. So JVM definitely an issue handling memory issues. I set the heap space limit 1GB lower than the limit of kubernetes pod. Not sure if that is causing any issue for GC.

– Edward
Nov 13 '18 at 21:16













Edward, if your java code is spawning many many threads, setting your jvm maximum heap isn't going to help. Java threads will use memory outside your jvm heap. These can cause container oom error, not java oom. Bottomline, it really depends on what that java application is doing when it receives big files.

– Bal Chua
Nov 16 '18 at 1:03





Edward, if your java code is spawning many many threads, setting your jvm maximum heap isn't going to help. Java threads will use memory outside your jvm heap. These can cause container oom error, not java oom. Bottomline, it really depends on what that java application is doing when it receives big files.

– Bal Chua
Nov 16 '18 at 1:03


















draft saved

draft discarded




















































Thanks for contributing an answer to Stack Overflow!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53271397%2fkubernetes-pod-oomkilled-solution%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Full-time equivalent

Bicuculline

さくらももこ