<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[padz.dev: System Design]]></title><description><![CDATA[System Design do jeito que deveria ser ensinado: com as decisões difíceis, os tradeoffs reais e os erros que custam caro em produção.]]></description><link>https://blog.padz.dev/s/system-design</link><image><url>https://substackcdn.com/image/fetch/$s_!d0XQ!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28f3aae4-c6f9-4a2b-9671-b0ff6175501a_512x512.png</url><title>padz.dev: System Design</title><link>https://blog.padz.dev/s/system-design</link></image><generator>Substack</generator><lastBuildDate>Tue, 14 Apr 2026 07:52:14 GMT</lastBuildDate><atom:link href="https://blog.padz.dev/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Bruno Padilha]]></copyright><language><![CDATA[pt-br]]></language><webMaster><![CDATA[brunopadz@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[brunopadz@substack.com]]></itunes:email><itunes:name><![CDATA[Bruno Padilha]]></itunes:name></itunes:owner><itunes:author><![CDATA[Bruno Padilha]]></itunes:author><googleplay:owner><![CDATA[brunopadz@substack.com]]></googleplay:owner><googleplay:email><![CDATA[brunopadz@substack.com]]></googleplay:email><googleplay:author><![CDATA[Bruno Padilha]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[Graceful Shutdown - Como não perder milhares de Reais em 5 minutos de deploy]]></title><description><![CDATA[Um guia completo para encerrar aplica&#231;&#245;es de forma controlada]]></description><link>https://blog.padz.dev/p/graceful-shutdown-como-nao-perder</link><guid isPermaLink="false">https://blog.padz.dev/p/graceful-shutdown-como-nao-perder</guid><dc:creator><![CDATA[Bruno Padilha]]></dc:creator><pubDate>Tue, 13 Jan 2026 13:03:07 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!d7B8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d7B8!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d7B8!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 424w, https://substackcdn.com/image/fetch/$s_!d7B8!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 848w, https://substackcdn.com/image/fetch/$s_!d7B8!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!d7B8!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d7B8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg" width="1456" height="911" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/d2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:911,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:691631,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.padz.dev/i/184042490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d7B8!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 424w, https://substackcdn.com/image/fetch/$s_!d7B8!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 848w, https://substackcdn.com/image/fetch/$s_!d7B8!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!d7B8!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd2be8574-b8ec-4d10-9785-75ebbd2f637a_2832x1772.jpeg 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Foto por <a href="https://unsplash.com/@sydsujuaan?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Syd Sujuaan</a> no <a href="https://unsplash.com/photos/closeup-photo-of-rippling-sea-water-08I-aPLola0?utm_source=unsplash&amp;utm_medium=referral&amp;utm_content=creditCopyText">Unsplash</a></figcaption></figure></div><p>Imagina o seguinte cen&#225;rio: voc&#234; trabalha em um e-commerce com alto volume de tr&#225;fego e logo no in&#237;cio do ano a diretoria comercial junto ao marketing decide realizar uma mega promo&#231;&#227;o de queima de estoque de produtos que n&#227;o venderam na Black Friday, com desconto agressivo e prazo curto.</p><p>Tudo no ar, campanhas rodando em redes sociais, tr&#225;fego subindo, todo mundo animado. At&#233; que percebem que um c&#225;lculo errado de pre&#231;o est&#225; sendo aplicado em parte do cat&#225;logo, afetando diretamente alguns dos produtos da promo&#231;&#227;o.</p><p>N&#227;o tem muito o que discutir, &#233; preciso corrigir. A &#250;nica sa&#237;da &#233; um novo deploy para sanar o problema.</p><p>Seu time faz a corre&#231;&#227;o, o CI passa, o deploy come&#231;a. Minutos depois, o time de atendimento dispara no Slack dizendo que v&#225;rios clientes n&#227;o conseguiram finalizar compras nos &#250;ltimos 5 minutos, exatamente o tempo que levou para o rollout da nova vers&#227;o. Carrinhos abandonados, pagamentos falhando, reclama&#231;&#245;es p&#250;blicas. Clim&#227;o.</p><p>Voc&#234; come&#231;a a investigar e percebe que diversas requisi&#231;&#245;es simplesmente n&#227;o foram conclu&#237;das durante o deploy. Algumas morreram no meio do caminho. Outras nunca receberam resposta.</p><p>O que aconteceu? Sua aplica&#231;&#227;o morreu no meio de transa&#231;&#245;es ativas. Pedidos sendo processados foram abortados. Pagamentos ficaram em estados inconsistentes. Workers processando emails de confirma&#231;&#227;o morreram deixando eventos &#243;rf&#227;os no Kafka.</p><p>E agora? Como evitar que isso aconte&#231;a de novo?</p><p>&#201; aqui que entra o Graceful Shutdown, uma forma de encerrar uma aplica&#231;&#227;o de maneira controlada, garantindo que requisi&#231;&#245;es em andamento sejam conclu&#237;das antes do processo morrer.</p><h2>O que &#233; Graceful Shutdown de verdade?</h2><p>Graceful Shutdown &#233; um contrato operacional entre sua aplica&#231;&#227;o e o ambiente onde ela roda. Para satisfazer esse contrato, voc&#234; precisa garantir tr&#234;s coisas:</p><ol><li><p>N&#227;o aceitar novas requests e workloads: Encerrar o recebimento de novas requisi&#231;&#245;es HTTP; Parar de consumir mensagens de filas (Kafka, RabbitMQ, SQS); Sem derrubar conex&#245;es j&#225; estabelecidas com banco de dados ou cache;</p></li><li><p>Completar todo processamento: Aguardar que todas as requisi&#231;&#245;es ativas sejam conclu&#237;das; Processar mensagens que j&#225; foram retiradas da fila; Se ultrapassar um tempo aceit&#225;vel, responder com erro previs&#237;vel, n&#227;o timeout gen&#233;rico.</p></li><li><p>Liberar recursos: Encerrar conex&#245;es com bancos de dados; Commitar offsets do Kafka; Liberar locks de arquivos; Finalizar qualquer comunica&#231;&#227;o ativa com sistemas externos.</p></li></ol><p>Neste artigo, vou mostrar como implementar isso em Go, mas o conceito vale para qualquer linguagem, runtime ou plataforma de orquestra&#231;&#227;o.</p><h1>Como processos sabem que devem morrer?</h1><p>Antes de nos aprofundarmos em Graceful Shutdown, &#233; importante entender como processos s&#227;o &#8220;avisados&#8221; de que precisam parar. Vamos voltar ao cen&#225;rio do e-commerce, quando voc&#234; fez o deploy, o que exatamente aconteceu do ponto de vista do processo rodando no container?</p><h2>Sinais </h2><p>Em sistemas Unix-like, processos s&#227;o notificados atrav&#233;s de sinais. Sinais s&#227;o interrup&#231;&#245;es enviadas pelo sistema operacional (ou por outro processo) para notificar que algo relevante aconteceu e uma a&#231;&#227;o precisa ser tomada.</p><p>Quando um sinal &#233; entregue, o fluxo normal de execu&#231;&#227;o do processo &#233; interrompido para que ele possa reagir a esse evento.</p><p>Por&#233;m, nem todo processo reage da mesma forma a um sinal, e nem todo sinal permite rea&#231;&#227;o.</p><p>Existem tr&#234;s comportamentos poss&#237;veis:</p><h3>Signal handler</h3><p>Um signal handler &#233; uma fun&#231;&#227;o registrada pela aplica&#231;&#227;o para lidar com um sinal espec&#237;fico. Quando o sinal &#233; recebido, o processo n&#227;o morre automaticamente. Em vez disso, o handler &#233; executado, permitindo que a aplica&#231;&#227;o:</p><ul><li><p>Pare de aceitar novas requisi&#231;&#245;es;</p></li><li><p>Aguarde requisi&#231;&#245;es em andamento;</p></li><li><p>Libere recursos;</p></li><li><p>Finalize de forma controlada.</p></li></ul><h3>A&#231;&#227;o padr&#227;o (default action)</h3><p>Se a aplica&#231;&#227;o n&#227;o define um handler, o sistema operacional executa a a&#231;&#227;o padr&#227;o associada ao sinal e dependendo do sinal, essa a&#231;&#227;o pode ser:</p><ul><li><p>Encerrar o processo imediatamente;</p></li><li><p>Ignorar o sinal;</p></li><li><p>Parar (suspender) o processo.</p></li></ul><h3>Unblockable signals (sinais que n&#227;o podem ser tratados)</h3><p>Alguns sinais n&#227;o podem ser interceptados ou ignorados, independentemente do que a aplica&#231;&#227;o fa&#231;a. Os mais conhecidos s&#227;o <code>SIGKILL </code>e o <code>SIGSTOP</code>. Quando um desses sinais &#233; enviado, o processo morre imediatamente, sem chance de cleanup, sem handler, sem tempo de reagir.</p><h3>E em ambientes conteinerizados?</h3><p>No Kubernetes, o processo normalmente recebe um <code>SIGTERM</code> antes de qualquer coisa mais agressiva. Esse &#233; como se fosse um &#250;ltimo aviso educado para a aplica&#231;&#227;o.</p><p>Se a aplica&#231;&#227;o n&#227;o trata <code>SIGTERM</code> ou demora mais do que o tempo configurado, o orquestrador envia um <code>SIGKILL</code> e a sua aplica&#231;&#227;o cai.</p><h2>Voltando ao cen&#225;rio do deploy com bug de pre&#231;o</h2><pre><code>15:29:58 - Deploy iniciado
15:30:00 - Kubernetes envia SIGTERM para Pod antigo
15:30:00 - Sua aplica&#231;&#227;o IGNORA o sinal (n&#227;o tem handler)
15:30:00 - Runtime do Go encerra processo imediatamente
15:30:00 - 47 requisi&#231;&#245;es HTTP ativas abortadas
15:30:00 - 12 mensagens Kafka meio processadas
15:30:00 - 3 workers gerando relat&#243;rios mortos
15:30:01 - Time de atendimento come&#231;a a receber reclama&#231;&#245;es</code></pre><h3>O erro comum</h3><p>Muitos times acham que &#8220;Graceful Shutdown &#233; coisa do Kubernetes&#8221; ou do load balancer, a real &#233; que n&#227;o &#233;.</p><p>O Graceful Shutdown come&#231;a dentro da aplica&#231;&#227;o, no momento em que ela decide o que fazer ao receber um sinal.</p><h1>Lidando com sinais em Go</h1><p>Quando uma aplica&#231;&#227;o Go inicia, antes mesmo da fun&#231;&#227;o<strong> </strong><code>main</code><strong> </strong>rodar, o runtime registra handlers para v&#225;rios sinais do sistema operacional. Esses handlers existem para garantir que o processo termine de forma previs&#237;vel.</p><p>Por&#233;m, para Graceful Shutdown, quase tudo &#233; ru&#237;do. Na pr&#225;tica, tr&#234;s sinais concentram 99% dos cen&#225;rios reais:</p><ul><li><p><code>SIGTERM</code>: Indica que o processo deve encerrar. &#201; o sinal enviado por orquestradores como Kubernetes durante deploys, scale down ou evictions. Ele pode e deve ser tratado pela aplica&#231;&#227;o;</p></li><li><p><code>SIGINT</code>: Normalmente enviado quando algu&#233;m pressiona <code>Ctrl+C</code> no terminal. O comportamento esperado &#233; encerrar de forma limpa, assim como no <code>SIGTERM</code>;</p></li><li><p><code>SIGHUP</code>: Historicamente &#233; um sinal utilizado para indicar fechamento de terminal. Hoje em dia, ele costuma ser usado para recarregar configs.</p></li></ul><p>Por padr&#227;o, ao receber qualquer um desses sinais, o runtime do Go encerra o processo imediatamente. Isso significa que:</p><ul><li><p>Requisi&#231;&#245;es em andamento s&#227;o abortadas;</p></li><li><p>Conex&#245;es s&#227;o fechadas sem aviso;</p></li><li><p>O sistema externo que chamou sua aplica&#231;&#227;o n&#227;o vai receber a resposta.</p></li></ul><p>Para mudar esse comportamento, &#233; necess&#225;rio interceptar os sinais usando o pacote <code>os/signal</code>.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!xsRp!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!xsRp!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 424w, https://substackcdn.com/image/fetch/$s_!xsRp!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 848w, https://substackcdn.com/image/fetch/$s_!xsRp!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 1272w, https://substackcdn.com/image/fetch/$s_!xsRp!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!xsRp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png" width="590" height="712.758064516129" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1498,&quot;width&quot;:1240,&quot;resizeWidth&quot;:590,&quot;bytes&quot;:215407,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.padz.dev/i/184042490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!xsRp!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 424w, https://substackcdn.com/image/fetch/$s_!xsRp!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 848w, https://substackcdn.com/image/fetch/$s_!xsRp!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 1272w, https://substackcdn.com/image/fetch/$s_!xsRp!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcca73810-42c1-4305-bcbe-7618d203aa5f_1240x1498.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Basicamente, <code>signal.NotifyContext</code> (dispon&#237;vel desde o Go 1.16) faz com que o runtime notifique os sinais para um contexto ao inv&#233;s de executar o comportamento padr&#227;o, permitindo que voc&#234; configure a melhor forma de prevenir que a aplica&#231;&#227;o termine de forma abrupta.</p><h2>Timeout n&#227;o &#233; detalhe</h2><p>Graceful Shutdown n&#227;o acontece no seu tempo, acontece no tempo que o ambiente permite.</p><p>Quando sua aplica&#231;&#227;o recebe um sinal de encerramento, existe um rel&#243;gio invis&#237;vel rodando. Se ele zerar antes do seu cleanup terminar, o sistema simplesmente mata o processo.</p><p>No Kubernetes, por exemplo, o comportamento padr&#227;o &#233; o seguinte:</p><ul><li><p>Ao iniciar um deploy, scale down ou eviction, o Pod recebe um <code>SIGTERM</code>;</p></li><li><p>A partir desse momento, come&#231;a a contar o valor que &#233; especificado no <code>terminationGracePeriodSeconds</code>, que por padr&#227;o s&#227;o 30 segundos - <a href="https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/">doc</a>;</p></li><li><p>Se o processo ainda estiver vivo ao final desse per&#237;odo, o Kubernetes envia um <code>SIGKILL</code>.</p></li></ul><p>E aqui n&#227;o tem negocia&#231;&#227;o, o <code>SIGKILL</code> n&#227;o pode ser interceptado, tratado ou ignorado. O processo morre imediatamente, no estado em que estiver. </p><p>Isso significa que toda a sua l&#243;gica de shutdown precisa caber dentro dessa janela:</p><ul><li><p>Concluindo requisi&#231;&#245;es em andamento;</p></li><li><p>Respondendo erros controlados se necess&#225;rio;</p></li><li><p>Liberando conex&#245;es;</p></li><li><p>Finalizar goroutines e workers.</p></li></ul><p>No cen&#225;rio do e-commerce, se voc&#234; tem:</p><ul><li><p>Requisi&#231;&#245;es HTTP de checkout que levam 5-8 segundos (valida&#231;&#227;o de estoque, pagamento, nota fiscal);</p></li><li><p>Workers processando emiss&#227;o de Notas Fiscais que levam 10-15 segundos;</p></li><li><p>Consumidor Kafka commitando offsets em batch a cada 30 segundos.</p></li></ul><p>Voc&#234; tem um baita de um problema. O shutdown padr&#227;o de 30s n&#227;o vai dar conta.</p><h2>Parando de receber requisi&#231;&#245;es: HTTP Server</h2><p>Vamos come&#231;ar pelo b&#225;sico, fazer o servidor HTTP parar de aceitar novas requisi&#231;&#245;es. </p><p>O pacote <code>net/http</code> possui o m&#233;todo <code>http.Server.Shutdown(ctx)</code> que implementa Graceful Shutdown nativamente. Ele para de aceitar novas conex&#245;es, aguarda as requisi&#231;&#245;es em andamento completarem (respeitando o contexto passado) e ent&#227;o fecha as conex&#245;es idle. Veja um exemplo <a href="https://github.com/brunopadz/signals-go/blob/main/http.go">aqui</a>.</p><p>Se voc&#234; usa <a href="https://gin-gonic.com/en/">Gin</a>, ele n&#227;o possui m&#233;todo pr&#243;prio, voc&#234; usa o <code>http.Server.Shutdown</code> diretamente, j&#225; que Gin &#233; apenas um wrapper em cima do <code>net/http</code>. Exemplo <a href="https://github.com/brunopadz/signals-go/blob/main/gin.go">aqui</a>.</p><p>J&#225; se voc&#234; usa o <a href="https://gofiber.io/">Fiber</a> (minha escolha pessoal), ele tem seu pr&#243;prio m&#233;todo, o <code>app.ShutdownWithContext(ctx)</code> porque usa o <a href="https://github.com/valyala/fasthttp">fasthttp</a> por baixo e n&#227;o o <code>net/http</code>. <a href="https://github.com/brunopadz/signals-go/blob/main/fiber.go">Exemplo</a>.</p><h3>Comportamento comum aos tr&#234;s</h3><p>Independte da implemente, framework, o comportamento &#233; o mesmo:</p><ul><li><p>Param de aceitar novas conex&#245;es imediatamente;</p></li><li><p>Aguardam requisi&#231;&#245;es ativas finalizarem (com timeout do contexto);</p></li><li><p>Fecham conex&#245;es idle;</p></li><li><p>Por&#233;m, n&#227;o garantem que requisi&#231;&#245;es v&#227;o completar se o timeout do contexto expirar.</p></li></ul><h3>O problema do Kubernetes</h3><p>J&#225; no Kubernetes, mesmo ap&#243;s o Pod ser marcado para termination, o kube-proxy pode demorar alguns segundos (~5-10s) para atualizar as regras de iptables e remover o Pod dos endpoints do Service. </p><p>Durante essa janela, voc&#234; ainda pode receber tr&#225;fego. O que acontece:</p><pre><code>15:30:00 - SIGTERM enviado 
15:30:00 - app.Shutdown() chamado imediatamente 
15:30:00 - Servidor para de aceitar conex&#245;es 
15:30:01 - Cliente tenta conectar &#8594; Connection refused 
15:30:02 - Cliente tenta conectar &#8594; Connection refused 
15:30:05 - kube-proxy finalmente atualiza iptables 
15:30:05 - Agora sim, tr&#225;fego para de chegar</code></pre><p>O resultado s&#227;o clientes recebendo erros 5xx.</p><p>Uma boa solu&#231;&#227;o &#233; implementar um delay antes de chamar o Shutdown:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!d8B7!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!d8B7!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 424w, https://substackcdn.com/image/fetch/$s_!d8B7!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 848w, https://substackcdn.com/image/fetch/$s_!d8B7!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 1272w, https://substackcdn.com/image/fetch/$s_!d8B7!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!d8B7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png" width="1240" height="864" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:864,&quot;width&quot;:1240,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:159251,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.padz.dev/i/184042490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!d8B7!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 424w, https://substackcdn.com/image/fetch/$s_!d8B7!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 848w, https://substackcdn.com/image/fetch/$s_!d8B7!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 1272w, https://substackcdn.com/image/fetch/$s_!d8B7!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef82586c-e20b-45a6-97ab-8dec159ae9c0_1240x864.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>E agora o fluxo ficaria algo assim:</p><pre><code>15:30:00 - SIGTERM recebido
15:30:00 - Aplica&#231;&#227;o aguarda 10s (delay intencional)
15:30:05 - kube-proxy atualiza iptables
15:30:10 - app.Shutdown() chamado
15:30:10 - Servidor para de aceitar conex&#245;es (mas n&#227;o chega mais tr&#225;fego)
15:30:12 - Requisi&#231;&#245;es ativas completadas
15:30:12 - Servidor encerrado </code></pre><h2>O que voc&#234; esqueceu de desligar</h2><p>Voc&#234; implementou <code>http.Server.Shutdown</code>, testou localmente, fez deploy em produ&#231;&#227;o. Tudo funcionando. At&#233; que numa ter&#231;a-feira &#224;s 3 da manh&#227; voc&#234; recebe um alerta: mensagens duplicadas no Kafka, jobs incompletos no banco, relat&#243;rios corrompidos.</p><p>O que aconteceu? Seu servidor HTTP morreu corretamente com Graceful Shutdown, mas o resto da aplica&#231;&#227;o morreu no meio do caminho.</p><h3>O problema real</h3><p>Aplica&#231;&#245;es n&#227;o s&#227;o s&#243; servidores HTTP. Elas t&#234;m:</p><ul><li><p>Consumidores de fila processando eventos ass&#237;ncronos;</p></li><li><p>Workers em background gerando relat&#243;rios, enviando emails, processando uploads;</p></li><li><p>Jobs agendados fazendo limpeza, sincroniza&#231;&#227;o, c&#225;lculos peri&#243;dicos;</p></li><li><p>Conex&#245;es persistentes com bancos, caches, message brokers.</p></li></ul><p>Quando voc&#234; faz Graceful Shutdown apenas do servidor HTTP, esses componentes continuam rodando at&#233; o SIGKILL chegar (~30s depois no Kubernetes). O resultado:</p><pre><code><code>15:30:00 - SIGTERM recebido
15:30:00 - HTTP server para de aceitar requests
15:30:02 - HTTP server finaliza &#250;ltimas requisi&#231;&#245;es
15:30:02 - HTTP server encerrado 
15:30:02 - Kafka consumer ainda processando pedido #8472
           &#8627; Validando estoque, reservando produtos...
15:30:15 - Worker ainda gerando nota fiscal do pedido #8480
           &#8627; PDF 70% completo, gravando no S3...
15:30:20 - Job de sincroniza&#231;&#227;o com ERP ainda rodando
           &#8627; Atualizando 4.823 produtos de 10.000...
15:30:30 - SIGKILL &#8594; tudo morre abruptamente
           &#8627; Pedido #8472: estoque reservado mas n&#227;o commitado no Kafka
           &#8627; Nota fiscal #8480: PDF corrompido no S3
           &#8627; Sincroniza&#231;&#227;o ERP: estado inconsistente</code></code></pre><p>A sequ&#234;ncia de shutdown tamb&#233;m &#233; importante. Voc&#234; precisa respeitar as depend&#234;ncias entre componentes: </p><h3>Fase 1</h3><p>Primeiro voc&#234; impede que novos processamentos ocorram no sistema:</p><ol><li><p>HTTP Server para de aceitar requisi&#231;&#245;es;</p></li><li><p>Kafka Consumer para de consumir novos eventos;</p></li><li><p>Jobs agendados devem ser cancelados.</p></li></ol><p>A ideia de fazer nessa ordem &#233; porque o servidor HTTP pode enfileirar eventos no Kafka, e o Kafka pode disparar jobs. Se voc&#234; parar o Kafka antes do HTTP, requisi&#231;&#245;es v&#227;o falhar ao tentar publicar os eventos.</p><h3>Fase 2</h3><p>Aqui a ideia &#233; finalizar todo processamento que est&#225; em andamento:</p><ol><li><p>Requisi&#231;&#245;es HTTP ativas, por exemplo: checkout, pagamento;</p></li><li><p>Kafka Consumers: Finaliza processamento, commita offsets;</p></li><li><p>Workers: Completa gera&#231;&#227;o de Notas Fiscais, envio de e-mails;</p></li><li><p>Jobs: Termina sincroniza&#231;&#227;o com ERP.</p></li></ol><h3>Fase 3</h3><p>Por fim, voc&#234; a aplica&#231;&#227;o deve liberar conex&#245;es com recursos externos:</p><ol><li><p>Fecha conex&#245;es com o Kafka, tanto de producer quanto consumer;</p></li><li><p>Encerra conex&#227;o com o banco de dados;</p></li><li><p>Encerra conex&#227;o com o Redis, memcached ou qualquer outro sistema de cache;</p></li><li><p>Libera locks de arquivos.</p></li></ol><p>Essa ordem importa, por que se voc&#234; fechar a conex&#227;o com o banco de dados enquanto os workers ainda est&#227;o salvando Notas Fiscais, voc&#234; ver&#225; um connection closed do nada. Se fechar a conex&#227;o com o Kafka antes de commitar os offsets, vai ter que reprocessar mensagens.</p><p>Exemplo:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!SMeF!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!SMeF!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 424w, https://substackcdn.com/image/fetch/$s_!SMeF!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 848w, https://substackcdn.com/image/fetch/$s_!SMeF!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 1272w, https://substackcdn.com/image/fetch/$s_!SMeF!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!SMeF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png" width="1240" height="790" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1240,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:159950,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.padz.dev/i/184042490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!SMeF!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 424w, https://substackcdn.com/image/fetch/$s_!SMeF!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 848w, https://substackcdn.com/image/fetch/$s_!SMeF!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 1272w, https://substackcdn.com/image/fetch/$s_!SMeF!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F389b5c11-4215-4975-8fea-81de0dbb208b_1240x790.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Goroutines tamb&#233;m precisam saber a hora de morrer</h2><p>Outro ponto ignorado com frequ&#234;ncia: goroutines n&#227;o morrem sozinhas.</p><p>Quando o processo recebe um <code>SIGTERM</code>:</p><ul><li><p>O runtime n&#227;o cancela goroutines automaticamente;</p></li><li><p>Loops infinitos continuam rodando;</p></li><li><p>Workers bloqueados continuam bloqueados.</p></li></ul><p>Se voc&#234; n&#227;o tem um mecanismo claro de cancelamento (normalmente via <code>context.Context</code>), voc&#234; est&#225; acreditando que o processo vai morrer r&#225;pido o suficiente, ou o orquestrador vai mandar <code>SIGKILL</code>.</p><p>Nenhuma das duas op&#231;&#245;es &#233; uma boa estrat&#233;gia.</p><p>Uma boa aqui &#233; utilizar context-aware goroutine, <a href="https://github.com/brunopadz/signals-go/blob/main/goroutines_1.go">exemplo</a>. Agora caso voc&#234; tem v&#225;rias goroutines, use <code>sync.WaitGroup</code> para aguardar todas finalizarem, <a href="https://github.com/brunopadz/signals-go/blob/main/goroutines_2.go">exemplo</a>.</p><p>Toda goroutine de longa dura&#231;&#227;o deveria responder a cancelamento. Se ela n&#227;o sabe quando parar, ela &#233; provavelmente &#233; um leak.</p><h2>Shutdown precisa ser previs&#237;vel, n&#227;o best effort</h2><p>Muita gente implementa shutdown como algo &#8220;best effort&#8221;, tenta fechar tudo e torce para dar tempo e isso &#233; fr&#225;gil.</p><p>O shutdown precisa ter:</p><ul><li><p>Ordem clara, o que encerra antes do qu&#234;;</p></li><li><p>Limites de tempo expl&#237;citos;</p><ul><li><p>Lembra das fases que citei acima?</p></li></ul></li><li><p>Fallback bem definido. O que acontece se algo n&#227;o for encerrado conforme esperado?</p></li></ul><p>Esses tempos precisam ser mensurados e testados. Uma dica aqui &#233; adicionar uma gordura. Voltando ao nosso cen&#225;rio:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!o48I!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!o48I!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 424w, https://substackcdn.com/image/fetch/$s_!o48I!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 848w, https://substackcdn.com/image/fetch/$s_!o48I!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 1272w, https://substackcdn.com/image/fetch/$s_!o48I!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!o48I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png" width="673" height="498.6513859275053" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e0ea64a3-6060-4851-b254-d8879b79f796_938x695.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:695,&quot;width&quot;:938,&quot;resizeWidth&quot;:673,&quot;bytes&quot;:158978,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.padz.dev/i/184042490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!o48I!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 424w, https://substackcdn.com/image/fetch/$s_!o48I!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 848w, https://substackcdn.com/image/fetch/$s_!o48I!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 1272w, https://substackcdn.com/image/fetch/$s_!o48I!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe0ea64a3-6060-4851-b254-d8879b79f796_938x695.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Vamos relembrar que o <code>terminationGracePeriodSeconds</code> por padr&#227;o no Kubernetes s&#227;o 30 segundos, mas a soma das nossas fases d&#225; ~35s. Isso &#233; intencional, voc&#234; quer que todas fases termine antes do <code>SIGKILL</code>.</p><p>Voc&#234; pode muito bem configurar o lifecycle do seu Deployment/Pod atrav&#233;s do par&#226;metro <code>terminationGracePeriodSeconds</code> e utilizar o <code>preStop</code> hook. </p><h2>Observe o shutdown como um evento cr&#237;tico</h2><p>Outro erro recorrente, o shutdown n&#227;o &#233; observado.</p><p>Em produ&#231;&#227;o, voc&#234; deveria conseguir responder:</p><ul><li><p>Quanto tempo o shutdown levou?</p></li><li><p>Q que demorou mais?</p></li><li><p>Quantas requisi&#231;&#245;es foram abortadas?</p></li><li><p>Houve timeout?</p></li></ul><p>Sem logs claros e m&#233;tricas, o shutdown vira um buraco negro. Quando algo d&#225; errado, ningu&#233;m sabe o real motivo.</p><p>Talvez esse seja o grande ponto de reflex&#227;o do meu post, Graceful Shutdown &#233; um evento operacional importante, n&#227;o um detalhe de implementa&#231;&#227;o.</p><h3>Dicas para mensurar</h3><ul><li><p>Atrav&#233;s m&#233;tricas para mensurar o tempo de dura&#231;&#227;o do shutdown;;</p></li><li><p>Atrav&#233;s de logs contendo dura&#231;&#227;o, quantas requests estavam ativas, quantos jobs estavam pendentes;</p></li><li><p>Crie uma dashboard para monitorar esses tempos, com dura&#231;&#227;o do shutdown, quantos shutdowns foram bem sucedidos, a dura&#231;&#227;o por fase e quantas requests foram abortadas;</p></li><li><p>Alerte apenas em casos cr&#237;ticos. Quando o shutdown est&#225; pr&#243;ximo dos 30s ou do valor que voc&#234; definir do <code>terminationGracePeriodSeconds</code>.</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!4Lda!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!4Lda!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 424w, https://substackcdn.com/image/fetch/$s_!4Lda!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 848w, https://substackcdn.com/image/fetch/$s_!4Lda!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 1272w, https://substackcdn.com/image/fetch/$s_!4Lda!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!4Lda!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png" width="630" height="401.3709677419355" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:790,&quot;width&quot;:1240,&quot;resizeWidth&quot;:630,&quot;bytes&quot;:125370,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.padz.dev/i/184042490?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!4Lda!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 424w, https://substackcdn.com/image/fetch/$s_!4Lda!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 848w, https://substackcdn.com/image/fetch/$s_!4Lda!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 1272w, https://substackcdn.com/image/fetch/$s_!4Lda!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F93f0fe0f-31e1-441d-ae5c-80a33dc2935f_1240x790.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">Exemplo de m&#233;tricas</figcaption></figure></div><p></p><h3>N&#227;o teste s&#243; quando d&#225; problema</h3><p>Shutdown precisa ser testado:</p><ul><li><p>Localmente durante desenvolvimento;</p></li><li><p>Em staging antes de subir pra produ&#231;&#227;o;</p></li><li><p>Sob carga com tr&#225;fego real;</p></li><li><p>Com lat&#234;ncia que possa simular depend&#234;ncias lentas;</p></li><li><p>Com depend&#234;ncias como banco de dados e Kafka falhando.</p></li></ul><p>Quanto antes voc&#234; falhar, mais barato &#233; corrigir.</p><h1>Conclus&#227;o</h1><p>Graceful Shutdown n&#227;o &#233; uma feature opcional nem um detalhe de infraestrutura.<br>Ele &#233; parte do contrato da aplica&#231;&#227;o com o ambiente onde ela roda.</p><p>Se voc&#234; ignora:</p><ul><li><p>Sinais do sistema operacional;</p></li><li><p>Limites de tempo impostos pelo orquestrador;</p></li><li><p>Requisi&#231;&#245;es em andamento;</p></li><li><p>Goroutines e workers;</p></li></ul><p>O sistema vai continuar funcionando, at&#233; o dia em que o deploy custar dinheiro, reputa&#231;&#227;o ou ambos.</p><h2>Pr&#243;ximos passos</h2><p>Se voc&#234; ainda n&#227;o tem Graceful Shutdown:</p><ol><li><p>Comece pelo b&#225;sico interceptando <code>SIGTERM</code> e <code>SIGINT</code>;</p></li><li><p>Implemente shutdown do servidor HTTP;</p></li><li><p>Identifique todos os componentes concorrentes (Kafka, workers, jobs);</p></li><li><p>Implemente shutdown coordenado respeitando depend&#234;ncias;</p></li><li><p>Adicione logs e m&#233;tricas;</p></li><li><p>Teste localmente com <code>Ctrl+C</code>;</p></li><li><p>Teste em staging sob carga;</p></li><li><p>Fa&#231;a rollout gradual em produ&#231;&#227;o;</p></li><li><p>Monitore e ajuste timeouts baseado em dados reais.</p></li></ol><p>Se voc&#234; j&#225; tem Graceful Shutdown:</p><ol><li><p>Revise a ordem, ela respeita depend&#234;ncias?;</p></li><li><p>Me&#231;a os timeouts;</p></li><li><p>Adicione observabilidade;</p></li><li><p>Teste sob condi&#231;&#245;es adversas;</p></li><li><p>Configure alertas;</p></li><li><p>Documente.</p></li></ol><div><hr></div><p>Fiquem a vontade para comentar suas solu&#231;&#245;es, como voc&#234;s implementam ou se possuem d&#250;vidas sobre o tema.</p><p>At&#233; mais!</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.padz.dev/p/graceful-shutdown-como-nao-perder?utm_source=substack&utm_medium=email&utm_content=share&action=share&quot;,&quot;text&quot;:&quot;Compartilhar&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.padz.dev/p/graceful-shutdown-como-nao-perder?utm_source=substack&utm_medium=email&utm_content=share&action=share"><span>Compartilhar</span></a></p>]]></content:encoded></item></channel></rss>