{"id":183912,"date":"2024-04-01T04:40:00","date_gmt":"2024-04-01T03:40:00","guid":{"rendered":"https:\/\/liora.io\/en\/?p=183912"},"modified":"2026-02-06T08:13:21","modified_gmt":"2026-02-06T07:13:21","slug":"chaos-engineering-what-is-it","status":"publish","type":"post","link":"https:\/\/liora.io\/en\/chaos-engineering-what-is-it","title":{"rendered":"Chaos Engineering: What is it?"},"content":{"rendered":"<style>\n.elementor-heading-title{padding:0;margin:0;line-height:1}.elementor-widget-heading .elementor-heading-title[class*=elementor-size-]>a{color:inherit;font-size:inherit;line-height:inherit}.elementor-widget-heading .elementor-heading-title.elementor-size-small{font-size:15px}.elementor-widget-heading .elementor-heading-title.elementor-size-medium{font-size:19px}.elementor-widget-heading .elementor-heading-title.elementor-size-large{font-size:29px}.elementor-widget-heading .elementor-heading-title.elementor-size-xl{font-size:39px}.elementor-widget-heading .elementor-heading-title.elementor-size-xxl{font-size:59px}<\/style><p><strong>Chaos Engineering is an innovative discipline in the world of software engineering, which focuses on improving the resilience and reliability of computer systems. This approach, often considered counter-intuitive, involves the deliberate introduction of disturbances or errors into a computer system in order to test its ability to cope with them.<\/strong><\/p>\t\t\n\t\t<p>This principle emerged in a context where <a href=\"https:\/\/liora.io\/en\/data-architecture-definition-and-importance-in-data-science\">IT system architectures<\/a> were becoming increasingly complex and distributed. Leading companies such as Netflix, a pioneer in this field, recognised that traditional testing and quality management methods were insufficient to guarantee the reliability of large-scale systems.<\/p>\t\t\n\t\t\t<h3>Principles of Chaos Engineering<\/h3>\t\t\n\t\t<p>This innovative approach is based on a number of key principles that govern its implementation and effectiveness.<\/p>\t\t\n\t\t\t<style type=\"text\/css\">\n.tg  {border-collapse:collapse;border-spacing:0;}\n.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;\n  overflow:hidden;padding:10px 5px;word-break:normal;}\n.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;\n  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}\n.tg .tg-64tu{background-color:#efefef;font-family:Arial, Helvetica, sans-serif !important;font-size:15px;text-align:left;\n  vertical-align:middle}\n.tg .tg-nkdd{background-color:#c0c0c0;text-align:center;vertical-align:middle}\n.tg .tg-bwzf{background-color:#c0c0c0;font-family:Arial, Helvetica, sans-serif !important;font-size:15px;font-weight:bold;\n  text-align:left;vertical-align:middle}\n<\/style>\n<table style=\"undefined;table-layout: fixed; width: 800px\">\n<colgroup>\n<col style=\"width: 100px\">\n<col style=\"width: 300px\">\n<col style=\"width: 400px\">\n<\/colgroup>\n<thead>\n  <tr>\n    <th><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image13.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/th>\n    <th>Development of a Stable Resilience Hypothesis<\/th>\n    <th>It starts with formulating hypotheses about the resilience of the system. These hypotheses are based on understanding how the system should theoretically behave in the presence of various disruptions. <\/th>\n  <\/tr>\n<\/thead>\n<tbody>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image6-3.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Production of Controlled Disturbances<\/td>\n    <td>Chaos Engineering involves the deliberate and controlled introduction of disturbances into the production environment. These disturbances, known as &#8221; attacks &#8220;, can include things like unexpected server shutdowns, simulated network outages, or system resource overload. <\/td>\n  <\/tr>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image8-2.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Observation and Measurement<\/td>\n    <td>Once the disturbances are introduced, observing and measuring the system&#8217;s responses is crucial. This involves monitoring metrics and performance indicators to evaluate the impact of the disturbances. <\/td>\n  <\/tr>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image12-1.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Improvement<\/td>\n    <td>Learning from these experiences to continuously improve the system&#8217;s resilience is paramount. After each test, teams analyze the results, identify resilience gaps, and implement improvements.<\/td>\n  <\/tr>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image7-3.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Automation and Continuous Integration<\/td>\n    <td>To maximize To maximise its effectiveness, Chaos Engineering needs to be integrated into the development lifecycle. This means automating chaos tests as far as possible and integrating them into continuous deployment pipelines.<\/td>\n  <\/tr>\n<\/tbody>\n<\/table>\t\t\n\t\t\t\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex is-content-justification-center\"><div class=\"wp-block-button \"><a class=\"wp-block-button__link wp-element-button \" href=\"https:\/\/liora.io\/formation-administrateur-systeme-reseaux-et-cloud\">Ma\u00eetriser les principes du Chaos Engineering<\/a><\/div><\/div>\n\n\t\t\t<h3>Implementing Chaos Engineering<\/h3>\t\t\n\t\t<p>Sa mise en \u0153uvre est un processus structur\u00e9 qui n\u00e9cessite une planification minutieuse, des outils appropri\u00e9s et une compr\u00e9hension claire des objectifs vis\u00e9s.<\/p>\t\t\n\t\t\t<style type=\"text\/css\">\n.tg  {border-collapse:collapse;border-spacing:0;}\n.tg td{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;\n  overflow:hidden;padding:10px 5px;word-break:normal;}\n.tg th{border-color:black;border-style:solid;border-width:1px;font-family:Arial, sans-serif;font-size:14px;\n  font-weight:normal;overflow:hidden;padding:10px 5px;word-break:normal;}\n.tg .tg-dhwc{background-color:#c0c0c0;font-family:Arial, Helvetica, sans-serif !important;font-size:15px;text-align:center;\n  vertical-align:middle}\n.tg .tg-fn4z{background-color:#D9D9D9;font-family:Arial, Helvetica, sans-serif !important;font-size:15px;font-weight:bold;\n  text-align:left;vertical-align:middle}\n.tg .tg-kxla{background-color:#EFEFEF;font-family:Arial, Helvetica, sans-serif !important;font-size:15px;text-align:left;\n  vertical-align:middle}\n<\/style>\n<table style=\"undefined;table-layout: fixed; width: 800px\">\n<colgroup>\n<col style=\"width: 100px\">\n<col style=\"width: 300px\">\n<col style=\"width: 400px\">\n<\/colgroup>\n<thead>\n  <tr>\n    <th><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image15.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/th>\n    <th>Preparation and Planning<\/th>\n    <th>This involves clearly defining objectives, selecting relevant metrics to monitor, and establishing effective communication protocols for the team. <\/th>\n  <\/tr>\n<\/thead>\n<tbody>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image14.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Selection of Adequate Tools and Technologies<\/td>\n    <td>There are a variety of tools and platforms dedicated to Chaos Engineering, for example:<ul><li><strong>Chaos Monkey<\/strong><\/li><li><strong>Gremlin<\/strong><\/li><li><strong>Chaos Toolkit<\/strong><\/li><\/ul><\/td>\n  <\/tr>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image5-4.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Chaos Experience<\/td>\n    <td>This involves creating specific scenarios where disturbances will be introduced into the system. These experiments should be designed to test the hypotheses established during the preparation phase.<\/td>\n  <\/tr>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image2-3.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Execution in a Controlled Environment<\/td>\n    <td>Tests should be executed in a controlled environment to minimize risks. This often means starting in a testing environment before moving to production. <\/td>\n    <\/tr>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image3-2.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Analysis of results<\/td>\n    <td>After each experiment, it is essential to analyse the results and draw lessons. On the basis of these findings, corrective measures must be taken to strengthen the resilience of the system.<\/td>\n  <\/tr>\n  <tr>\n    <td><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image16.png\" alt=\"Image\" width=\"100\" height=\"100\"><\/td>\n    <td>Integration into the company culture<\/td>\n    <td>Experiments must be repeated regularly and the lessons learned integrated into the team&#8217;s day-to-day practices. For Chaos Engineering to be truly effective, it must become an integral part of the company&#8217;s culture.<\/td>\n  <\/tr>\n<\/tbody>\n<\/table>\t\t\n\t\t\t\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex is-content-justification-center\"><div class=\"wp-block-button \"><a class=\"wp-block-button__link wp-element-button \" href=\"\/en\/courses\/data-ai\/\">Learn how to implement Chaos Engineering<\/a><\/div><\/div>\n\n\t\t\t<h3>Case studies and real-life examples<\/h3>\t\t\n\t\t\t<style>\n.elementor-widget-image{text-align:center}.elementor-widget-image a{display:inline-block}.elementor-widget-image a img[src$=\".svg\"]{width:48px}.elementor-widget-image img{vertical-align:middle;display:inline-block}<\/style>\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image11-1.png\" title=\"\" alt=\"\" loading=\"lazy\">\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t<h4>Netflix with Chaos Monkey<\/h4>\t\t\n\t\t<p>Netflix is one of the pioneers of <strong>Chaos Engineering.<\/strong> They have developed a tool called Chaos Monkey, designed to test the resilience of their cloud infrastructure. <strong>Chaos Monkey<\/strong> works by randomly disabling servers in Netflix&#8217;s production environment. This bold approach has enabled Netflix to ensure that their streaming service remains reliable even in the event of an unexpected<a href=\"https:\/\/liora.io\/en\/monitoring-definition-principles-importance\"> server failure.<\/a><\/p>\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image9-1.png\" title=\"\" alt=\"\" loading=\"lazy\">\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t<h4>Amazon avec des tests de r\u00e9silience \u00e0 grande \u00e9chelle<\/h4>\t\t\n\t\t<p>Amazon a r\u00e9guli\u00e8rement mis en \u0153uvre des <strong>tests de chaos pour \u00e9valuer la robustesse de son immense infrastructure<\/strong> <a href=\"https:\/\/liora.io\/amazon-web-services-tout-savoir\">AWS<\/a>. En simulant des pannes de r\u00e9seau et des interruptions de service dans des r\u00e9gions sp\u00e9cifiques, Amazon a pu identifier et corriger des vuln\u00e9rabilit\u00e9s, garantissant une haute disponibilit\u00e9 de ses services cloud.<\/p>\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image4-3.png\" title=\"\" alt=\"\" loading=\"lazy\">\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t<h4>Linkedin with peak traffic management<\/h4>\t\t\n\t\t<p>LinkedIn used Chaos Engineering to better manage traffic peaks on its platform. By introducing controlled disruptions that simulated sudden increases in load, LinkedIn was able to assess the elasticity of its infrastructure and optimise its automatic scaling capabilities.<\/p>\t\t\n\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t<img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/2024\/01\/image1-3.png\" title=\"\" alt=\"\" loading=\"lazy\">\t\t\t\t\t\t\t\t\t\t\t\t\t\t\t\n\t\t\t<h4>NASA and the safety of space missions<\/h4>\t\t\n\t\t<p>Even organisations like NASA have applied Chaos Engineering principles to ensure the safety and success of their space missions. By testing their systems against extreme and unforeseen scenarios, NASA has been able to strengthen the resilience of its critical missions, where failure can have monumental consequences.<\/p>\t\t\n\t\t\t<h3>In conclusion<\/h3>\t\t\n\t\t<p>The <strong>Chaos Engineering<\/strong> approach represents a significant advance in the field of software engineering, offering a proactive and innovative approach to improving the resilience and <strong>reliability of systems.<\/strong><\/p><p>As these <strong><em>IT systems<\/em><\/strong> become increasingly complex and integrated into all aspects of daily life, the importance of such a methodology can only increase.<\/p>\t\t\n\t\t\t\n<div class=\"wp-block-buttons is-layout-flex wp-block-buttons-is-layout-flex is-content-justification-center\"><div class=\"wp-block-button \"><a class=\"wp-block-button__link wp-element-button \" href=\"\/en\/courses\/data-ai\/\">Training in Chaos Engineering<\/a><\/div><\/div>\n","protected":false},"excerpt":{"rendered":"<p>Chaos Engineering is an innovative discipline in the world of software engineering, which focuses on improving the resilience and reliability of computer systems. This approach, often considered counter-intuitive, involves the deliberate introduction of disturbances or errors into a computer system in order to test its ability to cope with them. This principle emerged in a [&hellip;]<\/p>\n","protected":false},"author":76,"featured_media":183913,"comment_status":"open","ping_status":"open","sticky":false,"template":"elementor_theme","format":"standard","meta":{"_acf_changed":false,"editor_notices":[],"footnotes":""},"categories":[2434],"class_list":["post-183912","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-cloud-dev"],"acf":[],"_links":{"self":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/183912","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/users\/76"}],"replies":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/comments?post=183912"}],"version-history":[{"count":1,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/183912\/revisions"}],"predecessor-version":[{"id":205868,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/183912\/revisions\/205868"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media\/183913"}],"wp:attachment":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media?parent=183912"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/categories?post=183912"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}