{"id":171102,"date":"2026-02-18T07:04:02","date_gmt":"2026-02-18T06:04:02","guid":{"rendered":"https:\/\/liora.io\/en\/?p=171102"},"modified":"2026-02-18T16:25:03","modified_gmt":"2026-02-18T15:25:03","slug":"confusion-matrix-what-is-it-and-how-do-i-use-it","status":"publish","type":"post","link":"https:\/\/liora.io\/en\/confusion-matrix-what-is-it-and-how-do-i-use-it","title":{"rendered":"Confusion Matrix: what is it and how do I use it?"},"content":{"rendered":"<p><strong>The performance of a Machine Learning algorithm is directly related to its ability to predict an outcome. When comparing the results of an algorithm to reality, a confusion matrix is used. In this article, you will learn how to read this matrix to interpret the results of a classification algorithm.<\/strong><\/p>\n<h2>What is a confusion matrix?<\/h2>\n<p><a href=\"https:\/\/liora.io\/en\/unlock-your-future-dive-into-machine-learning-engineer-training\">Machine Learning<\/a> involves feeding an algorithm with data so that it can learn to perform a specific task on its own. In classification problems, it predicts outcomes that need to be compared to the ground truth to measure its performance. The <strong>confusion matrix,<\/strong> also known as a contingency table, is commonly used for this purpose.<\/p>\n<p>It not only highlights correct and incorrect predictions but also provides insights into the types of errors made. To <strong>calculate a confusion matrix,<\/strong> you need a test dataset and a validation dataset containing the actual result values.<\/p>\n<p>Each column of the table represents a class predicted by the algorithm, and the rows represent the actual classes.<\/p>\n<p>Results are classified into four categories:<\/p>\n<ol>\n<li><strong>True Positive (TP): T<\/strong>he prediction and the actual value are both positive. Example: A sick person predicted as sick.<\/li>\n<li><strong>True Negative (TN):<\/strong> The prediction and the actual value are both negative. Example: A healthy person predicted as healthy.<\/li>\n<li><strong>False Positive (FP)<\/strong>: The prediction is positive, but the actual value is negative. Example: A healthy person predicted as sick.<\/li>\n<li><strong>False Negative (FN):<\/strong> The prediction is negative, but the actual value is positive. Example: A sick person predicted as healthy.<\/li>\n<\/ol>\n<figure><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2023\/09\/Sans-titre-3_Plan-de-travail-1-1024x594.png\" alt=\"\" width=\"800\" height=\"464\" \/><figcaption><\/figcaption><\/figure>\n<p>Of course, in more complex scenarios, you can add rows and columns to this matrix.<\/p>\n<p>Example: After applying a predictive model, we obtain the following results:<\/p>\n<p>In general, you always find the correct <strong>predictions on the diagonal. So, here we have:<\/strong><\/p>\n<p>&#8211; 600 individuals classified as belonging to class A out of a total of 2000 individuals, which is quite low.<br \/>\n&#8211; For individuals in class B, 1200 out of 2000 were correctly identified as belonging to this class.<br \/>\n&#8211; For individuals in class C, 1600 out of 2000 were correctly identified.<\/p>\n<p>The number of True Positives (TP) is therefore 3400.<\/p>\n<p>To calculate the<strong> number of False Positives (FP), True Negatives (TN), and False Negatives (FN),<\/strong> it&#8217;s not possible to calculate them directly from this table. You would need to break it down into three cases:<\/p>\n<p>1. Individuals in A and not in (B or C)<br \/>\n2. Individuals in B and not in (A or C)<br \/>\n3. Individuals in C and not in (A or B)<\/p>\n<p>You can then calculate all the necessary metrics for analyzing this table.<\/p>\n<p>Here are some common metrics derived from this kind of table:<\/p>\n<p>&#8211; Accuracy<br \/>\n&#8211; Precision<br \/>\n&#8211; Negative Predictive Value<br \/>\n&#8211; Specificity<br \/>\n&#8211; Sensitivity<\/p>\n<p>In practice, there is a very simple way to <a href=\"https:\/\/liora.io\/en\/python-programming-for-beginners-episode-3\">access all of these metrics in Python using the `classification_report` function from the `sklearn` library.<\/a><\/p>\n<figure><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2023\/09\/Sans-titre-3-02-1024x771.png\" alt=\"\" width=\"800\" height=\"602\" \/><figcaption><\/figcaption><\/figure>\n<p><strong>\u00a0Example:<\/strong><\/p>\n<p>This gives us the following result:<\/p>\n<figure><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2023\/09\/pasted-image-0-3.png\" alt=\"\" width=\"800\" height=\"265\" \/><figcaption><\/figcaption><\/figure>\n<h2>Quelles mesures devrions-nous utiliser pour \u00e9valuer nos pr\u00e9dictions ?<\/h2>\n<p>Here, we chose to name the different classes based on the values of <code>y_true<\/code>. We can then observe various metrics that allow us to assess the quality of our predictions, primarily:<\/p>\n<ul>\n<li>Precision<\/li>\n<li>Recall<\/li>\n<li>F1-score<\/li>\n<\/ul>\n<p>Their formulas are respectively:<\/p>\n<ul>\n<li>Precision: Precision = True Positives \/ (True Positives + False Positives)<\/li>\n<li>Recall: Recall = True Positives \/ (True Positives + False Negatives)<\/li>\n<li>F1-score: F1-score = (2 * Precision * Recall) \/ (Precision + Recall)<\/li>\n<\/ul>\n<p>These metrics provide valuable insights into the <a href=\"https:\/\/liora.io\/en\/management-of-unbalanced-classification-problems-ii\">performance of a classification model.<\/a> Precision measures the accuracy of positive predictions, recall measures the ability to correctly identify positive cases, and the F1-score is a balanced measure that combines both precision and recall into a single value.<\/p>\n<p>In our case, we can observe a Precision of 0 for class 1. This is simply because no individuals from real class 1 are predicted as belonging to that class. In contrast, for individuals in class 2, where 2 out of 3 were correctly assigned to class 2, we have a Precision of 0.67.<\/p>\n<p>Precision measures how many of the <strong>predicted positive cases<\/strong> were correct, so when there are no positive predictions for a particular class, the Precision for that class becomes 0. It&#8217;s an important metric for understanding the model&#8217;s ability to avoid false positives for a specific class.<\/p>\n<h3>The differences between micro-betterment and macro-betterment<\/h3>\n<p>The calculation and interpretation of these averages are slightly different.<\/p>\n<p>Regardless of the metric used, a macro-average calculates an average after calculating the metric independently for each class. In contrast, a <strong>micro-average<\/strong> takes into account the contributions of each class to calculate the average metric.<\/p>\n<p>In a multi-class classification, this approach is often favored when there is suspicion of imbalance between the classes (in terms of sample size or importance).<\/p>\n<p>For example:<\/p>\n<p>This gives us the following result:<\/p>\n<figure><img decoding=\"async\" src=\"https:\/\/liora.io\/app\/uploads\/sites\/9\/2023\/09\/unnamed-4-2.png\" alt=\"\" width=\"512\" height=\"152\" \/><figcaption><\/figcaption><\/figure>\n<p>You now know how to read and interpret the results of a <a href=\"https:\/\/liora.io\/en\/management-of-unbalanced-classification-problems-i\">classification algorithm<\/a>. If you want to learn more about this topic, you can check out our Data Analyst, Data Scientist, and <a href=\"\/en\/courses\/data-ai\/data-management\">Data Manager training programs,<\/a> which cover these concepts in more practical cases.<\/p>\n<p><a href=\"\/en\/courses\/data-ai\/\"><br \/>\nDiscover our courses<br \/>\n<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>The performance of a Machine Learning algorithm is directly related to its ability to predict an outcome. When comparing the results of an algorithm to reality, a confusion matrix is used. In this article, you will learn how to read this matrix to interpret the results of a classification algorithm. What is a confusion matrix? [&hellip;]<\/p>\n","protected":false},"author":76,"featured_media":207116,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"editor_notices":[],"footnotes":""},"categories":[2433],"class_list":["post-171102","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-ai"],"acf":[],"_links":{"self":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/171102","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/users\/76"}],"replies":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/comments?post=171102"}],"version-history":[{"count":3,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/171102\/revisions"}],"predecessor-version":[{"id":207238,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/posts\/171102\/revisions\/207238"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media\/207116"}],"wp:attachment":[{"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/media?parent=171102"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/liora.io\/en\/wp-json\/wp\/v2\/categories?post=171102"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}