Decoded©

  • NoSQL: All about non-relational databases

    NoSQL is a type of database, whose specificity is to be non-relational. These systems allow the storage and analysis of Big Data. Find out everything you need to know: definition, history, functioning, use cases, advantages, training… In the age of Big Data, relational databases are no longer adequate. To handle the immense volumes of data, […]

    Lire la suite
  • Demystifying SQL Index: Understanding its Purpose and Functionality

    An SQL index enables you to quickly locate the data you’re looking for in a relational database. Find out all you need to know about this valuable tool, and why it’s so useful in Data Science! Efficient access to information is a priority in Data Science. That’s why professionals use databases to manage, store and […]

    Lire la suite
  • Data Engineer: All about the job, required skills, and salary

    The Data Engineer’s role is to prepare the data for the Data Scientist to analyze. Big Data and Data Science are growing, and more and more jobs are emerging in this field. Today, we’re going to take a closer look at one of the three main data science jobs, alongside the roles of Data Scientist and […]

    Lire la suite
  • KNN: What is the KNN Algorithm ?

    The K-Nearest Neighbors (KNN) algorithm is a machine learning algorithm belonging to the class of simple and easy-to-implement supervised learning algorithms. It can be used to solve classification and regression problems. In this article, we will delve into the definition of this algorithm, how it works, and provide a practical programming application. KNN : Definition […]

    Lire la suite
  • Alteryx: What is it? How does it work?

    In the digital age, businesses collect thousands of pieces of data every day, which are essential to their development. As technology develops, more and more data is created, to the point where analysing it becomes back-breaking. To make this task easier, many tools, such as Alteryx, offer to centralise and analyse the newly formed data. […]

    Lire la suite
  • “train_test_split: Tutorial on how to use this function

    A Machine Learning model is capable of learning autonomously from one dataset, with the aim of predicting behavior on another dataset. To do this, it finds underlying relationships between independent explanatory variables and a target variable in the initial dataset. It then uses these patterns to predict or classify new data. How do I define […]

    Lire la suite
  • Recommendation algorithm: What is it? How does it work?

    When YouTube recommends videos that match our current interests, or when Amazon suggests products we might find intriguing, what mechanisms are at play? Recommendation algorithms. These are highly intricate systems developed to further personalize the user experience, though with the potential risk of producing sometimes undesirable effects of polarization. They also spark debates regarding the […]

    Lire la suite
  • Open SQL file : Complete tutorial

    SQL files (Structured Query Language) hold code that outlines the structure and contents of a database. Opening a SQL file facilitates the execution of code to modify the database’s contents. The SQL file also includes commands for creating content, inserting, deleting, splitting, or updating data. Accessing a SQL file is swift when using MySQL or […]

    Lire la suite
  • How do I merge tables in Power Query?

    Confronted with numerous sources of data, data experts need to identify the relationships between different pieces of information. To do this, they can merge two tables with Power Query. This query editor simplifies the task of data modeling, for more efficient and reliable analysis. What is the merge function in Power Query? Merging tables in […]

    Lire la suite
  • Standard deviations in Excel: What’s it for? How do I calculate it?

    To analyze a set of numerical data, the mean is often used. However, this parameter has some shortcomings that do not always reflect the reality of the data. Fortunately, other statistical tools allow for a deeper analysis. This is notably the case with the standard deviation. So, what is it? What is its purpose? And […]

    Lire la suite
  • K-Means Clustering in Machine Learning: A Deep Dive

    Clustering is a specialized discipline within Machine Learning aimed at separating your data into homogeneous groups with common characteristics. It’s a highly valued field, especially in marketing, where there is often a need to segment customer databases to identify specific behaviors. The K-means algorithm is a well-known unsupervised algorithm in the realm of Clustering. In […]

    Lire la suite
  • Dust: What is it? How can it be used for prompt engineering?

    With the proliferation of generative AI tools, mastering the art of writing prompts is proving more than essential. But at the speed at which artificial intelligence solutions are evolving, many users are feeling lost. That’s when Dust was born. This prompt engineering tool helps users not only to write relevant guided messages, but also to […]

    Lire la suite