Machine learning is fundamentally about reducing entropy: finding the basic parameters that define a set of observations so we can either predict or generate new observations.

Reducing entropy can be seen as building a model that explains data. By reducing entropy, we aim to capture the underlying patterns and structure in the data, allowing us to better understand and explain the observed phenomena.

In machine learning, models are constructed to represent the relationships and dependencies present in the data. The goal is to find the model parameters that minimize the discrepancy between the model’s predictions and the actual data.

This way, the model acts as a distilled representation of the data, capturing the essential features and relationships (information) while discarding irrelevant information (noise).

These models can take various forms, such as mathematical equations, decision trees, neural networks, or probabilistic models.

By constructing a model that effectively explains the data, we can: make predictions, generate new samples, or gain insight into the underlying processes. But it is always fundamentally about reducing entropy.

And the interesting thing is that this can be done in very specific and rare situations in real life, because the universe is mostly computationally irreducible, which means that it can only be explained by itself. Not good for those who support the idea that we live in a simulation!