As we mentioned in the first part of the course, a machine learning model—also known as a “machine”—is essentially a black box.
This machine works in a simple way: it takes input feature variables, predicts the target variable, and outputs the result.
Typically, this black box is complex and difficult to interpret, but let’s assume that it is a simple linear model, as we introduced in the first part. Then, let’s open it.
All the linear models we have learned share a common characteristic: they first compute a weighted sum of the feature variables and then add a constant term.
Then, depending on the nature of the problem and the model being used, we need to choose an appropriate function—called the activation function—to obtain the final prediction.
For example, in a regression problem, we can choose the identity function. If our machine is a logistic regression model, then the activation function is the logistic function.
BTW, do you remember any other activation functions?
Mathematicians might find such diagrams too cumbersome, so they strip away all unnecessary details and abstract only the most essential parts. As a result, they express a machine learning model like this: