History

Proceptron classifier is viewed as the foundation and building block of artificial neural network. For a historical introduction, see the figures below.

The Perceptron classifier was developed by Frank Rosenblatt in 1957 (**LHS**). Rosenblatt’s goal was to create a machine that could classify visual patterns, such as distinguishing between different shapes. At the time, he envisioned using large computers to simulate neural networks, inspired by how the brain processes information. His early experiments involved using a huge computer called the Mark I Perceptron (**RHS**), which attempted to recognize different shapes by adjusting weights based on input data. This work laid the foundation for modern neural networks and machine learning, despite initial limitations in its capacity to handle complex, non-linear problems.

Suppose we are solving a binary classification problem with \(p\) feature variables. As discussed before, it can be represented as

\[ \hat{y} = \text{Sign}( \textbf{w}^{\top} \textbf{x} + w_0 ) \]

where \(\textbf{w} = (w_1, w_2, \dots, w_p)^{\top}\), and \(\textbf{x} = (x_1, x_2, \dots, x_p)^{\top}\). Different from before, here we represent the weighted sum of \(p\) feature variables, \(w_1x_1+ \dots + w_px_p\), as the inner (dot) product of two vectors, i.e. \(\textbf{w} \cdot \textbf{x} = \textbf{w}^{\top} \textbf{x}\).

Note: In order to understand proceptron algorithm, we need some basic knowledge about vector and its operations. If you are not familiar with it or need to refresh it, read about vector, operators, inner product before start reading the next.

The perceptron algorithm is about finding a set of reasonable weights. The key term here, “reasonable,” is easy to understand—it refers to a set of weights that can deliver good predictive performance. The core issue is how to find them.

Brute-force idea: try all possible weight values and record the corresponding model performance, such as accuracy, and then choose the weights that yield the highest accuracy as the final model parameters.

However, this idea is clearly not ideal. Even if we only have two feature variables, this would still not be a simple task. A smarter approach is to do it this way: we start with an initial guess for the weight values, and then gradually approach the most reasonable weights through some iterative mechanism. This mechanism is called the perceptron algorithm. Next, let’s dive into learning this magical mechanism—the perceptron algorithm.

Next, the logic goes like this:

Further explore the geometric properties of linear classifiers
Use geometry to understand how this algorithm works.

Previous page | Lecture 1 Homepage | Next page