History

Proceptron classifier is viewed as the foundation and building block of artificial neural network. For a historical introduction, see the figures below.

The Perceptron classifier was developed by Frank Rosenblatt in 1957 (LHS). Rosenblatt’s goal was to create a machine that could classify visual patterns, such as distinguishing between different shapes. At the time, he envisioned using large computers to simulate neural networks, inspired by how the brain processes information. His early experiments involved using a huge computer called the Mark I Perceptron (RHS), which attempted to recognize different shapes by adjusting weights based on input data. This work laid the foundation for modern neural networks and machine learning, despite initial limitations in its capacity to handle complex, non-linear problems.

Suppose we are solving a binary classification problem with \(p\) feature variables. As discussed before, it can be represented as

\[ \hat{y} = \text{Sign}( \textbf{w}^{\top} \textbf{x} + w_0 ) \]

where \(\textbf{w} = (w_1, w_2, \dots, w_p)^{\top}\), and \(\textbf{x} = (x_1, x_2, \dots, x_p)^{\top}\). Different from before, here we represent the weighted sum of \(p\) feature variables, \(w_1x_1+ \dots + w_px_p\), as the inner (dot) product of two vectors, i.e. \(\textbf{w} \cdot \textbf{x} = \textbf{w}^{\top} \textbf{x}\).

The perceptron algorithm is about finding a set of reasonable weights. The key term here, “reasonable,” is easy to understand—it refers to a set of weights that can deliver good predictive performance. The core issue is how to find them.

However, this idea is clearly not ideal. Even if we only have two feature variables, this would still not be a simple task. A smarter approach is to do it this way: we start with an initial guess for the weight values, and then gradually approach the most reasonable weights through some iterative mechanism. This mechanism is called the perceptron algorithm. Next, let’s dive into learning this magical mechanism—the perceptron algorithm.

Next, the logic goes like this:

Previous page | Lecture 1 Homepage | Next page

© 2024 Xijia Liu. All rights reserved.
Logo