Feature extraction refers to a general method of obtaining new feature variables by recalculating the original variables. \[
Z = g(X_1,X_2,\dots,X_p; W)
\]
The main purpose of feature extraction is to create a feature space that is more conducive to modeling by extracting information from the original data. Often, we also aim to achieve dimensionality reduction through feature extraction.
What is manual feature extraction? Do you have any good examples?
Manual feature extraction refers to the methods based on domain knowledge.
Feature Mapping V.S. Feature extraction
Feature mapping \(\approx\) feature extraction
About Principle Component Analysis
PCA: new variable; PC weights; variance (information). PC (extracted feature variable): \[
Z_{\textbf{w}} = g_{\textbf{w}}(X_1,X_2,\dots,X_p) = w_1X_1+w_2X_2+\dots+w_pX_p
\]
Theoretically, how many principal components (PCs) can we obtain from a dataset with \(p\) variables?
Theoretically, we can get \(p\) new variables at most.
What are the limitations of PCA?
Linear; the choice of \(\textbf{w}\) doesn’t depend on the target varaible.
Image Reconstruction IDEA
Flavor \(=\) weighted sum of ingredients.
For image reconstruction: What are the weights? What are the “ingredients”?
Weights \(=\) PCs (Z); “Ingredients” \(=\) the weights for calculating PCs
For the original data matrix, what are the “ingredients”? What does it look like?