dislib.decomposition.PCA¶
-
class
dislib.decomposition.pca.base.
PCA
(n_components=None, arity=50)[source]¶ Bases:
object
Principal component analysis (PCA) using the covariance method.
Performs a full eigendecomposition of the covariance matrix.
Parameters: - n_components (int or None, optional (default=None)) – Number of components to keep. If None, all components are kept.
- arity (int, optional (default=50)) – Arity of the reductions.
Variables: - components (array, shape (n_components, n_features)) –
Principal axes in feature space, representing the directions of maximum variance in the data. The components are sorted by explained_variance_.
Equal to the n_components eigenvectors of the covariance matrix with greater eigenvalues.
- explained_variance (array, shape (n_components,)) –
The amount of variance explained by each of the selected components.
Equal to the first n_components largest eigenvalues of the covariance matrix.
- mean (array, shape (n_features,)) – Per-feature empirical mean, estimated from the training set.
Examples
>>> from dislib.decomposition import PCA >>> import numpy as np >>> x = np.array([[1, 2], [1, 4], [1, 0], [4, 2], [4, 4], [4, 0]]) >>> from dislib.data import load_data >>> data = load_data(x=x, subset_size=2) >>> pca = PCA() >>> transformed_data = pca.fit_transform(data) >>> print(transformed_data) >>> print(pca.components_) >>> print(pca.explained_variance_)
-
fit
(dataset)[source]¶ Fit the model with the dataset.
Parameters: dataset (Dataset, shape (n_samples, n_features)) – Training dataset. Returns: self – Returns the instance itself. Return type: PCA
-
fit_transform
(dataset)[source]¶ Fit the model with the dataset and apply the dimensionality reduction to it.
Parameters: dataset (Dataset, shape (n_samples, n_features)) – Training dataset. Returns: transformed_dataset Return type: Dataset, shape (n_samples, n_components)
-
transform
(dataset)[source]¶ Apply dimensionality reduction to dataset.
The given dataset is projected on the first principal components previously extracted from a training dataset.
Parameters: dataset (Dataset, shape (n_samples, n_features)) – New dataset, with the same n_features as the training dataset. Returns: transformed_dataset Return type: Dataset, shape (n_samples, n_components)