API Reference¶
dislib.array: Distributed array¶
Classes¶
data.Array
- 2-dimensional array divided in
blocks that can be operated in a distributed way.
Array creation routines¶
dislib.array
- Build a distributed array
(ds-array) from an array-like structure, such as a NumPy array, a list, or a SciPy sparse matrix.
dislib.random_array
- Build a ds-array with
random contents.
dislib.zeros
- Build a ds-array filled with zeros.
dislib.full
- Build a ds-array filled with a value.
dislib.eye
- Build an eye ds-array.
dislib.identity
- Build an identity ds-array.
dislib.load_svmlight_file
- Build a
ds-array from a file in SVMlight format.
dislib.load_txt_file
- Build a
ds-array from a text file.
dislib.load_npy_file
- Build a ds-array from
a binary NumPy file.
dislib.load_mdcrd_file
- Build a ds-array
from a mdcrd trajectory file.
dislib.data.load_hstack_npy_files
- Build a ds-array
from .npy files, concatenating them side-by-side.
dislib.save_txt
- Save a ds-array by blocks to a
directory in txt format.
Utility functions¶
data.util.compute_bottom_right_shape
-
Computes a shape of the bottom right block.
data.util.pad
- Pad array blocks with
the desired value.
data.util.pad_last_blocks_with_zeros
-
Pad array blocks with zeros.
data.util.remove_last_columns
-
Removes last columns from the right-most blocks of the ds-array.
data.util.remove_last_rows
-
Removes last rows from the bottom blocks of the ds-array.
Other functions¶
dislib.apply_along_axis
- Applies a
function to a ds-array along a given axis.
dislib.classification: Classification¶
classification.CascadeSVM
- Distributed support vector classification using a cascade of classifiers.
classification.KNeighborsClassifier
- Distributed K neighbors classification using partial classifiers.
dislib.cluster: Clustering¶
cluster.DBSCAN
- Perform DBSCAN
clustering.
cluster.KMeans
- Perform K-Means
clustering.
cluster.GaussianMixture
-
Fit a gaussian mixture model.
cluster.Daura
- Perform Daura
clustering.
dislib.decomposition: Matrix Decomposition¶
decomposition.qr
-
QR decomposition.
decomposition.tsqr
-
Tall-Skinny QR decomposition.
decomposition.PCA
-
Principal
Component Analysis (PCA).
decomposition.lanczos_svd
-
Lanczos SVD decomposition.
decomposition.random_svd
-
Random SVD decomposition.
dislib.math: Mathematical functions¶
dislib.kron
- Computes the Kronecker product of two
ds-arrays.
dislib.svd
- Singular value decomposition of a ds-array.
dislib.model_selection: Model selection¶
model_selection.GridSearchCV
-
Exhaustive search over specified parameter values for an estimator.
model_selection.RandomizedSearchCV
-
Randomized search over estimator parameters sampled from given distributions.
model_selection.SimulationGridSearch
-
Exhaustive search over estimator parameters sampled from given distributions.
model_selection.KFold
-
K-fold splitter for cross-validation.
dislib.neighbors: Neighbor queries¶
cluster.NearestNeighbors
-
Perform k-nearest neighbors queries.
dislib.preprocessing: Data pre-processing¶
preprocessing.MinMaxScaler
-
Scale a ds-array to zero mean and unit variance.
preprocessing.StandardScaler
-
Scale a ds-array to the given range.
dislib.recommendation: Recommendation¶
recommendation.ALS
- Distributed alternating least squares for collaborative filtering.
dislib.regression: Regression¶
regression.LinearRegression
- Multivariate linear regression using ordinary least squares.
regression.Lasso
- Linear Model trained with L1 prior as regularizer.
dislib.sorting: Sorting¶
sorting.TeraSort
- Sorts the ds-array using the TeraSort algorithm.
dislib.trees: Trees¶
trees.DecisionTreeClassifier
-
Build a decision tree.
trees.DecisionTreeRegressor
-
Build a regression tree.
trees.RandomForestClassifier
-
Build a random forest for classification.
trees.RandomForestRegressor
-
Build a random forest for regression.
trees.mmap.DecisionTreeClassifier
-
Build a decision tree using memorymap.
trees.mmap.DecisionTreeRegressor
-
Build a regression tree using memorymap.
trees.mmap.RandomForestClassifier
-
Build a random forest for classification using memorymap.
trees.mmap.RandomForestRegressor
-
Build a random forest for regression using memorymap.
trees.distributed.DecisionTreeClassifier
-
Build a decision tree using the distributed approach.
trees.distributed.DecisionTreeRegressor
-
Build a regression tree using the distributed approach.
trees.distributed.RandomForestClassifier
-
Build a random forest for classification using the distributed approach.
trees.distributed.RandomForestRegressor
-
Build a random forest for regression using the distributed approach.
trees.nested.DecisionTreeClassifier
-
Build a decision tree using the nested approach.
trees.nested.DecisionTreeRegressor
-
Build a regression tree using the nested approach.
trees.nested.RandomForestClassifier
-
Build a random forest for classification using the nested approach.
trees.nested.RandomForestRegressor
-
Build a random forest for regression using the nested approach.
dislib.utils: Utility functions¶
utils.shuffle
- Randomly shuffles the
rows of a ds-array.