API Reference#
dislib.array: Distributed array#
Classes#
data.Array - 2-dimensional array divided in
blocks that can be operated in a distributed way.
data.Tensor - n-dimensional tensor divided in
blocks that can be operated in a distributed way.
Array creation routines#
dislib.array - Build a distributed array
(ds-array) from an array-like structure, such as a NumPy array, a list, or a SciPy sparse matrix.
dislib.random_array - Build a ds-array with
random contents.
dislib.zeros - Build a ds-array filled with zeros.
dislib.full - Build a ds-array filled with a value.
dislib.eye - Build an eye ds-array.
dislib.identity - Build an identity ds-array.
dislib.load_svmlight_file - Build a
ds-array from a file in SVMlight format.
dislib.load_txt_file - Build a
ds-array from a text file.
dislib.load_npy_file - Build a ds-array from
a binary NumPy file.
dislib.load_mdcrd_file - Build a ds-array
from a mdcrd trajectory file.
dislib.data.load_hstack_npy_files - Build a ds-array
from .npy files, concatenating them side-by-side.
dislib.save_txt - Save a ds-array by blocks to a
directory in txt format.
Utility functions#
data.util.compute_bottom_right_shape -
Computes a shape of the bottom right block.
data.util.pad - Pad array blocks with
the desired value.
data.util.pad_last_blocks_with_zeros -
Pad array blocks with zeros.
data.util.remove_last_columns -
Removes last columns from the right-most blocks of the ds-array.
data.util.remove_last_rows -
Removes last rows from the bottom blocks of the ds-array.
Tensor creation routines#
dislib.from_array - Build a ds-tensor from a NumPy array.
dislib.from_pt_tensor - Build a ds-tensor from a PyTorch tensor.
dislib.from_ds_array - Build a ds-tensor from a ds-array.
dislib.create_ds_tensor - Build a ds-tensor from a list of tensors.
dislib.random_tensors - Build a ds-tensor with random contents.
dislib.data.tensor.load_dataset - Build a ds-tensor from a dataset of files.
Tensor utility functions#
dislib.data.tensor.cat - Concatenate ds-tensors along a dimension.
dislib.data.tensor.change_shape - Change the shape of a ds-tensor.
dislib.data.tensor.rechunk_tensor - Rechunk a ds-tensor along a dimension.
dislib.data.tensor.shuffle - Randomly shuffle a ds-tensor.
Other functions#
dislib.apply_along_axis - Applies a
function to a ds-array along a given axis.
dislib.classification: Classification#
classification.CascadeSVM
- Distributed support vector classification using a cascade of classifiers.
classification.KNeighborsClassifier
- Distributed K neighbors classification using partial classifiers.
dislib.cluster: Clustering#
cluster.DBSCAN - Perform DBSCAN
clustering.
cluster.KMeans - Perform K-Means
clustering.
cluster.GaussianMixture -
Fit a gaussian mixture model.
cluster.Daura - Perform Daura
clustering.
dislib.decomposition: Matrix Decomposition#
decomposition.qr -
QR decomposition.
decomposition.tsqr -
Tall-Skinny QR decomposition.
decomposition.PCA -
Principal
Component Analysis (PCA).
decomposition.lanczos_svd -
Lanczos SVD decomposition.
decomposition.random_svd -
Random SVD decomposition.
dislib.math: Mathematical functions#
dislib.kron - Computes the Kronecker product of two
ds-arrays.
dislib.svd - Singular value decomposition of a ds-array.
dislib.pytorch and dislib.eddl: Distributed neural network training#
pytorch.EncapsulatedFunctionsDistributedPytorch - Distributed training of neural networks using PyTorch backend.
eddl.EncapsulatedFunctionsDistributedEddl - Distributed training of neural networks using EDDL backend.
dislib.model_selection: Model selection#
model_selection.GridSearchCV -
Exhaustive search over specified parameter values for an estimator.
model_selection.RandomizedSearchCV -
Randomized search over estimator parameters sampled from given distributions.
model_selection.SimulationGridSearch -
Exhaustive search over estimator parameters sampled from given distributions.
model_selection.KFold -
K-fold splitter for cross-validation.
model_selection.train_test_split -
Split arrays or matrices into random train and test subsets.
dislib.neighbors: Neighbor queries#
cluster.NearestNeighbors -
Perform k-nearest neighbors queries.
dislib.preprocessing: Data pre-processing#
preprocessing.MinMaxScaler -
Scale a ds-array to zero mean and unit variance.
preprocessing.StandardScaler -
Scale a ds-array to the given range.
dislib.recommendation: Recommendation#
recommendation.ALS
- Distributed alternating least squares for collaborative filtering.
dislib.regression: Regression#
regression.LinearRegression
- Multivariate linear regression using ordinary least squares.
regression.Lasso
- Linear Model trained with L1 prior as regularizer.
dislib.sorting: Sorting#
sorting.TeraSort
- Sorts the ds-array using the TeraSort algorithm.
dislib.trees: Trees#
trees.DecisionTreeClassifier -
Build a decision tree.
trees.DecisionTreeRegressor -
Build a regression tree.
trees.RandomForestClassifier -
Build a random forest for classification.
trees.RandomForestRegressor -
Build a random forest for regression.
trees.mmap.DecisionTreeClassifier -
Build a decision tree using memorymap.
trees.mmap.DecisionTreeRegressor -
Build a regression tree using memorymap.
trees.mmap.RandomForestClassifier -
Build a random forest for classification using memorymap.
trees.mmap.RandomForestRegressor -
Build a random forest for regression using memorymap.
trees.distributed.DecisionTreeClassifier -
Build a decision tree using the distributed approach.
trees.distributed.DecisionTreeRegressor -
Build a regression tree using the distributed approach.
trees.distributed.RandomForestClassifier -
Build a random forest for classification using the distributed approach.
trees.distributed.RandomForestRegressor -
Build a random forest for regression using the distributed approach.
trees.nested.DecisionTreeClassifier -
Build a decision tree using the nested approach.
trees.nested.DecisionTreeRegressor -
Build a regression tree using the nested approach.
trees.nested.RandomForestClassifier -
Build a random forest for classification using the nested approach.
trees.nested.RandomForestRegressor -
Build a random forest for regression using the nested approach.
dislib.utils: Utility functions#
utils.shuffle - Randomly shuffles the
rows of a ds-array.