dislib.classification.CascadeSVM¶
-
class
dislib.classification.csvm.base.
CascadeSVM
(cascade_arity=2, max_iter=5, tol=0.001, kernel='rbf', c=1, gamma='auto', check_convergence=True, random_state=None, verbose=False)[source]¶ Bases:
sklearn.base.BaseEstimator
Cascade Support Vector classification.
Implements distributed support vector classification based on Graf et al. [1]. The optimization process is carried out using scikit-learn’s SVC.
Parameters: cascade_arity (int, optional (default=2)) – Arity of the reduction process.
max_iter (int, optional (default=5)) – Maximum number of iterations to perform.
tol (float, optional (default=1e-3)) – Tolerance for the stopping criterion.
kernel (string, optional (default=’rbf’)) – Specifies the kernel type to be used in the algorithm. Supported kernels are ‘linear’ and ‘rbf’.
c (float, optional (default=1.0)) – Penalty parameter C of the error term.
gamma (float, optional (default=’auto’)) – Kernel coefficient for ‘rbf’.
Default is ‘auto’, which uses 1 / (n_features).
check_convergence (boolean, optional (default=True)) – Whether to test for convergence. If False, the algorithm will run for max_iter iterations. Checking for convergence adds a synchronization point after each iteration.
If ``check_convergence=False’’ synchronization does not happen until a call to ``predict’’ or ``decision_function’’. This can be useful to fit multiple models in parallel.
random_state (int, RandomState instance or None, optional (default=None)) – The seed of the pseudo random number generator used when shuffling the data for probability estimates. If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.random.
verbose (boolean, optional (default=False)) – Whether to print progress information.
Variables: - iterations (int) – Number of iterations performed.
- converged (boolean) – Whether the model has converged.
References
[1] Graf, H. P., Cosatto, E., Bottou, L., Dourdanovic, I., & Vapnik, V. (2005). Parallel support vector machines: The cascade svm. In Advances in neural information processing systems (pp. 521-528). Examples
>>> import dislib as ds >>> from dislib.classification import CascadeSVM >>> import numpy as np >>> >>> >>> if __name__ == '__main__': >>> x = np.array([[-1, -1], [-2, -1], [1, 1], [2, 1]]) >>> y = np.array([1, 1, 2, 2]) >>> train_data = ds.array(x, block_size=(4, 2)) >>> train_labels = ds.array(y, block_size=(4, 2)) >>> svm = CascadeSVM() >>> svm.fit(train_data, train_labels) >>> test_data = ds.array(np.array([[-0.8, -1]]), block_size=(1, 2)) >>> y_pred = svm.predict(test_data) >>> print(y_pred)
-
decision_function
(x)[source]¶ Evaluates the decision function for the samples in x.
Parameters: x (ds-array, shape=(n_samples, n_features)) – Input samples. Returns: df – The decision function of the samples for each class in the model. Return type: ds-array, shape=(n_samples, 2)
-
fit
(x, y)[source]¶ Fits a model using training data.
Parameters: - x (ds-array, shape=(n_samples, n_features)) – Training samples.
- y (ds-array, shape=(n_samples, 1)) – Class labels of x.
Returns: self
Return type:
-
predict
(x)[source]¶ Perform classification on samples.
Parameters: x (ds-array, shape=(n_samples, n_features)) – Input samples. Returns: y – Class labels of x. Return type: ds-array, shape(n_samples, 1)
-
score
(x, y, collect=False)[source]¶ Returns the mean accuracy on the given test data and labels.
Parameters: - x (ds-array, shape=(n_samples, n_features)) – Test samples.
- y (ds-array, shape=(n_samples, 1)) – True labels for x.
- collect (bool, optional (default=False)) – When True, a synchronized result is returned.
Returns: score – Mean accuracy of self.predict(x) wrt. y.
Return type: float (as future object)