dislib.utils

Functions

dislib.utils.base.apply_splits_to_blocks(x, indexes_train, indexes_test)[source]
dislib.utils.base.shuffle(x, y=None, random_state=None)[source]

Randomly shuffles the rows of data.

Parameters:
  • x (ds-array) – Data to be shuffled.
  • y (ds-array, optional (default=None)) – Additional array to shuffle using the same permutation, usually for labels or values. It is required that y.shape[0] == x.shape[0].
  • random_state (int or RandomState, optional (default=None)) – Seed or numpy.random.RandomState instance to use in the generation of random numbers.
Returns:

  • x_shuffled (ds-array) – A new ds-array containing the rows of x shuffled.
  • y_shuffled (ds-array, optional) – A new ds-array containing the rows of y shuffled using the same permutation. Only provided if y is not None.

dislib.utils.base.train_test_split(x, y=None, test_size=None, train_size=None, random_state=None)[source]

Randomly shuffles the rows of data.

Parameters:
  • x (ds-array) – Data to be splitted.
  • y (ds-array, optional (default=None)) – Additional array to split using the same permutations, usually for labels or values. It is required that y.shape[0] == x.shape[0].
  • test_size (float) – Number between 0 and 1 that defines the percentage of rows used as test data
  • train_size (float) – Number between 0 and 1 that defines the percentage of rows used as train data
  • random_state (int or RandomState, optional (default = None)) – Seed or numpy.random.RandomState instance to use in the generation of splits in the blocks.
Returns:

  • train (ds-array) – A new ds-array containing the rows of x that correspond to train data.
  • test (ds-array) – A new ds-array containing the rows of x that correspond to test data.
  • train_y (ds-array, optional) – A new ds-array containing the rows of y that correspond to the rows in train.
  • test_y (ds-array, optional) – A new ds-array containing the rows of y that correspond to the rows in test.