dislib.tensor#

class dislib.data.tensor.Tensor(tensors, tensor_shape, dtype, number_samples, shape=None, delete=True)[source]#

Bases: object

A distributed n-dimensional tensor divided in blocks.

Normally, this class should not be instantiated directly, but created using one of the array creation routines provided.

Apart from the different methods provided, this class supports the same indexing as a numpy array or pytorch tensor

Parameters:
  • tensors (list) – List of lists of nd-array or pytorch tensor.

  • tensor_shape (tuple) – A single tuple indicating the shape of the distributed tensors.

  • shape (tuple (int, int)) – Number of tensors inside the tensors attributes of the Tensor object in each of the dimensions.

  • shape (int) – Total number of tensors in the Tensor.

  • dtype (np or torch object) – Numerical type elements inside the tensor have.

  • number_samples (int) – Total number of samples contained in the different distributed tensors.

  • delete (boolean, optional (default=True)) – Whether to call compss_delete_object on the blocks when the garbage collector deletes this Tensor.

Variables:

shape (tuple (int, int)) – Total number of elements in the array.

apply_to_tensors(func)[source]#

Applies the specified function to the all the tensors

collect()[source]#

Tensor creation routines#

dislib.create_ds_tensor(tensors, tensors_shape, shape, dtype=<class 'numpy.float64'>)[source]#

Function to create a ds-tensor from a list of lists of pytorch tensors or numpy arrays. If specified the shape it should match the number of elements in the lists respectively.

Parameters:
  • tensors (List of lists containing numpy arrays or pytorch tensors) – Should contain the tensors of the object.

  • tensors_shape (tuple of int) – Shape of the regular tensors in the list.

  • shape (tuple of int) – Shape of the object, it represents the number of tensors in both dimensions of the list.

  • dtype (String) – Type of the data inside the tensors.

Returns:

x

Return type:

ds-tensor

dislib.random_tensors(tensors_type, shape, dtype=None)[source]#

Function that generates a ds-tensor with random data inside it.

Parameters:
  • tensor_type (String) – Type of the tensors used, could be numpy “np” or pytorch “torch”

  • shape (tuple of int) – Shape of the object, the first two numbers will be used as the number of column tensors and row tensors respectively.

  • dtype (String) – Type of the data inside the tensors.

Returns:

x

Return type:

ds-tensor

dislib.data.tensor.load_dataset(number_tensors_per_file, path)[source]#

Function to load data from files, these files can be numpy file “.npy” or pytorch files “.pt”. Depending on extension of the file to load the data from the ds-tensor generated will contain numpy arrays as tensors or pytorch tensors.

Parameters:
  • number_tensors_per_file (int) – Number of tensors to load from each file

  • path (String) – Path to the directory where the files with the data are located

Returns:

x

Return type:

ds-tensor

dislib.from_array(np_array, shape=None)[source]#

Creates from a numpy array the Tensor object, the numpy array should have at least 3 dimensions, the first two for the lists of the tensors and the last one (at least) as data inside each of the tensors.

Parameters:
  • np_array (np.array) – Numpy array that contains the data.

  • shape (tuple of two ints) – Shape of the output ds-tensor.

Returns:

x

Return type:

ds-tensor

dislib.from_pt_tensor(tensor, shape=None)[source]#

Creates from a pytorch tensor the Tensor object, the pytorch tensor should have at least 3 dimensions, the first two for the lists of the tensors and the last one (at least) as data inside each of the tensors.

Parameters:
  • tensor (torch.Tensor) – Tensor that contains the data.

  • shape (tuple of two ints) – Shape of the output ds-tensor.

Returns:

x

Return type:

ds-tensor

dislib.from_ds_array(ds_array, shape=None)[source]#

Creates the Tensor object from a ds_array. This method can’t generated Tensors that have more than two dimensions, thus the output of this method can’t be used in a Convolutional Neural Network neither any Neural Network that requires data input with more than two dimensions.

Parameters:
  • ds_array (ds-array) – The ds-array to transform into ds-tensor.

  • shape (tuple of two ints.) – The organization of the number of tensors, how many will be on axis=0 and how many will be on axis=1. The total number of tensors should be the same as blocks are in the input ds-array.

Returns:

tensor

Return type:

ds-tensor

Utility functions#

dislib.data.tensor.shuffle(x, y=None, random_state=None)[source]#

Shuffles randomly the data contained inside the tensors.

Parameters:
  • x (ds-tensor) – ds-tensor to be shuffled

  • y (ds-tensor) – ds-tensor to be shuffled in the same one as x

  • random_state (int) – Seed that will be used in the random state functions and np.random.

Returns:

  • x (ds-tensor)

  • y (ds-tensor or None)

dislib.data.tensor.rechunk_tensor(tensor, new_tensors_shape, dimension=0)[source]#

Changes the shape of the tensors inside the ds-tensor. The number of tensors, and at the same time the shape of the ds-tensor, will be modified in order to fit the total number of elements with the new shape of each tensor.

Parameters:
  • tensor (ds-tensor) – ds-tensor which tensors will be modified

  • new_tensors_shape (int) – Shape that each of the tensors will have in the specified dimension after the rechunk

  • dimension (int) – Dimension of the tensors where the change of the shape is going to be applied

Returns:

x

Return type:

ds-tensor

dislib.data.tensor.change_shape(tensor, new_shape)[source]#

Changes the distribution of the tensors in the ds-tensor, modifying its shape. For example a ds-tensor with shape (2, 2) may be changed to (4, 1) or to (1, 4)

Parameters:
  • tensor (ds-tensor) – ds-tensor where the shape is going to be modified

  • new_shape (tuple of two ints) – Shape that the tensor will have after the modification

Returns:

x

Return type:

ds-tensor

dislib.data.tensor.cat(tensors, dimension)[source]#

Concatenates the tensors inside the tensors list using the specified dimension

Parameters:
  • tensors (List) – List containing the tensors to concatenate.

  • dimension (int) – Dimension to use in the concatenation.

Returns:

x

Return type:

ds-tensor