Dataset Handlers

Researcher supplied Python modules that contain a dataset class (to handle processing, splitting, etc.) that are passed to the sacroml.attacks.target.Target must implement one of these abstract classes.

Scikit-learn models that use numpy arrays should implement SklearnDataHandler.

PyTorch models that use DataLoaders should implement PyTorchDataHandler.

API Reference

Abstract data handler supporting both PyTorch and scikit-learn.

class sacroml.attacks.data.BaseDataHandler[source]

Base data handling interface.

abstractmethod __init__() None[source]

Instantiate a data handler.

class sacroml.attacks.data.PyTorchDataHandler[source]

PyTorch dataset handling interface.

Methods

get_dataloader(dataset, indices[, ...])

Return a data loader with a requested subset of samples.

get_dataset()

Return a processed dataset.

get_raw_dataset()

Return a raw unprocessed dataset.

abstractmethod __init__() None

Instantiate a data handler.

abstractmethod get_dataloader(dataset: Dataset, indices: Sequence[int], batch_size: int = 32, shuffle: bool = False) DataLoader[source]

Return a data loader with a requested subset of samples.

Parameters:
datasetDataset

A (processed) PyTorch dataset.

indicesSequence[int]

The indices to load from the dataset.

batch_sizeint

The batch_size to sample the dataset.

shufflebool

Whether to shuffle the data.

Returns:
DataLoader

A PyTorch DataLoader.

abstractmethod get_dataset() Dataset[source]

Return a processed dataset.

Returns:
Dataset

A (processed) PyTorch dataset.

abstractmethod get_raw_dataset() Dataset | None[source]

Return a raw unprocessed dataset.

Returns:
Dataset | None

An unprocessed PyTorch dataset.

class sacroml.attacks.data.SklearnDataHandler[source]

Scikit-learn data handling interface.

Methods

get_data()

Return the processed data arrays.

get_raw_data()

Return the original unprocessed data arrays.

get_subset(X, y, indices)

Return a subset of the data.

abstractmethod __init__() None

Instantiate a data handler.

abstractmethod get_data() tuple[ndarray, ndarray][source]

Return the processed data arrays.

Returns:
tuple[np.ndarray, np.ndarray]

Features (X) and targets (y) as numpy arrays.

abstractmethod get_raw_data() tuple[ndarray, ndarray] | None[source]

Return the original unprocessed data arrays.

Returns:
tuple[np.ndarray, np.ndarray] | None

Features (X) and targets (y) as numpy arrays.

abstractmethod get_subset(X: ndarray, y: ndarray, indices: Sequence[int]) tuple[ndarray, ndarray][source]

Return a subset of the data.

Parameters:
Xnp.ndarray

Feature array.

ynp.ndarray

Target array.

indicesSequence[int]

The indices to extract.

Returns:
tuple[np.ndarray, np.ndarray]

Subset of features and targets.