Attacks

Examples showing how to run the code can be found in the examples folder.

Calculate metrics.

aisdc.metrics.auc_p_val(auc: float, n_pos: int, n_neg: int) tuple[float, float][source]

Compute the p-value for a given AUC.

Parameters:
aucfloat

Observed AUC value

n_posint

Number of positive examples

n_negint

Number of negative examples

Returns:
auc_pfloat

p-value of observing an AUC > auc by chance

auc_stdfloat

standard deviation of the NULL AUC diustribution (mean = 0.5)

aisdc.metrics.get_metrics(y_pred_proba: ndarray, y_test: ndarray)[source]

Calculate metrics, including attacker advantage for MIA binary.

Implemented as Definition 4 on https://arxiv.org/pdf/1709.01604.pdf which is also implemented in tensorFlow-privacy https://github.com/tensorflow/privacy.

Parameters:
y_testnp.ndarray

test data labels

y_pred_probanp.ndarray of shape [x,2] and type float

predicted probabilities

Returns:
metricsdict

dictionary of metric values

Notes

Includes the following metrics:

  • True positive rate or recall (TPR).

  • False positive rate (FPR), proportion of negative examples incorrectly classified as positives.

  • False alarm rate (FAR), proportion of objects classified as positives that are incorrect, also known as false discovery rate.

  • True neagative rate (TNR).

  • Positive predictive value or precision (PPV).

  • Negative predictive value (NPV).

  • False neagative rate (FNR).

  • Accuracy (ACC).

  • F1 Score - harmonic mean of precision and recall.

  • Advantage.

aisdc.metrics.get_probabilities(clf, X_test: ndarray, y_test: ndarray = array([], dtype=float64), permute_rows: bool = False)[source]

Given a prediction model and a dataset, calculate the predictions of the model for each data sample in probability format.

Parameters:
clfsklearn.Model

trained model

X_testnp.ndarray

test data matrix

y_testnp.ndarray

test data labels

permute_rowsboolean

a flag to indicate whether rows should be permuted

Returns:
y_pred_probaa list of probabilities for each sample in the dataset

Notes

If permute_rows is set to true, y_test must also be supplied. The function will then return both the predicted probabilities and corresponding y_test

aisdc.metrics.min_max_disc(y_true: ndarray, pred_probs: ndarray, x_prop: float = 0.1, log_p: bool = True) tuple[float, float, float, float][source]

Non-average-case methods for MIA attacks. Considers actual frequency of membership amongst samples with highest- and lowest- assessed probability of membership. If an MIA method confidently asserts that 5% of samples are members and 5% of samples are not, but cannot tell for the remaining 90% of samples, then these metrics will flag this behaviour, but AUC/advantage may not. Since the difference may be noisy, a p-value against a null of independence of true membership and assessed membership probability (that is, membership probabilities are essentially random) is also used as a metric (using a usual Gaussian approximation to binomial). If the p-value is low and the frequency difference is high (>0.5) then the MIA attack is successful for some samples.

Parameters:
ynp.ndarray

true labels

ypnp.ndarray

probabilities of labels, or monotonic transformation of probabilities

xpropfloat

proportion of samples with highest- and lowest- probability of membership to be considered

logpbool

convert p-values to log(p).

Returns:
maxdfloat

frequency of y=1 amongst proportion xprop of individuals with highest assessed membership probability

mindfloat

frequency of y=1 amongst proportion xprop of individuals with lowest assessed membership probability

mmdfloat

difference between maxd and mind

pvalfloat

p-value or log-p value corresponding to mmd against null hypothesis that random variables corresponding to y and yp are independent.

Examples

>>> y = np.random.choice(2, 100)
>>> yp = np.random.rand(100)
>>> maxd, mind, mmd, pval = min_max_desc(y, yp, xprop=0.2, logp=True)

Worst_case_attack.py.

Runs a worst case attack based upon predictive probabilities stored in two .csv files

class aisdc.attacks.worst_case_attack.WorstCaseAttack(n_reps: int = 10, reproduce_split: int | ~typing.Iterable[int] | None = 5, p_thresh: float = 0.05, n_dummy_reps: int = 1, train_beta: int = 1, test_beta: int = 1, test_prop: float = 0.2, n_rows_in: int = 1000, n_rows_out: int = 1000, training_preds_filename: str | None = None, test_preds_filename: str | None = None, output_dir: str = 'output_worstcase', report_name: str = 'report_worstcase', include_model_correct_feature: bool = False, sort_probs: bool = True, mia_attack_model: ~typing.Any = <class 'sklearn.ensemble._forest.RandomForestClassifier'>, mia_attack_model_hyp: dict | None = None, attack_metric_success_name: str = 'P_HIGHER_AUC', attack_metric_success_thresh: float = 0.05, attack_metric_success_comp_type: str = 'lte', attack_metric_success_count_thresh: int = 5, attack_fail_fast: bool = False, attack_config_json_file_name: str | None = None, target_path: str | None = None)[source]

Class to wrap the worst case attack code.

Methods

attack(target)

Programmatic attack entry point.

attack_from_prediction_files()

Start an attack from saved prediction files.

attack_from_preds(train_preds, test_preds[, ...])

Runs the attack based upon the predictions in train_preds and test_preds, and the params stored in self.args.

generate_arrays(n_rows_in, n_rows_out[, ...])

Generate train and test prediction arrays, used when computing baseline.

get_params()

Get parameters for this attack.

make_dummy_data()

Makes dummy data for testing functionality.

make_report()

Creates output dictionary structure and generates pdf and json outputs if filenames are given.

run_attack_reps(train_preds, test_preds[, ...])

Run actual attack reps from train and test predictions.

attack(target: Target) None[source]

Programmatic attack entry point.

To be used when code has access to Target class and trained target model

Parameters:
targetattacks.target.Target

target as a Target class object

attack_from_prediction_files()[source]

Start an attack from saved prediction files.

To be used when only saved predictions are available.

Filenames for the saved prediction files to be specified in the arguments provided in the constructor

attack_from_preds(train_preds: ndarray, test_preds: ndarray, train_correct: ndarray | None = None, test_correct: ndarray | None = None) None[source]

Runs the attack based upon the predictions in train_preds and test_preds, and the params stored in self.args.

Parameters:
train_predsnp.ndarray

Array of train predictions. One row per example, one column per class (i.e. 2)

test_predsnp.ndarray

Array of test predictions. One row per example, one column per class (i.e. 2)

generate_arrays(n_rows_in: int, n_rows_out: int, train_beta: float = 2, test_beta: float = 2) tuple[ndarray, ndarray][source]

Generate train and test prediction arrays, used when computing baseline.

Parameters:
n_rows_inint

number of rows of in-sample (training) probabilities

n_rows_outint

number of rows of out-of-sample (testing) probabilities

train_betafloat

beta value for generating train probabilities

test_betafloat:

beta_value for generating test probabilities

Returns:
train_predsnp.ndarray

Array of train predictions (n_rows x 2 columns)

test_predsnp.ndarray

Array of test predictions (n_rows x 2 columns)

make_dummy_data() None[source]

Makes dummy data for testing functionality.

Parameters:
argsdict

Command line arguments

Returns:

Notes

Returns nothing but saves two .csv files

make_report() dict[source]

Creates output dictionary structure and generates pdf and json outputs if filenames are given.

run_attack_reps(train_preds: ndarray, test_preds: ndarray, train_correct: ndarray | None = None, test_correct: ndarray | None = None) dict[source]

Run actual attack reps from train and test predictions.

Parameters:
train_predsnp.ndarray

predictions from the model on training (in-sample) data

test_predsnp.ndarray

predictions from the model on testing (out-of-sample) data

Returns:
mia_metrics_dictdict

a dictionary with two items including mia_metrics (a list of metric across repetitions) and failfast_metric_summary object (an object of FailFast class) to maintain summary of fail/success of attacks for a given metric of failfast option

aisdc.attacks.worst_case_attack.main()[source]

Main method to parse arguments and invoke relevant method.

Code for automatic report generation.

class aisdc.attacks.report.NumpyArrayEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Json encoder that can cope with numpy arrays.

Methods

default(o)

If an object is an np.ndarray, convert to list.

encode(o)

Return a JSON string representation of a Python data structure.

iterencode(o[, _one_shot])

Encode the given object and yield each string representation as available.

default(o)[source]

If an object is an np.ndarray, convert to list.

aisdc.attacks.report.add_output_to_pdf(report_dest: str, pdf_report: FPDF, attack_type: str) None[source]

Creates pdf and appends contents if it already exists.

aisdc.attacks.report.create_json_report(output)[source]

Create a report in json format for injestion by other tools.

aisdc.attacks.report.create_lr_report(output: dict) FPDF[source]

Make a lira membership inference report.

Parameters:
outputdict

dictionary with following items

metadata: dict

dictionary of metadata

attack_experiment_logger: dict

list of metrics as dictionary items for an experiments In case of LIRA attack scenario, this will have dictionary items of attack_instance_logger that will have a single metrics dictionary

Returns:
pdffpdf.FPDF

fpdf document object

aisdc.attacks.report.create_mia_report(attack_output: dict) FPDF[source]

Make a worst case membership inference report.

Parameters:
attack_outputdict

dictionary with following items

metadata: dict

dictionary of metadata

attack_experiment_logger: dict

list of metrics as dictionary items for an experiment

dummy_attack_experiment_logger: dict

list of metrics as dictionary items across dummy experiments

Returns:
pdffpdf.FPDF

fpdf document object

aisdc.attacks.report.line(pdf, text, indent=0, border=0, font_size=11, font_style='', font='arial')[source]

Write a standard block.

aisdc.attacks.report.subtitle(pdf, text, indent=10, border=0, font_size=12, font_style='B')[source]

Write a subtitle block.

aisdc.attacks.report.title(pdf, text, border=0, font_size=24, font_style='B')[source]

Write a title block.

Likelihood testing scenario from https://arxiv.org/pdf/2112.03570.pdf.

class aisdc.attacks.likelihood_attack.DummyClassifier[source]

A Dummy Classifier to allow this code to work with get_metrics.

Methods

predict(test_X)

Return an array of 1/0 depending on value in second column.

predict_proba(test_X)

Simply return the test_X.

predict(test_X)[source]

Return an array of 1/0 depending on value in second column.

predict_proba(test_X)[source]

Simply return the test_X.

class aisdc.attacks.likelihood_attack.LIRAAttack(n_shadow_models: int = 100, p_thresh: float = 0.05, output_dir: str = 'outputs_lira', report_name: str = 'report_lira', training_data_filename: str | None = None, test_data_filename: str | None = None, training_preds_filename: str | None = None, test_preds_filename: str | None = None, target_model: list | None = None, target_model_hyp: dict | None = None, attack_config_json_file_name: str | None = None, n_shadow_rows_confidences_min: int = 10, shadow_models_fail_fast: bool = False, target_path: str | None = None)[source]

The main LIRA Attack class.

Methods

attack(target)

Programmatic attack running Runs a LIRA attack from a Target object and a target model.

attack_from_config()

Runs an attack based on the args parsed from the command line.

example()

Runs an example attack using data from sklearn.

get_params()

Get parameters for this attack.

make_report()

Create the report.

run_scenario_from_preds(shadow_clf, ...)

Implements the likelihood test, using the "offline" version See p.6 (top of second column) for details.

setup_example_data()

Method to create example data and save (including config).

attack(target: Target) None[source]

Programmatic attack running Runs a LIRA attack from a Target object and a target model.

Parameters:
targetattacks.target.Target

target as an instance of the Target class. Needs to have x_train, x_test, y_train and y_test set.

attack_from_config() None[source]

Runs an attack based on the args parsed from the command line.

example() None[source]

Runs an example attack using data from sklearn.

Generates example data, trains a classifier and tuns the attack

make_report() dict[source]

Create the report.

Creates the output report. If self.args.report_name is not None, it will also save the information in json and pdf formats

Returns:
outputDict

Dictionary containing all attack output

run_scenario_from_preds(shadow_clf: BaseEstimator, X_target_train: Iterable[float], y_target_train: Iterable[float], target_train_preds: Iterable[float], X_shadow_train: Iterable[float], y_shadow_train: Iterable[float], shadow_train_preds: Iterable[float]) tuple[ndarray, ndarray, BaseEstimator][source]

Implements the likelihood test, using the “offline” version See p.6 (top of second column) for details.

Parameters:
shadow_clfsklearn.Model

An sklearn classifier that will be trained to form the shadow model. All hyper-parameters should have been set.

X_target_trainnp.ndarray

Data that was used to train the target model

y_target_trainnp.ndarray

Labels that were used to train the target model

target_train_predsnp.ndarray

Array of predictions produced by the target model on the training data

X_shadow_trainnp.ndarray

Data that will be used to train the shadow models

y_shadow_trainnp.ndarray

Labels that will be used to train the shadow model

shadow_train_predsnp.ndarray

Array of predictions produced by the target model on the shadow data

Returns:
mia_scoresnp.ndarray

Attack probabilities of belonging to the training set or not

mia_labelsnp.ndarray

True labels of belonging to the training set or not

mia_clsDummyClassifier

A DummyClassifier that directly returns the scores for compatibility with code in metrics.py

Examples

>>> X, y = load_breast_cancer(return_X_y=True, as_frame=False)
>>> train_X, test_X, train_y, test_y = train_test_split(
>>>   X, y, test_size=0.5, stratify=y
>>> )
>>> rf = RandomForestClassifier(min_samples_leaf=1, min_samples_split=2)
>>> rf.fit(train_X, train_y)
>>> mia_test_probs, mia_test_labels, mia_clf = likelihood_scenario(
>>>     RandomForestClassifier(min_samples_leaf=1, min_samples_split=2, max_depth=10),
>>>     train_X,
>>>     train_y,
>>>     rf.predict_proba(train_X),
>>>     test_X,
>>>     test_y,
>>>     rf.predict_proba(test_X),
>>>     n_shadow_models=100
>>> )
setup_example_data() None[source]

Method to create example data and save (including config). Intended to allow users to see how they would need to setup their own data.

Generates train and test data .csv files, train and test predictions .csv files and a config.json file that can be used to run the attack from the command line.

aisdc.attacks.likelihood_attack.main()[source]

Main method to parse args and invoke relevant code.