Attacks

Examples showing how to run the code can be found in the examples folder.

Calculate metrics.

aisdc.metrics.auc_p_val(auc: float, n_pos: int, n_neg: int) → tuple[float, float][source]

Compute the p-value for a given AUC.

Parameters:

aucfloat: Observed AUC value
n_posint: Number of positive examples
n_negint: Number of negative examples

Returns:

auc_pfloat: p-value of observing an AUC > auc by chance
auc_stdfloat: standard deviation of the NULL AUC diustribution (mean = 0.5)

aisdc.metrics.get_metrics(y_pred_proba: ndarray, y_test: ndarray)[source]

Calculate metrics, including attacker advantage for MIA binary.

Implemented as Definition 4 on https://arxiv.org/pdf/1709.01604.pdf which is also implemented in tensorFlow-privacy https://github.com/tensorflow/privacy.

Parameters:

y_testnp.ndarray: test data labels
y_pred_probanp.ndarray of shape [x,2] and type float: predicted probabilities

Returns:

metricsdict: dictionary of metric values

Notes

Includes the following metrics:

True positive rate or recall (TPR).
False positive rate (FPR), proportion of negative examples incorrectly classified as positives.
False alarm rate (FAR), proportion of objects classified as positives that are incorrect, also known as false discovery rate.
True neagative rate (TNR).
Positive predictive value or precision (PPV).
Negative predictive value (NPV).
False neagative rate (FNR).
Accuracy (ACC).
F1 Score - harmonic mean of precision and recall.
Advantage.

aisdc.metrics.get_probabilities(clf, X_test: ndarray, y_test: ndarray = array([], dtype=float64), permute_rows: bool = False)[source]

Given a prediction model and a dataset, calculate the predictions of the model for each data sample in probability format.

Parameters:

clfsklearn.Model: trained model
X_testnp.ndarray: test data matrix
y_testnp.ndarray: test data labels
permute_rowsboolean: a flag to indicate whether rows should be permuted

Returns:

y_pred_probaa list of probabilities for each sample in the dataset

Notes

If permute_rows is set to true, y_test must also be supplied. The function will then return both the predicted probabilities and corresponding y_test

aisdc.metrics.min_max_disc(y_true: ndarray, pred_probs: ndarray, x_prop: float = 0.1, log_p: bool = True) → tuple[float, float, float, float][source]

Non-average-case methods for MIA attacks. Considers actual frequency of membership amongst samples with highest- and lowest- assessed probability of membership. If an MIA method confidently asserts that 5% of samples are members and 5% of samples are not, but cannot tell for the remaining 90% of samples, then these metrics will flag this behaviour, but AUC/advantage may not. Since the difference may be noisy, a p-value against a null of independence of true membership and assessed membership probability (that is, membership probabilities are essentially random) is also used as a metric (using a usual Gaussian approximation to binomial). If the p-value is low and the frequency difference is high (>0.5) then the MIA attack is successful for some samples.

Parameters:

ynp.ndarray: true labels
ypnp.ndarray: probabilities of labels, or monotonic transformation of probabilities
xpropfloat: proportion of samples with highest- and lowest- probability of membership to be considered
logpbool: convert p-values to log(p).

Returns:

maxdfloat: frequency of y=1 amongst proportion xprop of individuals with highest assessed membership probability
mindfloat: frequency of y=1 amongst proportion xprop of individuals with lowest assessed membership probability
mmdfloat: difference between maxd and mind
pvalfloat: p-value or log-p value corresponding to mmd against null hypothesis that random variables corresponding to y and yp are independent.

Examples

>>> y = np.random.choice(2, 100)
>>> yp = np.random.rand(100)
>>> maxd, mind, mmd, pval = min_max_desc(y, yp, xprop=0.2, logp=True)

Worst_case_attack.py.

Runs a worst case attack based upon predictive probabilities stored in two .csv files

class aisdc.attacks.worst_case_attack.WorstCaseAttack(n_reps: int = 10, reproduce_split: int | ~typing.Iterable[int] | None = 5, p_thresh: float = 0.05, n_dummy_reps: int = 1, train_beta: int = 1, test_beta: int = 1, test_prop: float = 0.2, n_rows_in: int = 1000, n_rows_out: int = 1000, training_preds_filename: str | None = None, test_preds_filename: str | None = None, output_dir: str = 'output_worstcase', report_name: str = 'report_worstcase', include_model_correct_feature: bool = False, sort_probs: bool = True, mia_attack_model: ~typing.Any = <class 'sklearn.ensemble._forest.RandomForestClassifier'>, mia_attack_model_hyp: dict | None = None, attack_metric_success_name: str = 'P_HIGHER_AUC', attack_metric_success_thresh: float = 0.05, attack_metric_success_comp_type: str = 'lte', attack_metric_success_count_thresh: int = 5, attack_fail_fast: bool = False, attack_config_json_file_name: str | None = None, target_path: str | None = None)[source]

Class to wrap the worst case attack code.

Methods

`attack`(target)	Programmatic attack entry point.
`attack_from_prediction_files`()	Start an attack from saved prediction files.
`attack_from_preds`(train_preds, test_preds[, ...])	Runs the attack based upon the predictions in train_preds and test_preds, and the params stored in self.args.
`generate_arrays`(n_rows_in, n_rows_out[, ...])	Generate train and test prediction arrays, used when computing baseline.
`get_params`()	Get parameters for this attack.
`make_dummy_data`()	Makes dummy data for testing functionality.
`make_report`()	Creates output dictionary structure and generates pdf and json outputs if filenames are given.
`run_attack_reps`(train_preds, test_preds[, ...])	Run actual attack reps from train and test predictions.

attack(target: Target) → None[source]

Programmatic attack entry point.

To be used when code has access to Target class and trained target model

Parameters:

targetattacks.target.Target: target as a Target class object

attack_from_prediction_files()[source]

Start an attack from saved prediction files.

To be used when only saved predictions are available.

Filenames for the saved prediction files to be specified in the arguments provided in the constructor

attack_from_preds(train_preds: ndarray, test_preds: ndarray, train_correct: ndarray | None = None, test_correct: ndarray | None = None) → None[source]

Runs the attack based upon the predictions in train_preds and test_preds, and the params stored in self.args.

Parameters:

train_predsnp.ndarray: Array of train predictions. One row per example, one column per class (i.e. 2)
test_predsnp.ndarray: Array of test predictions. One row per example, one column per class (i.e. 2)

generate_arrays(n_rows_in: int, n_rows_out: int, train_beta: float = 2, test_beta: float = 2) → tuple[ndarray, ndarray][source]

Generate train and test prediction arrays, used when computing baseline.

Parameters:

n_rows_inint: number of rows of in-sample (training) probabilities
n_rows_outint: number of rows of out-of-sample (testing) probabilities
train_betafloat: beta value for generating train probabilities
test_betafloat:: beta_value for generating test probabilities

Returns:

train_predsnp.ndarray: Array of train predictions (n_rows x 2 columns)
test_predsnp.ndarray: Array of test predictions (n_rows x 2 columns)

make_dummy_data() → None[source]

Makes dummy data for testing functionality.

Parameters:

argsdict: Command line arguments

Returns:

Notes

Returns nothing but saves two .csv files

make_report() → dict[source]: Creates output dictionary structure and generates pdf and json outputs if filenames are given.

run_attack_reps(train_preds: ndarray, test_preds: ndarray, train_correct: ndarray | None = None, test_correct: ndarray | None = None) → dict[source]

Run actual attack reps from train and test predictions.

Parameters:

train_predsnp.ndarray: predictions from the model on training (in-sample) data
test_predsnp.ndarray: predictions from the model on testing (out-of-sample) data

Returns:

mia_metrics_dictdict: a dictionary with two items including mia_metrics (a list of metric across repetitions) and failfast_metric_summary object (an object of FailFast class) to maintain summary of fail/success of attacks for a given metric of failfast option

aisdc.attacks.worst_case_attack.main()[source]: Main method to parse arguments and invoke relevant method.

Code for automatic report generation.

class aisdc.attacks.report.NumpyArrayEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)[source]

Json encoder that can cope with numpy arrays.

Methods

`default`(o)	If an object is an np.ndarray, convert to list.
`encode`(o)	Return a JSON string representation of a Python data structure.
`iterencode`(o[, _one_shot])	Encode the given object and yield each string representation as available.

default(o)[source]: If an object is an np.ndarray, convert to list.

aisdc.attacks.report.add_output_to_pdf(report_dest: str, pdf_report: FPDF, attack_type: str) → None[source]: Creates pdf and appends contents if it already exists.

aisdc.attacks.report.create_json_report(output)[source]: Create a report in json format for injestion by other tools.

aisdc.attacks.report.create_lr_report(output: dict) → FPDF[source]

Make a lira membership inference report.

Parameters:

outputdict

dictionary with following items

metadata: dict: dictionary of metadata
attack_experiment_logger: dict: list of metrics as dictionary items for an experiments In case of LIRA attack scenario, this will have dictionary items of attack_instance_logger that will have a single metrics dictionary

Returns:

pdffpdf.FPDF: fpdf document object

aisdc.attacks.report.create_mia_report(attack_output: dict) → FPDF[source]

Make a worst case membership inference report.

Parameters:

attack_outputdict

dictionary with following items

metadata: dict
dictionary of metadata

attack_experiment_logger: dict
list of metrics as dictionary items for an experiment

dummy_attack_experiment_logger: dict
list of metrics as dictionary items across dummy experiments

Returns:

pdffpdf.FPDF: fpdf document object

aisdc.attacks.report.line(pdf, text, indent=0, border=0, font_size=11, font_style='', font='arial')[source]: Write a standard block.

aisdc.attacks.report.subtitle(pdf, text, indent=10, border=0, font_size=12, font_style='B')[source]: Write a subtitle block.

aisdc.attacks.report.title(pdf, text, border=0, font_size=24, font_style='B')[source]: Write a title block.

Likelihood testing scenario from https://arxiv.org/pdf/2112.03570.pdf.

class aisdc.attacks.likelihood_attack.DummyClassifier[source]

A Dummy Classifier to allow this code to work with get_metrics.

Methods

`predict`(test_X)	Return an array of 1/0 depending on value in second column.
`predict_proba`(test_X)	Simply return the test_X.

predict(test_X)[source]: Return an array of 1/0 depending on value in second column.

predict_proba(test_X)[source]: Simply return the test_X.

class aisdc.attacks.likelihood_attack.LIRAAttack(n_shadow_models: int = 100, p_thresh: float = 0.05, output_dir: str = 'outputs_lira', report_name: str = 'report_lira', training_data_filename: str | None = None, test_data_filename: str | None = None, training_preds_filename: str | None = None, test_preds_filename: str | None = None, target_model: list | None = None, target_model_hyp: dict | None = None, attack_config_json_file_name: str | None = None, n_shadow_rows_confidences_min: int = 10, shadow_models_fail_fast: bool = False, target_path: str | None = None)[source]

The main LIRA Attack class.

Methods

`attack`(target)	Programmatic attack running Runs a LIRA attack from a Target object and a target model.
`attack_from_config`()	Runs an attack based on the args parsed from the command line.
`example`()	Runs an example attack using data from sklearn.
`get_params`()	Get parameters for this attack.
`make_report`()	Create the report.
`run_scenario_from_preds`(shadow_clf, ...)	Implements the likelihood test, using the "offline" version See p.6 (top of second column) for details.
`setup_example_data`()	Method to create example data and save (including config).

attack(target: Target) → None[source]

Programmatic attack running Runs a LIRA attack from a Target object and a target model.

Parameters:

targetattacks.target.Target: target as an instance of the Target class. Needs to have x_train, x_test, y_train and y_test set.

attack_from_config() → None[source]: Runs an attack based on the args parsed from the command line.

example() → None[source]

Runs an example attack using data from sklearn.

Generates example data, trains a classifier and tuns the attack

make_report() → dict[source]

Create the report.

Creates the output report. If self.args.report_name is not None, it will also save the information in json and pdf formats

Returns:

outputDict: Dictionary containing all attack output

run_scenario_from_preds(shadow_clf: BaseEstimator, X_target_train: Iterable[float], y_target_train: Iterable[float], target_train_preds: Iterable[float], X_shadow_train: Iterable[float], y_shadow_train: Iterable[float], shadow_train_preds: Iterable[float]) → tuple[ndarray, ndarray, BaseEstimator][source]

Implements the likelihood test, using the “offline” version See p.6 (top of second column) for details.

Parameters:

shadow_clfsklearn.Model: An sklearn classifier that will be trained to form the shadow model. All hyper-parameters should have been set.
X_target_trainnp.ndarray: Data that was used to train the target model
y_target_trainnp.ndarray: Labels that were used to train the target model
target_train_predsnp.ndarray: Array of predictions produced by the target model on the training data
X_shadow_trainnp.ndarray: Data that will be used to train the shadow models
y_shadow_trainnp.ndarray: Labels that will be used to train the shadow model
shadow_train_predsnp.ndarray: Array of predictions produced by the target model on the shadow data

Returns:

mia_scoresnp.ndarray: Attack probabilities of belonging to the training set or not
mia_labelsnp.ndarray: True labels of belonging to the training set or not
mia_clsDummyClassifier: A DummyClassifier that directly returns the scores for compatibility with code in metrics.py

Examples

>>> X, y = load_breast_cancer(return_X_y=True, as_frame=False)
>>> train_X, test_X, train_y, test_y = train_test_split(
>>>   X, y, test_size=0.5, stratify=y
>>> )
>>> rf = RandomForestClassifier(min_samples_leaf=1, min_samples_split=2)
>>> rf.fit(train_X, train_y)
>>> mia_test_probs, mia_test_labels, mia_clf = likelihood_scenario(
>>>     RandomForestClassifier(min_samples_leaf=1, min_samples_split=2, max_depth=10),
>>>     train_X,
>>>     train_y,
>>>     rf.predict_proba(train_X),
>>>     test_X,
>>>     test_y,
>>>     rf.predict_proba(test_X),
>>>     n_shadow_models=100
>>> )

setup_example_data() → None[source]

Method to create example data and save (including config). Intended to allow users to see how they would need to setup their own data.

Generates train and test data .csv files, train and test predictions .csv files and a config.json file that can be used to run the attack from the command line.

aisdc.attacks.likelihood_attack.main()[source]: Main method to parse args and invoke relevant code.