Attribute Attack

The attribute inference attack assumes that the attacker has access to a record missing the value for one attribute, and measuring whether the trained model allows more (and more accurate) predicted completions for the training set than it does for the test set. An exhaustive search of the target model’s predictive confidence is performed for all possible values that complete the record (discretised for continuous attributes). The attack model then makes a prediction if one missing value (categorical) or a single unbroken range of values (continuous) leads to the highest target confidence, which is above a user-defined threshold; otherwise it reports don’t know.

The attack computes an upper bound on the fraction of records that are vulnerable, i.e., where the attack makes a correct prediction, and reports the Attribute Risk Ratio, ARR(a): the ratio of training and test set proportions for each attribute a. The attack is considered accurate if the target model’s predicted label \(l^*\) for the record with a single missing value is the same as for the actual record value \(l\) (categorical) or the range of values yielding the same target confidence lies within \(l\pm10\%\) (continuous). This latter condition mirrors the protection limits commonly used in cell suppression algorithms; see, for example, Smith et al. [1]. The ARR metric recognises that any useful trained model contains some generalisable information and so only considers the model to be leaking privacy if ARR(a)>1. It also recognises that not all attributes will be considered equally disclosive, and so enables a discussion between TRE staff and researchers.

Usage

To run the attribute attack, in addition to the usual processed data splits, the feature encoding and the original unprocessed data must be included within the sacroml.attacks.target.Target that is passed to the attribute attack object.

See the examples:

Training a model and including all required information.
Running an attribute inference attack programmatically.

References

[1]

J. E. Smith, A. R. Clark, A. T. Staggemeier, and M. C. Serpell. A genetic approach to statistical disclosure control. IEEE Transactions on Evolutionary Computation, 3(16):431–441, June 2012. doi:10.1109/TEVC.2011.2159271.

API Reference

Attribute inference attacks.

class sacroml.attacks.attribute_attack.AttributeAttack(output_dir: str = 'outputs', write_report: bool = True, n_cpu: int = 3)[source]

Attribute inference attack.

Methods

`attack`(target)	Check whether an attack can be performed and run the attack.
`attackable`(target)	Return whether a target can be assessed with AttributeAttack.
`get_params`()	Get parameters for this attack.

classmethod attackable(target: Target) → bool[source]: Return whether a target can be assessed with AttributeAttack.

__init__(output_dir: str = 'outputs', write_report: bool = True, n_cpu: int = 3) → None[source]

Construct an object to execute an attribute inference attack.

Parameters:

n_cpuint: number of CPUs used to run the attack
output_dirstr: name of the directory where outputs are stored
write_reportbool: Whether to generate a JSON and PDF report.

attack(target: Target) → dict: Check whether an attack can be performed and run the attack.

get_params() → dict

Get parameters for this attack.

Returns:

paramsdict: Parameter names mapped to their values.

sacroml.attacks.attribute_attack.plot_categorical_fraction(res: dict, path: str = '') → None[source]

Generate a bar chart showing fraction of dataset inferred.

Parameters:

resdict: Dictionary containing attribute inference attack results.
pathstr: Directory to write plots.

sacroml.attacks.attribute_attack.plot_categorical_risk(res: dict, path: str = '') → None[source]

Generate a bar chart showing categorical risk scores.

Parameters:

resdict: Dictionary containing attribute inference attack results.
pathstr: Directory to write plots.

sacroml.attacks.attribute_attack.plot_quantitative_risk(res: dict, path: str = '') → None[source]

Generate a bar chart showing quantitative value risk scores.

Parameters:

resdict: Dictionary containing attribute inference attack results.
pathstr: Directory to write plots.

sacroml.attacks.attribute_attack.report_categorical(results: dict) → str[source]: Return a string report of the categorical results.

sacroml.attacks.attribute_attack.report_quantitative(results: dict) → str[source]: Return a string report of the quantitative results.