Attribute Attack
The attribute inference attack assumes that the attacker has access to a record missing the value for one attribute, and measuring whether the trained model allows more (and more accurate) predicted completions for the training set than it does for the test set. An exhaustive search of the target model’s predictive confidence is performed for all possible values that complete the record (discretised for continuous attributes). The attack model then makes a prediction if one missing value (categorical) or a single unbroken range of values (continuous) leads to the highest target confidence, which is above a user-defined threshold; otherwise it reports don’t know.
The attack computes an upper bound on the fraction of records that are vulnerable, i.e., where the attack makes a correct prediction, and reports the Attribute Risk Ratio, ARR(a): the ratio of training and test set proportions for each attribute a. The attack is considered accurate if the target model’s predicted label \(l^*\) for the record with a single missing value is the same as for the actual record value \(l\) (categorical) or the range of values yielding the same target confidence lies within \(l\pm10\%\) (continuous). This latter condition mirrors the protection limits commonly used in cell suppression algorithms; see, for example, Smith et al. [1]. The ARR metric recognises that any useful trained model contains some generalisable information and so only considers the model to be leaking privacy if ARR(a)>1. It also recognises that not all attributes will be considered equally disclosive, and so enables a discussion between TRE staff and researchers.
Usage
To run the attribute attack, in addition to the usual processed data splits, the feature encoding and the original unprocessed data must be included within the sacroml.attacks.target.Target
that is passed to the attribute attack object.
See the examples:
References
J. E. Smith, A. R. Clark, A. T. Staggemeier, and M. C. Serpell. A genetic approach to statistical disclosure control. IEEE Transactions on Evolutionary Computation, 3(16):431–441, June 2012. doi:10.1109/TEVC.2011.2159271.
API Reference
Attribute inference attacks.
- class sacroml.attacks.attribute_attack.AttributeAttack(output_dir: str = 'outputs', write_report: bool = True, n_cpu: int = 3)[source]
Attribute inference attack.
Methods
attack
(target)Check whether an attack can be performed and run the attack.
attackable
(target)Return whether a target can be assessed with AttributeAttack.
Get parameters for this attack.
- classmethod attackable(target: Target) bool [source]
Return whether a target can be assessed with AttributeAttack.
- __init__(output_dir: str = 'outputs', write_report: bool = True, n_cpu: int = 3) None [source]
Construct an object to execute an attribute inference attack.
- Parameters:
- n_cpuint
number of CPUs used to run the attack
- output_dirstr
name of the directory where outputs are stored
- write_reportbool
Whether to generate a JSON and PDF report.
- get_params() dict
Get parameters for this attack.
- Returns:
- paramsdict
Parameter names mapped to their values.
- sacroml.attacks.attribute_attack.plot_categorical_fraction(res: dict, path: str = '') None [source]
Generate a bar chart showing fraction of dataset inferred.
- Parameters:
- resdict
Dictionary containing attribute inference attack results.
- pathstr
Directory to write plots.
- sacroml.attacks.attribute_attack.plot_categorical_risk(res: dict, path: str = '') None [source]
Generate a bar chart showing categorical risk scores.
- Parameters:
- resdict
Dictionary containing attribute inference attack results.
- pathstr
Directory to write plots.
- sacroml.attacks.attribute_attack.plot_quantitative_risk(res: dict, path: str = '') None [source]
Generate a bar chart showing quantitative value risk scores.
- Parameters:
- resdict
Dictionary containing attribute inference attack results.
- pathstr
Directory to write plots.