LiRA Attack
Likelihood testing scenario from https://arxiv.org/pdf/2112.03570.pdf.
See p.6 (top of second column) for details.
With mode “offline”, we measure the probability of observing a confidence as high as the target model’s under the null-hypothesis that the target point is a non-member. That is we, use the norm CDF.
With mode “offline-carlini”, we measure the probability that a target point did not come from the non-member distribution. That is, we use Carlini’s implementation with a single norm (log) PDF.
With mode “online-carlini”, we use Carlini’s implementation of the standard likelihood ratio test, measuring the ratio of probabilities the sample came from the two distributions. That is, the (log) PDF of pr_in minus pr_out.
- class sacroml.attacks.likelihood_attack.LIRAAttack(output_dir: str = 'outputs', write_report: bool = True, n_shadow_models: int = 100, p_thresh: float = 0.05, mode: str = 'offline', fix_variance: bool = False, report_individual: bool = False)[source]
The main LiRA Attack class.
Methods
attack
(target)Check whether an attack can be performed and run the attack.
attackable
(target)Return whether a target can be assessed with LIRAAttack.
Get parameters for this attack.
- classmethod attackable(target: Target) bool [source]
Return whether a target can be assessed with LIRAAttack.
- __init__(output_dir: str = 'outputs', write_report: bool = True, n_shadow_models: int = 100, p_thresh: float = 0.05, mode: str = 'offline', fix_variance: bool = False, report_individual: bool = False) None [source]
Construct an object to execute a LiRA attack.
- Parameters:
- output_dirstr
Name of the directory where outputs are stored.
- write_reportbool
Whether to generate a JSON and PDF report.
- n_shadow_modelsint
Number of shadow models to be trained.
- p_threshfloat
Threshold to determine significance of things. For instance auc_p_value and pdif_vals.
- modestr
Attack mode: {“offline”, “offline-carlini”, “online-carlini”}
- fix_variancebool
Whether to use the global standard deviation or per record.
- report_individualbool
Whether to report metrics for each individual record.
- get_params() dict
Get parameters for this attack.
- Returns:
- paramsdict
Parameter names mapped to their values.