Metrics

Calculate metrics.

sacroml.metrics.auc_p_val(auc: float, n_pos: int, n_neg: int) tuple[float, float][source]

Compute the p-value for a given AUC.

Parameters:
aucfloat

Observed AUC value

n_posint

Number of positive examples

n_negint

Number of negative examples

Returns:
auc_pfloat

p-value of observing an AUC > auc by chance

auc_stdfloat

standard deviation of the NULL AUC diustribution (mean = 0.5)

sacroml.metrics.get_metrics(y_pred_proba: ndarray, y_test: ndarray, permute_rows: bool = True) dict[source]

Calculate metrics, including attacker advantage for MIA binary.

Implemented as Definition 4 on https://arxiv.org/pdf/1709.01604.pdf which is also implemented in tensorFlow-privacy https://github.com/tensorflow/privacy.

Parameters:
y_testnp.ndarray

Test data labels.

y_pred_probanp.ndarray of shape [x,2] and type float

Predicted probabilities.

permute_rowsbool, default True

Whether to permute arrays, see: https://github.com/AI-SDC/SACRO-ML/issues/106

Returns:
metricsdict

dictionary of metric values

Notes

Includes the following metrics:

  • True positive rate or recall (TPR).

  • False positive rate (FPR), proportion of negative examples incorrectly classified as positives.

  • False alarm rate (FAR), proportion of objects classified as positives that are incorrect, also known as false discovery rate.

  • True neagative rate (TNR).

  • Positive predictive value or precision (PPV).

  • Negative predictive value (NPV).

  • False neagative rate (FNR).

  • Accuracy (ACC).

  • F1 Score - harmonic mean of precision and recall.

  • Advantage.

sacroml.metrics.min_max_disc(y_true: ndarray, pred_probs: ndarray, x_prop: float = 0.1, log_p: bool = True) tuple[float, float, float, float][source]

Return non-average-case methods for MIA attacks.

Considers actual frequency of membership amongst samples with highest- and lowest- assessed probability of membership. If an MIA method confidently asserts that 5% of samples are members and 5% of samples are not, but cannot tell for the remaining 90% of samples, then these metrics will flag this behaviour, but AUC/advantage may not. Since the difference may be noisy, a p-value against a null of independence of true membership and assessed membership probability (that is, membership probabilities are essentially random) is also used as a metric (using a usual Gaussian approximation to binomial). If the p-value is low and the frequency difference is high (>0.5) then the MIA attack is successful for some samples.

Parameters:
ynp.ndarray

true labels

ypnp.ndarray

probabilities of labels, or monotonic transformation of probabilities

xpropfloat

proportion of samples with highest- and lowest- probability of membership to be considered

logpbool

convert p-values to log(p).

Returns:
maxdfloat

frequency of y=1 amongst proportion xprop of individuals with highest assessed membership probability

mindfloat

frequency of y=1 amongst proportion xprop of individuals with lowest assessed membership probability

mmdfloat

difference between maxd and mind

pvalfloat

p-value or log-p value corresponding to mmd against null hypothesis that random variables corresponding to y and yp are independent.

Examples

>>> y = np.random.choice(2, 100)
>>> yp = np.random.rand(100)
>>> maxd, mind, mmd, pval = min_max_desc(y, yp, xprop=0.2, logp=True)