{ "cells": [ { "cell_type": "markdown", "id": "b0feddf2-7437-4548-8add-f07822e2792d", "metadata": { "tags": [] }, "source": [ "# Drug response prediction for cancer patients\n", "\n", "Scientists have created a model to predict whether or not cancer patients will respond to a drug.\n", "\n", "The same scientist published the details of their research, how the model was built and a detailed description of the data (e.g., the health conditions investigated), the NHS board where the data was collected. The data was deidentified and was not released as it is confidential patient information, and any leak might break existing legislation.\n", "\n", "The researchers balanced the benefits and potential risks of the model realease, and it was decided that overall, there is a clear benefit for the population for the model to be made public.\n", "\n", "What they didn\u2019t realise, is that the NHS board in question is home to a famous Member of Parliament (MP). This famous MP is a former Prime Minister. There had been some speculation that the MP had cancer, but it is not in the public domain.\n", "\n", "## Membership Inference\n", "\n", "We will use this example to demonstrate a _membership inference_ attack. In such an attack, an attacker has access to information about a particular individual (maybe they are famous), and attempts to find out if their data was used to train the model. In this case, knowing if they were in the training set for the model would be disclosive as it would reveal that they had indeed suffered from cancer (all people in the training set had cancer)" ] }, { "cell_type": "markdown", "id": "33ffc813-c9e5-43c7-8a32-2bb2aed3b5f5", "metadata": {}, "source": [ "## Let's get hands on with this example.\n", "\n", "The following code imports some standard libraries that we will need." ] }, { "cell_type": "code", "execution_count": 1, "id": "63720e84-26f9-4763-9f20-18bad396ba4b", "metadata": {}, "outputs": [], "source": [ "import random\n", "\n", "import numpy as np\n", "\n", "np.random.seed(1234)\n", "random.seed(12345)\n", "\n", "import pandas as pd\n", "from scipy.stats import poisson\n", "from sklearn.model_selection import train_test_split\n", "from sklearn.svm import SVC" ] }, { "cell_type": "markdown", "id": "9c7acbc2-1970-4259-90b9-dda459bf85a4", "metadata": { "tags": [] }, "source": [ "## Create the original model\n", "\n", "We are assuming that a model is trained within a TRE on real data. However, we do not have access to real data, so we will randomly generate some realistic looking data.\n", "\n", "In particular, we will generate data for 200 people: all are cancer patients, 100 responded well to the drug, and 100 did not. Our MP will be one of the patients in the good responders set.\n", "\n", "For each patient, we generate six values that in reality would be extracted from their electronic health records:\n", "1. `diabetes` -- whether or not the patient suffers from diabetes (1 = yes, 0 = no)\n", "1. `asthma` -- whether or not the patient suffers from asthma (1 = yes, 0 = no)\n", "1. `bmi_group` -- the BMI group in which the patient falls (1, 2, 3, or 4)\n", "1. `blood_pressure` -- the blood pressure group in which the patient falls (0, 1, 2, 3, 4, or 5)\n", "1. `smoker` -- whether or not the patient is a smoker (1 = yes, 0 = no)\n", "1. `age` -- the patient's age\n", "\n", "Each patient is also associated with a value to indicate whether they responded well to the drug (1) or not (0).\n" ] }, { "cell_type": "code", "execution_count": 2, "id": "2f94ec62-eefa-4859-acba-6ccb29f4ebb2", "metadata": {}, "outputs": [], "source": [ "# 1 is cancer, 0 is no cancer, this is our label and what we want to predict.\n", "response = [1] * 99 + [0] * 100\n", "\n", "df = pd.DataFrame()\n", "\n", "# diabetes 0 no, 1 yes\n", "df[\"diabetes\"] = [[1, 0][random.random() > 0.7] for n in range(99)] + [\n", " [1, 0][random.random() > 0.2] for n in range(100)\n", "]\n", "\n", "# asthma 0 no, 1 yes\n", "df[\"asthma\"] = [[1, 0][random.random() > 0.7] for n in range(99)] + [\n", " [1, 0][random.random() > 0.5] for n in range(100)\n", "]\n", "\n", "# bmi group 1 under, 2 normal, 3 overweight, 4 obese\n", "df[\"bmi_group\"] = [\n", " random.choices([1, 2, 3, 4], weights=[0.5, 5, 7, 5], k=1)[0] for n in range(99)\n", "] + [random.choices([1, 2, 3, 4], weights=[1, 7, 4, 1], k=1)[0] for n in range(100)]\n", "\n", "# blood pressure 0 is low, 1 is normal, 5 is extremly high\n", "df[\"blood_pressure\"] = [\n", " random.choices([0, 1, 2, 3, 4, 5], weights=[0.5, 1, 5, 6, 1, 0.5], k=1)[0]\n", " for n in range(99)\n", "] + [\n", " random.choices([0, 1, 2, 3, 4, 5], weights=[0.5, 5, 5, 1, 1, 0.5], k=1)[0]\n", " for n in range(100)\n", "]\n", "\n", "# smoker 0 is non smoker, 1 is smoker\n", "df[\"smoker\"] = [[1, 0][random.random() > 0.8] for n in range(99)] + [\n", " [1, 0][random.random() > 0.2] for n in range(100)\n", "]\n", "\n", "# age\n", "x = np.arange(20, 90)\n", "pmf = poisson.pmf(x, 72)\n", "age = [random.choices(x, weights=pmf, k=1)[0] for n in range(99)]\n", "x = np.arange(20, 90)\n", "pmf = poisson.pmf(x, 55)\n", "age2 = [random.choices(x, weights=pmf, k=1)[0] for n in range(100)]\n", "df[\"age\"] = age + age2\n", "\n", "# Add the data of your MP\n", "response = response + [1]\n", "\n", "# add new row to end of DataFrame\n", "# the order of the list indicates in order diabetes, asthma, bmi_group, blood_pressure, smoker, age\n", "df.loc[len(df.index)] = [1, 1, 3, 2, 1, 62]" ] }, { "cell_type": "markdown", "id": "011f7237", "metadata": {}, "source": [ "This looks like the kind of data that might exist within a TRE. Here's the first few rows:" ] }, { "cell_type": "code", "execution_count": 3, "id": "978b6754", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ " diabetes asthma bmi_group blood_pressure smoker age\n", "0 1 0 4 3 1 72\n", "1 1 1 2 3 0 83\n", "2 0 0 4 3 1 63\n", "3 1 1 4 3 0 77\n", "4 1 1 4 2 1 87\n" ] } ], "source": [ "print(df.head())" ] }, { "cell_type": "markdown", "id": "65a2663c", "metadata": {}, "source": [ "Our MP is the final row of the data, here are their values:" ] }, { "cell_type": "code", "execution_count": 4, "id": "34992ca0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "diabetes 1\n", "asthma 1\n", "bmi_group 3\n", "blood_pressure 2\n", "smoker 1\n", "age 62\n", "Name: 199, dtype: int64\n" ] } ], "source": [ "print(df.iloc[199, :])" ] }, { "cell_type": "markdown", "id": "dace3d79-78bc-4fb8-a1ae-2d8587c663b2", "metadata": {}, "source": [ "## Model training\n", "\n", "The researcher trained a particular machine learning model called a Support Vector Machine (SVM). This is a very popular model for tasks in which we want to assign things (in this case patients) to groups (in this case cancer v non-cancer). The attribute inference attack we will perform is not unique to SVMs, we just use them as a popular example.\n", "\n", "Training the model is very straightforward -- just a couple of lines of code (the details are not important)." ] }, { "cell_type": "code", "execution_count": 5, "id": "d5500d7b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "SVC(C=1, gamma=3, probability=True,\n", " random_state=RandomState(MT19937) at 0x159E5341140)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Train a model\n", "prng = np.random.RandomState(12)\n", "svc = SVC(C=1, gamma=3, probability=True, random_state=prng)\n", "svc.fit(df, response)" ] }, { "cell_type": "markdown", "id": "418f7fa2", "metadata": {}, "source": [ "The trained model can be used to make predictions about new individuals. Given data for an individual, it will produce two scores (probabilities). The first is how likely they are to belong to the non-responders group (higher = more likely) and the second how likely they are to belong to the responders group. The scores are always positive, and sum to 1.\n", "\n", "For example, if we have an individual who has diabetes, has asthma, has a bmi group of 1, blood pressure of 5. is a non-smoker and is 42 years old, we can use the model to predict whether or not they should belong in the cancer or non-cancer groups:" ] }, { "cell_type": "code", "execution_count": 6, "id": "e21f2890", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "non-responders score = 0.49\n", "responders score = 0.51\n" ] } ], "source": [ "test_example = pd.DataFrame(\n", " {\n", " \"diabetes\": 1,\n", " \"asthma\": 1,\n", " \"bmi_group\": 1,\n", " \"blood_pressure\": 5,\n", " \"smoker\": 1,\n", " \"age\": 72,\n", " },\n", " index=[1],\n", ")\n", "predictions = svc.predict_proba(test_example)\n", "print(f\"non-responders score = {predictions[0][0]:.2f}\")\n", "print(f\"responders score = {predictions[0][1]:.2f}\")" ] }, { "cell_type": "markdown", "id": "01608373", "metadata": {}, "source": [ "## The attack\n", "\n", "We now assume the role of the attacker. The attacker knows some general properties about the data -- for example, they know the range of values each variable can take. They also know the configuration of the classifier that was trained (that it was an SVM and any parameters that were used to define it (more on this later)). Finally, they know (or can make a good guess at) the input data for the MP (they are famous and this information is perhaps in the public domain). The attacker is going to try and determine, from this information, and with access to the trained model, whether or not the MP was in the dataset and hence determine if they had cancer or not.\n", "\n", "## How does the attack work?\n", "\n", "Recall that when we used the model to make predictions, the model provided two scores -- the cancer and non-cancer scores. The more extreme these scores become (e.g one is close to 1 and the other to 0 (recall that they have to add up to 1)), the more _confident_ the model is in assigning that example. It is not uncommon for models to have higher confidence for examples that they were trained on than examples that they haven't seen before. It is this property that the attacker will make use of.\n", "\n", "In particular, the attacker will generate their own dataset (known as _shadow_ data) that has similar properties to the original. They can do this randomly -- it doesn't matter that it won't be quite right -- all they need to know is the rough ranges of the variables. They will then use some of this data to train their own model (a _shadow_ model). This allows them to see roughly what kind of confidence their model gives to examples it was trained on, and examples it wasn't trained on. This gives them an idea about how confident the original model is likely to be on data it was trained on, and data it wasn't trained on. Comparing this to the actual confidence obtained when the MPs data is given to the original model will allow them to infer if the MP was in the training data or not.\n", "\n", "Let's look at that step-by-step...\n", "\n", "Firstly, the attacker presents the MPs data to the original model to see what the model's predictions are...\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "ddbad0bf", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0.06161495 0.93838505]]\n" ] } ], "source": [ "mp_data = pd.DataFrame(\n", " {\n", " \"diabetes\": 1,\n", " \"asthma\": 1,\n", " \"bmi_group\": 3,\n", " \"blood_pressure\": 2,\n", " \"smoker\": 1,\n", " \"age\": 62,\n", " },\n", " index=[1],\n", ")\n", "mp_preds = svc.predict_proba(mp_data)\n", "print(mp_preds)" ] }, { "cell_type": "markdown", "id": "710bbec4", "metadata": {}, "source": [ "The model stronly predicts that the MP is in the reponder class. This in itself doesn't tell the attacker that the model was in the training set. What the attacker needs is to estimate how confident the model is when presented with examples from the training set, and when not. This is where the _shadow_ model comes in -- they hope that their shadow model is similar enough to the original that the confidences it gives can be used as a proxy against which to compare these values for the MP." ] }, { "cell_type": "markdown", "id": "3879fa3a", "metadata": {}, "source": [ "The attacker generates their _shadow_ data. There are lots of ways they could do this, in this case they use the same process we used above." ] }, { "cell_type": "code", "execution_count": 8, "id": "71f854d0", "metadata": {}, "outputs": [], "source": [ "# 1 is cancer, 0 is no response, this is our label and what we want to predict.\n", "shadow_response = [1] * 100 + [0] * 100\n", "\n", "shadow_df = pd.DataFrame()\n", "\n", "# diabetes 0 no, 1 yes\n", "shadow_df[\"diabetes\"] = [[1, 0][random.random() > 0.7] for n in range(100)] + [\n", " [1, 0][random.random() > 0.2] for n in range(100)\n", "]\n", "\n", "# asthma 0 no, 1 yes\n", "shadow_df[\"asthma\"] = [[1, 0][random.random() > 0.7] for n in range(100)] + [\n", " [1, 0][random.random() > 0.5] for n in range(100)\n", "]\n", "\n", "# bmi group 1 under, 2 normal, 3 overweight, 4 obese\n", "shadow_df[\"bmi_group\"] = [\n", " random.choices([1, 2, 3, 4], weights=[0.5, 5, 7, 5], k=1)[0] for n in range(100)\n", "] + [random.choices([1, 2, 3, 4], weights=[1, 7, 4, 1], k=1)[0] for n in range(100)]\n", "\n", "# blood pressure 0 is low, 1 is normal, 5 is extremly high\n", "shadow_df[\"blood_pressure\"] = [\n", " random.choices([0, 1, 2, 3, 4, 5], weights=[0.5, 1, 5, 6, 1, 0.5], k=1)[0]\n", " for n in range(100)\n", "] + [\n", " random.choices([0, 1, 2, 3, 4, 5], weights=[0.5, 5, 5, 1, 1, 0.5], k=1)[0]\n", " for n in range(100)\n", "]\n", "\n", "# smoker 0 is non smoker, 1 is smoker\n", "shadow_df[\"smoker\"] = [[1, 0][random.random() > 0.8] for n in range(100)] + [\n", " [1, 0][random.random() > 0.2] for n in range(100)\n", "]\n", "\n", "# age\n", "x = np.arange(20, 90)\n", "pmf = poisson.pmf(x, 72)\n", "age = [random.choices(x, weights=pmf, k=1)[0] for n in range(100)]\n", "x = np.arange(20, 90)\n", "pmf = poisson.pmf(x, 55)\n", "age2 = [random.choices(x, weights=pmf, k=1)[0] for n in range(100)]\n", "shadow_df[\"age\"] = age + age2" ] }, { "cell_type": "markdown", "id": "1d376b8d", "metadata": {}, "source": [ "Now, we split the shadow data into two. We will use one set to train the shadow model" ] }, { "cell_type": "code", "execution_count": 9, "id": "38d7126d", "metadata": {}, "outputs": [], "source": [ "shadow_train_x, shadow_test_x, shadow_train_y, shadow_test_y = train_test_split(\n", " shadow_df, shadow_response, test_size=0.5\n", ")" ] }, { "cell_type": "markdown", "id": "e770da87", "metadata": {}, "source": [ "And train the shadow model..." ] }, { "cell_type": "code", "execution_count": 10, "id": "22a4e547", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "SVC(C=1, gamma=3, probability=True,\n", " random_state=RandomState(MT19937) at 0x159E5341840)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Train a model\n", "prng = np.random.RandomState(12)\n", "shadow_svc = SVC(C=1, gamma=3, probability=True, random_state=prng)\n", "shadow_svc.fit(shadow_train_x, shadow_train_y)" ] }, { "cell_type": "markdown", "id": "afca4893", "metadata": {}, "source": [ "The attacker now passes the portion of shadow data used for training, and the portion used for testing through the trained shadow model to extract the model's confidence. For each example, the attacker just needs the highest of the two values (reponse confidence or non-response confidence, whichever is larger). A quick look at the average of these values for the two sets tells us that the shadow model assigns much higher confidence to training examples than non-training examples" ] }, { "cell_type": "code", "execution_count": 11, "id": "5c4e82be", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Mean of confidence for training examples = 0.92\n", "Mean of confidence for non-training examples = 0.54\n" ] } ], "source": [ "train_preds = shadow_svc.predict_proba(shadow_train_x).max(axis=1)\n", "test_preds = shadow_svc.predict_proba(shadow_test_x).max(axis=1)\n", "print(f\"Mean of confidence for training examples = {train_preds.mean():.2f}\")\n", "print(f\"Mean of confidence for non-training examples = {test_preds.mean():.2f}\")" ] }, { "cell_type": "markdown", "id": "c79b9928", "metadata": {}, "source": [ "The attacker now knows the kind of confidence values that their shadow model gives to training and non-training examples. They're confident that the data they generated is similar enough to the original data, and that their model is confifgured similarly to the original model (remember that the researcher released this information) to assume that these confidence values are similar to those that the original model would give on training and non-training data. They can therefore use them as a baseline against which to compare the the value they got when they presented the MP's data to the original model." ] }, { "cell_type": "code", "execution_count": 16, "id": "45dfa8e0", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Maximum confidence for MP in original model: 0.84\n" ] } ], "source": [ "print(f\"Maximum confidence for MP in original model: {mp_preds.max():.2f}\")" ] }, { "cell_type": "markdown", "id": "a34463fb", "metadata": {}, "source": [ "The attacker can do this comparison in a number of ways. Here we will assume that the attacker trains another ML model (the _attack_ model) to distinguish between these two sets of confidences. We use a LogisticRegression model (a very simple classifier), but the attacker could use anything." ] }, { "cell_type": "code", "execution_count": 13, "id": "7b45f764", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "LogisticRegression()" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "from sklearn.linear_model import LogisticRegression\n", "\n", "lr = LogisticRegression()\n", "train_x = np.vstack((train_preds[:, None], test_preds[:, None]))\n", "train_y = np.hstack((np.ones(len(train_preds)), np.zeros(len(test_preds))))\n", "lr.fit(train_x, train_y)" ] }, { "cell_type": "markdown", "id": "d87b27c9", "metadata": {}, "source": [ "The attacker now passes the maximum confidence value they got from the original model with the MPs data to this new model to obtain a prediction as to whether or not it was in the training data. Note that the prediction takes the same form as previous predictions: two confidence values. One the confidence of it not having been in the training set, another the confidence that it was:" ] }, { "cell_type": "code", "execution_count": 14, "id": "309dd568", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Confidence of non-membership = 0.20\n", "Confidence of membership = 0.80\n" ] } ], "source": [ "input_array = np.array([[mp_preds.max()]])\n", "prob_membership = lr.predict_proba(input_array)\n", "print(f\"Confidence of non-membership = {prob_membership[0][0]:.2f}\")\n", "print(f\"Confidence of membership = {prob_membership[0][1]:.2f}\")" ] }, { "cell_type": "markdown", "id": "1d98a425", "metadata": {}, "source": [ "The model is making a strong prediction that the MP _was_ in the training data, and therefore the attacker concludes that they _did_ have Cancer. This prediction is correct." ] }, { "cell_type": "markdown", "id": "f6d2860b", "metadata": {}, "source": [ "## Mitigation\n", "\n", "Mitigating this kind of attack involves configuring the classifier to not give different confidences to examples that it has been trained upon. In this case, decreasing the SVMs `gamma` parameter will have a strong effect. For example, here is what happens in the attack if the original model's `gamma` is reduced from 3 to 0.1" ] }, { "cell_type": "code", "execution_count": 17, "id": "921d437f", "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[0.15803859 0.84196141]]\n", "Mean of confidence for training examples = 0.93\n", "Mean of confidence for non-training examples = 0.90\n", "Confidence of non-membership = 0.52\n", "Confidence of membership = 0.48\n" ] } ], "source": [ "prng = np.random.RandomState(12)\n", "svc = SVC(C=1, gamma=0.1, probability=True, random_state=prng)\n", "svc.fit(df, response)\n", "\n", "mp_preds = svc.predict_proba(mp_data)\n", "print(mp_preds)\n", "\n", "shadow_svc = SVC(C=1, gamma=0.1, probability=True, random_state=prng)\n", "shadow_svc.fit(shadow_train_x, shadow_train_y)\n", "\n", "train_preds = shadow_svc.predict_proba(shadow_train_x).max(axis=1)\n", "test_preds = shadow_svc.predict_proba(shadow_test_x).max(axis=1)\n", "print(f\"Mean of confidence for training examples = {train_preds.mean():.2f}\")\n", "print(f\"Mean of confidence for non-training examples = {test_preds.mean():.2f}\")\n", "\n", "lr = LogisticRegression()\n", "train_x = np.vstack((train_preds[:, None], test_preds[:, None]))\n", "train_y = np.hstack((np.ones(len(train_preds)), np.zeros(len(test_preds))))\n", "lr.fit(train_x, train_y)\n", "\n", "\n", "input_array = np.array([[mp_preds.max()]])\n", "prob_membership = lr.predict_proba(input_array)\n", "print(f\"Confidence of non-membership = {prob_membership[0][0]:.2f}\")\n", "print(f\"Confidence of membership = {prob_membership[0][1]:.2f}\")" ] }, { "cell_type": "markdown", "id": "94d3f196", "metadata": {}, "source": [ "The attack is now almost completely ambiguous, providing the attacker with no information as to whether or not the MP was in the training set.\n", "\n", "The technical effect `gamma` has on the SVM is unimportant -- the important point is that changes to the model's configuration can play a significant role in its vulnerability." ] }, { "cell_type": "markdown", "id": "bf4518cd", "metadata": {}, "source": [ "## Conclusions\n", "\n", "This example has shown how an attacker can perform a membership inference attack to determine that a well-known individual was in a model's training data.\n", "\n", "It is hopefully clear that this is non-trivial -- the attacker has to put in quite a lot of effort. Their success is also contingent on them knowing certain things about the problem. In particular:\n", "1. Enough information about the original data that they can generate a shadow dataset. This will be things like: types of variables, ranges of variables, distributions of variables. Such information is often available at population levels (e.g. average age, proportion of population with diabetes etc).\n", "1. Configuration information about the original model. In this case, it was the parameters that define the model and, in particular a parameter called `gamma` that is used in the SVM. It is quite common for researchers to publish this information.\n", "1. The input values for the individual in question. This is harder to come by, but for famous individuals, it's conceivable that a lot of this information might be in the public domain.\n", "\n" ] }, { "cell_type": "markdown", "id": "f8b9da84", "metadata": {}, "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.13" }, "vscode": { "interpreter": { "hash": "fcca1ce0a591990538c4a1a2cbe16853d718e2332b5914ea18ddb1937a418955" } } }, "nbformat": 4, "nbformat_minor": 5 }