Documents detail DeepMind's plan to apply AI to NHS data in 2015

More details have emerged about a controversial 2015 patient data-sharing arrangement between Google DeepMind and a UK National Health Service Trust which paint a contrasting picture vs the pair’s public narrative about their intended use of 1.6 million citizens’ medical records.

DeepMind and the Royal Free NHS Trust signed their initial information sharing agreement (ISA) in September 2015 — ostensibly to co-develop a clinical task management app, called Streams, for early detection of an acute kidney condition using an NHS algorithm.

Patients whose fully identifiable medical records were being shared with the Google-owned company were neither asked for their consent nor informed their data was being handed to the commercial entity.

Indeed, the arrangement was only announced to the public five months after it was inked — and months after patient data had already started to flow.

And it was only fleshed out in any real detail after a New Scientist journalist obtained and published the ISA between the pair, in April 2016 — revealing for the first time, via a Freedom of Information request, quite how much medical data was being shared for an app that targets a single condition.

This led to an investigation being opened by the UK’s data protection watchdog into the legality of the arrangement. And as public pressure mounted over the scope and intentions behind the medical records collaboration, the pair stuck to their line that patient data was not being used for training artificial intelligence.

They also claimed they did not need to seek patient consent for their medical records to be shared because the resulting app would be used for direct patient care — a claimed legal basis that has since been demolished by the ICO, which concluded a more than year-long investigation in July.

However a series of newly released documents shows that applying AI to the patient data was in fact a goal for DeepMind right from the earliest months of its partnership with the Royal Free — with its intention being to utilize the wide-ranging access to and control of publicly-funded medical data it was being granted by the Trust to simultaneously develop its own AI models.

In a FAQ note on its website when it publicly announced the collaboration, in February 2016, DeepMind wrote: “No, artificial intelligence is not part of the early-stage pilots we’re announcing today. It’s too early to determine where AI could be applied here, but it’s certainly something we are excited about for the future.”

Omitted from that description of its plans was the fact it had already received a favorable ethical opinion from an NHS Health Research Authority research ethics committee to run a two-year AI research study on the same underlying NHS patient data.

DeepMind’s intent was always to apply AI

The newly released documents, obtained via an FOI filed by health data privacy advocacy organization medConfidential, show DeepMind made an ethics application for an AI research project using Royal Free patient data in October 2015 — with the stated aim of “using machine learning to improve prediction of acute kidney injury and general patient deterioration”.

Earlier still, in May 2015, the company gained confirmation from an insurer to cover its potential liability for the research project — which it subsequently notes having in place in its project application.

And the NHS ethics board granted DeepMind’s AI research project application in November 2015 — with the two-year AI research project scheduled to start in December 2015 and run until December 2017.

A brief outline of the approved research project was previously published on the Health Research Authority’s website, per its standard protocol, but the FOI reveals more details about the scope of the study — which is summarized in DeepMind’s application as follows:

By combining classical statistical methodology and cutting-edge machine learning algorithms (e.g. ‘unsupervised and semisupervised learning’), this research project will create improved techniques of data analysis
and prediction of who may get AKI [acute kidney injury], more accurately identify cases when they occur, and better alert doctors to their presence.

DeepMind’s application claimed that the existing NHS algorithm, which it was deploying via the Streams app, “appears” to be missing and misclassifying some cases of AKI, and generating false positives — and goes on to suggest: “The problem is not with the tool which DeepMind have made, but with the algorithm itself. We think we can overcome these problems, and create a system which works better.”

Although at the time it wrote this application, in October 2015, user tests of the Streams app had not yet begun — so it’s unclear how DeepMind could so confidently assert there was no “problem” with a tool it hadn’t yet tested. But presumably it was attempting to convey information about (what it claimed were) “major limitations” with the working of the NHS’ national AKI algorithm passed on to it by the Royal Free.

(For the record: In an FOI response that TechCrunch received back from the Royal Free in August 2016, the Trust told us that the first Streams user tests were carried out on 12-14 December 2015. It further confirmed: “The application has not been implemented outside of the controlled user tests.”)

Most interestingly, DeepMind’s AI research application shows it told the NHS ethics board that it could process NHS data for the study under “existing information sharing agreements” with the Royal Free.

“DeepMind acting as a data processor, under existing information sharing agreements with the responsible care organisations (in this case the Royal Free Hospitals NHS Trust), and providing existing services on identifiable patient data, will identify and anonymize the relevant records,” the Google division wrote in the research application.

The fact that DeepMind had taken active steps to gain approval for AI research on the Royal Free patient data as far back as fall 2015 flies in the face of all the subsequent assertions made by the pair to the press and public — when they claimed the Royal Free data was not being used to train AI models.

For instance, here’s what this publication was told in May last year, after the scope of the data being shared by the Trust with DeepMind had just emerged (emphasis mine):

DeepMind confirmed it is not, at this point, performing any machine learning/AI processing on the data it is receiving, although the company has clearly indicated it would like to do so in future. A note on its website pertaining to this ambition reads: “[A]rtificial intelligence is not part of the early-stage pilots we’re announcing today. It’s too early to determine where AI could be applied here, but it’s certainly something we are excited about for the future.”

The Royal Free spokesman said it is not possible, under the current data-sharing agreement between the trust and DeepMind, for the company to apply AI technology to these data-sets and data streams.

That type of processing of the data would require another agreement, he confirmed.

“The only thing this data is for is direct patient care,” he added. “It is not being used for research, or anything like that.”

As the FOI makes clear, and contrary to the Royal Free spokesman’s claim, DeepMind had in fact been granted ethical approval by the NHS Health Research Authority in November 2015 to conduct AI research on the Royal Free patient data-set — with DeepMind in control of selecting and anonymizing the PID (patient identifiable data) intended for this purpose.

Conducting research on medical data would clearly not constitute an act of direct patient care — which was the legal basis DeepMind and the Royal Free were at the time claiming for their reliance on implied consent of NHS patients to their data being shared. So, in seeking to paper over the erupting controversy about how many patients’ medical records had been shared without their knowledge or consent, it appears the pair felt the need to publicly de-emphasize their parallel AI research intentions for the data.

“If you have been given data, and then anonymise it to do research on, it’s disingenuous to claim you’re not using the data for research,” said Dr Eerke Boiten, a cyber security professor at De Montfort University whose research interests encompass data privacy and ethics, when asked for his view on the pair’s modus operandi here.

“And [DeepMind] as computer scientists, some of them with a Ross Anderson pedigree, they should know better than to believe in ‘anonymised medical data’,” he added — a reference to how trivially easy it has been shown to be for sensitive medical data to be re-identified once it’s handed over to third parties who can triangulate identities using all sorts of other data holdings.

Also commenting on what the documents reveal, Phil Booth, coordinator of medConfidential, told us: “What this shows is that Google ignored the rules. The people involved have repeatedly claimed ignorance, as if they couldn’t use a search engine. Now it appears they were very clear indeed about all the rules and contractual arrangements; they just deliberately chose not to follow them.”

Asked to respond to criticism that it has deliberately ignored NHS’ information governance rules, a DeepMind spokeswoman said the AI research being referred to “has not taken place”.

“To be clear, no research project has taken place and no AI has been applied to that dataset. We have always said that we would like to undertake research in future, but the work we are delivering for the Royal Free is solely what has been said all along — delivering Streams,” she added.

She also pointed to a blog post the company published this summer after the ICO ruled that the 2015 ISA with the Royal Free had broken UK data protection laws — in which DeepMind admits it “underestimated the complexity of NHS rules around patient data” and failed to adequately listen and “be accountable to and [be] shaped by patients, the public and the NHS as a whole”.

“We made a mistake in not publicising our work when it first began in 2015, so we’ve proactively announced and published the contracts for our subsequent NHS partnerships,” it wrote in July.

“We do not foresee any major ethical… issues”

In one of the sections of DeepMind’s November 2015 AI research study application form, which asks for “a summary of the main ethical, legal or management issues arising from the research project”, the company writes: “We do not foresee any major ethical, legal or management issues.”

Clearly, with hindsight, the data-sharing partnership would quickly run into major ethical and legal problems. So that’s a pretty major failure of foresight by the world’s most famous AI-building entity. (Albeit, it’s worth noting that the rest of a fuller response in this section has been entirely redacted — but presumably DeepMind is discussing what it considers lesser issues here.)

The application also reveals that the company intended not to register the AI research in a public database — bizarrely claiming that “no appropriate database exists for work such as this”.

In this section the application form includes the following guidance note for applicants: “Registration of research studies is encouraged wherever possible”, and goes on to suggest various possible options for registering a study — such as via a partner NHS organisation; in a register run by a medical research charity; or via publishing through an open access publisher.

DeepMind makes no additional comment on any of these suggestions.

When we asked the company why it had not intended to register the AI research the spokeswoman reiterated that “no research project has taken place”, and added: “A description of the initial HRA [Health Research Authority] application is publicly available on the HRA website.”

Evidently the company — whose parent entity Google’s corporate mission statement claims it wants to ‘organize the world’s information’ — was in no rush to more widely distribute its plans for applying AI to NHS data at this stage.

Details of the size of the study have also been redacted in the FOI response so it’s not possible to ascertain how many of the 1.6M medical records DeepMind intended to use for the AI research, although the document does confirm that children’s medical records would be included in the study.

The application confirms that Royal Free NHS patients who have previously opted out of their data being used for any medical research would be excluded from the AI study (as would be required by UK law).

As noted above, DeepMind’s application also specifies that the company would be both handling fully identifiable patient data from the Royal Free, for the purposes of developing the clinical task management app Streams, and also identifying and anonymizing a sub-set of this data to run its AI research.

This could well raise additional questions over whether the level of control DeepMind was being afforded by the Trust over patients’ data is appropriate for an entity that is described as occupying the secondary role of data processor — vs the Royal Free claiming it remains the data controller.

“A data processor does not determine the purpose of processing — a data controller does,” said Boiten, commenting on this point. “Doing AI research” is too aspecific as a purpose, so I find it impossible to view DeepMind as only a data processor in this scenario,” he added.

One thing is clear: When the DeepMind-Royal Free collaboration was publicly revealed with much fanfare, the fact they had already applied for and been granted ethical approval to perform AI research on the same patient data-set was not — in their view — a consideration they deemed merited detailed public discussion. Which is a huge miscalculation when you’re trying to win the public’s trust for the sharing of their most sensitive personal data.

Asked why it had not informed the press or the public about the existence and status of the research project at the time, a DeepMind spokeswoman failed to directly respond to the question — instead she reiterated that: “No research is underway.”

DeepMind and the Royal Free both claim that, despite receiving a favorable ethical opinion on the AI research application in November 2015 from the NHS ethics committee, additional approvals would have been required before the AI research could have gone ahead.

“A favourable opinion from a research ethics committee does not constitute full approval. This work could not take place without further approvals,” the DeepMind spokeswoman told us.

“The AKI research application has initial ethical approval from the national research ethics service within the Health Research Authority (HRA), as noted on the HRA website. However, DeepMind does not have the next step of approval required to proceed with the study — namely full HRA approval (previously called local R&D approval).

“In addition, before any research could be done, DeepMind and the Royal Free would also need a research collaboration agreement,” she added.

The HRA’s letter to DeepMind confirming its favorable opinion on the study does indeed note:

Management permission or approval must be obtained from each host organisation prior to the start of the study at the site concerned.

Management permission (“R&D approval”) should be sought from all NHS organisations involved in the study in accordance with NHS research governance arrangements

However since the proposed study was to be conducted purely on a database of patient data, rather than at any NHS locations, and given that the Royal Free already had an information-sharing arrangement inked in place with DeepMind, it’s not clear exactly what additional external approvals they were awaiting.

The original (now defunct and ICO sanctioned) ISA between the pair does include the below paragraph — granting DeepMind the ability to anonymize the Royal Free patient data-set “for research” purposes. And although this clause lists several bodies, one of which it says would also need to approve any projects under “formal research ethics”, the aforementioned HRA (“the National Research Ethics Service”) is included in this list.

So again, it’s not clear whose rubberstamp they would still have required.

The value of transparency

At the same time, it’s clear that transparency is a preferred principle of medical research ethics — hence the NHS encouraging those filling in research applications to publicly register their studies.

A UK government-commissioned life science strategy review, published this week, also emphasizes the importance of transparency in engendering and sustaining public trust in health research projects — arguing it’s an essential component for furthering the march of digital innovation.

The same review also recommends that the UK government and the NHS take ownership of training health AIs off of taxpayer-funded health data-sets — exactly to avoid corporate entities coming in and asset-stripping potential future medical insights.

(“Most of the value is the data,” asserts review author, Sir John Bell, an Oxford University professor of medicine. Data that, in DeepMind’s case, has been so far freely handed over by multiple NHS organizations — in June, for example, it emerged that another NHS Trust which has inked a five-year data-sharing deal with DeepMind, Taunton & Somerset, is not paying the company for the duration of the contract; unless (and in the unlikely eventuality) that the service support exceeds £15,000 a month. So essentially DeepMind is being ‘paid’ with access to NHS patients’ data.)

Even before the ICO’s damning verdict, the original ISA between DeepMind and the Royal Free had been extensively criticized for lacking robust legal and ethical safeguards on how patient data could be used. (Even as DeepMind’s co-founder Mustafa Suleyman tried to brush off criticism, saying negative headlines were the result of “a group with a particular view to peddle“.)

But after the original controversy flared the pair subsequently scrapped the agreement and replaced it, in November 2016, with a second data-sharing contract which included some additional information governance concessions — while also continuing to share largely the same quantity and types of identifiable Royal Free patient data as before.

Then this July, as noted earlier, the ICO ruled that the original ISA had indeed breached UK privacy law. “Patients would not have reasonably expected their information to have been used in this way, and the Trust could and should have been far more transparent with patients as to what was happening,” it stated in its decision.

The ICO also said it had asked the Trust to commit to making changes to address the shortcomings that the regulator had identified.

In a statement on its website the Trust said it accepted the findings and claimed to have “already made good progress to address the areas where they have concerns”, and to be “doing much more to keep our patients informed about how their data is used”.

“We would like to reassure patients that their information has been in our control at all times and has never been used for anything other than delivering patient care or ensuring their safety,” the Royal Free’s July statement added.

Responding to questions put to it for this report, the Royal Free Hospitals NHS Trust confirmed it was aware of and involved with the 2015 DeepMind AI research study application.

“To be clear, the application was for research on de-personalised data and not the personally identifiable data used in providing Stream,” said a spokeswoman.

“No research project has begun, and it could not begin without further approvals. It is worth noting that fully approved research projects involving de-personalised data generally do not require patient consent,” she added.

At the time of writing the spokeswoman had not responded to follow-up questions asking why, in 2016, it had made such explicit public denials about its patient data being used for AI research, and why it chose not to make public the existing application to conduct AI research at that time — or indeed, at an earlier time.

Another curious facet to this saga involves the group of “independent reviewers” that Suleyman, announced the company had signed up in July 2016 to — as he put it — “examine our work and publish their findings”.

His intent was clearly to try to reset public perceptions of the DeepMind Health initiative after a bumpy start for transparency, consent, information governance and regulatory best practice — with the wider hope of boosting public trust in what an ad giant wanted with people’s medical data by allowing some external eyeballs to roll in and poke around.

What’s curious is that the reviewers make no reference to DeepMind’s AI research study intentions for the Royal Free data-set in their first report — also published this July.

We reached out to the chair of the group, former MP Julian Huppert, to ask whether DeepMind informed the group it was intending to undertake AI research on the same data-set.

Huppert confirmed to us that the group had been aware there was “consideration” of an AI research project using the Royal Free data at the time it was working on its report, but claimed he does not “recall exactly” when the project was first mentioned or by whom.

“Both the application and the decision not to go ahead happened before the panel was formed,” he said, by way of explanation for the memory lapse.

Asked why the panel did not think the project worth mentioning in its first annual report, he told TechCrunch: “We were more concerned with looking at work that DMH had done and were planning to do, than things that they had decided not to go ahead with.”

“I understand that no work was ever done on it. If this project were to be taken forward, there would be many more regulatory steps, which we would want to look at,” he added.

In their report the independent reviews do flag up some issues of concern regarding DeepMind Health’s operations — including potential security vulnerabilities around the company’s handling of health data.

For example, a datacenter server build review report, conducted by an external auditor looking at part of DeepMind Health’s critical infrastructure on behalf of the external reviewers, identified what it judged a “medium risk vulnerability” — noting that: “A large number of files are present which can be overwritten by any user on the reviewed servers.”

“This could allow a malicious user to modify or replace existing files to insert malicious content, which would allow attacks to be conducted against the servers storing the files,” the auditor added.

Asked how DeepMind Health will work to regain NHS patients’ trust in light of such a string of transparency and regulatory failures to-date, the spokeswoman provided the following statement: “Over the past eighteen months we’ve done a lot to try to set a higher standard of transparency, appointing a panel of Independent Reviewers who scrutinise our work, embarking on a patient involvement program, proactively publishing NHS contracts, and building tools to enable better audits of how data is used to support care. In our recently signed partnership with Taunton and Somerset NHS Trust, for example, we committed to supporting public engagement activity before any patient data is transferred for processing. And at our recent consultation events in London and Manchester, patients provided feedback on DeepMind Health’s work.”

Asked whether it had informed the independent reviewers about the existence of the AI research application, the spokeswoman declined to respond directly. Instead she repeater the prior line that: “No research project is underway.”