Building A Data-Driven White House

Editor’s note: David Richards is CEO, president and co-founder of WANdisco, a public software company specializing in the area of distributed computing. It is a corporate contributor to Hadoop, Subversion and other open-source projects. 

Working in government IT doesn’t mean what it used to. In the past, computers were only used to carry out simple office work or as a substitute for pens and paper. And now, organizations are fast realizing they require new leadership to get the most out of their data. This is more than just another fickle trend; a quick search on LinkedIn reveals that “data scientist” now appears in roughly 36,000 help-wanted posts.

In a move that reflects the growing prominence of data science in the U.S., the White House has named DJ Patil as the nation’s first chief data scientist. The former PayPal and eBay executive is tasked with helping the U.S. government both maximize its investments in big data and advise on policy issues around the use of data.

Patil is committed to accelerating smarter data-driven government, which could deliver considerable benefits for the taxpayer and cement the U.S.’s status as the global leader in data science. But to do this he’ll need to work closely with the private sector. The Commerce Department recently reported that the field contributes between $24 billion and $221 billion annually to the private sector. Companies like Hortonworks, Pivotal and WANdisco are helping industries from banks and utility providers, to hospitals and government agencies deploy big data strategies.

When the government’s latest big data privacy plan was being drawn up, the White House engaged hundreds of stakeholders, but most of the representatives from the private sector were end users rather than vendors – the companies benefitting from the technology rather than those designing or operating it. To put this in perspective, this is akin to investigating national spending habits without consulting a single bank.

At the top of Patil’s to-do list has to be a full audit of the initiatives already in play across the government, setting out the look and feel of the government’s data-driven strategy, identifying standards of best practice and promoting cross-agency learning.

Despite an increased appreciation of data’s importance, analytics policies vary wildly among government departments. While some agencies have been enthusiastic adopters of large-scale data-crunching platforms, others are struggling to integrate data science into their workflows, whether due to budgetary constraints or cultural resistance.

To this end, Patil could look to emulate Australia, whose federal government’s big-data strategy began with the Australian Public Service ICT Strategy 2012-2015, and was further outlined in the 2013 report Big Data Strategy – Issues Paper.

Australia’s strategy increased the deployment of data-driven solutions in the Australian public sector leading to initiatives like the Australian Taxation Office trawling through records to find evidence on the use of tax havens, and data matching to identify small online retailers that are not meeting their compliance obligations.

In a memo addressed to the American people, Patil described his role as “responsibly sourcing, processing, and leveraging data in a timely fashion to enable transparency, provide security, and foster innovation for the benefit of the American public.”

Patil emphasized his belief that public data should be used to benefit the nation and, as such, it should play a key role in the National Institutes of Health’s Precision Medicine Initiative, which is developing genomics-based therapies for individual patients.

Much has been made of Obama’s commitment to precision medicine. The prospect of freeing up access to public data has the potential to speed up its development and provide the government and private sector with a much deeper insight into the trends underpinning the nation’s health to allow for more efficient management of diagnosis, treatment and recovery.

But to remain at the cutting edge of data science in the long run, the U.S. also needs legislation that rewards responsible behavior while simultaneously empowering the government’s ability to innovate and develop new services using data.

Many in Silicon Valley are concerned that heavy-handed legislation could do more harm than good in the long run and clip the wings of innovative thinking. Engaging with big data vendors would provide lawmakers with a richer understanding of how and where the technology is already being used. It would also shed light on where technology already has the ability to deal with public concerns.

Patil’s appointment reflects an increasing appreciation of the opportunities and challenges we face in a world driven by data, where successful strategies revolve around experimentation and exploration. Let’s hope he is able to establish a data-driven U.S. that will benefit everyone.