Amazon released its previously announced Public Data Sets web service this evening. The project encourages developers, researchers, universities and businesses to upload large (non-confidential) data sets to Amazon – things like census data, genomes, etc. – and then let others integrated that data into their own AWS applications.
Previously, Amazon says, large data sets like the Human Genome or U.S. Census data required “many hours to located, download and customize,” but that developers can now access and start computing on this data within minutes. Data is hosted for free.
Data sets available today include an Annotated Human Genome, a public database of chemical structures, various census data and labor statistics.
Public Data Sets on AWS provides a centralized repository of public data sets that can be seamlessly integrated into AWS cloud-based applications. AWS is hosting the public data sets at no charge for the community, and like all AWS services, users pay only for the compute and storage they use for their own applications. An initial list of data sets is already available, and more will be added soon.
Previously, large data sets such as the mapping of the Human Genome and the US Census data required hours or days to locate, download, customize, and analyze. Now, anyone can access these data sets from their Amazon Elastic Compute Cloud (Amazon EC2) instances and start computing on the data within minutes. Users can also leverage the entire AWS ecosystem and easily collaborate with other AWS users. For example, users can produce or use prebuilt server images with tools and applications to analyze the data sets. By hosting this important and useful data with cost-efficient services such as Amazon EC2, AWS hopes to provide researchers across a variety of disciplines and industries with tools to enable more innovation, more quickly.
Perhaps someone can now upload all those now-public iFund applications to Amazon.