The internet has opened new potential in many areas of the research and scholarly process: open access journals, institutional repositories, open source software – and more recently, open data. Do you ever stop and think about the datasets that are the foundation of every scholarly journal article? What happens to all that data? Is it ever seen or re-used by other researchers? Is it preserved somewhere?
Programs like the National Science Foundation’s (NSF) DataNet are beginning to address the enormous challenges of managing and sharing research data in order to drive networked, collaborative and innovative science. The challenges involve building infrastructure and setting standards for storing data, encouraging researchers to deposit their data for others to see and use for new purposes, and creating fewer restrictions on licenses, patents and copyrights that currently forbid re-use of data.
In the past raw data was not considered for publication primarily because it was not affordable or feasible to include in print journals. But, the predominance of online journals has changed that model. Also, large federal funding agencies such as the NSF and the National Institutes of Health (NIH) are having a tremendous impact on the publication of data. NSF expects that all data will be made available after a reasonable length of time. Each proposal must now be submitted with a Data Management Plan. Since 2003 the NIH has required applicants seeking $500,000 or more in direct costs to follow the Data Sharing policy by submitting a plan for sharing their final research data with their application.
Data can be big like the Human Genome Project or small like thousands of datasets created by individual or small group researchers at universities and research institutes each year. No matter what the size, the potential for sharing and reusing this knowledge is huge.
This is our fourth open access post for the week. Hope you're enjoying the series.