Link to the University of Pittsburgh Homepage
Link to the University Library System Homepage Link to the Contact Us Form

Who shares? Who doesn't? Factors associated with openly archiving raw research data

Piwowar, HA (2011) Who shares? Who doesn't? Factors associated with openly archiving raw research data. PLoS ONE, 6 (7).

[img]
Preview
PDF
Published Version
Available under License : See the attached license file.

Download (447kB) | Preview
[img] Plain Text (licence)
Available under License : See the attached license file.

Download (1kB)

Abstract

Many initiatives encourage investigators to share their raw datasets in hopes of increasing research efficiency and quality. Despite these investments of time and money, we do not have a firm grasp of who openly shares raw research data, who doesn't, and which initiatives are correlated with high rates of data sharing. In this analysis I use bibliometric methods to identify patterns in the frequency with which investigators openly archive their raw gene expression microarray datasets after study publication. Automated methods identified 11,603 articles published between 2000 and 2009 that describe the creation of gene expression microarray data. Associated datasets in best-practice repositories were found for 25% of these articles, increasing from less than 5% in 2001 to 30%-35% in 2007-2009. Accounting for sensitivity of the automated methods, approximately 45% of recent gene expression studies made their data publicly available. First-order factor analysis on 124 diverse bibliometric attributes of the data creation articles revealed 15 factors describing authorship, funding, institution, publication, and domain environments. In multivariate regression, authors were most likely to share data if they had prior experience sharing or reusing data, if their study was published in an open access journal or a journal with a relatively strong data sharing policy, or if the study was funded by a large number of NIH grants. Authors of studies on cancer and human subjects were least likely to make their datasets available. These results suggest research data sharing levels are still low and increasing only slowly, and data is least available in areas where it could make the biggest impact. Let's learn from those with high rates of sharing to embrace the full potential of our research output. © 2011 Healther A. Piwowar.


Share

Citation/Export:
Social Networking:
Share |

Details

Item Type: Article
Status: Published
Creators/Authors:
CreatorsEmailPitt UsernameORCID
Piwowar, HA
Contributors:
ContributionContributors NameEmailPitt UsernameORCID
EditorNeylon, CameronUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Centers: Other Centers, Institutes, Offices, or Units > Center for Biomedical Informatics
Date: 18 July 2011
Date Type: Publication
Journal or Publication Title: PLoS ONE
Volume: 6
Number: 7
DOI or Unique Handle: 10.1371/journal.pone.0018657
Schools and Programs: School of Medicine > Biomedical Informatics
Refereed: Yes
MeSH Headings: Archives; Cooperative Behavior; Databases, Genetic; Humans; Information Dissemination; Multivariate Analysis; Odds Ratio; Periodicals as Topic; Research--statistics & numerical data
Other ID: NLM PMC3135593
PubMed Central ID: PMC3135593
PubMed ID: 21765886
Date Deposited: 03 Aug 2012 15:50
Last Modified: 20 Dec 2018 00:55
URI: http://d-scholarship.pitt.edu/id/eprint/13209

Metrics

Monthly Views for the past 3 years

Plum Analytics

Altmetric.com


Actions (login required)

View Item View Item