Protecting Privacy through Controlled Access: The Successful Five-Year Experience of the ICGC Data Access Compliance Office

Emilie de Vries-Séguin1, Derek So1, Bartha M. Knoppers1, Yann Joly1

1. Centre of Genomics and Policy, McGill University

Context: The broad sharing of pre-competitive data from large scale genomics initiatives has vastly contributed to recent advances in human genetic research. However, questions have been raised regarding the potential privacy risk for research participants associated with such data sharing practice. The controlled approach to data sharing, where access is restricted to users authorized by a data access committee (DAC), has been proposed as a potential solution. However, this approach is perceived as overly bureaucratic and time consuming by many in the scientific community.

Objective: This communication will use empirical data to refute the negative perception of controlled access and show that it is possible to efficiently use this strategy to protect the genomic and clinical data of a large scale research consortium in the field of cancer genomics.

Methods: Our data comes from statistical analysis of over 270 access requests made over the last five years of operation of the Data access compliance office (DACO) of the International Cancer Genome Consortium (ICGC) complemented by operational data associated to the administration of DACO.

Discussion: The data obtained shows that DACO has been highly cost-efficient and has successfully fulfilled its mission of adding a protection and oversight layer to the more sensitive data of the ICGC. One potential area for improvement would be that of developing a more efficient compliance framework to ensure the conformity of post approval data usage.

Conclusion: It is possible to provide additional protection to sensitive genomic data at little cost to the scientific community by using a controlled approach to data access. The required infrastructure can be designed to run on a limited budget while addressing access requests and queries from users in a very short timeframe. This approach to the protection of genomic data is also highly flexible and can be used in combination with a variety of consent, privacy and security strategies including the more recent methods involving substantial information technology components.