This site may earn affiliate commissions from the links on this page. Terms of use.

When you hear the phrase "Big Information," do you think of large-scale analysis carried out on a huge body of data? Or do you think of ethical conflicts like Big Pharma? Computational medicine has been making news at a steady step over the last several years with advances in medical imaging and genomics — perhaps most recently in IBM'south joint venture with three summit cancer centers, leveraging Watson, the Jeopardy-playing cerebral computing platform, in a pitched battle against cancer. At the finish of Moore's Law, with security breaches and the exposure of personally identifying data on the increase, what can computing do for medicine that it hasn't already done?

MIT: cheap fluorescence imaging with Kinect

This, among other things: cheap fluorescence imaging with Kinect.  Image courtesy of researchers

Big Data offers unprecedented perspective, far beyond the scope of what any researcher has done before. It allows bench scientists and citizen scientists alike to contribute to a massively multivariate body of data. Research like Sergey Brin's quest to effigy out Parkinson'south disease via Big Genomics demands the cooperation of a lot of people and sophisticated information handling. And with every scientific step nosotros advance, we open upward new problems and new patent applications. For better or worse, the nature of scientific inquiry is to increase the corporeality of data bachelor to u.s.a..

But, for example, the information independent in a Deoxyribonucleic acid sequence of millions to billions of base pairs has to accept up data space somewhere — and Dna has greater bit depth than merely its 4 nucleobases. And that's just research existence carried out on one biomolecule. Rational drug design ascended partly on the strength of the belief that computational muscle could solve disease as a whole past creating custom-tailored biomolecules. Scientists dreamed of a computer-designed drug for every disease. As a event, there'southward a combinatorial explosion of possible proteomics data. And treatment this kind of medical data could well exist the bottleneck, according to a written report published this summertime on PLoS: Biology.

Growth of DNA Sequencing Data - from PLoS

Growth of Dna Sequencing Data – from PLoS

When doing genomics research, scientists must acquire, format, store, retrieve, and analyze that data, oftentimes transferring it between software or hardware. These individual tasks are harder than the sum of their parts one time you account for the difficulties of transferring such massive volumes of information. Combine that with the exponential expansion of "omics" science, and the data rapidly becomes hard to manage without a long concatenation of custody. And that's a problem.

Large Data is different Large Pharma in that it doesn't necessarily belong to any one field. Entities like YouTube, Folding@Domicile, and the stock marketplace are generating data in volumes that have to be stored somewhere, in volumes measured in petabytes or exabytes. And scientific inquiry has a decentralized component that defies the corruption endemic to monopolistic control of any field. You just tin't purchase out literally anybody, when each person works on a different project for a different interest.

In the end, nosotros need a new conversation about the elephant in the n-dimensional analytical space: the massive conflict of interest centered around who owns personally identifying information. What happens if a pharmaceutical written report that aggregates proteomics data trying to create a new drug should instead create a new, litigious Henrietta Lacks? Centralized data is salable data, and the temptation to monetize such intimately targeted data will exist strong. When scientific discipline and money intersect, who decides where to depict the line in the sand?