As I mentioned in my last post, the amount of DNA in your body is astounding, as is the number of cells, which has been estimated at 100 trillion When I started in genomics, searching for cancer-causing mutations was a huge and expensive endeavor, and simply could not be done comprehensively. The tools to understand the structure and function of cancer cell genomes simply did not exist.
All of that has drastically changed thanks to advances in technology. So called “next-generation” DNA sequencing has dramatically changed the rate and cost of doing DNA sequencing for cancer mutation detection.
At the Genome Sciences Centre (GSC), we currently operate 25 of the latest ultra high throughput next generation DNA sequencing instruments. Each one of these can produce in one week an amount of data equivalent to what 360 of the previous sequencers could produce, all working together for one year. At our facility, these instruments are capable of generating hundreds of cancer genome sequences every year, and in sight are technology advances that will allow us to sequence thousands of genomes each year. As long as the costs of sequencing continue to fall, I envision a time where we will have the opportunity to learn about every cancer mutation in every cancer sample. This will provide an unprecedented level of knowledge about all tumors, and from this knowledge will emerge superior treatment strategies.
Because these machines produce astonishing quantities of information, accompanying the next generation sequencers at the GSC is a very powerful super-computer cluster – recently ranked the fourth most powerful computer cluster in the country. This computer cluster takes up an entire room, and is made up of hundreds of processors and is attached to more than seven petabytes of disk storage capacity (the prefix “peta” refers to the number 1 followed by 15 zeros; this is equivalent to one million gigabytes, for those of you with home computers). Although this sounds like a lot, these computers are barely able to keep up with the information we generate using the next-generation sequencers. And with sequencing rates advancing constantly, we are always aware of the need to access more powerful, faster computers.
All of these technology advances are to the benefit of those who, like us, desire to understand the basis of cancers at their genetic roots.
But all this technology is just one piece of the puzzle. Even more important are the people who analyze the data – the biologists, mathematicians, computer scientists and doctors who come together to identify the most important elements of the sequencing results, searching for the spelling mistakes that may cause cancer, and who then design the subsequent experiments to test that the spelling mistakes are indeed important. Finally, most important of all are the cancer patients, who consent to participating in important research. Many, many thanks to you all. Without your generous contributions of tissue samples to sequence, we would be unable to take advantage of these remarkable technology developments in our quest to discover the genetic changes that cause cancers.
Marco