The duplication of DNA through the process of mitosis is essential to life. But that process doesn’t always go as planned. Sometimes the copies are flawed and the genes contained within them are mutated – the process that drives evolution. Sometimes extra copies of genes are produced. Those extra copies, or gene duplications, work like other gene mutations: sometimes the effect is good; other times it is bad or has no effect at all.
Denise Clark is a geneticist and a professor of biology at the University of New Brunswick in Fredericton. She studies those gene duplications as part of her research and she’s using the ACENET computer network to do it. “When two copies of a gene are both functioning within a genome they can take on different roles,” says Clark. “We’re interested in how the the duplicated genes function and also in the mechanism that causes the duplications to arise.”
Fruit flies are the primary organism Clark studies. The tiny insects are the gold standard for this kind of genetic research because they reproduce quickly and also because they have already been studied extensively by geneticists for many years. “There has been a determination of the entire DNA sequence, or “genome”, for hundreds individual fruit flies by several fruit fly research groups to look at genetic variation in populations,” she says. “That next-generation DNA sequencing data is available for anyone to look at.”
“A couple of years ago I started using ACENET to look for duplicates. If those duplicates are localized – if they’re not present around the world – we can assume they are new.”
ACENET provides a crucial piece of the puzzle. The file for one fruit fly genome is gigabytes in size, says Clark. The sheer size of the data that needs to be analyzed would quickly overtax a standard computer system. “I’ve looked at 500 genomes using ACENET. That’s something I couldn’t do on my laptop.”
The job is made easier because ACENET has installed a number of genome tools that Clark can use off the shelf. It allows her to look at hundreds of genomes in parallel or string together the tools she wants to use in a pipeline.”I’m not building any new tools, except for the specific pipeline that lists what I need. I start with a genome file that’s gigabytes in size and at the end I’m left with a file that’s hundreds of kilobytes,” she says.
Clark says that now that technology to sequence genes has gotten much cheaper and faster, there has been an explosion of new genetic knowledge and ideas recently. “A lot of geneticists’ ideas about genomes and genetic variation have changed since the development of next-generation sequencing technology.”