“This award is going at the very top of my CV,” says Rosemary McCloskey, the first recipient of the Department of Statistics Award in Data Science. The award, which has been made possible by an anonymous donor, honours an undergraduate or graduate student making outstanding contributions to data science. This is a relatively new discipline that combines skills in computer science, mathematics and statistics to draw meaning from data.
Data science is relevant to many fields, including finance, marketing, medicine and biology. Rosemary, a master’s student in bioinformatics, argues that biology offers the most interesting—and important—scope for data science. For example, Rosemary is currently studying viral disease outbreaks by looking at the “family trees” of the viruses in infected individuals.
“The idea is that genetically similar viruses tend to be related by recent transmissions,” Rosemary explains. “It’s not possible to determine who transmitted to whom, but by looking at a large number of HIV sequences, we can get an idea of where in the population HIV is spreading fastest.”
Rosemary is examining a viral “family tree” of a segment of the HIV genome 1,500 nucleotides long, from each of almost 8,000 HIV positive people. Her goal is to identify parts of the population where HIV is spreading quickly.
To deal with this much data, Rosemary has written her own computer program, using a statistical model that has previously been used to find patterns in internet traffic and earthquake readings. Rosemary has also generated thousands of simulations using the data.
This research is in its early stages, but has the potential to yield insights on transmission rates, contact structures, and whether an epidemic is growing. This research complements the traditional way of studying outbreaks, by going out in the field and asking people who they contacted.
Rosemary, who originally planned to study computer science, became fascinated with bioinformatics after her second year at Simon Fraser University. Through a co-op program placement, she started working with Dr. Art Poon at the British Columbia Centre for Excellence in HIV/AIDS at St. Paul’s Hospital, and has worked with him ever since.
Data scientists like Rosemary need to have skills from several scientific disciplines. That’s why the Department of Statistics Award in Data Science is open to students in all UBC disciplines. Rosemary is grateful that this is possible. “I was really happy to see that the award recognized that there are applications of data science everywhere,” she says.
Nancy Heckman, Head of the Department of Statistics, says that opening the award to aspiring data scientists in all departments was a deliberate choice by the award donor. “Data science exists in many disciplines at UBC,” she explains. “However, the donor felt that the Statistics Department is the natural place to define and assess data science skills.”
Heckman is extremely pleased to see this award come to UBC. She says, “We welcome this award to not only honour young UBC data scientists, but also to let people know the types of problems that a data scientist can solve. We are at an exciting time for the discipline of statistics. Media attention to big data and data science shows the importance of statistics. The private, public and non-profit sectors recognize their need for individuals—data scientists—to help them make sense of the quantitative information they collect.”
Rosemary agrees with Heckman about the future of data science. “I’m really excited by this award,” she says. “Data science is growing so quickly. Getting more people involved is important.”
Thanks to the generosity of an anonymous donor, the Department of Statistics Award in Data Science has funding for three years.