June 2006

### Why is it important to develop computer savvy students of biology?

A good fraction of modern biology is heavily quantitative. Photo: Microsoft Images.

**Gross:** A good fraction of modern biology is heavily quantitative, and one of the main complaints that I have heard for a very long time from biologists is that students do not have the capability to carry out the basic quantitative research that is critical to all modern biology across the spectrum.

One of the most distressing comments has been about the simplest things: reading, interpreting, and producing graphs—simple graphs that students have seen again and again since they were in elementary and middle school. Things like bar charts and histograms are common, but certainly the kinds of graphs that come up again and again in science include those that have logarithmic and exponential scalings. These are the ones that are typically called log-log or semi-log graphs. These are used throughout biology and the rest of science to describe systems, particularly those in which there is an exponential component. An exponential response is something students learn in high school and, from my experience, they never really get an understanding of it. What educators need are good data sets and good ways to explain why these kinds of graphs arise. That includes what are called allometric relationships, such as the relationship between body size and weight in mammals, or that of diameter, breast height, and tree height. Such scaling relationships occur in biological systems from the level of individuals to that of entire global systems.

Perhaps the simplest illustration of this point is the basic way we communicate science, which requires a quantitative framework for understanding. I gave the example of graphs, but we also communicate science through symbolic means, from simple equations to more complicated ones.

### How do teachers go about helping students become more computer savvy?

**Gross:** The students that are coming into upper high school and college level now are computer savvy in a variety of ways. Certainly they have the capability to use programs such as Word, PowerPoint, and even Excel to a certain extent to carry out analysis because they have been using them for other tasks. Perhaps the easiest way to help is to build on students’ own experience. That means having them carry out some variation of observations or experiments themselves, then having them enter their data into a program like Excel or one of a wide variety of others and carry out some level of analysis. That level of analysis could be very simple, for example, creating scatterplots that relate two variables like length and width or heart rate before and after exercise. One thing that you can do here is to make sure that students are invested in some sense in the data sets that they have collected. There are now lots of tools that are useful in providing sensor input into data loggers, and calculators that directly connect to them, allowing students to collect real time data on a variety of things.

### What key skills should a teacher help students develop so that they can become good computational biologists?

**Gross:** First I should say that being a good computational biologist is not going to come from just a few exposures. It is going to require considerable exposure to lots of different areas of what we generally call computational science. Just because someone learns what an algorithm is is not enough. Working with algorithms is key. It helps you gain an understanding of how any computer program operates, that is, by one step following another in a logical sequence of operations. It is possible to teach computational biology from a wide variety of perspectives, including the cellular level. There are great websites and collections of materials, such as the ones produced by BioQUEST, that provide ways to introduce students to a logical sequence of operations. Even in Excel there is sort of a logical construct for what goes first and what goes next. It is kind of hidden in that people are not running programs in the same way they would if they were using basic computer language, but they are good things to start with.

Beyond that there are a good variety of resources available on what computational biologists need to know. I even have something on my website about this, but it is not a one shot kind of thing. It requires a long-term, sustained exposure of students to a variety of computational science issues. My preference is to focus on the general biology student and not to target students that are going on to do computational biology. I focus on the key concepts that underlie quantitative approaches in the life sciences and across the board. The *Math & Bio 2010* report has a listing of these concepts, if an educator wants to know how to train what we could call “fearless biologists.” Fearless biologists can go off and do a wide variety of research related to the basic biological questions that they are interested in and not be constrained by their lack of understanding in some discipline.

What you are aiming for is a conceptual foundation for undergraduate education. For example, most science undergraduates take some kind of course in which they learn about catalysts and what an instantaneous rate of change is in a very formalized way. But there are other kinds of rates of change, rates of change over a time period, say, and these are not instantaneous at all. A very different underlying mathematical framework is required for analyzing things like the change in population size from one generation to the next, the change in inset populations as the population changes through time, or the change of a number of individuals in a population affected by a disease such as HIV or rabies. That is not something students pick up in a calculus course. What we can do in our biology courses is reinforce these basic quantitative concepts that occur again and again throughout the life sciences. Rates of change are just one example, but there are many others. The whole notion of homeostasis and what equilibrium is in a biological system are concepts that you would hope every undergraduate in life sciences has some conceptual foundation for understanding. Even if they can’t do the underlying mathematics, they should at least understand the notion of equilibrium in a process for which there is no change through time. We hope they have some understanding of what we call dynamic equilibrium, such as a distinct increase in heart rate. Just those basic concepts are important: equilibria, dynamic equilibria, and associated issues of homeostasis. Again, *Math & Bio 2010* report has a listing of the key conceptual foundations for biology that have quantitative components, and I think that serves as a good guide to what one could encourage students to grasp throughout undergraduate studies or in a general biology program.

### What advantages does a biology student with some computational skills have?

**Gross:** One is, of course, just an expansion of the skill set, in that anytime you learn a new skill you have the potential to utilize that skill. Much of the job opportunities in modern biology require quantitative training, so graduate students should be encouraged to use quantitative skills. In fact, many of my colleagues would argue that it is much easier for students to learn biology if they have quantitative training. It is certainly easier if you develop the skill along the way. There is a movement toward developing programs that train students at the undergraduate level in both quantitative skills and in biological skills.

Pick up any issue of *Science* and *Nature* and you will see that the job ads for people with PhDs are looking for people with quantitative skills. In addition to ads for computational biologists, there are technical jobs that require an understanding of what is going on inside machines, if nothing else. Admittedly, there are many technicians who don’t understand what is going on in the machine that they use, but I think that those who are most in demand would be those who have an understanding beyond just pushing a certain button to produce a particular result. Technicians need to know why a result arises, and often that involves sitting down and analyzing the techniques that the piece of equipment utilizes. A piece of equipment can be simple or complicated; for example, in ecology we use a wide variety of photosynthetic measuring devices. The underlying mathematics is not outrageously complicated, but it requires some thought to understand what you are measuring when you measure photosynthesis.

### Any suggestions for high school teachers about adding computational biology skills to the curriculum?

**Gross:** There are wonderful resources, including computer programs and free programs, which are designed to assist students at the high school level [see “learn more links” below]. The vast majority of students entering college programs in biology already have had calculus. They often have already taken a course viewed as a college-level course, and students realize quite well that they need this kind of background. Good biology students today, whether they are interested in the health sciences or in veterinary medicine, ecology, or other areas, realize that quantitative skills will enhance their ability to work in the future.

Therefore, I would encourage students at the high school level to think about data. Having students collect and analyze data gives them quantitative skills that will serve them well in the long term no matter what they do. In addition, many of the students entering college now have had a statistics course in high school. Coupling that statistics course with their biology courses or any lab experience would also be beneficial. Much of what we call bioinformatics relates to statistical methods, and a good grounding in statistics is critical in life science research as well.

I have an entire website devoted to modules aimed at incorporating the basic high-school-level mathematics into a general biology framework (these modules are now part of the BIO2010 report, http://www.nap.edu/openbook.php?isbn=0309085357). The set of modules is applicable for both college-level biology as well as the high-school level because the mathematics that underlies the activities is at the high-school level. The objective in that set of modules is to have cases in which the mathematics adds something that was not possible to see from biological data sets by themselves, that is, to devise a new way of thinking about a biological question.

© 2006, American Institute of Biological Sciences. Educators have permission to reprint articles for classroom use; other users, please contact editor@actionbioscience.org for reprint permission. See reprint policy.