Skip Over Navigation Links

Web Exclusives: Diseases

Virtual Populations Get More Real
Emily Carlson
Posted May 18, 2011

Image of randomly placed houses within census blocks(top) and more realistic plot placement (bottom)
Instead of randomly placing houses and getting a relatively even distribution within census blocks (top), the improved synthetic population uses additional data sources and a statistical method to more realistically plot placement (bottom). Credit: MIDAS Information Technology Resource, RTI International
Click for larger image

Last April, we reported that researchers at RTI International in North Carolina had created a synthetic population mirroring U.S. demographics to help computational modelers study the spread of infectious diseases. Now, they've added another layer of realism: where the virtual citizens live.

While it may sound more like a tool for gaming, synthetic populations are a very useful tool for disease modeling. By incorporating agents who represent U.S. citizens, modelers can better simulate the spread of an infectious outbreak through a community and identify the best ways to intervene. They also can use synthetic populations to study how certain behaviors may speed up or slow down the spread of an outbreak.

Until now, the RTI synthetic population was based primarily on 2000 census data, such as household sizes, family incomes and residents' ages. Houses weren't placed in the middle of lakes or airports, but they were randomly distributed across census blocks, geographical areas that contain about 100 people and are used to help tabulate the census.

With the availability of geospatial data from satellite imaging, remote sensing and other technologies, the researchers have more realistically plotted where the virtual residents likely reside. They used LandScan USA, a collection of data about road locations, land cover and slope and nighttime lights that also approximates where people live.

This improvement takes advantage of a statistical method called dasymetric mapping, which employs additional data sources to refine spatial resolution. In this case, it helped the RTI researchers use probability to more accurately estimate where people do and don't live within the census blocks. Houses, for instance, are typically built near roads.

"Knowing where lakes and mountains are located reduces the amount of livable space," explains John Boos, a geospatial research analyst at RTI who incorporated the LandScan USA data. "If you know more specifically where houses are located, you can much more accurately bring spatial processes into modeling activities."

For disease modelers, this means they can more realistically simulate human populations and factors that influence disease spread, such as proximity to neighbors, health care facilities and even wooded areas where disease-carrying mosquitoes may emerge.

Learn about related research

This page last reviewed on May 18, 2011