A PROTOTYPE FOR EARLY WARNING SYSTEM TO PREDICT THE LOCAL AREA RISKS OF GLOBAL INFECTIOUS DISEASES
A computer implemented method for predicting the local area impact of the spread of global infectious diseases including the steps of providing on a computer readable medium a global pathogen risk factors database having data stored therein related to local area vulnerability of a group of human pathogens across a plurality of areas, providing on a computer readable medium a global pathogen activity database having data stored therein related to the activity of the group of human pathogens in said plurality of geographies, providing on a computer readable medium a global transport database having data stored therein related to travel patterns in and/or between the plurality of areas, processing by a computer system data on each of the global pathogen risk factors database, the global pathogen activity database and the global transport database to generate a pathogen vulnerability index, a pathogen activity index and a transportability index, and processing by the computer system each of the pathogen vulnerability index, the pathogen activity index and the transportability index to generate a risk indicator indicative of the local area impact of individual global infectious diseases.
New, previously unknown or unrecognized human pathogens are emerging faster than ever before. Furthermore, many existing human pathogens are evolving into new and potentially dangerous forms. At the same time, the world is becoming increasingly interconnected by air travel. Today more than 2.5 billion travelers board commercial flights every year, creating unprecedented opportunities for locally occurring infectious disease events to rapidly transform into international epidemics or global pandemics. Events such as the worldwide SARS outbreak in 2003 and the H1N1 pandemic in 2009 have clearly demonstrated the ease with which pathogens can spread across international borders and threaten human health, security, and economic activity.
Recent technological innovations to confront emerging global infectious diseases have focused on the early detection of potential threats through real-time analysis of massive volumes of Internet data. These innovations include software systems that analyze mass media content (e.g. online news), social media content (e.g. Twitter™), search engine activity (e.g. Google™ Flu trends), and other online communication channels for signs of potentially dangerous infectious diseases around the world. Recently, some of these systems have been coupled with information on global air traffic patterns to predict how a known human pathogen in a specific geography might spread around the world.
These systems, which predict how individual infectious disease threats disseminate globally from a single geography at a defined moment in time, are incapable of forecasting or anticipating how a global array of infectious disease threats present risks to every geography in the world on a continuous real-time basis. There is therefore a need in the art for an improved method and system that is anticipatory in nature, which can forecast the local risks and consequences of continuously evolving global infectious disease activity.
According to one embodiment of the proposal, there is disclosed a computer implemented method for predicting local area risks of global infectious diseases including providing on a computer readable medium a global pathogen risk factors database having data stored therein related to local area vulnerability of individual human pathogens across a plurality of areas, providing on a computer readable medium a global pathogen activity database having data stored therein related to local area activity of the individual human pathogens in said plurality of areas, providing on a computer readable medium a global transport database having data stored therein related to human travel patterns in and/or between a plurality of the local areas, processing by a computer system data on each of the global pathogen risk factors database, the global pathogen activity database and the global transport database to generate a pathogen vulnerability index, a pathogen activity index and a transportability index, and processing by the computer system each of the pathogen vulnerability index, the pathogen activity index and the transportability index to generate a local area risk indicator for individual global infectious diseases.
There is further provided a step of modeling by the computer system each of the plurality of areas as a spatial unit.
The plurality of areas comprises all cities in the world having at least one airport, such that there is stored a unique spatial unit for each the city with at least one airport.
Each spatial unit is a function of each city's proximity to neighboring cities with at least one airport and of the corresponding magnitude of air traffic at each airport.
The spatial unit is a Voronoi polygon, other polygon, or any other geographic unit.
The method further includes normalizing by the computer system values in the global pathogen risk factors database, the global pathogen activity database and the global transport database.
The pathogen vulnerability index the pathogen activity index and the transportability index are also aggregated and normalized into a single unweighted or weighted cumulative risk index.
The normalizing comprises scaling to a value between 0 and 1.
The data stored on the global pathogen risk factors database related to the local area vulnerability of individual human pathogens includes a list of individual risks and risk factors applied to each of the individual risks; and wherein the generating of a cumulative pathogen vulnerability index comprises assigning and scaling an index to each of the one or more risk factors for each pathogen and then calculating an average or weighted average.
The list of individual risks and risk factors are further provided for each pathogen in the group of human pathogens.
The data stored in the global pathogen activity database includes worldwide data on the activity of pathogens derived from one or more different sources.
The one or more different sources are selected from the group comprising official government reporting, reporting from medical and public health professional networks, mass media news sources, portable diagnostic devices, mobile applications, Internet search activity and social media.
The generating of a cumulative pathogen activity index comprises assigning and scaling an index to each of the one or more different information sources for each pathogen and then calculating an average or a weighted average.
The transportability index is calculated by assigning a geography index between each and every city based on the number of inbound travelers expected to arrive.
There is provided a computer system for predicting the local area risks of global infectious diseases including a computer readable medium having a global pathogen risk factors database with data stored therein related to local area vulnerability of individual human pathogens across a plurality of areas, a computer readable medium having a global pathogen activity database having data stored therein related to the local area activity of the individual human pathogens in said plurality of areas, a computer readable medium having a global transport database including data stored therein related to human travel patterns in and/or between the plurality of local areas, computer executable instructions executed by the computer system for processing by a computer system data on each of the global pathogen risk factors database, the global pathogen activity database and the global transport database to generate a pathogen vulnerability index, a pathogen activity index and a transportability index, and computer executable instructions executed by the computer system for processing by the computer system each of the pathogen vulnerability index, the pathogen activity index and the transportability index to generate a local area risk indicator for individual global infectious diseases.
The system includes computer executable instructions for carrying out the method as herein described.
A novel system and method to create a real-time global early warning system for infectious diseases in accordance with an embodiment of the proposal will now be described. The system continuously identifies risks from each major human pathogen to each city in the world
at any moment in time
by finding the spatiotemporal convergence of, inter alia, global, local, and pathogen
specific risk factors that could impact the health of a population.
Broadly, the hereinafter described proposal provides for real-time systems that can continuously integrate and proactively synthesize knowledge of:
i) Worldwide vulnerability to individual pathogens, ii) worldwide activity of individual pathogens, and iii) worldwide connectivity through travel that could spread pathogens between global geographies. Synthesized intelligence from the proposal could then be used by cities around the world in real-time to anticipate, and consequently prevent or mitigate, the local risks and consequences of infectious disease threats before they occur.
The system of the proposal dynamically produces three indices or scores, which in combination offers all cities with at least one airport the ability to understand their local vulnerability to individual human pathogens, and their associated risk of importation of those pathogens from other parts of the world. While the description refers to cities in the world, it will be understood by a person skilled in the art that the invention may be implemented in a subset of cities, or in a predefined geographical area.
First, the global pathogen risk factors database is preferably populated with human, pathogen, environmental and medical diagnostic and therapeutic features of important human pathogens. the global pathogen risk factors database includes environmental data that may be used, including the presence of an essential animal in the pathogen life cycle (such as poultry in the case of avian flu) and environmental humidity (such as in the case of influenza). The global pathogen activity database is preferably populated with one or more of the following pertaining to each pathogen: official government notifiable disease surveillance data, online real-time news (e.g. GPHTN, HealthMap, MediSys), communications from medical and public health professional networks (e.g. Pro-MED mail), real-time social media content, test results from point of care diagnostic devices, self
reported syndromes inputted via mobile health web-applications, and Internet search engine activity (e.g. Google™ Flu Trends). For definition purposes, variants of the same microorganism (i.e. such as a drug resistant or highly pathogenic form) are considered unique pathogens. Finally, the global transport database is preferably is populated with data on worldwide flight schedules (e.g. Official Airline Guide), worldwide airline passenger ticket sales and flight itineraries (e.g. International Air Transport Association), and real-time aircraft-level flight data (e.g. FlightStats).
Standardized indices are created from pathogen risk factor, pathogen activity and transportability databases by rescaling or normalizing all dataset values associated with each pathogen and human travel to a defined city. For example, in the case of vulnerability to cholera, population access to clean water percentages for each city in the world are transformed into relative values for each city that are scaled between 0 and 1. These indices are produced by accessing stored data from memory in the above databases, using computer processors to analyze datasets by applying predefined statistical rescaling algorithms, and then storing calculated results to computer memory. This process is repeated at frequent scheduled intervals. For any pathogen at any point in time, each of the above three indices are combined to offer every city in the world with intelligence about their local vulnerability to that pathogen and its associated risk of importation from other parts of the world.
The invention then produces and publishes secure reports on these risks to share with end users using a combination of visualization methods including data tables and static or animated charts, graphs, and maps.
Creating a Spatial Unit for Analysis:
As a precursor to the generating of the indices mentioned above, creating a spatial unit for analysis is required, which is then used to define local areas within the overall geography being analyzed, which in the preferred embodiment is the world as a whole. One way in which a spatial unit can be created is by way of the use of Voronoi polygons. According to the preferred embodiment, Voronoi polygons are created to deconstruct the world's land geography into distinct areas around cities with airports, which then serve as spatial units for all numeral calculations. In the invention, Voronois approximate "airport catchment geographies"
i.e. the maximum distances individuals would be expected to travel by ground to fly out of an airport or travel by ground from an airport to their final destination. A global view of Voronois weighted by air traffic volume. Each Voronoi is created around the geographic coordinates of a city with at least one airport and no two Voronois overlap. The size and shape of each Voronoi is a function of each city's proximity to neighboring cities with at least one airport and its magnitude of air traffic. Given the approximately 4000 cities in the world with airports, a corresponding number of Voronois is used to cover the world's land geography. Although traffic weighted Voronoi polygons are described to separate the world's land geography into distinct spatial units in this application, other established techniques to generate spatial units may also be used.
Voronoi polygons are known in mathematics study, but their application to the field of invention is thought to be inventive. In particular, when used in combination with other aspects of the invention, is especially advantageous in generating a transportability index, as will be discussed in more detail below. the world geography modeled using Voronoi polygons that apply a weighting factor for airport traffic volume in each city. This way, irrespective of the population size of the city, the effect of travel between cities is determined based on air traffic.
Quantifying the Vulnerability of Cities to Human Pathogens.
Connecting awareness of global infectious disease activity with global population mobility via air travel generates insights into the risks of infectious disease importation for any defined area. However, this does not offer insights into the potential local area impact of a pathogen that is introduced from another area of the world. Defining and quantifying the potential local impact to human health, biosecurity, and/or economic activity requires information synthesis across the following four domains, data relating to which the global pathogen risk factors database 25 may be populated with.
The global pathogen risk factors database may be populated with data pertaining to:
- Pertinent characteristics of individual pathogens;
- Available medical countermeasures against individual pathogens;
- Pertinent characteristics of the host population; and
- Pertinent characteristics of the environment.
The population health impact
that is the vulnerability of a particular city to imported pathogens is a function of the above risk factors.
Pertinent characteristics of individual pathogens may include their communicability, mode of transmission, reproduction number, attack rate, virulence and antimicrobial resistance. Pertinent characteristics of the host population may include the population size, population density, age structure/demographics, population behavior, disease specific immunity and general immune competence. Pertinent characteristics of the environment may include aspects of the healthcare system, public health capacity, physical infrastructure, animal/insect populations and climatic conditions.
Finally, pertinent characteristics of available medical countermeasures may include the availability/effectiveness of diagnostic tools, the availability/effectiveness of antimicrobials for pre or post exposure prophylaxis or treatment, and the availability/effectiveness of vaccines.
In this database, each risk factor is matched to a high quality data source that is a surrogate marker for that risk factor. For example, cholera, which spreads through fecal contamination of food or water, may be matched to the World Bank indicator "Population Access to Improved Sanitation Facilities", which represents the percentage of a population with at least adequate access to excreta disposal facilities that can effectively prevent human, animal, and insect contact with excreta across rural and urban geographies worldwide. Data values pertaining to the pathogen and corresponding medical countermeasures are derived from the expert opinion of clinical infectious disease specialists and databases pertaining to national healthcare resources and systems (e.g. World Bank), whereas data values pertaining to the host population and environment are derived from high quality third-party data sources with global coverage (e.g. the World Health Organization, World Bank, Food and Agriculture Organization of the United Nations, the National Aeronautics and Space Administration, the National Oceanic and Atmospheric Administration, etc.). Data from these sources are pre-processed, stored in computer memory, and updated with the greatest available frequency. In certain instances, data will be updated and saved to the database in real-time (e.g. Climate data from satellites).
To produce a standardized vulnerability index for each pathogen across each city worldwide, each set of risk factor values for a pathogen are rescaled between 0 and 1 (lowest to highest risk). This process is achieved by applying a statistical rescaling algorithm to each set of risk factor values using a computer processor, with results saved to a computer readable medium. Where there are multiple sets of risk factors for a given pathogen, each set of risk factor values is independently rescaled, aggregated, and then the sum is rescaled again (i.e. to create a single vulnerability index for each pathogen across each city worldwide). The default rescaling process (where multiple risk factors are involved) is unweighted with optional weighting by users if they deem certain risk factors to be of greater significance than others. Since risk factor values for pathogens continuously change over time, vulnerability indices for each city also change over time. The rescaling process is repeated for each pathogen and its corresponding risk factor values until all pathogens in the invention are processed.
The entire process is also repeated at frequent scheduled intervals (e.g. daily) with results saved to memory.
Alternatively, to evaluate local conditions across the above domains within a potential geographic area (e.g., Voronoi Cairo Region or other regional geography such as a state or province), a second, more spatially precise geographic unit of analysis may be used (i.e. hexagons that are 50 km across parallel sides and -6500 km in area). At this shape and size, there are -163,000 hexagons that cover the world's land area. While the hexagon shape and stated size is currently used, other polygon shapes and/or sizes may be used in the future. This is because the spatial resolution of data sources may change over time, and smaller spatial units would offer more precise information.
In addition to evaluating the immediate impact of pathogen importation, the invention determines if the unique life cycles of individual pathogens can be completed within a local area, by analyzing data from the global pathogen risk factors database that pertain to each component of the pathogen life cycle. This allows for a determination of whether local conditions are sufficiently present for a pathogen to find a "hospitable" environment, where it can then propagate (e.g., West Nile Virus completed its life cycle and established itself in New York City in 1999 for the first time, after which it spread widely across North America). Schematics of how the risks of infectious disease importation and subsequent local transmission would be evaluated by the invention are shown for pathogens that can spread directly from person-to-person, through contaminated water sources, from animal populations (zoonoses), and insects as examples.
For data that is more spatially precise than a single Voronoi polygon or hexagon, information is aggregated (e.g. precipitation data at the 1 km level is aggregated up to the Voronoi or hexagon unit level). Conversely, for data that is less spatially precise than a single Voronoi or hexagon, the data is attributed to all Voronois or hexagons within the larger spatial unit (e.g. a report of an epidemic at the state level would be attributed to all Voronois or hexagons within that particular state). The temporal resolution of all data is retained in its most precise form.
A. Pathogen Characteristics.
In the system, a catalogue of infectious pathogens is created and used to define features for each pathogen such as the following:
- Does the pathogen cause a communicable or a non-communicable disease?
- What is its primary (and if applicable) secondary mode(s) of transmission?
- What living systems and/or environmental factors are necessary for the pathogen to complete its life cycle?
- What is its basic reproduction number (i.e. how easily can it spread)?
- How virulent is the pathogen in terms of its morbidity and mortality?
- Is antimicrobial resistance an issue with this pathogen?
Answers to these and related questions are derived from expert domain knowledge of the life cycles, epidemiological patterns, and medical aspects of diagnosing, treating, and preventi