Infection Modeling — Part 2Optimizing a Vaccination Strategy with Genetic AlgorithmsMark DitsworthBlockedUnblockFollowFollowingJan 7In part 1, we modeled the spread of an infectious pathogen through a social network.
Monte Carlo simulations allowed for the calculation of a probability distribution of R0, the basic reproduction number.
In practice, public health officials would then try to develop a plan to lower the impact through vaccination or quarantine.
By definition, R0 is an intrinsic property of an outbreak and does not change once calculated, as it is calculated assuming a fully susceptible population.
The effective reproduction number, R, represents the average number of secondary infections in a population that may include immune members.
This is the metric that is desired to be minimized through vaccination campaigns.
PDF of calculated R0 value from the Monte Carlo simulations of the epidemic from part 1.
Part 1 showed that R0 is expected to be between 2 and 3 for our example network and pathogen, as shown in the distribution of the Monte Carlo simulation results in Figure 1.
In this article and the next, we will model the effects of vaccination in the network.
Specifically, we will look at the scenario of a shortage in the vaccine supply and attempt to answer one question: how can we best protect the population given that only 25% of the members can be vaccinated?Of the 62-node network, only 15 will receive a vaccine (thereby being removed from the network) before modeling the spread of the pathogen with the same method as in part 1.
The resulting (averaged) R value will then be calculated to ascertain the effectiveness of that vaccination policy.
The goal is to lower R below 1, indicating that the pathogen is likely to spread slower than the rate the infected recover.
Genetic Algorithm OptimizationIdentifying the best nodes to vaccinate could be accomplished through brute-force combinatorial search, although this would take quite a while (there are 93 x 10¹² combinations to check in this case).
A genetic algorithm (GA) is a bio-inspired search algorithm that narrows the search space by mimicking natural selection.
A number of candidate solutions (also referred to as chromosomes) join a mating pool with a probability proportional to their performance on the objective function.
The mating pool produces offspring with a specified chance of mutation which are then evaluated on the objective function, restarting the evolutionary process.
In this case, the GA was programmed to use a population of 20 candidate solutions to minimize the R value of the pathogen in the network.
Figure 2 shows the evolution of the candidate solutions over 20 iterations.
Evolution of R for the GA population.
The mating pool is formed with the tournament method.
Instead of assigning a fixed probability of mutation, mutation occurs when crossover results in duplication.
After the completion of the GA routine, the best policy resulted in an R of about 0.
98 on average by vaccinating nodes 0, 7, 8, 9, 10, 13, 17, 19, 21, 25, 29, 30, 38, 43, and 49.
The Monte Carlo simulation results from implementing this policy are shown below in Figure 3.
Monte Carlo simulation results for the optimal policy found via GA.
Averaged SIR response (upper).
Distribution of total infected nodes (lower).
The average number of infected nodes is about 27% of the population (13 nodes), though the number is roughly uniformly distributed between 3–15, at which point the likelihood begins to decrease.
The probability distribution of R is shown below in Figure 4, and roughly fits to a normal distribution.
The value of R achieved with this vaccination policy is most likely to be between 0.
68 and 1.
PDF of calculated R values fit to normal distribution.
ConclusionsImplementing GA to reduce the impact of the pathogen on the network was successful.
After 20 iterations, the best vaccination policy resulted in a 32% to 77% decrease in expected infectiousness.
However, the upper range of the expected R remains above 1, which is undesirable considering our goal is to move R as far below 1 as possible.
Additionally, performing genetic algorithm optimization on Monte Carlo simulations is expensive.
With 20 chromosomes, 20 GA iterations, and 1,000 simulations per Monte Carlo simulation, a total of 400,000 simulations were computed.
It took my machine just over 48 hours to finish.
Keep in mind that this example uses a small 62-node social network.
At a larger scale, a decently sized cluster would be needed to attain results in a timely manner.
Part 3 will illustrate how network science tools perform this optimization much faster (and better) than GA optimization.