Structure of a Monte Carlo simulation for HRH
PACE-HRH calculates the amount of time that would be required to deliver planned health services for a baseline population. It has two components: an Excel workbook that organizes data inputs, and an R package that performs the modeling computations. The model is built up from a comprehensive set of input sheets, typically set up to be representative of a single geography (e.g., a province). The inputs revolve around demographics, disease profiles (including seasonality), tasks to be completed, and responsibilities by cadre.
A list of tasks based on the service packages is the core of the model. There are five basic types of tasks that a user may need to include in their model, including both clinical tasks, such as diagnostics and treatment for infectious diseases, and non-clinical tasks, such as travel time and report completion.
Examples of these five types of tasks and how they are implemented are shown in Table 1. The incidence rate is the portion of the population to which a task applies, in the starting year. The annual change rate is the rate at which the incidence changes on a year-by-year basis (i.e., year N / year N-1). Weekly workload applies to tasks that take the same amount of time, regardless of the catchment population or disease prevalence. While the baseline assumption of PACE-HRH is 100% delivery of planned health services, the model also allows incorporation of service utilization rate or service coverage rate.
Table 1
Implementation of different types of tasks. PACE-HRH has a flexible interface that can accommodate many types of tasks that fall into a healthcare worker’s workload. The same input structure can be used for the first four categories listed in the table. Variable values are specific to the task item. 100% incidence is used for tasks that are applicable to all patients of a given age group. A 1.0 annual change rate indicates that the incidence rate applied to the population is fixed for the modeled period.
Task Type | Incidence rate | Annual change rate | Weekly workload | Examples |
Clinical, incidence-based | variable | variable | - | Malaria testing FP counseling HIV treatment |
Clinical, all individuals | 100% | 1.0 (fixed) | - | Immunization Cancer screening Nutrition checks |
Clinical, age-banded | variable, broken out by age group | 1.0 (fixed for a given age group) | - | Diabetes Hypertension Cancer |
Non-clinical, population-dependent | variable | 1.0 (fixed per population) | - | Travel time Education programs AEFI reporting |
Non-clinical, fixed per week | - | - | variable | DHIS2 reports Taking inventory Training courses |
The annual workload time is calculated for each task based on its input parameters, and then aggregated. The model utilizes a statistical sampling technique, Monte Carlo simulation, to incorporate uncertainty into the forward projections of populations and disease incidences (21). PACE-HRH samples from the prescribed distributions for parameters including fertility and mortality rates, disease incidence rates, and time per contact (Fig. 1). Stochasticity is incorporated into the projections by sampling the annual change rate, which creates a random walk effect for each incidence, fertility, and mortality rate. Default distributions applied are: uniform to the year-1 incidence rates, truncated normal to the annual change rate, and lognormal to time per contact. [Additional file 1, Table 1] The distributions are modifiable by the user by editing their local code base.
Figure 1. The PACE-HRH model structure. The demographics model projects forward the population age pyramid on an annual basis, based on fertility and mortality by age and sex. The Task Time model utilizes the demographics to estimate the workload by task and then reports out the totals. The tasks can optionally have a seasonality curve applied to them, to reflect within-year variation in demand for services. In addition, the workload can optionally be allocated to different cadres of workers, based on the skills required for a given task.
In addition to running baseline simulations, PACE-HRH also supports more complex analyses such as analysis on input sensitivity or task shifting of responsibilities from one cadre to another.
A user conducts seasonality analysis by applying condition-specific seasonality curves to relevant services. Some services require multiple contacts with the healthcare system, including those in which contacts are due before or after the month of the condition onset. For example, a child diagnosed with malnutrition may require multiple visits to ensure recovery. To properly apply seasonality curves to relevant tasks, the user specifies offset values for each service contact. For example, a birth seasonality curve can be applied to all pregnancy-related tasks, including antenatal care (ANC) visits. The first ANC visit is recommended 7 months prior to the expected birth(22), giving this task a seasonality offset value of negative 7.
Analysis on task shifting can be set up by specifying alternative sets of inputs that specify the task-to-cadre allocations. The flexible set-up of tasks in PACE-HRH makes it possible to easily assign a task to a different cadre and to test different combinations of task assignments.
Parameterization And Recommended Assumptions
The Task Time model configuration starts from a list of tasks based on the planned health services, with input parameters specified for each task as described in Table 2.
Table 2
Input parameters for the PACE-HRH task list. The model calculates service time for each task based on these parameters, and every task in the model must be completely specified before the model can run. Cadre allocation is optional. The Demographics Model is built into PACE-HRH and it requires input data on population composition by age and gender, current fertility and mortality rates, and the annualized future rate of change for fertility and mortality rates (Table 3).
Task list parameters | Descriptions | Example |
Task identifier | A unique identifier for each task | ANC.1 |
Relevant population | The population to which this task applies | children under 5, pregnant women |
Incidence/prevalence rate | Proportion of the relevant population in need of care (always < = 1.0) | 0.15 (i.e., 15% of population) |
Annual change in incidence rate | The year-on-year rate of change in incidence/prevalence rate (typically between 0.95 and 1.05) | 0.98 (i.e., 2% decline per year) |
Number of contacts | The required number of contacts with the health system, per person served, to fulfill the service | 1 test per suspected malaria case |
Adjustment factor | Optional multiplier applied to the total number of services delivered | HIV testing = 4x the incidence rate to identify a case (i.e., 25% positivity) |
Time per contact | The amount of time (in minutes) required to complete the task | 15 minutes per family planning consult |
Cadre allocation | The cadre of health workers the task is allocated to; these can be specified to change over time | Nurse 75%, Midwife 25% |
Table 3
Demographic input parameters for PACE-HRH. PACE-HRH projects forward populations using the demographic inputs listed here.
Data item | Data description | Potential data source |
Population composition | Number of people in each one-year age bin, from 0 to 100, by gender | National Census, National Statistics Office, UN Population (23) |
Age-specific fertility rate | Annualized birth rate per 100 women in year 1, broken out across age 15–49 years in five-year age groups | National Census, National Statistics Office, the DHS program (24) |
Age-specific mortality rate | Annualized mortality rate per 100 individuals in year 1, broken out by gender and relevant age group | National Statistics Office, WHO Mortality Database (25), UN Population (23) |
Annual change in fertility rate | Annualized ratio of fertility in year n + 1, divided by fertility in year n, by appropriate age group | Calculated from recent historical data |
Annual change in mortality rate | Annualized ratio of mortality in year n + 1, divided by mortality in year n, by appropriate age group | Calculated from recent historical data |
The user can modify assumptions about trends, based on their expectations about what the future may look like. The variables that change over time in the model include: fertility rates by age, mortality rates by age, disease incidence and prevalence, and cadre allocations.
We recommend PACE-HRH users to consider the following factors when choosing data inputs for their key assumptions.
Low- and middle-income countries’ (LMIC) demographics tend to exhibit two characteristics: 1) a younger population than higher income countries, a majority of whom are in their fertile age window(26); and 2) ongoing shifts in fertility (27, 28). Additionally, fertility rates can vary widely on a sub-national scale (29). In combination, this means that fertility assumptions can have a dramatic effect on the demographic projections and the resulting workload estimates into the future. For example, in geographic areas that have recently experienced rapid declines in fertility, historical patterns may not be likely to continue. As family planning becomes more common practice, first pregnancies are occurring at older ages; thus, overall fertility rates are declining in younger age groups while increasing in older groups, a trend that cannot continue indefinitely. Capturing this non-equilibrium dynamic in population structure is critical to accurately projecting future population health needs. Users need to carefully consider the implications inherent in using historical trends before using them to populate the model.
Parameterizing rates of change in disease incidence reflects an expectation about the future, and historical trends may not continue apace. Periodic population-based surveys (e.g., the Demographic and Health Surveys (24)) are potential sources for historical disease incidence trends since data are collected consistently over time. Looking at this data, most countries have reported a decline in infectious diseases such as HIV, TB, and pneumonia over time; however, depending on the ability of a healthcare system to reach the ‘hardest to reach’ subgroups, this progress is not guaranteed to continue. In addition to seeking data that span years and can be used to calculate trends, the user should also consider factors that influence expectations of the future, including health priorities set at the national and local levels, global disease eradication efforts, and changes in access to preventative healthcare.
PACE-HRH centers around the enumerated list of healthcare services, ultimately to estimate total workload, so the time per task is a key assumption. The time it takes to complete a task is variable and poorly documented, with very few studies. Intrinsic variation depends on the proficiency of the health worker, the location of service delivery, and accepted standard practices (30), and is addressed in the model through the application of distributions to incidence rates and time durations. There is also extrinsic uncertainty due to the lack of robust data, which is captured in the model by sampling a random starting value for these rates. A PACE-HRH user should consult existing time and motion studies for time estimates (31), provided that the context of the cited studies reasonably resembles the geographic area being modeled. If the resources are available, it may make sense for the user to conduct their own time and motion studies to gather additional robust data.
PACE-HRH can perform analyses at any level of geographical granularity that the user deems appropriate, but it is important to capture local context as much as possible. It is common that high quality data do not exist for the desired granularity and substitution must be made by using data either from a more aggregated level or from an alternative but similar geographic area. When such substitutions are made, it is important to ensure that the data used still reasonably captures the characteristics of the intended geographic area. Additionally, sensitivity analysis can be used to check that any given assumption does not overly bias the results.
PACE-HRH provides users with the option to validate data inputs before running the model. The validation tool checks for input irregularities that fall into two categories: 1) values that are not acceptable by the model (e.g., non-numeric values where numeric values are expected); and 2) values that fall out of the defined range (e.g., the sum across all months of a seasonality curve must total to 100%). This reduces the occurrence of missing values and typo errors and ensures reasonable values for key inputs and proper data dependencies. Examples of validation rules can be found in Additional File 1, Table 2. The validation rules are customizable by the user. A validation report is produced by the validation tool and informs the users of color-coded input irregularities based on the validation rules.
Structure For Scenario Analysis
The scenario manager (built in the Excel workbook) allows the user to easily provide multiple model configurations that are run in succession. Each configuration takes in global numerical parameters, sheet names where input values are located, and optional scenario analysis toggles. Numerical values include: working weeks per year, hours worked per week, and the catchment population. Named worksheets include inputs for task values, fertility and mortality rates, seasonality curves, and cadre allocations. There are three toggles that enable easy scenario comparison and sensitivity analysis.
The first option is to turn off fertility trends and instead to fix them at the first year’s value. By comparing the model’s results with and without changes in fertility, the user can understand how much impact those assumptions have on the results. Given the difficulty in predicting fertility trends (32), this is a key sensitivity for a workload analysis.
Second, the user can turn off population growth, allowing shifts in demography but not in the total size of the population. By doing so, the user can understand how much of an impact that shifts in demography have on the results if a healthcare system maintains a fixed ratio of HCW to population.
Third, the user can turn off trends in different types of disease areas. Continuing to reduce incidence for infectious diseases is not guaranteed – the COVID-19 pandemic reversed progress in many areas (33–35) – so understanding how much these assumptions matter to the model results is valuable for decision makers who need to mitigate downside risks of under-estimating future workload.
Software Development Process
PACE-HRH was developed and tested by a team of researchers, software development engineers, software testing engineers, and technical documentation specialists. The researchers defined the model’s objectives and functionality, designed the user interface, and proposed features to the software developers through detailed descriptions of intended functionalities of the model. The software team implemented per the specifications and ensured that the model behaves as expected, including automated validation and testing features. The technical documentation specialist produced a getting started guide and documentation that is available publicly on the GitHub repository (36).
The code base was developed to be fast, flexible, and easy to modify, using a functional programming approach. PACE-HRH pre-calculates matrices of prevalence rates and other parameters before each modeling run, so that during the run it can exploit R's high-speed linear algebra capabilities. Intermediate calculations are available for inspection but are not explicitly saved. Model outputs including detailed workload by task, month, and applicable age group are exported as csv files for the user to employ in their analysis. The package can handle several hundred simulation runs on a local instance of R, allowing users to work from a laptop without needing additional computational infrastructure.
Automated Testing
To ensure a high quality package release, we implemented automated testing. Scientific software faults can lead to incorrect model results due to discrepancy between the design of scientific software and its implemented functionality (37). Furthermore, inadequate testing creates risk of unintended consequences, model errors, and the introduction of bugs. We used GitHub actions (38) to automate the testing process so that every code commit is checked and reported before being accepted, using devtools (39) and testthat (40) for validation. Our testing strategy covers three areas.
First was package validation and documentation building. For every commit, we automated the build process, checked the source package for issues, and verified that vignettes and documents built correctly. This helped to avoid potential installation issues for users and enabled rapid integration.
Second was unit tests, which validated that functions behaved as expected at the most basic level where validation was possible. A code coverage analysis ensured that important model logic was covered by testing.
Third was scenario and regression testing. Regression tests checked that the functionality of previously developed and tested software still performed as expected after code updates (41), which was helpful for tracking model result changes. We took snapshots of model results after verifying them initially, which were used for comparing with future changes. All differences were reviewed, and a new snapshot was created if the change was deemed acceptable.
Technical Details
PACE-HRH was built under R Statistical Software (42) version 4.2.0 (2022-04-22 ucrt). Complete documentation, including R code reference, details about the dependent packages, and a getting started guide, is available at https://institutefordiseasemodeling.github.io/PACE-HRH/articles/pacehrh.html. The latest released version of the Windows binary zip file can be downloaded from https://github.com/InstituteforDiseaseModeling/PACE-HRH/releases and installed as a package in R. If a user intends to modify the code, they can install directly from unreleased source code in GitHub and create their own fork.