Summary#

Summary of main results#

We found that a minority (n=47; 8.3%) of the 564 papers reviewed shared a simulation model either as a code or visual interactive simulation file. The trend is that sharing has increased during the study period: increasing from ~4% in 2019 to 8%-10% between 2020 and 2022. Studies published in journal articles were also more likely to share their computer model (9.7%) compared to full conference papers (4%). This might reflect that conference papers were work in progress and authors were not ready to share their work. Although we note that one model shared via a Winter Simulation Conference paper rated very high in our best practice audit []. Regardless of the way the split is calculated, sharing of health DES models does not match the findings from Agent Based Simulation where 18% of studies shared models in 2018 [Janssen et al., 2020]. It appears that four years later DES models in health are shared far less often than ABS models in general.

Of the 47 models shared the majority were implemented using a FOSS simulation tool such as R Simmer. This is perhaps not surprising given the freedom FOSS licenses grant authors and other researchers or health care services that may opt to reuse the work.

We also found that only a minority (~25%) of models aiming to support health services to care for patients during the Covid-19 pandemic were shared. It would appear that healthcare DES fell short of the sharing standard other fields had called for during the pandemic [Sills et al., 2020]. One positive is that Covid-19 models were shared more often than computer models in general healthcare settings.

Over 65% (31 out of 47) of the DES models shared via a publication were developed using a code based simulation package. Sharing was most often done by GitHub. Open science researchers have stated elsewhere [Heil et al., 2021, Janssen et al., 2020, Krafczyk et al., 2021] that the gold standard for an open science archive can mint a DOI and provide long term storage guarantees. Disadvantages of GitHub (and other cloud based version control tools) is that there are no guarantees on how long code will remain stored, and it is unclear which version of the code was used in a publication. We only found two instances of code being managed on both GitHub/Lab and deposited in an open science archive. Models built in commercial VIM software were shared less often (16 out 47 models). The majority of these were attached to journal supplementary material or via AnyLogic Cloud. The latter of these approaches will likely lead to broken links in the future due to changes in commercially provided infrastructure and licensing.

Although FOSS simulation packages made up the majority of the shared/archived models we identified it is still commercial off the shelf simulation packages that dominate the academic literature overall. We confirmed a prior result [Vázquez-Serrano et al., 2021] that Arena was the most used package in healthcare DES, with AnyLogic and Simul8 also being in regular use. Definitively explaining why model sharing is most associated with FOSS packages is not possible at this time. We can confirm this is not related to effort needed to perform basic archiving. If we take a digital open science repository such as Zenodo, we note there is no additional work to archive a model file built in a commercial package to one coded in Python. Our speculations are that this is i.) related the general philosophy of freedom built into FOSS tools. That is, FOSS tools attract certain types of users and these users are more likely to share their model artifacts. ii.) FOSS tools are more likely to be code based (although not exclusively) and that lends itself to managing code via repositories such as GitHub.

Keeping to the topic of simulation software, we found that 11% of studies failed to report what simulation software was used at all. This result was also observed by [Vázquez-Serrano et al., 2021] and by [Brailsford et al., 2019] in their exhaustive review of hybrid simulation models. Our result is somewhat more striking than these previous reviews as we focused our review post publication of the Strengthening the Reporting of Empirical Simulation Studies that recommends authors report software used. An explanation is that our review found that the uptake of reporting guidelines within healthcare DES was low. Cost-effectiveness studies modelling individual patient trajectories using DES were the most likely to include some mention or cite using reporting guidelines, typically one of the ISPOR publications.

Summary of best practice audit results#

The quality audit revealed that the tools and methods of sharing DES models could be greatly improved within health and medicine.

Open scholarship#

We found that very few models were deposited in an open science repository such as Zenodo, OSF or even an academic institution. This meant that these model artifacts had no guarantees of long term storage and could not be easily cited. In the case of FOSS or code based models, authors primarily opted for linking to code version control repository such as GitHub or GitLab. While this allowed authors to share their code it is possible that these peer reviewed links will be invalid in future years. We also found instances where model binary files used within commercial simulation software such as Anylogic were inappropriately committed to GitHub instead of an open science repository.

Similarly, the review found limited use of ORCIDs in repositories, archives, or platforms that shared DES models. This meant there was no robust way for the models artifacts to be linked to a researcher and their portfolios.

The majority of DES models were shared without an open license. If we split the models into code based and VIM based simulation software we found that VIM models were the least likely to include a clear open license. The VIM models that did have an open license were typically journal supplementary material and by default adopted the license applied to the journal article as a whole. Without a license authors retain exclusive copyright to their research artifacts. This means that other researchers and potential users can view the code/model, but not reuse it or adapt it.

Tools to facilitate reuse of models#

In general, we found that only a minority of DES models are stored with any form of clear instruction to run the model. This was less likely for VIM models (20%) than code based models (40%). As we did not have licensed copies of all the commercial software used it is possible that instructions were contained within the models themselves. If this is indeed the case the authors did not describe this within papers or repositories.

Dependency management was in general of a poor quality or not used at all. Only 23% of code based simulation models had any formal dependency management; while VIM based models fared slightly better informally by stating the version of the software used. This latter result is perhaps due to the simplicity of stating which version of the commercial ‘of the shelf’ DES software you are using versus a complex software environment for code based DES model. Although we note that in many cases authors did not even state the version of R or Python they were using.

Surprisingly, we found that several of the VIM based DES models shared were not downloadable - even when an author stated it could be downloaded. These were all hosted on cloud based services where the model was interactive and executable to some extent. We found that some models shared through the same platforms were downloadable, so a plausible explanation is that that this was just a setting that was missed by study authors.

Although testing and model verification is standard practice and covered by many simulation text books, we found very little evidence of model testing both of models written in a coding language and commercial software. Given the other findings, it is not surprising that there was limited evidence of testing among the shared models, although this does not mean that testing activities did not take place. An explanation could be that model testing had been completed informally and incrementally as models were coded.

References#

BEK+19: Sally C. Brailsford, Tillal Eldabi, Martin Kunc, Navonil Mustafee, and Andres F. Osorio. Hybrid simulation modelling in operational research: A state-of-the-art review. European Journal of Operational Research, 278(3):721–737, November 2019. URL: https://www.sciencedirect.com/science/article/pii/S0377221718308786 (visited on 2023-04-03), doi:10.1016/j.ejor.2018.10.025.
HHM+21: Benjamin J. Heil, Michael M. Hoffman, Florian Markowetz, Su-In Lee, Casey S. Greene, and Stephanie C. Hicks. Reproducibility standards for machine learning in the life sciences. Nature Methods, 18(10):1132–1135, October 2021. Number: 10 Publisher: Nature Publishing Group. URL: https://www.nature.com/articles/s41592-021-01256-7 (visited on 2023-04-25), doi:10.1038/s41592-021-01256-7.
JPL20(1,2): Marco A Janssen, Calvin Pritchard, and Allen Lee. On code sharing and model documentation of published individual and agent-based models. Environmental Modelling & Software, 134:104873, 2020.
KSB+21: M. S. Krafczyk, A. Shi, A. Bhaskar, D. Marinov, and V. Stodden. Learning from reproducing computational results: introducing three principles and the Reproduction Package. Philosophical transactions. Series A, Mathematical, physical, and engineering sciences, 379(2197):20200069, 2021. URL: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC8059663/ (visited on 2023-04-03), doi:10.1098/rsta.2020.0069.
SBA+20: Jennifer Sills, C. Michael Barton, Marina Alberti, Daniel Ames, Jo-An Atkinson, Jerad Bales, Edmund Burke, Min Chen, Saikou Y Diallo, David J. D. Earn, Brian Fath, Zhilan Feng, Christopher Gibbons, Ross Hammond, Jane Heffernan, Heather Houser, Peter S. Hovmand, Birgit Kopainsky, Patricia L. Mabry, Christina Mair, Petra Meier, Rebecca Niles, Brian Nosek, Nathaniel Osgood, Suzanne Pierce, J. Gareth Polhill, Lisa Prosser, Erin Robinson, Cynthia Rosenzweig, Shankar Sankaran, Kurt Stange, and Gregory Tucker. Call for transparency of covid-19 models. Science, 368(6490):482–483, 2020. URL: https://www.science.org/doi/abs/10.1126/science.abb8637, arXiv:https://www.science.org/doi/pdf/10.1126/science.abb8637, doi:10.1126/science.abb8637.
VSPGCB21(1,2): Jesús Isaac Vázquez-Serrano, Rodrigo E. Peimbert-García, and Leopoldo Eduardo Cárdenas-Barrón. Discrete-Event Simulation Modeling in Healthcare: A Comprehensive Review. International Journal of Environmental Research and Public Health, 18(22):12262, January 2021. Number: 22 Publisher: Multidisciplinary Digital Publishing Institute. URL: https://www.mdpi.com/1660-4601/18/22/12262 (visited on 2023-04-03), doi:10.3390/ijerph182212262.

Model and code sharing practices in healthcare discrete-event simulation - a systematic review

Summary

Contents