Simulation software
Contents
Simulation software#
The results in this notebook do not directly answer any of our primary research questions. The results support RQ2:
What methods, tools, and resources did authors use to share their computer models and code?
The results also illustrate that ~11% of the literature do not report the simulation software used.
1. Imports#
1.1. Standard Imports#
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import MaxNLocator
# set up plot style as ggplot
plt.style.use('ggplot')
1.2 Imports from preprocessing module#
# function for loading full dataset
from preprocessing import load_clean_dataset
2. Constants#
FILE_NAME = 'https://raw.githubusercontent.com/TomMonks/' \
+ 'des_sharing_lit_review/main/data/share_sim_data_extract.zip'
RG_LABEL = 'reporting_guidelines_mention'
NONE = 'None'
WIDTH = 0.5
3. Functions#
3.1. Functions to create summary statistics#
Two functions are used together in order to generate the high level results by year.
high_level_metrics
- takes a subgroup of the dataset and generates summary statistics and countsanalysis_by_year
- loop through the years passing each tohigh_levle_metrics
and concatenates datasets at the end.
def software_count(column, threshold=2):
"""
Return a count of simulation software.
If the count of software is less than 2 the it is labelled as 'Other'
Params:
-------
column: pandas Series
Returns:
-------
pd.DataFrame
"""
counts = column.value_counts().to_frame().reset_index()
counts.columns = ['software', 'count']
summarised = counts[counts['count'] <= threshold].sum()
counts.loc[counts['count'] <= threshold, 'software'] = 'Other'
counts = counts.groupby('software').sum()
counts.loc['Other'] = summarised
return counts
4. Read in data#
clean = load_clean_dataset(FILE_NAME)
5. Results#
5.1 Overall summary table#
software_counts = software_count(clean['sim_software'], threshold=2)
software_counts['n(\%)'] = \
software_counts['count'] / software_counts['count'].sum() *100
software_counts = software_counts.sort_values('count', ascending=False)
software_counts['n(\%)'] = software_counts['n(\%)'].round(1)
software_counts
count | n(\%) | |
---|---|---|
software | ||
Arena | 124 | 21.6 |
AnyLogic | 78 | 13.6 |
Unknown | 60 | 10.5 |
Simul8 | 51 | 8.9 |
Other | 41 | 7.1 |
R | 35 | 6.1 |
FlexSim | 23 | 4.0 |
Excel | 21 | 3.7 |
Simio | 21 | 3.7 |
MATLAB | 19 | 3.3 |
SimPy | 18 | 3.1 |
R Simmer | 15 | 2.6 |
TreeAge | 14 | 2.4 |
Python | 10 | 1.7 |
ExtendSim | 7 | 1.2 |
C++ | 6 | 1.0 |
Salabim | 5 | 0.9 |
MedModel | 5 | 0.9 |
Flexsim | 5 | 0.9 |
ProModel | 4 | 0.7 |
Plant Simulation | 3 | 0.5 |
WITNESS | 3 | 0.5 |
anyLogistix | 3 | 0.5 |
iGrafx | 3 | 0.5 |
6. Output to LaTeX#
print(software_counts.style.to_latex(hrules=True,
label="DES Software",
caption="Software used in DES healthcare studies"))
\begin{table}
\caption{Software used in DES healthcare studies}
\label{DES Software}
\begin{tabular}{lrr}
\toprule
& count & n(\%) \\
software & & \\
\midrule
Arena & 124 & 21.600000 \\
AnyLogic & 78 & 13.600000 \\
Unknown & 60 & 10.500000 \\
Simul8 & 51 & 8.900000 \\
Other & 41 & 7.100000 \\
R & 35 & 6.100000 \\
FlexSim & 23 & 4.000000 \\
Excel & 21 & 3.700000 \\
Simio & 21 & 3.700000 \\
MATLAB & 19 & 3.300000 \\
SimPy & 18 & 3.100000 \\
R Simmer & 15 & 2.600000 \\
TreeAge & 14 & 2.400000 \\
Python & 10 & 1.700000 \\
ExtendSim & 7 & 1.200000 \\
C++ & 6 & 1.000000 \\
Salabim & 5 & 0.900000 \\
MedModel & 5 & 0.900000 \\
Flexsim & 5 & 0.900000 \\
ProModel & 4 & 0.700000 \\
Plant Simulation & 3 & 0.500000 \\
WITNESS & 3 & 0.500000 \\
anyLogistix & 3 & 0.500000 \\
iGrafx & 3 & 0.500000 \\
\bottomrule
\end{tabular}
\end{table}