Use of reporting guidelines
Contents
Use of reporting guidelines#
Primary research questions:#
These results presented in this notebook the following questions
What proportion of studies make use of a reporting guideline?
1. Imports#
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
## Imports from preprocessing module
from preprocessing import load_clean_dataset
2. Constants#
FILE_NAME = 'https://raw.githubusercontent.com/TomMonks/' \
+ 'des_sharing_lit_review/main/data/share_sim_data_extract.zip'
RG_LABEL = 'reporting_guidelines_mention'
NONE = 'None'
3. Functions#
def reporting_guideline_summary(df_clean, exclude_none=True):
'''
For studies included, summarise reporting guidelines.
Returned as name; n; % of included table
Params:
------
df_clean; pd.DataFrame
All papers
exclude_none: bool, optional (default=True)
Excludes the row for "None" i.e. no reporting guideline mention
Returns:
-------
pd.DataFrame
'''
# restrict to included studies only
included = df_clean[df_clean['study_included'] == 1]
# exclude or include 'None'
if exclude_none:
report_guidelines = included[included[RG_LABEL] != NONE]
else:
reporting_guidelines = included
# frequency + percentage
counts = report_guidelines.groupby([RG_LABEL])['key'] \
.count().sort_values(ascending=False)
percentages = counts / len(included)
# summary table
summary = pd.concat([counts, (percentages * 100).round(1)], axis=1)
summary.columns = ['n', '% of included']
summary = summary.drop(NONE, axis=0)
return summary.sort_values(by=['n'], ascending=False)
def guidelines_by_subset(df_clean, field, column_label):
subset = df_clean[df_clean[field] == 1]
summary = reporting_guideline_summary(subset)
summary.columns = [column_label, '% of included']
return summary
4. Read in data#
clean = load_clean_dataset(FILE_NAME)
5. Results#
5.1 Create a high level summary of the reporting guidelines used.#
# overall
overall_summary = reporting_guideline_summary(clean)
overall_summary
n | % of included | |
---|---|---|
reporting_guidelines_mention | ||
ISPOR | 37 | 6.6 |
STRESS | 22 | 3.9 |
CHEERS | 8 | 1.4 |
SQUIRE | 2 | 0.4 |
ODD | 1 | 0.2 |
Sanders et al. | 1 | 0.2 |
Zhang et al. | 1 | 0.2 |
The most frequent guidelines used were ISPOR; typically within papers publishing DES models used in a cost effectiveness study.
# covid only?
guidelines_by_subset(clean, 'covid', 'Covid')
Covid | % of included | |
---|---|---|
reporting_guidelines_mention | ||
STRESS | 9 | 13.0 |
CHEERS | 1 | 1.4 |
ISPOR | 0 | 0.0 |
ODD | 0 | 0.0 |
SQUIRE | 0 | 0.0 |
Sanders et al. | 0 | 0.0 |
Zhang et al. | 0 | 0.0 |
5.2 What proportion overall made use of any reporting guideline?#
n_reporting = overall_summary['n'].sum()
total_included = len(clean[clean['study_included'] == 1])
per_reporting = (n_reporting / total_included) * 100
txt = f'A total of {n_reporting} ({per_reporting:.1f}\%) studies used models' \
+ f' published in articles that mentioned a known simulation' \
+ ' reporting guideline or checklist.'
print(txt)
A total of 72 (12.8\%) studies used models published in articles that mentioned a known simulation reporting guideline or checklist.