Dataset for the DIssemination of REgistered COVID-19 Clinical Trials (DIRECCT) Study

Description: The DIRECCT study is a multi-phase, living examination of clinical trial results dissemination throughout the COVID-19 pandemic. This dataset contains `trials`, `registrations`, and `results` from Phase 1 of the project, examining trials completed during the first six months of the pandemic (i.e., through 30 June 2020). This dataset is provided as a relational database of three CSVs which can joined on the `id` column. Data was collected using a combination of automated and manual strategies; automated searches were performed on 30 June 2020, and manual searches were performed between 21 October 2020 and 18 January 2021. Data sources for `trials` and `registrations` include the World Health Organization (WHO) International Clinical Trials Registry Platform (ICTRP) list of registered COVID-19 studies, individual clinical trial registries, and the COVID-19 TrialsTracker (https://covid19.trialstracker.net/). Data sources for `results` include COVID-19 Open Research Dataset Challenge (CORD-19), PubMed, EuropePMC, Google Scholar, and Google. Additional information on the project is available at the project's OSF page: http://doi.org/10.17605/osf.io/5f8j2

Bibliographic

Published
Keywords
  • clinical trials
  • results reporting
  • COVID-19
Funder German Bundesministerium für Bildung und Forschung (BMBF)
License

Coverage

Temporal

Begin 2020-01-01
End 2020-06-30

Spatial

Attributes

Name Description Unit
id {trials, registrations, results} Unique trial identifier character
n_trn {registrations} Count of trial registration numbers for a trial numeric
source {registrations} Source of trial registration number: `ictrp` (i.e., WHO ICTRP), `auto` (i.e., found via automated searches on registry or in Trials Tracker), `reg` (i.e., found on registry via manual search) categorical
trn {registrations} Trial registration number character
registry {registrations} Clinical trial registry categorical
trn {results} Trial registration number if result associated with trial NOT in dataset. `no TRN found` if no trn reported in result. `see trials` if result associated with trial in dataset. character
search_type {results} Type of search used to find result: `results_check` (automated results search in CORD-19 and PubMed), `registry` (summary result or publication citation on registry), `trn` (search engine query using TRN), `keywords` (search engine query using trial keywords) categorical
pub_type {results} Publication type. Full results report complete results of the primary outcome(s) for all participants. All other results are interim: `full_results_preprint`, `full_results_journal_article`, `interim_results_journal_article`, `other` (includes protocols, conference abstracts, posters, and presentations, grey literature), `summary_results` (i.e., registry results), `interim_results_preprint`, `interim_result` categorical
search_engine {results} Search engine. If `search_type` is `trn` or `keywords`, which search engine was used to find the result: `pubmed`, `europe_pmc`, `google_scholar`, `google` categorical
doi {results} Publication DOI character
pmid {results} Publication PubMed id numeric
cord_id {results} Publication CORD-19 id. Sparsely used. May be removed in future dataset versions. character
url {results} Publication URL character
date_publication {results} Trial publication date. Earliest available, including online-first. date
date_completion {results} Trial completion date (last patient, last visit), if reported in the publication. date
comments {results} Comments on results from manual data extraction. character
trialid {trials} Preferred trial registration number from input ICTRP dataset after TrialsTracker pre-processing. character
source_register {trials} Preferred trial registry from input ICTRP dataset after TrialsTracker pre-processing. categorical
date_registration {trials} Trial registration date, per ICTRP. date
date_enrollement {trials} Trial first enrollment date, per ICTRP. date
retrospective_registration {trials} Whether trial was registered retrospectively, per ICTRP after TrialsTracker pre-processing. Not necessarily the same as "Retrospective flag" on the ICTRP. logical
normed_spon_names {trials} Normed sponsor names, derived from the ICTRP. character
recruitment_status {trials} Recruitment status, per ICTRP: `Authorised`, `Not Recruiting`, `Recruiting`, `No Status Given` categorical
phase {trials} Phase, per ICTRP. Normalized in TrialsTracker pre-processing: `Phase 1`, `Phase 1/Phase 2`, `Phase 2`, `Phase 2/Phase 3`, `Phase 3`, `Phase 3/Phase 4`, `Phase 4`, `Not Applicable`. categorical
study_type {trials} Study type, per ICTRP. Normalized in TrialsTracker pre-processing: `Interventional`, `Prevention`, `Observational`, `Diagnostic test`, `Prognosis`, `Basic Science`, `Epidemiological research`, `Health services research`, `Screening`, `Expanded Access`, `Other`, `Unknown`. Only `Interventional` and `Prevention` considered clinical trials. categorical
countries {trials} Countries, per ICTRP. categorical
public_title {trials} Trial title, per ICTRP. character
study_category {trials} Category of intervention or topic of study, based on TrialsTracker schema. Not systematically extracted to date: 39 categories. categorical
intervention {trials} Trial interventions, extracted from ICTRP data for TrialsTracker. Not systematically extracted to date. character
intervention_list {trials} List of all unique intervention in trial, seperated by a ";", derived from ICTRP. character
target_enrollment {trials} Trial target enrollment, extracted and cleaned automatically from ICTRP. numeric
web_address {trials} Trial registration URL, per ICTRP. character
trial_status {trials} Trial status, per registry during automated screening, if available. Takes the form of a list for the EUCTR to account for multiple protocols. categorical
registry_scraped {trials} Whether registry scraped. NA for non-trials. TRUE for trials. No FALSE in first phase. logical
pcd_auto {trials} Primary completion date, per registry during automated screening, if available. date
scd_auto {trials} Study completion date, per registry during automated screening, if available. date
rcd_auto {trials} Relevant completion date, based on automated screening. Used for our trial automated screening criteria. `scd_auto` if available, otherwise `pcd_auto`, otherwise NA. date
tabular_results {trials} Whether summary results posted for ClinicalTrial.gov or EUCTR trials. NA is `source_register` is NOT ClinicalTrial.gov or EUCTR. 0 indicates FALSE. No summary results detected in first phase. logical
potential_other_results {trials} Whether publication citations posted for ClinicalTrial.gov trials. NA is `source_register` is NOT ClinicalTrial.gov. 0 indicates FALSE. 1 indicates TRUE. logical
has_auto_reg_results_cutoff_1 {trials} Whether `tabular_results` or `potential_other_results` detected on registry in automated screening. logical
is_intervention {trials} Whether ICTRP `study_type` is `Interventional` or `Prevention`. Used for our trial automated screening criteria. logical
is_reg_2020 {trials} Whether ICTRP `date_registration` is as of 2020-01-01. Used for our trial automated screening criteria. logical
is_not_withdrawn {trials} Whether not indicated as withdrawn in either ICTRP or registry during automated screening. Used for our trial automated screening criteria. logical
is_eligible_study {trials} Whether eligible for manual screening for our study. Based on `is_intervention`, `is_reg_2020`, and `is_not_withdrawn`. logical
is_rcd_cutoff_1 {trials} Whether `rcd_auto` on or before phase 1 cutoff of 2020-06-30 logical
is_searched_cutoff_1 {trials} Whether included in phase 1 manual search. Based on `is_eligible_study` and either `is_rcd_cutoff_1` or `has_auto_reg_results_cutoff_1`. logical
is_clinical_trial_manual {trials} Whether a clinical trial, per manual screening. logical
is_covid_manual {trials} Whether the primary outcome of the clinical trial is the treatment or prevention of COVID-19/SARS-COV-2, per manual screening. logical
is_not_withdrawn_manual {trials} Whether not indicated as withdrawn on registry during manual screening. logical
pcd_manual {trials} Primary completion date, per registry during manual screening, if available. date
scd_manual {trials} Study completion date, per registry during manual screening, if available. date
rcd_manual {trials} Relevant completion date, based on manual screening. date
is_rcd_cutoff_1_manual {trials} Whether `rcd_manual` on or before phase 1 cutoff of 2020-06-30. logical
comments {trials} Comments on trials from manual data extraction. character
is_analysis_pop_cutoff_1 {trials} Whether included in phase 1 analysis population. Based on `is_eligible_study`, `is_rcd_cutoff_1`, and manual screening (`is_clinical_trial_manual`, `is_covid_manual`, and `is_not_withdrawn_manual`). logical
is_dupe {trials} Whether trial is a duplicate in our dataset. Duplicates determined based on cross-registrations found in manual search. 18 duplicates found in phase 1 and excluded from analysis. logical
dupe_primary {trials} If trial `is_dupe`, `id` of primary trial. character

Distribution

File Format  
registrations csv Download
results csv Download
trials csv Download