Primary and Secondary Data in Emergency Medicine H
Primary and Secondary Data in Emergency Medicine H
Primary and Secondary Data in Emergency Medicine H
Abstract
Background This analysis addresses the characteristics of two emergency department (ED) patient populations
defined by three model diseases (hip fractures, respiratory, and cardiac symptoms) making use of survey (primary)
and routine (secondary) data from hospital information systems (HIS). Our aims were to identify potential systematic
inconsistencies between both data samples and implications of their use for future ED-based health services research.
Methods The research network EMANET prospectively collected primary data (n=1442) from 2017-2019 and routine
data from 2016 (n=9329) of eight EDs in a major German city. Patient populations were characterized using socio-
structural (age, gender) and health- and care-related variables (triage, transport to ED, case and discharge type, multi-
morbidity). Statistical comparisons between descriptive results of primary and secondary data samples for each vari-
able were conducted using binomial test, chi-square goodness-of-fit test, or one-sample t-test according to scale level.
Results Differences in distributions of patient characteristics were found in nearly all variables in all three disease
populations, especially with regard to transport to ED, discharge type and prevalence of multi-morbidity. Recruitment
conditions (e.g., patient non-response), project-specific inclusion criteria (e.g., age and case type restrictions) as well
as documentation routines and practices of data production (e.g., coding of diagnoses) affected the composition of
primary patient samples. Time restrictions of recruitment procedures did not generate meaningful differences regard-
ing the distribution of characteristics in primary and secondary data samples.
Conclusions Primary and secondary data types maintain their advantages and shortcomings in the context of
emergency medicine health services research. However, differences in the distribution of selected variables are rather
small. The identification and classification of these effects for data interpretation as well as the establishment of moni-
toring systems in the data collection process are pivotal.
*Correspondence:
Anna Schneider
[email protected]
Andreas Wagenknecht
[email protected]
Full list of author information is available at the end of the article
© The Author(s) 2023. Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which
permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the
original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or
other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line
to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory
regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this
licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativeco
mmons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
with one of three model diseases: (a) EMAAGE aimed at interview-relevant questions (e.g., regarding suspected
the emergency and follow-up health care of patients with diagnoses and leading symptoms or patients’ abil-
hip fractures; (b) EMACROSS focused on cross-secto- ity to be interviewed). Since the inclusion of patients in
ral care provision for patients with respiratory diseases the primary data sample was based on leading symp-
(with an initial focus on outpatients, which was expanded toms and not on final and confirmed diagnoses, there
throughout the recruitment process to also include inpa- was a possibility that a patient no longer possessed one
tients) [21, 22] and (c) EMASPOT targeted comorbid of the study-relevant diagnoses at the end of his or her
mental health conditions, such as depression and anxiety, ED treatment. In these cases, respective patients were
in elderly patients with cardiac symptoms and diseases subsequently excluded from the study population. The
[23]. screening process was documented in printed structured
questionnaires, i.e., screening logs [4]. After inclusion
Selection of participants and informed consent, study nurses interviewed patients
Inclusion into the primary data sample and extraction of for approximately 30 to 60 minutes with handheld tab-
secondary data was performed for patients in participat- lets containing electronic versions of the study-specific
ing study EDs with at least one of the respective leading questionnaires. Printed study materials were addition-
symptoms and diagnoses (see Additional Table 2) [24]. ally available in German, English, Arabic, and Turkish
Model diseases and respective diagnoses were initially language. All participants gave written permission for
chosen from a publication on ambulatory care sensitive review of their individual electronic health records for
conditions (ACSC) in the German healthcare setting and study-specific aims.
adapted to the need of patient recruitment in the ED so For secondary data collection, HIS data of the eight
that symptom diagnose codes were included [25]. ACSC EDs were retrieved for all patients that were treated in
are a group of common chronic and acute illnesses con- one of these EDs during the year 2016 and met the inclu-
sidered not to require inpatient treatment if appropriate sion criteria of at least one of the EMANET sub-studies
ambulatory care is received [26]. The basis of our analy- with respect to age and coded diagnoses according to
sis consisted of three data types: (a) primary data from the International Classification of Diseases (ICD), 1 0th
the three sub-studies with baseline surveys, as well as Revision [27] (see Additional Table 2). Since data were
data from an electronic case report form (eCRF) for the anonymized before extraction, patients’ consent was
period from 2017 to 2019, (b) a screening log that moni- waived. All EDs received a list of predefined variables for
tored and documented the recruitment process and rea- extraction including patients’ sociodemographic infor-
sons for non-participation for the period of the primary mation, diagnosis ICD-codes of ED and inpatient treat-
data collection, and (c) secondary data from the HIS ment, and parameters of ED care.
of participating EDs for the year 2016. The HIS dataset
included all ED visits of patients with respective study- Data management
related diagnoses and posed a complete representation of Primary data collection in patient interviews was tablet-
the relevant ED population in the specified time frame of based and data was automatically transferred after entry
one year. to a Research Electronic Data Capture (REDCap) tool
hosted at Charité – Universitätsmedizin Berlin. Each
Data acquisition study participant received a unique pseudonym. Screen-
For primary data collection, trained study personnel ing log data and administrative participant data was
recruited patients between June 1, 2017 and June 28, saved separately in a Microsoft Access database. Further
2019 during fixed time periods, generally on weekdays information from patients’ electronic hospital files was
between 8 am and 5 pm, in all of the eight participating manually entered by study nurses into a study-specific
EDs. Occasionally, recruitment was extended to week- eCRF in another REDCap database using the respec-
ends or weekday evenings. Patients were interviewed tive participant pseudonym. The central data manage-
by study nurses in the acute ED situation (EMASPOT, ment of EMANET collated data sets by using participant
EMACROSS) or postoperatively on hospital wards pseudonyms.
(EMAAGE). Potential study participants were identified Anonymized secondary data were prepared and deliv-
by study nurses via patient screening in participating EDs ered to the central data management of the coordinating
using data from the HIS. Inclusion criteria were project- unit of EMANET by the participating EDs, the infor-
specific leading symptoms and age (50+ years for EMAS- mation technology (IT) departments of the respective
POT and 18+ years for EMAAGE and EMACROSS). If hospitals or the vendors of the respective HIS in CSV
necessary, medical, nursing, or administrative ED staff (comma-separated values) or Microsoft Excel files. Data
was consulted for the clarification of inclusion- and delivery followed established data protection procedures
on password-protected devices as described in the pro- gender, triage category, transportation type, case type,
ject’s data protection concept. The central data manage- discharge type, and multi-morbidity. The choice of vari-
ment checked all data for completeness and plausibility ables is justified by their general availability within all
and linked all files of the participating EDs to yield one eight HIS in the participating hospitals, the compara-
final data set with secondary data from all EDs. Due to tively small number of missing data within variables,
varying documentation standards between the partici- and their relevance to clinical routine in EDs. Except for
pating EDs, it was necessary to harmonize the data to multi-morbidity, all variables were generated by directly
establish comparability. HIS data were adapted accord- interviewing patients or by data extraction from each
ing to data harmonization rules consented by a working ED’s documentation system. Triage categories were
group of EMANET researchers. These rules basically fol- structured according to the Manchester Triage System
lowed the data harmonization recommendations of the (MTS). Transportation to the ED was coded in walk-
INDEED project [28]. All primary and secondary data in (including patients accompanied by relatives), by
was stored on servers of the Charité – Universitätsmedi- ambulance services (transportation of non-urgent and
zin Berlin. mobility-impaired patients), by emergency medical ser-
For comparative analyses between primary and sec- vices (EMS), and by EMS accompanied by an emergency
ondary data it was necessary to create two differently physician. Case type was coded into in- or outpatient.
tailored datasets of secondary data. One secondary data Discharge type for inpatients was differentiated in dis-
set was adapted to the recruitment conditions and inclu- charge to home or an existing care arrangement, transfer
sion criteria of the primary data sample (see Fig. 1 and to another hospital or other health care facility, death,
Additional Figs. 1, 2, 3 for each sub-study), i.e., second- or other. Multi-morbidity was defined according to van
ary data for EMACROSS included only patients reg- den Bussche et al. as three or more chronic diseases of
istered in EDs between 8 am and 5 pm analogous to a predefined list of ICD-codes [29]. For our compara-
patient recruitment times of the primary data sample tive analyses, ICD-10 diagnoses documented in HIS and
and secondary data in EMASPOT only included patients diagnoses reported by participants were used to deter-
with an age of 50 years or older and registered in EDs mine multi-morbidity.
between 8 am and 5 pm. The second data set included all
patients presenting with relevant diagnoses (for EMAS- Statistical analysis
POT: age ≥ 50 years) independently from the time of ED The aim of the statistical analysis was to compare
admission. primary data and secondary data regarding their
similarity in the distribution of various ED-related
Measurements parameters. For descriptive statistics, the continu-
In this analysis, the following variables were selected ous variable (age) was characterized by its mean and
as suitable for comparison between data types: age, standard deviation (SD). Categorical variables (gender,
Fig. 1 Illustration of primary and secondary data samples across the three study populations in EMANET. Legend: Ellipses depict the primary
data sample and rectangles secondary data samples used for analyses. Arrows between shapes illustrate data samples which were compared
numerically in this study. Numbers summarize patients with relevant study diagnoses for all three research projects (EMAAGE, EMACROSS,
EMASPOT). 1The sample of the general ED population registered between 8am and 5pm excludes ED patients with study-related diagnoses from
EMAAGE since patients were recruited into this study on wards following ED treatment without time restrictions. Abbreviations: ED: emergency
department, HIS: hospital information system
triage category, type of transportation to the ED, all relevant items of the Strengthening the Reporting
multi-morbidity, discharge type, case type) were sum- of Observational studies in Epidemiology (STROBE)
marized as numbers (percentages) of subjects. We Statement in this manuscript [30].
computed 95% Wald confidence intervals for the dis-
tribution of percentage estimates of categorical vari- Results
ables for population proportions of n≥5. In order to Patients with hip fractures (EMAAGE): primary
estimate the distribution of variables in primary data versus secondary data samples
in comparison to the ED population with respective In the population consisting of patients with hip frac-
diagnoses in secondary data, we calculated binomial tures (EMAAGE; n=326 in primary data sample; n=439
tests, chi-square goodness-of-fit tests, or one-sample in secondary data sample), the distributions of age and
t-tests according to scale level. Available case analysis gender were similar between both samples (see Table 1).
(pairwise deletion) in SPSS version 27 (IBM Inc.) was Although statistically significant, only minor differences
used for all analyses. In order to discuss differences were found in the triage category, i.e., slightly more
between variable distributions in the study-specific patients with triage category 2 (“very urgent treatment”)
primary and secondary data samples, primarily differ- were included in the primary cohort compared to the
ences in point estimates and confidence intervals of secondary data population (16.4% vs. 12.6%; see Addi-
each variable were consulted. Graphical distributions tional Fig. 4). Concerning the other triage categories,
in the form of bar charts for multi-categorical variables we observed only small deviations between 1 percent-
(MTS category, transportation to ED, discharge type) age point (pp) and 2pp, rendering the distribution of
are available in Additional Figs. 4, 5 and 6. We included patients in both populations between triage categories
Table 1 Description and statistical comparison of patient characteristics in primary and secondary data samples (EMAAGE)
Variable Primary data sample Secondary data sample p value
(n=326) (n=439)
n (%, [95% CI]) n (%)
Gender .380
Male 107 (32.8, [27.7; 37.9]) 140 (31.9)
Female 219 (67.2, [62.1; 72.3]) 299 (68.1)
Age in years (mean (SD)) 75.8 (12.1, [74.5; 77.1]) 76.8 (13.6) .132
Multi-morbidity: Yes 213 (65.3, [60.1; 70.5]) 252 (57.4) .002
MTS categorya .043
1 (immediate treatment) 1 (0.4) 6 (2.4)
2 (very urgent treatment) 44 (16.4, [12.0; 20.8]) 32 (12.6)
3 (urgent treatment) 187 (69.8, [64.3; 75.3]) 181 (71.5)
4 (normal) 32 (11.9, [8.0; 15.8]) 34 (13.4)
5 (not urgent) 4 (1.5) 0 (0)
Transportation to ED <.001
Walk-in 9 (2.9, [1.0; 4.8]) 26 (7.1)
Non-urgent medically accompanied transport 49 (15.9, [11.8; 20.0]) 76 (20.7)
Emergency medical services 211 (68.3, [63.1; 73.5]) 239 (65.1)
EMS with emergency physician 40 (12.9, [9.2; 16.6]) 26 (7.1)
Case type <.001
Outpatient 0 (0) 14 (3.2)
Inpatient 326 (100) 425 (96.8)
Discharge type <.001
Home or existing care arrangement 100 (31.6, [26.5; 36.7]) 191 (44.9)
Transfer (to another hospital or health care facility) 206 (65.2, [59.9; 70.5]) 206 (48.5)
Deceased 8 (2.5, [0.8; 4.2]) 24 (5.6)
Other 2 (0.6) 4 (0.9)
Note: 95% Wald confidence intervals were computed for population proportions of n≥5. CI Confidence interval, ED Emergency department, EMS Emergency medical
services, MTS Manchester Triage System, SD Standard deviation; aThe chi-square goodness-of-fit test was conducted with an adjusted variable for MTS category
containing categories 1 to 4 due to missing values in the secondary data sample
similar. Concerning case type, while all patients in the 8pp more multi-morbid patients compared to HIS data
primary sample were hospitalized after ED treatment, (65.3% vs. 57.4%).
3.2% of patients from the secondary sample were coded
as released after their ED stay in HIS data. The varia- Patients with respiratory diseases (EMACROSS): primary
bles transport type, discharge type and multi-morbidity versus secondary data samples
showed more profound differences. In transport catego- In patients with respiratory diseases (EMACROSS; n=472
ries, differences between 3pp and 5pp were found (see in primary data sample; n=1,563 in secondary data sample
Additional Fig. 5). However, the proportion of patients (presentation between 8am and 5pm), n=3,410 in second-
transported by EMS was similar, i.e., the confidence ary data sample (without time restriction)), characteristics
interval included the estimate of the secondary data between primary and secondary data samples showed
sample with respective diagnoses. The discharge type pronounced differences in the comparison of deviations
altered between cohorts, e.g., 16pp more patients in the regarding pp and confidence intervals (see Table 2). How-
primary data sample were transferred to another health ever, the distribution of gender and multi-morbidity did
care facility, while 13pp less patients were discharged not show any relevant differences in the three samples.
home or to a pre-existing care arrangement (see Addi- Study participants (53.6 years) were on average six years
tional Fig. 6). Finally, the primary data sample included younger than patients in the secondary data samples (59.9
Table 2 Description and statistical comparison of patient characteristics in primary and secondary data samples (EMACROSS)
Variable Primary data sample Secondary data sample p value Secondary data sample p value
(n=472) (presentation between 8am (presentation throughout
n (%, [95% CI]) and 5pm) the day)
(n=1563) (n=3410)
n (%) n (%)
years and 59.1 years, respectively). More patients were Patients with cardiac symptoms and diseases (EMASPOT):
discharged home or to existing care arrangements (87.0%) primary versus secondary data samples
than in the secondary data samples (80.3% and 79.7%, In the third project EMASPOT (n=644 in primary data
respectively; see Additional Fig. 6). Mortality was lower sample; n=2,777 in secondary data sample (presenta-
in the primary data sample (0.2%) than in the secondary tion between 8am and 5pm), and n=5,480 in second-
data samples (8.3% and 7.7%, respectively). Accordingly, ary data sample (without time restriction) patients with
less patients in the triage categories 1-3, and correspond- cardiac symptoms and diseases of comorbid mental
ingly more patients in categories 4 and 5 were recruited health conditions (MHCs) were screened and recruited.
(see Additional Fig. 4). The type of transport to the ED Differences in percentage points in the distribution of
differed: primary data contained fewer patients who were patient characteristics in primary and secondary data
transported by ambulance and EMS, but slightly more samples were rather small, although statistically signifi-
patients who were accompanied by an emergency physi- cant (see Table 3). Data types did not differ concern-
cian, and far more walk-in patients (see Additional Fig. 5). ing case type. Study participants were slightly younger
Finally, EMACROSS recruited substantially more outpa- (68.4 years) than the secondary data samples (69.8
tient cases compared to the proportion of outpatients in years and 69.5 years, respectively). Slightly less patients
the secondary data samples. from triage category 1 were recruited into the primary
Table 3 Description and statistical comparison of patient characteristics in primary and secondary data samples (EMASPOT)
Variable Primary data sample Secondary data sample p value Secondary data sample p value
(n=644) (presentation between 8am (presentation throughout
n (%, [95% CI]) and 5pm) the day)
(n=2777) (n=5480)
n (%) n (%)
sample (0.5%) in comparison to the secondary data although statistically significant, differences in distribu-
sample (1.4% and 2.4%, respectively; see Additional tions of patient characteristics between primary and sec-
Fig. 4). Small differences of 3pp to 4pp were found for ondary data samples were found in most variables and for
discharge types; however, no specific direction or rec- all three sub-studies. Age and gender distributions in study
ognizable structure was detectable (see Additional participants mostly reflected the secondary data sample
Fig. 6). The EMASPOT sample showed a clear differ- which was also reported in similar trial studies [31].
ence in gender distribution, where 7pp more men were Differences in patient and case characteristics can
included in the primary data sample. As in the other be attributed to recruitment conditions, study-specific
sub-studies, moderate differences between primary inclusion criteria and the modus operandi of documen-
and secondary data were found in the distribution of tation in hospitals. One of the central reasons for the
transport type (see Additional Fig. 5). The largest dis- observed differences in data samples are the specifics of
crepancy of 14pp between populations was found in the the recruitment situation and process of the three sub-
prevalence of multi-morbidity with higher frequency studies, which has also been argued by Roos et al. in the
for study participants (68.3%) in comparison to second- context of longitudinal studies [11]. The effects of recruit-
ary data samples (54.1% and 52.6%, respectively). ment practices for the composition of study populations
in health care research are discussed broadly. Some trial
Non‑responder analysis in the primary data sample studies investigated barriers to patient recruitment, such
and missing data as migration background, language barriers, cognitive
Of all eligible patients, 56.3% in EMAAGE, 45.7% in characteristics that make informed consent difficult or
EMACROSS and 45.4% in EMASPOT gave consent to the perceived lack of benefits for the patients [32–37].
participate in the respective studies. Reasons for non- The issue of recruitment barriers and their effects on
participation of eligible patients were described else- study population composition are certainly important
where as a summary for all three study populations [24]. when studies operate with the goal of achieving certain
Gender and age – the only available categories for com- case numbers and response rates. Identified further fac-
parison in non-responders and participants – showed tors influencing recruitment were, e.g., certain commu-
slight differences. In EMAAGE, proportionally more nication channels (telephone vs. mail) [38], specific time
women participated. In EMASPOT and EMACROSS, points of recruitment [39], and the recruitment experi-
participants were younger than non-responders (see ence of study nurses [40].
Additional Table 1). Our analysis focused on patient characteristics and spe-
Missing values in primary data samples and in the cific features of the recruitment situation. Generally, time
complete secondary data sample (with presentations restrictions in EMACROSS and EMASPOT did not gener-
throughout the day) applied to MTS category, transpor- ate meaningful differences with regard to the distribution
tation to ED, and discharge type (see Additional Table 3). of characteristics in primary and secondary data samples.
Furthermore, data on case type was missing in EMAC- However, in EMAAGE retrospective patient inclusion
ROSS and EMASPOT. Analyses of missing data in the was practiced, so that in this sample the time of presenta-
secondary data sample revealed that information was tion to the ED was irrelevant. In cardiac patients, the pro-
partly not available (in the required format) from all eight longed stay on the Chest Pain Unit (CPU) of the ED may
ED HIS for the above listed variables: data on case type have helped to include patients during working hours that
was not available in one ED; transportation to the ED initially presented during night hours. Given the almost
and discharge type in two EDs; and MTS category in four similar distributions of patient characteristics between
EDs. Concerning the amount of missing values at the ED samples, we conclude that restricting study recruitment to
level, all of the above listed variables were supplied by specific times of the day does not hamper the inclusion of
three EDs, one missing variable each was observed in two a patient population similar to the target population. This
ED HIS datasets, two missing variables each in a further finding appears to be a special feature and novelty of our
two ED HIS datasets and three missing variables were contribution. Whether this feasibility of comparisons as
observed in one ED HIS dataset. well as the seemingly negligible impact of time restrictions
to recruitment can be generalized to other clinical settings
Discussion is beyond the scope of this article. As available literature
This contribution’s novelty lies in the comparison of pri- suggests, recruiting and data collection heavily depend on
mary and secondary data in the emergency medicine the properties of certain settings [41–43].
health services research context and its inclusion of three The effect of the recruitment process is particularly
different patient populations and respective indications noticeable for participants with respiratory diseases
from eight EDs, which is unique so far. Mostly minor, in EMACROSS whose characteristics differed more
profoundly from the secondary data sample with respira- the most pronounced differences between the data sets
tory diseases. Concerning the distribution of patients’ tri- in EMASPOT and EMAAGE with higher rates of multi-
age categories, severely ill and acute patients in category morbid patients in the primary data sample. This might
1 were less often included in the primary study sample. be explained by two aspects: The primary data set was
Recruitment of patients for interviews of 30 to 60 min- tailored to detect certain comorbidities that are not sys-
utes who are in need of immediate treatment is not feasi- tematically documented in ED diagnoses. Especially
ble for medical and ethical reasons. Differences between for ambulatory ED patients in the secondary data sam-
populations found in triage categories therefore can ple, only diagnoses relevant for ED treatment are docu-
be regarded as unavoidable. Concerning EMACROSS, mented in the HIS. Thus, comorbid conditions might not
recruitment might have been additionally hampered by have been systematically documented by healthcare per-
patients’ physical inability to conduct an interview due to sonnel, as other studies also pointed out [29]. The defi-
shortness of breath or respiratory therapy in the ED. We nition of multi-morbidity applied in this data sample is
observed that older patients with respiratory complaints dependent on thorough ICD-coding [29, 48, 49]. Thus,
were more likely to be non-responders, which might using ED diagnoses from HIS for determining patient
have influenced the age distribution in the recruited multi-morbidity is potentially less suitable, since relevant
population. As studies on hospice patients [31, 44, 45] diagnoses might be lacking and comorbidities are also
or patients in stressful situations [46] argued in a simi- often recorded in form of free text. Therefore, prevalence
lar way, primary data collection might be inappropriate of multi-morbidity in ambulatory ED patients might be
or at least comes with a higher share of nonresponse, if underestimated. In primary data collection for research
patients suffer from certain illnesses. The same reasoning purposes, study personnel cannot only inquire relevant
generally applies to studying diseases that affect patients’ diagnoses from patients themselves, but also search
communication skills. Thus, if the importance of particu- through the electronic patient file in HIS on past hospi-
larly severely ill patients is relevant to the research ques- tal stays, physician’s letters, and other sources of infor-
tion, recourse to HIS data may be more appropriate. mation. This argument is in line with reviews that have
Inclusion criteria and respective changes during the examined the construction of the variable multi-morbid-
recruitment process are of particular relevance for ity [50]. We complement the point made by Stirland et al.
the total composition of a population. Participants in [50] by saying that the choice of data source is relevant
EMAAGE and EMASPOT reproduced the distribution and needs to be critically reflected when doing research
of ambulatory and inpatient stays in the secondary data on complex variables like multi-morbidity. However, the
sample with respective diagnoses. The overrepresenta- failure to diagnose and document specific conditions in
tion of ambulatory participants (and thus associated sur- the ED, e.g., mental health disorders, is another relevant
plus of younger and healthier patients) in EMACROSS is point when considering the reliability of both primary
explained by the project’s initial inclusion criteria focus- and secondary data concerning completeness of diagno-
ing on outpatient ED patients. Participants of all sub- ses [23, 49].
studies slightly differed with regard to discharge types Finally, with regard to patients’ transport to the ED,
from the secondary data samples. The difference in the inconclusive differences were observed in all transport
number of deceased patients in EMACROSS and res- categories between primary and secondary data popula-
piratory ED patients in general might be explained by the tions. No content-related explanation for these differ-
fact that this sub-study recruited mostly younger patients ences was found, thus indicating that recruitment in our
with ambulatory health care needs and rather average to study failed to reproduce the actual pattern of patients’
low acuity measured by triage categories [47]. transport to the ED.
In EMAAGE, we observed an overrepresentation of
patients who were transferred to other health care facili- Limitations
ties in the primary data sample. This might be due to Our research combines comprehensive data of two types
the focus of the study personnel on patients’ final care on three ED patient populations with common model
arrangements documented in electronic patient files diseases. Nevertheless, our analyses are subject to limi-
while HIS data only captures the most immediate dis- tations. Firstly, samples of primary and secondary data
charge type after hospital treatment, e.g., discharge were drawn from two different time periods due to
home. This indicates the relevance of documentation research-practical reasons and availability of secondary
routines and data production practices in patient sur- data. This might have influenced absolute numbers in
veys and routine data. This was even more obvious in sample composition. However, no changes in the rela-
the case of multi-morbidity, which is a generally com- tive distribution of patient- and care-related character-
plex variable [48]. Multi-morbidity was the variable with istics in participating EDs should have occurred in the
rather short period between 2016 and 2019, as there were small. Observed differences in patient characteristics of
no major changes in the prevalence of studied (mostly the primary data sample might have been influenced by
chronic) diseases, medical guidelines for ED treatment recruitment practices (e.g., non-response, length and
of these diseases, and in the structure or processes of ED type of survey administration), project-specific inclu-
care in Germany. Secondly, some variable categories in sion criteria (e.g., language and cognitive requirements
primary and secondary data sets originally differed and for study participation, focus on specific case types) and
were thus harmonized retrospectively for comparative differing documentation rationales. Nevertheless, pri-
analyses, which might have introduced minor distortions mary data allow a comprehensive and detailed collection
of results. Thirdly, three variables (triage category, trans- of information on specific patient groups. The higher
portation to ED, and discharge type) showed high pro- workload from patient recruitment and resulting lower
portions of missing values, especially in secondary data, sample sizes in primary datasets may be disadvantages.
which might have influenced results. Reasons for high In contrast, the secondary data sample depicts the full
missing values in secondary data samples were mainly population of ED patients with respective diagnoses in a
due to the fact that some EDs did not transmit data on specific time frame, although this data type bears the risk
certain variables, e.g., because this information was not of incomplete information due to missing values or non-
collected in the respective ED at all (no mandatory field usable data formats in HIS documentation.
in documentation forms) or information was not col- The aim of health services research studies is to depict
lected systematically for every ED patient. If the amount real-life conditions of health care provision in certain
of missing secondary data from specific study EDs would patient groups or settings. Future research studies with
be systematically associated with the above-mentioned primary data collection should thus additionally establish
ED factors, the distribution of the respective variables close concomitant monitoring practices during patient
in our analysis of secondary data might be biased. How- recruitment, in order to timely detect potential devia-
ever, from descriptive analysis of ED features and the tions from targeted sample characteristics. Additionally,
pattern of the amount of missing values per ED, no sys- our analysis has shown the need for systematic, harmo-
tematic bias in this direction became evident so that we nized and complete secondary data documentation in
can reasonably assume that missing values in our data- hospital information systems for health services research
sets occurred completely at random. Lower missing val- purposes.
ues in primary data in the same variables might point to
the advantage of primary data collection by trained study
Abbreviations
nurses, who closely observed the care process of study ED Emergency department
participants during recruitment and manually retrieved HIS Hospital information system
not readily available information from all electronic doc- ACSC Ambulatory care sensitive conditions
eCRF Electronic case report form
umentation in the patient file. Fourthly, identification of ICD International Classification of Diseases
cases in primary and secondary data was different (lead- REDCap Research Electronic Data Capture
ing symptoms in primary recruitment and diagnoses in MTS Manchester Triage System
EMS Emergency medical services
secondary data), which might have affected the compa- SD Standard deviation
rability of populations. Lastly, the secondary data sample MHC Mental health condition
consists of cases from eight EDs. However, the number CPU Chest Pain Unit
of patients treated in each ED differs vastly between EDs.
As documentation of variables was not harmonized prior Supplementary Information
to data extraction, systematic differences may occur. The online version contains supplementary material available at https://doi.
org/10.1186/s12874-023-01855-2.
Competing interests
Additional file 6: Figure 6. Graphical distribution of discharge type in pri- The authors declare that no conflicts of interest exist.
mary and secondary data samples in EMAAGE, EMACROSS, and EMASPOT.
Additional file 7: Table 1. Description of non-responders and study Author details
1
participants regarding gender and age from screening logs. Charité – Universitätsmedizin Berlin, corporate member of Freie Universität
Berlin and Humboldt-Universität zu Berlin, Institute of Medical Sociology
Additional file 8: Table 2. Inclusion criteria for the three EMANET sub-
and Rehabilitation Science, Berlin, Germany. 2 Charité – Universitätsmedizin
studies EMAAGE, EMACROSS, and EMASPOT.
Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität
Additional file 9: Table 3. Description of missing data in primary and zu Berlin, Division of Emergency Medicine, Campus Virchow-Klinikum
secondary data samples in MTS category, transportation to ED, discharge and Campus Charité Mitte, Berlin, Germany. 3 Charité – Universitätsmedizin
type, and case type. Berlin, corporate member of Freie Universität Berlin and Humboldt-Universität
zu Berlin, Institute of Social Medicine, Epidemiology and Health Economics,
Berlin, Germany. 4 Charité – Universitätsmedizin Berlin, corporate member
Acknowledgements of Freie Universität Berlin and Humboldt-Universität zu Berlin, Institute
The authors would like to thank the participating hospitals in the EMANET of General Practice, Berlin, Germany. 5 Charité – Universitätsmedizin Berlin,
research network, namely Charité – Universitätsmedizin Berlin (Campus Char- corporate member of Freie Universität Berlin and Humboldt-Universität zu
ité Mitte and Campus Virchow-Klinikum), St. Hedwig Hospital (Alexianer Berlin Berlin, Department of Psychosomatic Medicine, Berlin, Germany. 6 Charité
St. Hedwig-Krankenhaus), Elisabeth Hospital (Evangelische Elisabeth Klinik der – Universitätsmedizin Berlin, corporate member of Freie Universität Berlin
Paul-Gerhardt Diakonie), Franziskus Hospital (Franziskus-Krankenhaus Berlin), and Humboldt-Universität zu Berlin, Institute of Biometry and Clinical Epidemi-
German Armed Forces Hospital Berlin (Bundeswehr Krankenhaus Berlin), ology, Berlin, Germany.
German Red Cross Hospital Berlin-Mitte (DRK Kliniken Berlin-Mitte), and Jewish
Hospital (Jüdisches Krankenhaus Berlin). The authors would like to thank all Received: 14 April 2022 Accepted: 30 January 2023
study participants as well as all EMANET researchers and study personnel
responsible for study planning, patient recruitment, and data management.
13. Sakshaug JW, Couper MP, Ofstedal MB, Weir DR. Linking survey and administra- 33. Shepherd V, Wood F, Gillies K, O’Connell A, Martin A, Hood K. Recruit-
tive records: Mechanisms of consent. Sociol Methods Res. 2012;41(4):535–69. ment interventions for trials involving adults lacking capacity to consent:
14. Cornesse C, Bosnjak M. Is there an association between survey character- Methodological and ethical considerations for designing Studies Within a
istics and representativeness? A meta-analysis. Survey Research Methods. Trial (SWATs). Trials. 2022;23(1):756.
2018;12(1):1–13. 34. Fête M, Aho J, Benoit M, Cloos P, Ridde V. Barriers and recruitment strate-
15. Nederhof E, Jörg F, Raven D, Veenstra R, Verhulst FC, Ormel J, et al. gies for precarious status migrants in Montreal. Canada BMC Med Res
Benefits of extensive recruitment effort persist during follow-ups and are Methodol. 2019;19:41.
consistent across age group and survey method. The TRAILS study. BMC 35 Nielsen AL, Jervelund SS, Villadsen SF, Vitus K, Ditlevsen K, TØrslev MK,
Med Res Methodol. 2012;12:93. et al. Recruitment of ethnic minorities for public health research: An
16. Lee KK, Fitts MS, Conigrave JH, Zheng C, Perry J, Wilson S, et al. Recruiting interpretive synthesis of experiences from six interlinked Danish studies.
a representative sample of urban South Australian Aboriginal adults for a Scand J Public Health. 2017;45(2):140–52.
survey on alcohol consumption. BMC Med Res Methodol. 2020;20:183. 36. Dancy BL, Wilbur J, Talashek M, Bonner G, Barnes-Boyd C. Community-
17. Smith MG, Witte M, Rocha S, Basner M. Effectiveness of incentives and based research: Barriers to recruitment of African Americans. Nurs
follow-up on increasing survey response rates and participation in field Outlook. 2004;52(5):234–40.
studies. BMC Med Res Methodol. 2019;19:230. 37. Khidir A, Asad H, Abdelrahim H, Elnashar M, Killawi A, Hammoud M, et al.
18. Prada-Ramallal G, Roque F, Herdeiro MT, Takkouche B, Figueiras A. Primary Patient responses to research recruitment and follow-up surveys: Find-
versus secondary source of data in observational studies and heteroge- ings from a diverse multicultural health care setting in Qatar. BMC Med
neity in meta-analyses of drug effects: A survey of major medical journals. Res Methodol. 2016;16:10.
BMC Med Res Methodol. 2018;18:97. 38. Treweek S, Pitkethly M, Cook J, Fraser C, Mitchell E, Sullivan F, et al. Strate-
19. Lee DS, Donovan L, Austin PC, Gong Y, Liu PP, Rouleau JL, et al. Compari- gies to improve recruitment to randomised trials. Cochrane Database
son of coding of heart failure and comorbidities in administrative and Syst Rev. 2018;2(2):MR000013.
clinical data for use in outcomes research. Med Care. 2005;43(2):182–8. 39. Haidich AB, Ioannidis JP. Patterns of patient enrollment in randomized
20. Patel A, Rendu A, Moran P, Leese M, Mann A, Knapp M. A comparison of controlled trials. J Clin Epidemiol. 2001;54(9):877–83.
two methods of collecting economic data in primary care. Fam Pract. 40. Vluggen S, Hoving C, Vonken L, Schaper NC, de Vries H. Exploring factors
2005;22(3):323–7. influencing recruitment results of nurses recruiting diabetes patients for a
21. Schmiedhofer M, Inhoff T, Krobisch V, Schenk L, Rose M, Holzinger F, randomized controlled trial. Clin Trials. 2020;17(4):448–58.
et al. EMANET - Regionales Netzwerk für Versorgungsforschung in 41. Visanji E, Oldham J. Patient recruitment in clinical trials: A review of litera-
der Notfall- und Akutmedizin [EMANet: A regional network for health ture. Phys Ther Rev. 2013;6:141–50.
services research in emergency and acute medicine]. Z Evid Fortbild Qual 42. Whitelaw S, Baxendale A, Bryce C, MacHardy L, Young I, Witney E.
Gesundhwes. 2018;135–136:81–8. ′Settings′ based health promotion: A review. Health Promot Int.
22. Holzinger F, Oslislo S, Resendiz Cantu R, Möckel M, Heintze C. Diverting 2001;16:339–53.
less urgent utilizers of emergency medical services to primary care: Is 43. Vadeboncoeur C, Foster C, Townsend N. Challenges of research
it feasible? Patient and morbidity characteristics from a cross-sectional recruitment in a university setting in England. Health Promot Int.
multicenter study of self-referring respiratory emergency department 2018;33(5):878–86.
consulters. BMC Res Notes. 2021;14(1):113. 44. Williams CJ, Shuster JL, Clay OJ, Burgio KL. Interest in research participa-
23. Figura A, Kuhlmann SL, Rose M, Slagman A, Schenk L, Möckel M. Mental tion among hospice patients, caregivers, and ambulatory senior citizens:
health conditions in older multimorbid patients presenting to the emer- Practical barriers or ethical constraints? J Palliat Med. 2006;9(4):968–74.
gency department for acute cardiac symptoms: Cross-sectional findings 45. Phipps E, Harris D, Braitman LE, Tester W, Madison-Thompson N, True
from the EMASPOT study. Acad Emerg Med. 2021;28(11):1262–76. https:// G. Who enrolls in observational end of life research? Report from the
doi.org/10.1111/acem.14349. (Epub 2021/07/27). cultural variations in approaches to end of life study. J Palliat Med.
24. Schneider A, Riedlinger D, Pigorsch M, Holzinger F, Deutschbein J, Keil T, 2005;8(1):115–20.
et al. Self-reported health and life satisfaction in older emergency depart- 46. Voss R, Gravenstein S, Baier R. Recruiting hospitalized patients for
ment patients: Sociodemographic, disease-related and care-specific research: How do participants differ from eligible nonparticipants? J Hosp
associated factors. BMC Public Health. 2021;21(1):1440. Med. 2013;8(4):208–14.
25. Freund T, Campbell SM, Geissler S, Kunz CU, Mahler C, Peters-Klimm F, 47. Holzinger F, Oslislo S, Möckel M, Schenk L, Pigorsch M, Heintze C. Self-
et al. Strategies for reducing potentially avoidable hospitalizations for referred walk-in patients in the emergency department - who and why?
ambulatory care-sensitive conditions. Ann Fam Med. 2013;11(4):363–70. Consultation determinants in a multicenter study of respiratory patients
26. Purdy S, Griffin T, Salisbury C, Sharp D. Ambulatory care sensitive condi- in Berlin, Germany. BMC Health Serv Res. 2020;20(1):848.
tions: Terminology and disease coding need to be more specific to aid 48. Johnston MC, Crilly M, Black C, Prescott GJ, Mercer SW. Defining and
policy makers and clinicians. Public Health. 2009;123(2):169–73. measuring multimorbidity: A systematic review of systematic reviews. Eur
27. World Health Organization. ICD-10: International statistical classification J Public Health. 2019;29(1):182–9.
of diseases and related health problems: Tenth revision. 2nd ed. 2004. 49. Figura A, Rose M. Ambulatory care-sensitive conditions and mental
28. Fischer-Rosinský A, Slagman A, King R, Reinhold T, Schenk L, Greiner F, health disorders: A short overview of the current state of research. Intern
et al. INDEED – Utilization and cross-sectoral patterns of care for patients J Emerg Ment Health. 2016;18(4):1.
admitted to emergency departments in Germany: Rationale and study 50. Stirland LE, González-Saavedra L, Mullin DS, Ritchie CW, Muniz-Terrera G,
design. Front Public Health. 2021;9:616857. Russ TC. Measuring multimorbidity beyond counting diseases: systematic
29. van den Bussche H, Koller D, Kolonko T, Hansen H, Wegscheider K, review of community and population studies and guide to index choice.
Glaeske G, et al. Which chronic diseases and disease combinations are BMJ. 2020;368:m160.
specific to multimorbidity in the elderly? Results of a claims data based
cross-sectional study in Germany. BMC Public Health. 2011;11(1):101.
30 von Elm E, Altman DG, Egger M, Pocock SJ, Gøtzsche PC, Vandenbroucke Publisher’s Note
JP, STROBE initiative. The strengthening the reporting of observational Springer Nature remains neutral with regard to jurisdictional claims in pub-
studies in epidemiology (STROBE) statement: Guidelines for reporting lished maps and institutional affiliations.
observational studies. Lancet. 2007;370(9596):1453–7.
31. Stewart RR, Dimmock AEF, Green MJ, Van Scoy LJ, Schubart JR, Yang C,
et al. An analysis of recruitment efficiency for an end-of-life advance
care planning randomized controlled trial. Am J Hosp Palliat Care.
2019;36(1):50–4.
32. McCann SK, Campbell MK, Entwistle VA. Reasons for participating in
randomised controlled trials: Conditional altruism and considerations for
self. Trials. 2010;22(11):31.
1. use such content for the purpose of providing other users with access on a regular or large scale basis or as a means to circumvent access
control;
2. use such content where to do so would be considered a criminal or statutory offence in any jurisdiction, or gives rise to civil liability, or is
otherwise unlawful;
3. falsely or misleadingly imply or suggest endorsement, approval , sponsorship, or association unless explicitly agreed to by Springer Nature in
writing;
4. use bots or other automated methods to access the content or redirect messages
5. override any security feature or exclusionary protocol; or
6. share the content in order to create substitute for Springer Nature products or services or a systematic database of Springer Nature journal
content.
In line with the restriction against commercial use, Springer Nature does not permit the creation of a product or service that creates revenue,
royalties, rent or income from our content or its inclusion as part of a paid for service or for other commercial gain. Springer Nature journal
content cannot be used for inter-library loans and librarians may not upload Springer Nature journal content on a large scale into their, or any
other, institutional repository.
These terms of use are reviewed regularly and may be amended at any time. Springer Nature is not obligated to publish any information or
content on this website and may remove it or features or functionality at our sole discretion, at any time with or without notice. Springer Nature
may revoke this licence to you at any time and remove access to any copies of the Springer Nature journal content which have been saved.
To the fullest extent permitted by law, Springer Nature makes no warranties, representations or guarantees to Users, either express or implied
with respect to the Springer nature journal content and all parties disclaim and waive any implied warranties or warranties imposed by law,
including merchantability or fitness for any particular purpose.
Please note that these rights do not automatically extend to content, data or other material published by Springer Nature that may be licensed
from third parties.
If you would like to use or distribute our Springer Nature journal content to a wider audience or on a regular basis or in any other manner not
expressly permitted by these Terms, please contact Springer Nature at