One of the good things about research that has its own issues, is that there is lots of scope to learn from the things about it that are good, as well as those that aren’t so great. The nice thing about ongoing comment is it gives even more chances to explain why a researcher might make certain choices along the way. Every question in research has more than one way of approaching some answers. Dr Alan Garner returns to provide even more background on this particular study, which has already generated some interesting conversation and a follow-up post.
It’s an excellent thing to be able to keep having discussion around the challenges related to both conducting and interpreting a trial. These things always bring up so many valuable questions, which deserve a response. So this is not going to be quick, but I hope you’ll have a read.
Lots of things changed between the time this trial was designed and now. Standards of care change. Systems, processes and governance models change. Indeed, in this trial standard care changed underneath us. We completed the protocol and gained ethical and scientific committee approval for this study during 2003.
The world was a different place then – at the start of 2003 George W Bush was US President and Saddam Hussein was still running Iraq. There is no keener instrument in medicine than the retrospectoscope particularly when focused 12 years back. Would I have done things differently if I knew then what I knew now – absolutely. Does the trial have hairs? Looks like a yak to me and I don’t think we are pretending otherwise.
Did we ask the right question? The question was pragmatic. Add a doctor and with them comes a package of critical care interventions that were not routinely otherwise available in our standard EMS system. A number of cohort studies had previously looked at exactly this question and more studies have asked it since. Even papers published this month have examined this question although the issue often overlaps with HEMS as that is how the doctors are frequently deployed.
I might segue slightly to address dp’s question as well which overlaps here. Is it the procedures that the team performs or the person performing the procedures that matter? Dp suggests that a better study design would be to have them all use the same protocols then we compare doctors with non-doctors. Such a randomised trial has actually been done although it is a long time ago now – 1987. It is one of the classic Baxt and Moody papers and was published in JAMA.
Patients were randomly assigned to a helicopter staffed by a flight nurse/paramedic or a flight nurse/emergency physician. The flight nurse and emergency physicians could perform the same procedures under the same protocols including intubation, chest tubes, surgical airways and pericardiocentesis. By TRISS methodology there was lower mortality in the group that included the doctor and the suggestion was this might be related to how they judged the necessity for intervention, rather than technical skill. This study is well worth a read. They note that the outcome difference might have been removed if the nurse/paramedic team was more highly trained but where does this end? We then move into the question of how much training is enough training and this is an area that I think is still in its infancy. Each time you do some research your prompt a whole lot of extra, usually interesting questions.
All That Methods Stuff
Anyway, back to this paper. All analyses presented in this paper were pre-specified in the statistical analysis plan. Although the protocol paper was not published till 2013, the statistical analysis plan (SAP) was finalised by the NHMRC Clinical Trials Centre in August 2010, more than a year prior to follow up of the last recruited patients. Copies of the SAP were then provided to the trial funders and NSW Ambulance at the time it was finalised in 2010. Along the way we have presented data in other settings, mostly at the request of interested parties (such as the Motor Accidents Authority who specifically requested analyses of road trauma cases) and in retrieval reviews. This is why there has been the opportunity for extra public scrutiny by experts like Belinda Gabbe. And public scrutiny is a good thing.
And Standard Treatments?
I’m very happy to provide some reassurance that this study did not rely on junior doctors being put through EMST/ATLS and then sent out to manage severe prehospital trauma patients. Rather the trial protocol states that treatment was according to ATLS principles. In 2003 there was no other external standard of care that we could cite for trauma patient management that was widely and internationally recognised.
The specialists had of course all completed EMST/ATLS but they were also all critical care specialists in active practice in major trauma centres in Sydney with ongoing exposure to severe trauma patients. The average years of prehospital trauma management experience held by this group of doctors at the beginning of the trial was more than 12 years each. They operated to those high level of treatment standards, with regular reviews of management to make sure this was current best practice over the life of a trial that ended up being longer than we hoped.
Other Dimensions of Time
And time wasn’t a friend. Recruitment was indeed slower than planned. This is a common problem in prospective trials. Our estimates of how long it would take to recruit the required sample size were based on a written survey of the major trauma centres in Sydney in 2003 to determine how many unconscious adult blunt trauma patients they were seeing each year. This was reduced to 60% to reflect the fact the trial would recruit for only 12 hours each day (although during the busiest part of the day) and the time needed to recruit was then estimated at 3 years. We in fact planned for 4 years to allow for the fact that patients usually disappear when you go looking for them prospectively. This of course is exactly what happened but to a greater degree than we planned.
I agree it would have been nice to have the results formally published earlier. We did present some results at the ICEM in Dublin in June 2012. It is interesting to note that Lars Wik spoke immediately before me at this conference presenting the results of the CIRC trial on the Autopulse device. This study was finally published online in Resuscitation in March 2014, more than three years from recruitment of their last patient and this trial did not include a six month neurological assessment as HIRT did. Getting RCTs published takes time. Given we did have to perform six month outcome assessments I don’t think we were too far out of the ball park.
Randomising in Time Critical Systems
Just to be sure that I really have the right end of the stick on the question of excluding patients after randomisation I ascended the methodology mountain to consult the guru. For those that don’t know Val Gebski he is Professor and Director, Biostatistics and Research Methodology at the NH&MRC Clinical Trials Centre in Sydney. He was our methodology expert from the beginning of planning for the trial.
When I reached the mountain top I had to leave a voice message but Val did eventually get back to me. He tells me excluding patients post randomisation is completely legit as long as they are not excluded on the basis of either treatment received or their outcome. This is why he put it in the study design.
These are essentially patients that you would have excluded prior to randomisation had you been able to assess them properly and of course in our study context that was not possible. The CIRC study that I have already discussed also adopted this approach and excluded patients that did not meet inclusion criteria after enrolment.
Prehospital studies where you have to allocate patients before you have been able to properly assess them are always going to have these kind of difficulties. The alternative for a prehospital RCT would be to wait until you know every element of history that might make you exclude a patient. How many of us have that sort of detail even when we arrive at the hospital?
Extra Details to Help Along the Discussion
The newly met reader might also like to know that the call off rate was about 45% during the trial, not 75%. This is not different to many European systems. If you don’t have a reasonably high call off rate then you will arrive late for many severely injured patients.
And of course the HIRT study didn’t involve “self-tasking”. The system randomised cases on a strict set of dispatch guidelines, not on the feelings of the team on the day. This process was followed for nearly 6 years. There was not a single safety report of even a minor nature during that time. Compliance with the tasking guidelines was audited and found to be very high. Such protocolised tasking isn’t inherently dangerous and I’m not aware of any evidence suggesting it is.
It’s reassuring to know that other systems essentially do the same thing though perhaps with different logistics. For example in London HEMS a member of the clinical crew rotates into the central control room and tasks the helicopter using an agreed set of dispatch criteria. This started in 1990 when it was found that the central control room was so poor at selecting cases, and it resulted in the call off rate falling from 80% to 50%. The tasking is still by a member of the HEMS team, they just happen to be in the central control room for the day rather than sitting by the helicopter.
A more recent study from last year of the London system found that a flight paramedic from the HEMS service interrogating the emergency call was as accurate as a road crew assessing the patient on scene. This mirrors our experience of incorporating callbacks for HIRT.
The great advantage of visualising the ambulance Computer Assisted Dispatch system from the HIRT operations base by weblink was the duty crew could work in parallel in real time to discuss additional safety checks and advise immediately on potential aviation risks that might be a factor.
To consider it another way, why is the model safe if the flight paramedic is sitting at one location screening the calls but dangerous if he is sitting at another? What is the real difference between these models and why is one presumably a safe mature system and the other inherently dangerous?
I agree that the introduction of the RLTC to mirror the HIRT approach of monitoring screens and activating advanced care resources (with extension to a broader range) was a good thing for rural NSW. However they did activate medical teams into what are very urban areas of Sydney who were neither a long way from a trauma centre nor was there any suggestion they were trapped. Prior to the RLTC the Ambulance dispatch policy for medical teams was specifically circumstances where it would take the patient more than 30 mins to reach a trauma centre due to geography or entrapment. Crossover cases obviously didn’t explain the whole of our frustrating experience of recruitment, but it was one extra hurdle that finally led us to wrap recruitment up.
You can’t bite it all off at once
In a study where you collect lots of data, there’s no publication that will let you cram it all into a single paper. So there are definitely more issues to cover from the data we have. This includes other aspects of patient treatment. So I will be working with the other authors to get it out there. It might just require a little bit of time while we get more bits ready to contribute to the whole picture.
Of course, if you made it to the end of this post, I’m hoping you might just have the patience for that.
Here’s those reference links again:
That Swiss paper (best appreciated with a German speaker).
The earlier London HEMS tasking paper.