Why are companies often running single arm studies in development? 
What are the potential drawbacks and why is it not a binary choice between full comparative study and single arm trial?
What are solutions in between to find a balance between feasibility and rigour?

Single arm studies are a popular method for collecting data despite being critiqued for decades. It is a type of research design in which the investigator only observes one group of participants over time. 

This design is often used when sponsors claim it would be unethical to randomise patients into different groups or when comparing two treatments would be too difficult. 

However, there are several drawbacks associated with single arm studies. First of all, bias can easily creep into the results.

Anja Schiel – the most recognised regulatory statistician in Europe –  and I discuss the potential drawbacks of running single arm studies, as well as ways to balance scientific rigour with feasibility.

Tune in while Anja and I give some of the great advice we have come up with. 

We also discuss the following points:

  • Are you seeing a return to single arm studies from an HTA perspective?
  • What potential drawbacks should be considered when opting for non-randomized comparative trials instead of randomized control ones?
  • If a full RCT isn’t feasible, what other viable approaches are available as substitutes?
  • How can sponsors best communicate their research with HTAs on these alternative solutions and explain the value they present in comparison to an RTC design study?

Listen to this episode and share them with your friends and colleagues!

Never miss an episode!

Join thousends of your peers and subscribe to get our latest updates by email!

Get the shownotes of our podcast episodes plus tips and tricks to increase your impact at work to boost your career!

We won’t send you spam. Unsubscribe at any time. Powered by ConvertKit

Learn on demand

Click on the button to see our Teachble Inc. cources.

Load content

Anja Schiel

Senior Statistician at Veramed

She has studied Biology at the Johannes Gutenberg-University, Mainz, Germany. She received her PhD from the Free University in Amsterdam in 2006 and worked several years as Post-Doc on a range of subjects focusing on oncology, immunology and molecular biology, first at the University of Leiden and later at the University of Oslo, before starting at the Norwegian Medicines Agency (NoMA) in 2012.

At NoMA she is working as Special Adviser/Statistician/Methodologist both on regulatory (EMA) and HTA projects. She has been Chair of EMA’s Biostatistics Working Party 2017 – 2019 and from 2019 – 2022 she was Chair of the Scientific Advice Working Party (SAWP) at EMA. She continues currently as alternate member of the SAWP and is member of the new Methodology Working Party (MWP) recently established at EMA.

Since it was established in 2020 she has furthermore been a member of the Big Data Steering Group, working on utilising the full potential of the vast amount of health care data for regulatory decision making.

In her role as Team-leader international HTA (iHTA) at NoMA, she has been heavily involved in EUnetHTA JA3 and its successor, EUnetHTA 21. Her particular focus is on parallel EMA-HTA scientific advices, now in the role as Vice-Chair of the JSC-CSCQ.

She is furthermore involved in several academic and PPP projects as member of Scientific Advisory boards on subjects such as RWD, Patient reported outcomes, rare diseases, paediatric drug development and decentralized trials and digital tools.

Transcript

Interview Anja Schiel about single arm studies

[00:00:00] Alexander: Welcome to another episode, and today I’m really excited about a guest that I was thinking about having for a very long time, Anja Schiel, how are you doing? Anya?

[00:00:16] Anja: Fine. Thanks Alexander.

[00:00:18] Alexander: Okay, so if you are working in the HTA area, Anja probably doesn’t need any further introduction for all of the others. Anja, maybe you can explain a little bit what you do and how you got there.

[00:00:33] Anja: Yes, I’ve been working at the Norwegian Medicines Agency now since 2012. I have a background in biology, actually. Mostly I drifted off into statistics once the micro race came onto the picture and dataset started looking a little bit more complex than the usual 10 mice in in our experiments.

And I would say as a woman, with an interest in statistics, I was very strongly pushed into that direction by many people because I found it an strange combination. Most people don’t like statistics, but the real reason why I ended up doing the job I’m doing is because my agency was looking for someone who could explain complex statistical problems to people that don’t speak statistician. And that is where I think lots of my colleagues we have six point now at the agency, there’s barely anyone who’s a real statistician. They’re all coming from sideline backgrounds. I would say for the very reason that we need to communicate the translation, I would say. . So it’s not the fact, but rather we explain other people what it really means for their interpretation instead of, pecking everything in a language that anyhow, nobody understands. And that’s how I ended up there.

[00:01:53] Alexander: But you also have a couple of additional roles now. Tell us a little bit about EUNETA 21 at your role there.

[00:02:02] Anja: I’m one of the vice chairs of EUNETA 21 Joint Scientific Consultation Committee. The joint scientific consultations are the follow up of what used to be called the parallel advisors. . Which is a bit a confusing term because there’s also parallel advice with the F D A. So I think that’s where some confusion was living. And before that it was called early dialogues. . The idea is that for many years, people like me that work at the at the border between regulation and HTA have understood that there are problems. When those two stakeholders don’t communicate properly with each other so the industry is trying to serve two kings almost. And that can not really work unless these two kings and their kingdoms are connected and understand each other. And that is the main reason why we have these joint scientific consultations. They have been one of the two elements of the three joint action programs that EUNETA went through in the last decade.

Two of the products that the industry most strongly felt were needed. And the joint scientific consultations really had an enormous popularity with the usual problem that popularity kills. The HTA have limited capacity to do these kind of advices. The demand was much higher than what we could offer. But now with the new formal legislation coming in 2025, , the joint scientific consultations are one of the cornerstones together with the joint clinical assessments.

[00:03:36] Alexander: Yeah, I hope that both on the regulatory side as well, on the HTA side. So there will be a lot of new opportunities for statisticians to work on the non-pharma side in the future there. And across Europe we get a much stronger kind of representations there. At some point. I think that would be pretty cool. Okay. , but we don’t speak about EUNETA today. Also it’s great to have someone from that perspective here but we wanted to speak about a very specific point.

We both stumbled over the same. LinkedIn Post said, was there following up the ispo 2022 in Europe, and someone says yeah, we need to convince payers of single arm studies. And you said something like . Yeah. Single alarm studies shouldn’t maybe not be the kind of default options for developing trucks and getting it through HTA, yet on the other side there is the, very often , having a complete head-to-head study is maybe also not so perfect solution either. But there’s surely kind of other ways to move forward. Let’s start by talking about single arm studies. Where do you see most of these single arm studies coming in?

[00:05:00] Anja: Unfortunately they’re coming in where they are not supposed to be.

[00:05:03] Alexander: Okay.

[00:05:04] Anja: So there’s a lot in the oncology field. Probably the biggest area where we have a problem with a single arm trial. The way it is used, 15, 20 years ago, you would see a single arm trial as a phase two hypothesis generating perfectly okay. If you want to do that as a pharma industry. But when it comes to approval , it starts getting a problem. It should be a problem, but it’s almost a no-go for HTAs because obviously if one understands what HTAs are supposed to do, which is we first need to assess the internal validity of a study. Then we have to establish relative effectiveness, not just effectiveness, relative effectiveness. And that in itself already tells you there has to be something to compare to. And then in a last step, when we have to make this decision on whether to buy something or not, then we have to run through an exercise which kind of established the external validity, meaning that you have to contextualize your results in the context of a national healthcare system. That explains why 27 different member states have 27 different external validity realities they need to compare to. But with single arm start trials, it starts simply by, the internal validity it doesn’t have any if you are really honest about it. They are questionable usually also because they are almost always of small sample sizes because if you had enough patients, then I think we all would agree that you shouldn’t do a single arm trial to begin with because there is the risk of selection bias and that is really something that every statistician is aware of is a very huge problem. If you have that. And also single arm trials rely on a ton of assumptions. Unfortunately they often, like any justification, it’s just, making a statement. And as a matter of fact, we have seen in recent time that some of these assumptions have actually simply been wrong. And once you get more data, you realize that they were wrong to start with.

[00:07:02] Alexander: Do you have example there of service?

[00:07:04] Anja: You probably do remember the publication about the accelerated access from the F D A where they said that for some of the accelerated products either the survival data were never produced, or in some cases it was shown that there was no survival benefit and in worst case, you even have a detriment on the long run when you finally start doing the real analysis and the real data and you have comparisons. And that is because this concept of single arm trial is really one of a go, no-go decision making. But this is not how either regulators or HTAs are working. We are not go no go. We have a much com more complex problem. And we also have this issue that the endpoints that you can use in single arm trials. They are not necessarily considered clinically relevant by HTAs and observed response rate is something that is very highly debated and there are enough clinicians that will tell you it has no clinical meaning for patients. There are patients that claim it has clinical meaning for them. It depends on who you ask but for HTAs it is a huge problem and the endpoints that we are usually willing to accept the time to event endpoints cannot be interpreted in a single arm trial. It is that simple.

[00:08:26] Alexander: Yeah. So let’s say so there’s a lot of questions. So first is with observed rates. Anything that is, that has the name observed in it. So patient’s assessment, physician’s assessments, all these kind of different things are prone to all kind of different biases as we know. And they are much more susceptible to biases from investigators. You have, probably different Investigators in your face in this, in a clinical trial compared to that you have in real world? This is typical ones. You have probably different sites and they are all kind of different things. You have different ways to assess things. Maybe you look more regularly, you look like more, look more closely. You look more whatsoever. All these kind of things complicated even more now, if we would have something like survival rates. What’s, how would that help then? Or what, which kind of problems would you see remaining?

[00:09:31] Anja: The endpoint is just one aspect. The real issue for us is the comparison. It’s not even the randomization because statisticians should understand that the randomization is just an insurance that we don’t have this selection bias.

[00:09:48] Alexander: Yeah. Yeah.

[00:09:49] Anja: But once randomized, Everything happens in a trial. You never have full control. We understand that patients have different experiences, they do different things. Maybe they violate the protocol even without telling you, because on occasion patients might lie. So all these things are quite accepted. What I as a statistician find the problem is I can live with variability, heterogeneity with protocol violations, but I need to be able to assess them, and I can only assess the degree and the impact if I have a comparison. And without comparison, this is just all, assumptions promises by someone. No, it didn’t make any difference. I can’t assess this. And as a statistician, that’s the only thing I can’t accept. I can live with all kind of wild assumptions. Then it’s my task to prove that they are wrong. Or that they are overly optimistic or overly pessimistic or whatever, but in a single arm trial, I’m left with nothing that would allow me even to prove that it’s wrong. Or biased, that is the problem that I see as a statistician, but that’s very specific for the statistician. If I have to go and take my health economist role, there are so many additional levels that are coming in. But just to start with, as a statistician, I’m against single arm trials in a PI vital setting because I cannot make an inferential conclusion, and that is what I’m supposed to do. That’s at least how we work normally.

[00:11:20] Alexander: Yeah. Okay so let’s speak about the comparison and I guess if you say comparison, you’re referring to something else then something like the statement In the literature, we have seen a response rates of 2%, and here in the study we have a response rate of 20%. Therefore is a breakthrough treatment.

[00:11:43] Anja: Yes, and that’s definitely a problem. Nothing is more frowned upon. If you look into the HTA statisticians literature than what they call naive comparisons. This is, we all know that even the same company running the same trial has difficulties repeating their own trail. So how on earth am I supposed to com to rely on something that comes from a literature description where you know the, yes, they had some similar inclusion, exclusion criteria and anyone who has ever done a network meta-analysis realizes that this is a guarantee for absolutely nothing, because they’re going to throw out more studies with the same inclusion- exclusion criteria than you’re going to include.

It tells you something about the fact that, as I said, randomization is at baseline for the very second. You put this patient into the voice recorded randomization program and after that it doesn’t hold anymore. It’s only the assignment to the treatment. Then everything has an impact. Your treatment centers, is it multinational? Is it national? Regional preferences, local preferences whatsoever. And patients still have rights when they are participating in a trial, so they don’t have to behave like they’re little robots. And that’s exactly the point. So no, you cannot compare. Trials with each other unless you are really willing to go all the way trying to figure out, how good are they matching? Do we have additional variables that we can use for it? And we see that quite often in HTA submissions. We live with these indirect treatment comparisons. And what I use them for is to analyse how poor the fit actually is. Because if you end up with let’s say 20 patients out of 200 that you can really match, it tells you something about how un uncomparable the data sources actually are.

[00:13:43] Alexander: Yeah. That’s a good way. I’ve just recently seen a graph at the APF meeting here in Germany, where someone talked about propensity scoring and you need to look into the overlap. And a way to look into this is the effective sample size. And if that kind of decreases dramatically, like you met, just mentioned by 80, 90%, then the overlap is not that big. And then it becomes a lot of extrapolation and a lot of additional assumptions. So then it gets really tricky. So there’s a lot of thinking about okay. Let’s start with some kind of target trial. It has the optimal trial. So let’s say optimally you would have a one-to-one randomized study with a comparator. Let’s call the comparator just for kind of standard of care at the moment. And you have and that would be your ideal trials that you would like to run. On the other hand, you have seen one on study where you only have your arm and you have more or less nothings and why? I just said some literature stuff. What would be if you go back away from the one-to-one randomized study, what would be the next solution from that, working towards these one arm study?

[00:15:06] Anja: I would say that I have to start with a statement that as a regulatory statistician, I agree on the one-to-one, blah, blah, blah. As an HTA statistician, I would already say nope. Doesn’t necessarily have to be like that.

[00:15:19] Alexander: Okay.

[00:15:20] Anja: So our I ideas about what is good evidence are not the same as the framework Regulators have established. So neither do we actually insist on any P-value or alpha level.

[00:15:36] Alexander: Yeah, I know.

[00:15:37] Anja: We call something, it’s okay to make it for planning purposes. It’s okay. To agree that this is a success criterion. So if you don’t make that, you shouldn’t call your drug a success or your drug development program. Or your trial. But in the HTA world, that really doesn’t count because if you come with a primary endpoint, that is irrelevant for us. We have to look at secondary endpoints anyhow. They have not been controlled for multiplicity most of the times. So we actually look at the data and whether there is a difference. So we will look at the confidence in the vault and we start instantly complaining when they overlap. No matter if you somewhere at some point had a statistical significant result, that’s not gonna save it for you.

And that’s why the R C T is not the optimal tool for HTAs per say but we have to live with the fact that you have to have some standards. Regulators have made these standards. This is a gold standard, but the gold standard isn’t the one to one or the p-value in itself, it’s the comparative aspect with a concurrent control. These two Cs are the core of what we want to see. If you came per single,

[00:16:54] Alexander: Just a moment, if you speak about current control, what’s that? What does that mean for you?

[00:16:59] Anja: It means that I don’t accept something that was generated 10 years ago. No matter how often you say it, it’s the same population. It isn’t time changes. I always say that every drug we approve is like a mini atomic bomb. It changes the world forever because the same patient population doesn’t exist anymore. Once this drug comes to the market, it’ll have an impact. And drugs that have been used before are not having the same efficacy relative effectiveness in the real world anymore because of this new kid on the block. It changes everything from that time point on. You might have a selection towards Patients with a better prognosis getting the new drug. Patients with a poorer prognosis might be more conservative. Physicians might be more conservative in some indications. Many physicians have a tendency to say the new is not always better. Let’s be careful. Uptake is very slow. In others like oncology, everybody jumps on everything New is by definition better, whether it has proof or not. And that’s exactly where I say look you have that. These impacts cannot be underestimated. In this whole picture.

[00:18:12] Alexander: Yep. .

[00:18:13] Anja: And that’s where concurrent is really concurrent means me patient in the trial, someone looks like me, has to be my control at the same time point as I am in the trial, not five years before, not even a year before, because that could also not be correct because my healthcare system changes. And this has to be taken along. And that’s what I mean with concurrent. But concurrent doesn’t mean included in a randomized clinical trial per se. There are options that people are not sufficiently exploring.

[00:18:46] Alexander: Yes. There’s, I think there’s also, especially if you enter areas where there’s multiple options. In a clinical trial, you have one in the world usually. I rarely see anything that has more than two more than one active comparator. Maybe you have an active and placebo, but that you have two active comparisons. I don’t know. Was I’ve actually I’ve seen that in HIV but that was a very specific thing but if you want to compare to all the different drugs out there. Especially if later see all the health economics and these kind of things come into place. You can’t do that within just one trial usually. You would always need to re you know, go back to indirect comparison, network analysis, these kind of things, isn’t?

[00:19:33] Anja: In a way, yes. But on the other hand, there is no real, it’s a perception that you have to compare yourself to everything out there. And that’s really not true, we just had a D I A meeting about the famous Pico. And the industry claims that it’s gonna be like 400 Picos in the European context. That’s really not true. The majority of us have one PICO that one describes likely our preferred first choice comparator. But for pretty much anybody who does cost utility analysis, the idea is that you need an anchor, from which to build onto the other comparators. That’s what we are looking for. That’s why we want head to head comparisons, not versus placebo even. We don’t like placebo per se. I’m trying to explain to the industry already for quite some time now that there is a strategic choice you can make because regulators will accept several comparators in your study. But if you are really smart, and I really hope there are lots of people that are going to listen to this that are really smart, you will choose your comparator for your randomized clinical trial based on an exercise on beforehand. Trying to figure out which comparator would give you the strongest network matter analysis or indirect treatment comparisons options for the HTAs and that might in, as a matter of fact, not always be the last approved drug, but rather one that where there is more evidence available. And if you make that exercise on beforehand, doing this network meta-analysis, and you can easily figure out if there is a comparator. That has huge advantages because there are many other studies that would allow you to build a stronger network that should be your first choice, because in the end it might be one of the picos. And the idea is that we can formulate Picos with different populations or different comparators, and the difference is between end and or we tend to say or meaning, we have a preferred comparator, but we would also accept another comparator and a third comparator in first case. And if any of those are the ones that you can pick and that help you to then make this extrapolation to the other comparators, the contextualization that is required for different countries, then you can make your job easier by just doing your homework on beforehand, not afterwards, because that’s what always happens. You’re on your trial, you’ve do discussed it only with the regulators. They said, yes, this is okay. They don’t tell you something else will also be possible. They just say, no, this is okay. Because that’s a question you’re asking instead of saying to yourself, what’s the strategic best choice?

And would that still be acceptable for the regulators? And probably it would be. They wouldn’t say no to that either. But then if you make your homework, you realize that it does pave the way for the other analysis that are needed for the 27 plus every other country in the world that does utility analysis, it’s not just US 27, there are many other countries that do the same. So this is strategic thinking and it means that you have to start thinking reimbursement from day one before you start your drug development. You know when you’ve decided the go, this is going to be a drug we want to develop, then you have to start thinking reimbursement and how do I get it to the patient, not how do I get approval.

[00:23:08] Alexander: Honestly, having worked for about 20 years in the pharma industry, I can tell you that the this statisticians that work on regulatory and the statisticians, if they are any, that work on HTA can work actually much closer together and have an advantage here. So because the stats community within these companies is usually quite small, at least compared to all the other communities, the medics, the HTA, market access people, they’re multiples bigger than the HTA census Stats departments. And that can be an advantage. That you can know each other quite well and you can help each other. But of course that means, that you need to reach out. You need to learn from each other. I’ve yet to come across a statistician that knows both worlds, the regulatory and the HTA world inside out. I’m just not sure that, one career is long enough to become such an expert. Yes, there’s always the need for working together. And so as soon as you start thinking about your phase two, phase three plans from a regulatory side, it is so important to internally work together and to have someone from the stats side and can also think like these kind of strategic things.

There’s always usually someone that new product planning, HTA market access person, at least in the bigger companies that have already one drug on the market and have went all of pain. Have someone within your stats department that works on that. That is so important. Okay.

So yeah that’s really helpful. So you don’t need to compare to everything within a clinical trial. You can compare to one that gives you a lot of strategic options. I can one areas that I’ve worked a lot in is Poria, for example, and there are two drugs. One is etanercept and the other one is Ustekinumab. And lots of studies have been run against cs. Of course most studies have been run against placebo, but if you look into the networks, these are the ones that really stand out now. Etanercept. Maybe not so much the standard of care anymore with the Kingdom Up is a little bit more kind of reasoned. It’s probably still the workhorse of many dermatologists out there. . So that might be actually a good choice for a comparator. Where you get a lot of bridge comparisons against all the new treatments out there because there’s a lot of Edward studies against Ustekinumab and that gives you also strengths you need while also giving you something relevant for day-to-day physicians still, maybe you not see avangard dermatologist that always jump to the newest ones. But actually dermatologists are, more as a conservative people. They stick with their standard precon for quite some time. Maybe that’s also another problem, but We have said lots of opportunities.

Okay, very good. So we have the let’s now do the step from the RC team. We have the relevant comparator and it’s maybe one-to-one randomized. What would be the first, let’s say step towards a one arm study?

[00:26:46] Anja: I always think in terms of what’s the question? You need to answer. I’m really a lot in favor of thinking adaptive design because that really does allow you to stop. Your control arm, for example, once you re reach, you can do the threshold crossing, for example, or you can define Hallmark reach points that you have to reach. For example, in some of the rarer diseases, you can probably best define something where you say okay, now we’ve seen enough and we’ve seen. I think where sometimes the statisticians are getting in the way on the regulatory side they wouldn’t get in the way on the HTA side because enough is not the same as statistically significant. So yes, you cannot accept a certain risk accepting that you’ve seen enough, which might not be the same as the inferential framework and you need to be willing to discuss that. What is enough? When do I feel that I have seen the data that make me certain enough on safety or an efficacy endpoint that I would accept that now the controls can go out. But it depends also on your claim. And I think there’s something that, it’s just, it’s a bit broken in our system. Patients, payers, HTAs, we all want better drugs. But if you can’t get better, doesn’t mean that someone who is as good as couldn’t come to the market, preferably with a lower price or preferably with another advantage. So yes, there are reasons why you want additional drugs on the market, but the idea seems to be that everybody says we have to be better but that’s so difficult. So to avoid having to be better, we do a single arm trial. Which is a completely unlogical choice. And that’s I think, where I’m most frustrated with the system. It doesn’t have to be better, it can be just as good. The point is that you have to prove it. And the proof is only possible when it’s comparative. It’s never possible by just claiming that I am potentially better, theoretically better. No, bring me data and let’s discuss what this data has to look like. And there are, as I say, many options to decide on, okay, maybe you’re not better, it doesn’t mean that you are, you couldn’t be in the armamentarium of the dermatologist as a useful tool. Maybe your side effects are different. Or the frequency of administration is preferable for some patients, but not for others. Really not the issue. The issue is the proof and you have to prove any claim of better, just as well as you have to prove any claim of not worth then and that has to come from, and there always has to be some R c T part in your development program. How big that has to be. That’s a different point. And just as you say, for example, if you understand that, Your trial is going to be enormously big. If you have to include a very heterogeneous patient population, then the discussion has to be not around. How can we minimize the costs and the burden to the companies in the development program and claim, we want to spare patients. We can come back to the spare patient’s argument.

The point is, which question? Can only be answered by comparative data and which questions can be potentially answered by a different approach and a different approach. For example, for me and many have published stuff on it, including me and Vares and a couple of colleagues on hybrid designs where we say not everything has to happen in the R C T. You can find additional information, for example, on populations that you do not want to include in the trial due to heterogeneity. You can find information on the natural history, but please not from 20 years ago. It concurrently.

[00:30:54] Alexander: So have if you talk about hybrid is designs, it means you have some kind of external controls.

[00:31:00] Anja: Yes. Observational aspects I would call them. And there are many options. Few of them are discussed because at the moment it is pretty much the thinking that if we go observational, then we do that in phase. When everything is already broken. So we have created a problem by design. We did not do the right clinical trial for everyone. We have missed collecting irrelevant information on subpopulations or on other comparators, on different endpoints that are more relevant for others. But now we want to fix this problem after effects and that’s never gonna work. We all know that’s not gonna work because again, concurrent is the miracle word when it comes to this.

And that’s where I think statisticians have a huge potential to show everything between the R C T and the single arm trial. There’s a myriad of possibilities. What you could do that minimize the number of patients that have to be potentially exposed to. They might find unattractive, but keep in mind, unless you have proven it black on white, that you are better or as good as any claims that ruin the acpo of the development program are really misplaced. Patients also need to understand that new is not always better and they are going into this trap of there. I always say that there’s a reason why we have double blind as an aim because physicians are equally bad in judging what is really going on. They also mostly see what they want to see. And as a statistician, when I tell people what we are doing, I always say we are the science of trying to help people to avoid seeing things that are not there, that’s what we do.

[00:32:54] Alexander: Or just seeing the things that we want to see?

[00:32:57] Anja: Yes and trying to give them a possibility to understand no matter how much they want something to work there is an alternative explanation, and if there’s enough evidence that the alternative explanation that it doesn’t work is supported, then they have to simply accept that fact. That’s what statisticians are supposed to do not tell people that all the time, no, you can’t do this or you can’t do that. We have to explain to them the potential of misinterpretation, the risk they’re taking if they make a decision based on something that’s not solid enough and not robust enough, understanding the dangers of not having, a control arm. And yes, at the very end of this discussion, when we have offered them many alternatives in between, that would all have benefits in terms of their design being more robust, more attractive, and still attractive enough for patients then we can in the end come to the conclusion that in some instances a single arm trial is the only option. And I agree on that. But an important aspect is always you need to make a difference between generating just evidence for some signal of efficacy and the wish to generate scientific knowledge. I think trials have to be the letter, not the first. They need to contribute also to the scientific knowledge and avoiding all the, problematic questions. By doing a minimalistic design is when the industry refuses to contribute to the building of scientific knowledge. And also I agree that we have to protect participants in trials as best as possible.

But then again, a trial is an experiment. There is no human right to participate in a experiment. It’s your free choice. There’s also no human obligation to participate in an experiment if you don’t want to. But what the experiment should in the end, lead to is information for future patients. And at the moment with the way we are developing drugs, we are sacrificing the future patient. The argument that we have to protect the participant in trials from having to do something that is unpopular. I keep wondering all the generations before us that had to go through all the randomized clinical trials to give us the drugs we have today, have they all been idiots? No. For doing so?

No, they haven’t. And do we have an obligation to try to at least. Participate in these experiments for the greater good? Yes, we do. If you don’t want to, you don’t have to, but if you do, then please participate in a good trial.

[00:35:42] Alexander: Yep. And actually I think it is, I would put it a way around. It’s ethically questionable to put someone in a bad study. You put them at risk without a lot of benefit in the end, and I’m I really like the approach with the adaptive design. Because that gives you lots of opportunities, especially if you at the same time also run a prospective observational. Where you get exactly what you talked about, the concurrent treatment, you get patients from other sites that don’t participate in the study. You get maybe also patients from the sites that don’t fit, seen exclusion criteria are certain vulnerable populations whatsoever. And that helps you to exactly establish that framework that you talked about in terms of, CC external what did you say? Contextualization. And it also helps you to understand what’s really going on at the moment in the studies? In, no, not in the studies, in the clinical. Usually we have different ways of treatment in different countries, in different regions of the world. And if you’re talking about a rare disease, that’s also a great opportunity to get in touch with all the different researchers around the world. Not, I wouldn’t call that my kind of primary objective here. But, getting in touch. With all these people that care about the same patients that you wanna treat in the future. And that want that should benefit from ament in the future has never been a bad idea. I think that’s a very good approach.

By the way, it also, these types of studies help you get data on many more aspects. Think that you will get asked throughout the HTA process. Any amputate, any kind of burden of disease data, any data about what are the typical treatments out there? What are the treatment patterns where are the problems? What are a patient what do patients care about? All kind of different things you can learn from this observational data that you actually cannot learn from the R C T. And I completely. observational studies shouldn’t be only run in phase four. I think they are very well placed before phase before approval. Just that, of course, in an observational study, you can’t have your experimental product. But you can collect lots of data that help you contextualize your experimental work.

[00:38:21] Anja: Yes, absolutely. And sure if you those areas where everybody agrees that maybe a single arm trial is the only option actually those often diseases where you have a hundred patients globally, and yes, you want these physicians and these patients to build a network because most likely they will show you that they have poor data on their natural which kind of speaks again against this alarm trial. They have an extremely heterogeneous treatment option map, because everybody tries something, but nothing really works. That’s disadvantage of being so rare, but instead of trying to Really identify the evidence gaps at the start, work towards it and say okay, maybe, maybe it isn’t this generation. We can save, or we can bring a good treatment to this generation, but for the next generation we can build a basis on which we can finally identify a working treatment when we have one, because it’s in the orphan area where you really wonder, is the treatment really not working or is it just because the data are so poor that we can’t see it might be working. And that’s where you run into this, where you say okay, that is just poor planning from everybody’s side. It’s not identifying early enough what kind of evidence is needed. And that evidence is not just the trial for my drug. It is this larger context that you’re describing, what I call the scientific knowledge. And scientific knowledge is actually what is needed for good decision making and decision making, strangely enough, falls into the category of HTAs payers physicians, patients, their families. We all have decisions to make and we all feel like we have to make them on insufficient information coming from drug develop.

And that’s where, the line is where we say this has to change. It simply has to change because I feel uncomfortable. I’m getting old enough. I’ll be a patient sooner or later for something more serious than the little aches and blah blah that I’m having. And if I go to my doctor and I have the idea, they look at me and say yeah, we have this armamentarium. I have this box 10 drugs in there, make your pick. Take your favorite color or the favorite package size, because as a matter of fact, I cannot tell you, you should, you as a patient should take first the green ones, then the blue ones, and then the yellow ones, because I have evidence that supports this. Instead, they’re just offering me all of this and it’s again, back to where we started, trial and error. That was a reason why we invented statistics in the first place. Why we invented RCTs, why we are hammering on that we need, better evidence because we don’t want to have this trial and error. And yet when you go into the clinic, you start feeling like, okay, there’s an awful lot of trial and error going on. Awful lot of, in my experience coming from, your doctor and I’m really like, I don’t want experience. I want evidence.

[00:41:35] Alexander: Yeah. completely agree. That is you basically just summed up my personal vision. I want to make sure that payers, physicians, and patients, especially patient, and the caregivers, their parents their kids, if they are elderly, have the right evidence. It’s the right time and in the right format to make the right decisions. So the last is also really important. It’s not sufficient that the evidence is somewhere out there hidden behind a paywall of a journal or in clinical trials that caught that nobody other than us that decisions and sometimes even we can’t understand what’s really in there. So that is really important. Even if the evidence is there, it’s hardly communicated well throughout the system so that we can make informed decisions. Data literacy is surely one part there, but as a industry, we can definitely improve that.

[00:42:38] Anja: Yeah, absolutely. And I think one of the things that I discovered and I was really happy was I know that not everybody is so fond of the estimate framework and feels it’s, just adding another layer of complexity to everything. But I did in fact, discover a couple of studies that didn’t use some kind of fancy acronym as a title of their study in clinical trials.gov anymore, but they actually used the estimands description. And that is something that I would really love to see everybody to do. Don’t give me the wonder or miracle acronym that you torture out of whatever description you have given your, to your trial. But tell me really explicitly on the first page in the description, what your trial is actually doing, because that would help me also to make a better selection on saying okay, these are not relevant anyhow, don’t have to dig into, get frustrated. No report. No data reported. 99% of everything I look at, apparently I’m really poor at finding studies with results, I would say, so that’s another step forward.

Everybody has to learn and that’s where the estimands framework really. Is still not reaching the right audience. Everybody has to learn to put in words. That’s Albert Einstein said, if you can’t explain it simply, then you still haven’t understood it yourself. You need to be able to explain what your trial is going to do, which question are you going to answer? And then you can go to others and say, is this actually a really relevant question for you? And that includes patients and they will, I think, very often tell you. I don’t really understand why this is a relevant question to begin with, that’s where the dialogue starts. That’s where you start understanding, okay, I’m doing something, but it isn’t helping someone else to make a decision. So why is that the case and is there something we could do to improve that? At least it doesn’t have to be perfect, and I agree. It’s not always something that works. You cannot ask patients what they find important when that is something that you cannot operationalize in a trial, I understand that. But you can still at least then start thinking, okay, if that is so important for patients, can we somehow generate evidence around that topic in some other way? Doesn’t have to be an end point in a clinical study per se. Can we find some other way, or can we simply at some point start, designing studies that are never meant for approval by F K A or E M A, but these are studies that are meant to provide information, relevant information for others, and they don’t need to follow some kind of, statistical rigorous framework. If you just be honest about it and you say what you really want to do, you can still use the statistics, but you don’t have to be a sucker for the alpha 5%.

[00:45:32] Alexander: Yeah. Completely agree. Awesome. We touched a lot of stuff in this nearly one hour chat about HTA regulators, one on studies, clinical trials that are head to head. What is actually a good comparator? What’s a strategic choice? The overall evidence that you need to have, such as not just your clinical trials. There’s a lot of companies that talk about the integrated evidence plan and better you are in that as a statistician or as a stats function. I think actually we can actually drive that very nicely. And what we talked about a couple of different design options as well to how to get there.

Thanks so much, Anja. That was an outstanding chat. If you would like the statistician listening to this with one key takeaway, what would that be?

[00:46:31] Anja: That we have to see ourselves as facilitators for trial designs that answer scientific questions. We are scientists and we should stand for that. And I know it’s very hard to, pick up that fight for anyone. It’s a hard fight on the regulatory side. It’s a hard fight on the HTA side. It’s definitely a hard fight within pharma companies. I know the argument, why should we do something more? Because our competitor got away with a single arm trial. That’s where we have to really push extremely hard and say, I saw it the other day on LinkedIn also. That is right. Even if nobody does it wrong is wrong. Even if everybody does it. And that should be our motto for 2023, I think.

[00:47:25] Alexander: Yeah. Upskill your leadership and influencing skills and then let’s do that together. Thanks so much. Have a nice time. And maybe we speak again on this podcast in the future.

[00:47:40] Anja: Yeah. Would love to. Thanks Alexander.

Join The Effective Statistician LinkedIn group

I want to help the community of statisticians, data scientists, programmers and other quantitative scientists to be more influential, innovative, and effective. I believe that as a community we can help our research, our regulatory and payer systems, and ultimately physicians and patients take better decisions based on better evidence.

I work to achieve a future in which everyone can access the right evidence in the right format at the right time to make sound decisions.

When my kids are sick, I want to have good evidence to discuss with the physician about the different therapy choices.

When my mother is sick, I want her to understand the evidence and being able to understand it.

When I get sick, I want to find evidence that I can trust and that helps me to have meaningful discussions with my healthcare professionals.

I want to live in a world, where the media reports correctly about medical evidence and in which society distinguishes between fake evidence and real evidence.

Let’s work together to achieve this.