In our world of clinical trials and observational studies, missing data based on drop outs or for others reasons leads to a challenge in understanding how well treatments work. Treatment policy estimands help us to understand efficacy based on early treatment decisions. Various approaches, like reference-based imputation and delta adjustment, exist to speculate what may have happened after treatment was discontinued. However, these methods are often inconsistent, and more efficient methods are required.
In this episode, Alberto and I discuss how his new approach handles different scenarios with missing data. As a 26-year veteran of the pharma statistics industry that recently completed his PhD research, Garcia brings a wealth of knowledge and experience to this topic.
So, let’s dive into the details of this innovative framework for estimating policy estimands such as the following:
- The framework for estimating policy estimands, utilizing multiple imputation and analytical models for faster results.
- Competing intercurrent events and censoring
- How the framework allows users to estimate the treatment effect size using different strategies
Listen to this episode now and share this with your friends and colleagues!
Biostatistics Consultant at Argenx
Alberto is a biostatistician with over 27 years of experience in clinical trials. He has worked in various roles within contract research organizations, biotech firms, and multinational pharmaceutical companies across several countries. He is currently based in Madrid, Spain, where he works remotely as a consultant. In addition to his work, he has spent the past five years conducting PhD research, focusing on the analysis of clinical trials with missing data caused by intercurrent events.
Framework for Estimating Policy Estimands
[00:00:00] Alexander: Welcome to another episode of The Effective Statistician, and today I’m really happy to talk with Alberto about one of the most hot topics, the Estimands, and especially here about something that is Yeah, one of the main problems is treatment policy approaches and what happens when, we don’t have the data and what can we do there? And he and his Coors just published a really nice paper about a framework that can help us accomplish lots of these different things in a very alergent way. But before we dive into the the details welcome Alberto on the show. For those who don’t know you, maybe you can introduce yourself, shortly.
[00:00:55] Alberto: Hello, Alexander. This is really nice. I’m really happy to be here in the episode. Thank you for in the invitation. So I’m Alberto Garcia. I’m from Madrid, from Spain. I’ve been in the industry about 26. I started working 1996 in a CRO, so in not 30 years, but still just only three years to have this 30 years anniversary.
My experience is a pharma statistician experience. Initially in CROs, one a small CRO in Spain, and later in at P D in the uk. I lived in Cambridge, then I moved to pharma. I work in a biotech and in Spain. I worked in Astellas Pharma in the Netherlands. And at the moment, I’m working from home in back in Madrid. I’m a consultant working for different clients mainly doing what I do to lead the statistics in clinical trials. Phase two, phase three. That is my background.
[00:01:51] Alexander: Just before we started with the recording, you mentioned you did a little bit of a break maybe also because of covid and restructurings and things like this, and went on to do your PhD.
[00:02:05] Alberto: How? Yes. Basically when I worked in Astellas in the Netherlands starting about 10 years ago I started doing a bit of research. A few papers I published in different topics, risk benefit. Topic mainly and when Astellas decided to basically move r and d to the US and Japan, it was a moment when I thought, okay, let’s use that moment where after almost about 20 years working to have a bit of a break to do a bit of research. And at that moment, I decided to do that the research within the scope of a PhD research, eh, so I’ve been working now for about five years with professor per and pardon Madrid and also Professor Brisolus in Rotterdam, in Erasmus doing our research that is about missing data in presence of intercurrent events. So it’s fully focused on the estimands topic.
[00:03:02] Alexander: Yeah. Cool. Very good. I love that you were pursuing this. I think there’s a lot of people that think about it, talk about it, but never do it. And I think it’s a great, an inspiration for maybe some of the people listening at the moment. That’s a should I do a PhD later in my career? Am I too old or whatsoever? Here you have someone that yeah, at the moment finalizing it and and went for it. Yeah. Congratulations on that move.
[00:03:35] Alberto: Thank you. I’m now 49, so it’s true. It is never too late. And doing this at this age is maybe a bit hard because you have to combine it with family and work. But on the other hand, I think you can really do a research that you already, you will find useful in your work and more about what you are going to do the research on it. When you do the research when you are 20, maybe it’s just basically the professor at the university they choose the topic for you and maybe at the end you might not even use it. I really enjoyed it was very interesting and I learned a lot and I finish was interesting.
[00:04:08] Alexander: Bad is cool. Yeah. And missing data or estimates is also today our discussion. When we think about the treatment policy approach, so an approach where you wanna understand the efficacy based on the decisions that you make at the start of the treatment. Yeah. So you wanna understand, okay, if you now start with treatment X as compared to treatment Y, what will be the outcome of it? In 12 weeks, 24 weeks, two years whatsoever. Yeah. That is the treatment policy approach. And now the more, more advanced part in this is of course how do you exactly define treatment X and treatment Y? So can you tell us a little bit more about what are main challenges, there that you face faced with the treatment policy approach?
[00:05:13] Alberto: Okay, so this it is very good point and good question and it’s really interesting that if you look at the definition of treatment policy strategy at the very first site is the easiest strategy because by definition you have an intercurrent event that it depends on your setting, but maybe treatment discontinuation or maybe the use of rescue therapy by definition, using a treatment policy strategy for that intercurrent event means that you just have to ignore it. You shouldn’t exclude data because, the data have been collected after that interconnect event. So initially it’s super simple, you just ignore it and you just analyze your point at the final visit. For example, visit 52.
Okay? That is initially simple. However the problem is that in some indications, in some trials in the way they have been designed the data you can have a situation, two different situations or three different situations. The optimal situation, the data have been collected equally after that INTERCURRENT event. So even for the subjects with treatment discontinuation, for example let’s focus on treatment discontinuation as intercurrent event but the same applies to other intercurrent events. So the option one, this, the data have been collected equally after tremend discontinuation equally means that the missing is there, but the likelihood to be missing is the same.
It’s basically you keep collecting the data in the same way. You might lose some patience. But the inter current event is not changing totally. The amount of data you have that you can use, in that case, it’s super simple because you just ignore that event. You do, and you apply a step forward standard model.
If you have a continuous, you use period, but then sometimes you have other situations. Let’s go to a second an extreme situation. Let’s switch totally to, the extreme situation in some indications and in some protocols once the treatment is discontinued, the subjects are not followed up anymore. Or at least the primary endpoint is not collected. Sometimes it’s because maybe it requires an invasive test because it’s not the same if the primary endpoint is observed easily. Yeah. And you can go to the patient after the subject discontinued treatment is not happy with the trial, but you can still collect that information. But other times..
[00:07:39] Alexander: Biopsy or something like that.
[00:07:41] Alberto: Yeah. And the subject discontinued treatment is not happy with the trial and they will so if you have an extreme situation, Where you don’t have data after three minute discontinuation or let’s go, you don’t have data. Or maybe very rarely we have a data point, but in essence, you don’t have the data. You are in the situation that is the focus of the paper, eh, that we will discuss today in that situation. The treatment policy strategy is not simple and basically, You will have to speculate and it’s a speculation what has happened after the treatment discontinuation. Yeah. And there is a middle ground that is also problematic that you still have data, but the amount of messiness after three discontinuation is harder than the messiness before that is also problematic. In that case, you might want to try to model the data after treatment discontinuation, but, and a standard model will not capture that. Okay, differential mis and for this type of middle ground situations I have seen papers using this dropout retrieval approach where basically you try to model the data after treatment, discontinuation differently, somehow, differently, of course, in that middle ground situation.
The big question is you have enough data to do that modeling effort because if at the end you have only a few data points, you may have an issue. So you have these three possible situations. The paper we will discuss later is focused on the second option where when you have no data or hardly data, and you will have to speculate and normally one way to speculate what has happened is using the reference by ation, delta, Jasmine. So basically all of them are just ways to speculate.
[00:09:30] Alexander: There’s another case in which you can easily come up with this, missing this. Yeah. Imagine you have a treatment discontinuation and you then start with another treatment. To further continue to collect the data. And now the stakeholders that you work with is not interested on this. You know what happens with this kind of strategy? If you first start with that and gender discontinue, you go over to the second treatment that you used in your study. But they say in our country, We switched to a different treatment and said, you haven’t observed.
Yeah. So you have data, but you have under the wrong treatment. Yeah. Whereas in your study, you have start with X and if that doesn’t work, go to Y. This stakeholder says, start with X, and if that doesn’t work, go to z. So although you have for treatment X and then go to Y, you have the data, you don’t have it for X go to Z. So the data situation is completely the same just because you have your treatment policy. Yeah. So the treatment policy that you’re really interested in is not for that you have missing data, even though you have data, it’s basically missing. Yeah. So that is Yes. Another situation where else that can easily happen.
And that’s just very often the case in HTA analysis. Yeah. Where, for example, for you have a, an control arm. And once they are, people discontinue from the control arm, they go over to the experimental arm. In real life, that will never happen. Yeah. So in real life, these HTA person will be, oh, if that work doesn’t work on the control arm, then it should go to another control, not to your new drug. Yeah, so you have, you are missing sales c kind of still sales data.
[00:11:36] Alberto: So that is very interesting situation and formally speaking on the framework we have seen that after this guideline in Jud, for example some C H M P points to consider guidelines on different. Diabetes Alzheimer, others have been updated and they use the update to include one paragraph on estimands and a common trend is the following, for treatment discontinuation, eh, use statement policy. So basically you ignore the fact that the subject is not on treatment and you use the data after.
Okay, but also for rescue therapy and this type of therapies, you mentioned the recommendation is often to use hypothetical. Hypothetical means don’t use the data after that event because you want to estimate hypothetical where the, that other medication is, let’s call it rescue medication, not existing, has not been, so at the end, you end up in the same situation. Covered that by this paper because the data you do have after treatment discontinuation is often not usable because has been collected after those medications is very good point. Very good point. Yeah. So the strategy you use for that, for those others for those other medications, it’s going to be important because we will, you can end up in the same issue the same lack of data. Good point.
[00:13:08] Alexander: Okay there are a couple of existing current approaches for treatment policy. What are your main critique points around this?
[00:13:18] Alberto: Okay. So could you repeat that?
[00:13:20] Alexander: I think you make a really nice summary in your paper where you speak about all the benefits of your new approach compared to all the other existing approaches.
[00:13:33] Alberto: Ah, okay. Yeah.
[00:13:34] Alexander: Yeah. And you say your new approach can do all kind of different things that previous. I think previous approaches can do. So there’s the section 4.3 in your papers that I’m already Yeah.
[00:13:49] Alberto: Let explain this because it’s not it’s interesting, but I’m not sure if I we are on the same page on the tools we already have in the literature, right? Even before that paper. We already mentioned here the focus is, to use treatment policy, a strategy for treatment discontinuation. But we don’t have the data or as you mentioned, the data that we have might not be data we can use. End is the same. And we will have to speculate. We will have to be honest, we will have to invent or to, and normally, of course, that is speculation. Should be conservative, eh what we have what options we have, eh to use reference based imputation, that basically you assume that for the experimental group after team discontinuation, in average, the subjects will behave like the control group, the reference group average in average.
[00:14:39] Alexander: Okay.
[00:14:39] Alberto: Another option. Go ahead.
[00:14:41] Alexander: And so that would be after treatment and discontinuation to basically assume they directly, behave like the patients under reference?
[00:14:50] Alberto: Yeah. Basically the treatment effect after treatment discontinuation is zero. So there is a treatment effect only on treatment and after administration is zero. There is another option is the copy increment from reference when you assume that they already achieved it, manufac at that time point. Is maintained, or you can maybe apply sort of delta adjustment. But basically you add or take a delta value a bit of effect size and you can increase the effect size for placebo or decreases at the end you have to fix those delta values.
[00:15:22] Alexander: In terms of the delta. So you basically see assumes that Over time the treatment effect shrinks. Yeah. So instead of that, it says, you can assume that it’s maintained, that it goes directly to zero, or that continuously shrinks.
[00:15:39] Alberto: Yeah. Normally you will use a negative data for drug experimental. So you will assume that the, yeah, they felt the mean value for the experimental group after treatment discontinuation. You will discount an amount and you can repeat that for different amounts. And then you end up doing the TP point analysis, and you can put that in a plot.
[00:15:58] Alexander: Yeah, okay.
[00:15:59] Alberto: But all this is not new. So basically, let’s go now to the question. Your question eh, how you can implement that? That is not new. We have been doing that for many years. Sometimes may, maybe it was only a sensitivity analysis, but as as you might know The most frequent tool to implement preference based implementation and is at least what I used to apply in my times in different companies is the carpenter Roger Chemical algorithm. Basically you use multiple importation framework. Multiple importation framework is so basically you fest. Use multiple interpretation model, an importation model to, I improve your data, but you will not imput the data using your data-driven model. So your datadriven model is going to be alter change to implement the jump to reference.
So for example the estimated mean for placebo is the one you are going to use for the experimental group, for the, for the imputations. And after imputing the data and in multiple implementation you have to impute the data many times. You apply maybe a simple uncover model per imputation And after that you will, you obtain an effect size per iation, and you pull all this with rub sequestration eh, is a sequestration that such you apply when you use a procedure call for analyze eh, but I mean that the whole process of doing that, using multiple importation, it requires a lot of steps.
Eh, I’ve, I experienced the implementation of this in real practice when basically the programing group say, okay, we need to create Adam data sets for each of these thousands imputed data sets. To be honest, there is an increase in the budget at the end. It’s costly. It’s more It’s possible, but it’s a lot of steps then. Yeah. And.
[00:17:51] Alexander: Also saving all these kind of different things and imagine you are doing this not just for one analysis, but for many analysis.
[00:17:59] Alberto: Yeah. Imagine that.
[00:18:01] Alexander: Yeah. You need to update it and then things like that. I’m just thinking about, certain analysis that are sometimes requested by the German HTA body. Yeah. Where you need to do these kind of things then in many different subgroups across many different endpoints, across different populations and so on. Yeah. Many may say computation time is not a problem anymore. In this case it is still a problem. Yeah. And of course just with, rerunning things and having different seeds and whatsoever, it becomes sometimes really difficult to exactly replicate things.
[00:18:35] Alberto: Okay. So then the question is what is the alternative? In this field of reference based importation, there are already two to a streams to two lines of research. The ones based on multiple imputation, and it’s, I think what people tend to use is and for multiple implementation approach. There are a number of packages and macros and software. But it’s not, this paper is not the first one that is proposing a fully analytical solution, eh in the paper I mentioned this works from Lou from Tang and and also from Lou and Pang. These papers, I have to be honest, I think they are not that well known.
But those other papers, they propose fully analytical solutions. So basically you replace your a model by another fully analytical model that without a straightaway gives you the parameter estimates for your problem using GEM to reference, for example, and these solutions. Are much faster and very fast. So basically if not you ask me to compare the approach of this new paper. I can do this comparison in two ways. I can compare it and do I do it in the paper against. The multiple implementation approach, that is what most people are using out there. And I also com can compare it against the previous analytical solutions.
And I finish that is what you mentioned before, eh in that section where I mentioned that with regard to the previous analytical solutions the papers I just mentioned before the new framework allows to implement any rule. The new framework is actually more compatible aligned with the estimate framework. I, the other papers I just mentioned before are quite old, eh, so they were actually published before the estimates, eh, framework. And those papers, basically what they did is to adjust ba basically those papers. And my new framework is they are doing initially the same thing. Basically what we are doing we use only data up to the treatment discontinuation. And do we obtain an effect size? What is that? That is the effect size using the hypothetical strategy. That is by nature what you do. If if you censor afternoon discontinuation, in this case, you don’t need to censor because. You don’t have the data unless you have the situation you mentioned before where you have data you are not supposed to use, then you censor.
Yeah. That gives you an effect size. But is what in the paper, I call the effect size for process y, not for the process of interest that I call set. That is the treatment policy. And that effect size with a standard error basically has to be adjusted basically to you jump to reference. Basically you multiply that effect size, times, the probability of not having a intercurrent event at that time in the experimental group. The slight difference is that previous works they used the multinomial for that second probability for that adjustment.
[00:21:35] Alexander: Okay?
[00:21:35] Alberto: And for, and using fully survival methods so I can handle better sensory and I can handle better other situations. In essence, My paper is a continuation of the line of research that I mentioned before, analytical solutions. But it’s a bit of an extension and a bit of an improvement, to handle better competing intercurrent events or sensoring.
[00:21:57] Alexander: Okay. Awesome. That is really nice. So basically you can with this framework, do all the kind of different treatment policy approaches that you can think of? Yeah. And directly run it and come with analytical approaches to it and can basically write down, okay, this will happen if we have jump to reference. So this is where we assume that you have some kind of Maintenance of effect. And this is what happens if you assume some kind of gradual decrease in the effect size of time as you can basically y or the tipping of point analysis say, okay, if it is a delta, is that big, then since treatment effect goes away, or maybe, It is so big that it even then doesn’t go away. So that’s the other.
[00:22:48] Alberto: On this point is I just, let me remark one thing. Let’s now compare analytical solutions, including the new solution I propose and the previous ones with multiple importation, they are much faster. Is for example, and that is very interesting if you want to implement Delta at Jasmine and tipping point analysis because in tipping point analysis, you need to repeat the data. Methodology, maybe, I dunno, maybe 100 times to plot a nice curve. Yeah. If you look at the simulations we did in average the multiple importation approach. To implement one single model of this delta will require about half minute. It’s not a lot, but the analytical solution requires like seconds. Not even. So basically, With an analytical approach, you will have a full t p analysis very fast. And with a multiple importation, you will face this extra complication of requiring Adam dataset with thousands of records, but also you will need more time.
[00:23:44] Alexander: The other point where that comes really handy is imagine you have some kind of interactive. Exploratory tool. Yeah. And you are looking into your data basically on the fly together with the team. If the, if it takes half a minute or minute to run as background, that really has a, negative effect on the communication about it. But if it, refreshes within seconds, then you can get much faster and easier work proceeds.
[00:24:14] Alberto: That is a very interesting possibility because actually this analytical approach, what he’s doing is that in the first part you are analyzing your longitudinal data. Only with data af up to the treatment discontinuation. And that gives you the treatment effect using the hypothetical strategy. And that amount is already calculated that you can include that amount in your interactive software is there so you don’t need more time because it’s already filling in the and other amounts that you need to obtain from the time to treatment discontinuation.
Basically the survival you also can obtain them and at the end, All the ES estimates for the treatment policy strategy are there post processing combination of these amounts, eh are something that you can see ,in the paper, in table three at the end. All these estimates are, Just a post processing combination of the estimates from your M R model that you can have you fit once at the beginning and your survival probability for the intercurrent event. So you can totally do that. You can totally better possibility, eh, to have an interactive tool that based on the spec, let’s remember, eh, everything you do at the n c speculation, you don’t know what happens after treatment continuation.
Let’s play with that speculation. And, eh, the possible speculation of that situation will give you a different defense size. And as that, as you can imagine it depends a lot on how many subjects they have to discontinuation, eh, if you have anesthetic where only few subjects have this situation at the end all this will, this is imputation methods. They will all of them give you more or less the same result because most of your estimate is based on the treatment part. On the on treatment data part. Yeah. If you have many patients with treatment discontinuation, The speculation, speculative part will be more important, and each method will give you a very different effect size.
[00:26:15] Alexander: Yep, yep. And it would be better if your framework doesn’t reflect that. Awesome. So now as a listener, I would think okay, what’s next? How can people implement it? Do you help people with anything? Do you have any kind of scripts or whatsoever available?
[00:26:32] Alberto: Yeah. In the, actually in the journal in the supplementary materials, we have a SAS script. I think it’s in SAS it’s possible to implement, but I think if you read the, people read the paper and they have the time to really follow the logic of this paper, some of of this estimates or the, basically the estimator I build to for example, for jump to reference, for example, using this framework. Imagine you have a trial and you just want to, you don’t call it a data after discontinuation, and maybe it’s a supplementary analysis to let’s reporting and policy for this trial in a easy way. Basically June take size that at that visit that you already the estimator in the paper for them to reference.
You take the fed size the treatment difference using the hypothetical strategy. That is what I call beta J two, DTA two, because is the fed size and that amount that we all know how to be to estimate. You multiply that. Times the survival at that moment is the moment for the visit you want to estimate of not is the survival of basically the priority of not having suffer treatment discontinuation. In the experimental group at that moment. And that is something that others could even obtain using. I dunno, even non paramedic methods I’m using flexible paramedic methods because they are handy to later build a standard errors. But yeah, at the end it is the multiplication of two amounts that we all know how to obtain. Of course, I went to the i the simplest, eh, one jump to reference. For others, the question is more complex, but at the end the question is always based on your estimates for the hypothetical strategy. And the survivals of the probabilities, eh, of treatment discontinuation. Of course some of these possibilities, eh, or speculations, they will require more complex equations. But I think that if others want to use this method, eh, they don’t have to really use the SAS code. It is not that That simple, eh?
[00:28:42] Alexander: Yeah. Okay, So have a look into this and if you find a really nice solution, yeah. Maybe if you do something in our, or something like this let us know so that we can link to your work in the show notes. And of course, in the show notes, you will find a link to the paper, find a link to Alberto on his LinkedIn profile, and see what’s all out there.
Thanks so much, Alberto for this really good discussion about treatment policy where we talked about, couple of different limitations that are currently out there. We talked about reference based imputation. We talked about how you can use delta adjustment to model kind of different scenarios. How that links together with tipping point and analysis. And all of these are really helpful. However, it always starts with understanding what actually is the research questions that you have.
What exactly is treatment? What exactly is preference? What exactly do you wanna assume and understand? And I think that is still the discussion that we need to have much more as statisticians as non statisticians especially and should continue to work on this. By the way, one remark is as we are recording this on the 7th of February in 2023. This really nice incidents on the 8th of February 20th, 2023 is the wonderful Wednesday webinar. And there we are talking about how you can visualize all these kind of different things that we just talked about. And so by the time this goes, comes out, probably I already have some kind of Some recording available for webinar.
If you don’t know about the data visualization special interest group and what they are all doing, check out also the links here go to psi.org where you can find the visualization. Special interest groups under SIG special interest groups. It’s the last one because it starts with a V. And check out all the different work that are, is there. It’s definitely helpful for this and it’s definitely helpful for many other things. Alberto, any kind of final source for those who are reading your paper?
[00:31:10] Alberto: No, to be honest, I think we covered most of the interesting parts that are some other interesting facts in the paper. A bit too technical to just summarize briefly. I just want to thank you, eh, the for inviting me. Very interesting discussion.
[00:31:23] Alexander: Okay. Awesome. Thanks so much. Have a nice time and listen to the podcast again next week.
Never miss an episode of The Effective Statistician
Join hundreds of your peers and subscribe to get our latest updates by email!