Top 9: Non-parametric analyses – much more than just the Wilcoxon test!

Dr. Alexander Schacht

Why this episode made our all-time Top 9: If you’ve ever thought “non-parametric = Wilcoxon/Mann-Whitney and that’s it,” this conversation will happily destroy that myth. Frank shows how rank-based methods unlock rigorous analyses for skewed data, outliers, ordinal endpoints, small samples, composites/estimands—and how to communicate effects without relying on means.

Why You Should Listen:

You’ll walk away with:

✔ Non-parametric ≠ one test: A broad toolkit for two-group, multi-group, longitudinal, factorial, and covariate-adjusted designs.

✔ When ranks shine: Ordinal scales, heavy skew, small n (e.g., preclinical/animal studies), outliers, composite endpoints under the estimand framework.

✔ Interpretable effects without means: The probability-based “relative treatment effect”—“What’s the chance a random patient on A does better than a random patient on B?”

✔ Link to parametrics (when you must): How the rank-based effect relates to standardized mean differences under normality.

✔ Presenting results: Confidence intervals for rank-based effects and clean visualizations.

✔ Software exists: SAS macros and R packages for rank-based models (plus pointers to Frank’s book).

✔ Missing data & estimands: Practical thinking about composite strategies, treatment policy, and ongoing research for rank methods with missingness.

Episode Highlights:

00:00 – 03:31 | Welcome & setup
TES resources, PSI community, and why innovative methods often struggle with adoption.

03:32 – 06:00 | Meet Frank
From Göttingen to Munich, Texas, and back to Berlin; preclinical research focus.

06:01 – 09:11 | What are non-parametric analyses?
No strict distributional model; works for metric, ordinal, and binary data.

09:12 – 12:13 | Why ranks?
Small samples, unknown distributions; robustness when outliers occur.

12:14 – 14:35 | Where ranks are the better choice
Ordinal ratings (A/B/C/… without meaningful distances), outliers, skew, composites.

14:36 – 21:18 | Defining the treatment effect without means
Relative treatment effect as a probability (e.g., 60% = in 60% of random pairings, new treatment is better).
Connection to parametric world under normality assumptions.

21:19 – 23:13 | How to present it
Confidence intervals for rank-based effects and clear plots.

23:14 – 30:18 | Beyond two groups
Multi-arm trials, repeated measures, factorial designs, covariate adjustments; pseudo-ranks and why unweighted references improve interpretability and power properties.

30:19 – 35:33 | Missing data, real-world setups & estimands
Practical strategies (composites, treatment policy) and active research on rank methods with missingness.

35:34 – 39:41 | Collaboration & wrap-up
Research networks, software, and how statisticians can lead method adoption.

References:

Book: Brunner, E., Bathke, A.C., Konietschke, F. (2019). Rank and Pseudo-Rank Procedures for Independent Observations in Factorial Designs -Using R and SAS. Springer
Brunner, E., Konietschke, F., Pauly, M., & Puri, M. L. (2017). Rank‐based procedures in factorial designs: hypotheses about non‐parametric treatment effects. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 79(5), 1463-1485.
Konietschke, F., Bathke, A. C., Hothorn, L. A., & Brunner, E. (2010). Testing and estimation of purely nonparametric effects in repeated measures designs. Computational Statistics & Data Analysis, 54(8), 1895-1905.
Konietschke, F., Hothorn, L. A., & Brunner, E. (2012). Rank-based multiple test procedures and simultaneous confidence intervals. Electronic Journal of Statistics, 6, 738-759.
Konietschke, F., Harrar, S. W., Lange, K., & Brunner, E. (2012). Ranking procedures for matched pairs with missing data—asymptotic theory and a small sample approximation. Computational Statistics & Data Analysis, 56(5), 1090-1102.

Links:

🔗 The Effective Statistician Academy – I offer free and premium resources to help you become a more effective statistician.

🔗 Medical Data Leaders Community – Join my network of statisticians and data leaders to enhance your influencing skills.

🔗 My New Book: How to Be an Effective Statistician – Volume 1 – It’s packed with insights to help statisticians, data scientists, and quantitative professionals excel as leaders, collaborators, and change-makers in healthcare and medicine.

🔗 PSI (Statistical Community in Healthcare) – Access webinars, training, and networking opportunities.

Join the Conversation:
Did you find this episode helpful? Share it with your colleagues and let me know your thoughts! Connect with me on LinkedIn and be part of the discussion.

Subscribe & Stay Updated:
Never miss an episode! Subscribe to The Effective Statistician on your favorite podcast platform and continue growing your influence as a statistician.

Learn on demand

Click on the button to see our Teachble Inc. cources.

Load content

Featured courses

Click on the button to see our Teachble Inc. cources.

Load content

Frank Konietschke

Professor of Statistics at the Charité

Frank has done extensive research on methodological developments in nonparametric statistics including ranking procedures and resampling methods for various designs and models. His results are published in numerous publications in various journals including two papers in “the highest-quality journal” Journal of the Royal Statistical Society Series B. Recently, he published a book on nonparametric statistics the Springer Series in Statistics. He lectured and taught nonparametric statistics on almost every continent and has been invited to speak at about 80 different universities, companies, and research institutions. Currently, he is a professor of Statistics at the Charite Berlin, where he leads a research group working on the development and application of statistical methods of translation and early clinical trials.

Transcript

[00:00:00] Alexander: You are listening to The Effective Statistician Podcast, the weekly podcast with Alexander Schacht and Ben Piske designed to help you reach your potential lead great science and serve patients while having a great [00:00:15] work life balance.

[00:00:17] Alexander: In addition to our premium courses on the Effective Statistician Academy, we also have lots of free resources for you across all kind of [00:00:30] different topics within that academy. Head over to the effective statistician.com and find. The Academy and much more for you to become an effective statistician. I’m producing this [00:00:45] podcast in association with PSI community dedicated to leading and promoting use of statistics within the health industry for the benefit of patients.

[00:00:54] Alexander: Join PSI today to further. Your statistical capabilities with access [00:01:00] to the ever-growing video on demand content library, pre-registration for all PSI webinars and much, much more. Head over to the PSI website at PSI Web Talk to learn more about PSI activities to become a PSI [00:01:15] member to.

[00:01:15] Alexander: So today we are talking with Frank Konietschke, who’s the professor at t and he is working a lot on non-parametric analysis. Benjamin and myself, we, have something [00:01:30] in common with him and you’ll hear about it in the episode today. Non-parametric analysis actually offer you lots of opportunities and says much more than probably the Wilcoxin test that he had learned at [00:01:45] university or which is similar basically same as a man with a U test. And this just a two sample case, but there is much more you can do with it. And today we’ll also talk about how you can [00:02:00] transcribe treatment effects in these kind of situations where you don’t have such usual parameter like means to describe them.

[00:02:07] Alexander: Should stay tuned for this really nice interview with Frank. You’ll have a lot of learnings about it, especially if you [00:02:15] are not aware of what’s going on in the non-parametric field, like actually many others. Statisticians. One of these problems with these innovative approaches like these non parametric approaches, but also with lots of [00:02:30] other innovative approaches, is that sessions very often face problems implementing them because they can’t persuade their colleagues to do these new things.

[00:02:42] Alexander: Says a lot of colleagues that are [00:02:45] very conservative and just wanna do the same thing over and over again. That of course, the leads to organization lagging behind, not reaching full potential, not leveraging what’s possible. [00:03:00] And here statistician need to step up and leads the organizations in such a way, not as a supervisor that’s really cross-functionally influencing people.

[00:03:12] Alexander: Join PSI today to further develop your [00:03:15] statistical capabilities with access to the video demand content library free registration to all PSI webinars and much, much more. Visit the PSI website@psiweb.org to learn more about PSI activities [00:03:30] and become a ps I remember today.

[00:03:32] Benjamin: Welcome to another episode of The Effective Statistician. First of all, I’m Benjamin and I’m here with my cohost Alexander. Hey Alexander. Hi Benjamin. How are you doing? Thanks very well, and we have a [00:03:45] special guest today coming Long Way from Texas to Germany back. Frank Konietschke. He’s here with us today and we are talking about non paramedics.

[00:03:54] Benjamin: Hi Frank. Hi. Good morning. Thank you so much for having me. I’m very excited . It’s an [00:04:00] exciting topic actually we actually, have a bit in common in a way, like in our whole history because we are all coming from GÃ¶ttingen and so we started in GÃ¶ttingen and Alexander did myself and you as well, but you’re actually so young that we did meet in GÃ¶ttingen to phrase it positive.

[00:04:14] Benjamin: So that’s [00:04:15] basically, so we do have a common background and also having the work with Professor Bonner on Nonparametrics. So it’s really good to have you here. But actually my first question would’ve been to introduce yourself. So I already just started and maybe if you can just give a quick introduction to yourself and just [00:04:30] say a little bit about you and your history and where you’re coming from, where you’re going to and what you’re doing now.

[00:04:34] Frank: Okay, sure. Hi everybody. My name is Frank Konietschke. I’m professor of statistics at CharitÃ© in Berlin. Actually, the CharitÃ© is one of the largest [00:04:45] university medical center in Europe, and I lead a research group working on some space statistical methods of translation. Which is basically preclinical research. And let me say some words about my background.

[00:04:58] Frank: So [00:05:00] as being said, I studied mathematics at University of GÃ¶ttingen and I also did my PhD in mathematics at the same university specializing in statistics. So after getting my PhD, I stayed at the same university to [00:05:15] work at my habilitation and lemme do this a little short after this. I got my first professor position at Maximilian University Munich, where I stayed for a while.

[00:05:25] Frank: Then I moved to Texas, so I worked as professor [00:05:30] at the University of Texas at Dallas, and recently I moved back to Germany because I got a position, a professor position at the CharitÃ© with affiliation also at Humbold University and the prior university in Berlin. [00:05:45] Now I’m teaching here statistics and working in clinical research so far so good.

[00:05:49] Frank: That’s what I’m doing.

[00:05:50] Benjamin: Excellent. Right now. Excellent. As we talk about non-parametric analysis, so non parametric statistics, so what actually [00:06:00] are non-parametric analysis?

[00:06:01] Frank: So you could say non-parametric analysis means, when you’re working in statistics, you are. Collecting data and what you’re doing is that you work with data distribution moderates, right?

[00:06:12] Frank: That’s fair to say. So non-parametric [00:06:15] means that you don’t ate a specific distribution model of the data. That’s basically whether you don’t prepec, the data must come from a certain distribution, for example, normality. So you relax this assumption and [00:06:30] just allow data or allow the data distribution to be completely arbitrary.

[00:06:35] Benjamin: There are also, this is basically parametrics and non-parametric statistics, but what actually is then semi parametric? I think just to distinguish between the [00:06:45] three different types.

[00:06:46] Frank: So the parametric means you don’t postulate a specific data distribution, but you allow, or that you assume that certain parameters exist.

[00:06:57] Frank: For example, a meal. So when you [00:07:00] have analysis where you. Assume that a mean exists, but leave the assumption of a specific data distribution, then you are on a standard parametric framework. Okay. It’s basically that, that’s a compromise.

[00:07:13] Alexander: Yeah, and I think [00:07:15] here maybe it’s the obvious question is why shouldn’t a mean not exist?

[00:07:19] Alexander: I think, that is one of the tricky questions that we need to enter here, because a mean is nonexisting if you can’t define it [00:07:30] because the data doesn’t have the right properties. So if, for example, the data doesn’t allow you to define certain distance, for example you can’t have a mean, because if you, let’s say, have just oral data that’s [00:07:45] strictly ornal in the sense that

[00:07:47] Alexander: you can just say whether something is bigger and smaller, but you can’t say how much is, it’s bigger or smaller saying you can’t define a means, so that’s one situation. For example, we are that the mean doesn’t make [00:08:00] any sense, basically. Yeah,

[00:08:01] Frank: yeah. The, for example is skewed. It’s a distribution skewed. For example, if you measure income, so income as a measure is always very skewed.

[00:08:09] Frank: That means, for example , also might not be a best choice to define. Or to you as a measure of [00:08:15] interest.

[00:08:15] Alexander: Yeah. But for income, at least you can say what the mean is it’s might for certain question not be a good way to describe your distribution. But other situations in my, but for example, if you, [00:08:30] let’s say, have a rating that, just describes, let’s say, school ratings, for example, you know that A is better than B.

[00:08:38] Alexander: B is better than C. C is better than D, but it’s really difficult to say whether is a difference between A [00:08:45] and B is the same as between B and C or C and E, or. A to C is exactly twice a difference than B to C in these kind of things, settings, it’s really difficult to define a mean directly from the get go.

[00:08:59] Alexander: I think that the [00:09:00] other point is where if you have certain types of distributions, just looking into some mean doesn’t make a lot of sense. So what are alternative ways we can look into that?

[00:09:12] Frank: So we are working. [00:09:15] In this non parametric, usually with ranking methods or ranking methods are a non-parametric way to run statistical analysis.

[00:09:24] Frank: So in these ranking frameworks, you are allowed to relax any [00:09:30] distribution assumptions. So all of these methods work for any distribution data scale. So this works for metric data. As we said, this works for automated data. This even works for binary and dichotomous. Data, and let me say why we are working [00:09:45] in this ranking or why we are working with ranking methods is in many trials, sample sizes are very small.

[00:09:51] Frank: So in my area here at CharitÃ© where we are working in preclinical research, sample size are usually very small. We have, for example , eight or [00:10:00] nine measurements per group. Yeah. So we have a trial where we observe a few groups and sample size are very small. In such a situation, you cannot estimate the distribution of the data at all.

[00:10:11] Frank: So you’re always in the framework that you cannot get any guess about the [00:10:15] distribution of the data. So in all of these areas, or at least for me, purely non-car metric methods are the best way of choice to run any analysis, and especially for the clinic of the trial. You are always in, you always have an [00:10:30] issue.

[00:10:30] Frank: You plan your trial, like an animal trial with where you come where you come up. ’cause very small sample size that, that the planning is based on a parametric method because it’s very likely that the data distribution won’t satisfy the assumption of the method based on which you [00:10:45] planned your trial.

[00:10:46] Alexander: I

[00:10:46] Alexander: think, see it in that sense, it may be a more conservative way to calculate the power because you actually build in the model uncertainty, so to say. Yeah. Because if [00:11:00] you power based on an assumption that the data is normally distribution, then you might think, okay, I’m okay here.

[00:11:09] Alexander: Because I just build in additional assumptions and these additional assumptions [00:11:15] and give you more power. But that’s just perceived more power because you haven’t built in any variability regarding your model. And if you think about uncertainty in two different aspects, one is the uncertainty regarding [00:11:30] the model and then the uncertainty in the model, the non parametric approach.

[00:11:35] Alexander: Encompasses, so to say both things because there’s nearly no assumptions regarding C model, or at least really easy to justify assumptions [00:11:45] regarding the model. For example, that every subject follows the same distribution and it’s independent. So

[00:11:52] Frank: the volume, it’s a very powerful tool and you can, I think it’s fair to say you are always on the safe side.

[00:11:58] Frank: When you run [00:12:00] non, what

[00:12:00] Alexander: are other situations where we should use ranks? So we talked about the small group discussions of when you’re not sure about the distribution and kind of the ordinal approach or skew [00:12:15] distributions. Can you think of other situations that, makes it very obvious or to use ranks instead of using certain parametric approaches?

[00:12:24] Frank: Yeah, it’s for example. You have many outliers or some extreme outliers in your [00:12:30] dataset, then ranking methods might be a very good way to analyze the data. It’s fair to say at this point, it depends in what kind of outliers you have, so right when you have a ranking, the ranking just means you absorb the data in your list.

[00:12:43] Frank: The smallest observation gets a [00:12:45] ranking one. If you have M observations, the largest observation gets rank M for here. It doesn’t matter how far off any outlier is from the other observations. So it can be a very robust way to analyze data with outliers. Sometimes if the outlier is [00:13:00] very informative, some ranking methods might not be

[00:13:02] Alexander: in these kind of settings where you actually see outliers are important things.

[00:13:06] Alexander: You would probably any way look at maybe response variables in terms of outlie Yes, no. Or things like that. [00:13:15] And see. I think there’s also in, in this whole estimate framework where we have these composite endpoints in terms of where you build new endpoints and say, okay if a person discontinues [00:13:30] due to set reason, he gets set score.

[00:13:32] Alexander: If he discontinues to another reason, he gets a worse score and. If he die, he gets the worst possible score. Then that’s a kind of another areas, these composite endpoints that lends [00:13:45] itself very nicely to, to ranks. And the composite strategy with the estimate framework is one area where ranks can be very nicely applied because there you also have these oral data sets.

[00:13:58] Alexander: So I think that [00:14:00] links to another episodes that they recorded a couple of. Months ago about the composite strategy for the estimated approach, and we shortly touched on this topic within that episode. So if you wanna listen to that, just [00:14:15] scroll back in your podcast player and find the episode. Yeah.

[00:14:21] Alexander: Okay. One of the problems, of course, if we don’t have means, for example, to describe our treatment effects [00:14:30] and differences between means and these kind of things. How can we then describe actually a treatment effect?

[00:14:36] Frank: So when you work with ranking methods, it’s one of the questions. So how the treatment effect means, it’s how you can describe the [00:14:45] difference between at least two distributions in such a non-parametric framework.

[00:14:49] Frank: So when you don’t describe some the difference based on means or any other model. On our model, we don’t have any parameter at all. So we do this in a way, let’s say we have two [00:15:00] groups. Then we look for what is the probability that the random chosen observation in the first group say is smaller than any random chosen observation in the second groups that we, you don’t define a treatment effect based on a mean [00:15:15] difference or on any other parameter.

[00:15:17] Frank: You just look in which of the groups are the data larger than in the other one. That’s your treatment effect. So let’s say again, you have two groups. If this effect is to look for what is the probability. Any observation the first, the smaller [00:15:30] than the other one, if this probability is equal to 50% then you can say in none of the two groups, the data are smaller or larger. So it means you don’t have any, let’s say, difference between the two distribution on this probability scale.

[00:15:42] Benjamin: Okay. I understand. I think [00:15:45] that’s one of the key, no, not problems, but maybe what’s holding back the really breakthrough of non-parametric in general.

[00:15:51] Benjamin: In the clinical trials is probably that doesn’t give you like a measurement of, or a quantity of a difference that you can see ’cause [00:16:00] it’s based on ranks and it’s not based on the real, the actual observations.

[00:16:04] Alexander: What you can say is, let’s say you have in Jeff relative treatment effect of 60%, then you know that if you’re taking the new.

[00:16:14] Alexander: [00:16:15] Treatment in 60% of the cases. You’ll be better off than taking the comparative treatment. If you have a 80% chance, then it’s a 80 to 20% of ratio. And I think that’s a actually a nice way to understand [00:16:30] your data and. What’s also nice, it says there’s actually a relationship to the parametric case.

[00:16:38] Alexander: So let’s say if you assume a normal distribution, you assume, see [00:16:45] that the standard deviation is the same in post treatment groups, then you’ll always have see a hypothesis, no hypothesis match each other. So whenever you have a. Relative treatment effect [00:17:00] of 50%, you will have a zero difference in your means.

[00:17:05] Alexander: However, what’s also really nice is even if you know you have the same mean, but the standard deviation is very different [00:17:15] from the two groups, you still have a relative treatment effect of 50%. Yeah. You can have, let’s say a case, let’s say a very extreme case. You have just. Three outcomes, three outcomes for the endpoint.

[00:17:29] Alexander: One [00:17:30] to an outcome one, two, and three. And if for treatment group one, everybody has a two for treatment. Group two, half of them have a one half, A three. You’ll get the relative effect [00:17:45] will be 50% because 50% of the cases you will be A is better than B. And then 50% of the cases treatment B is better than a.

[00:17:54] Alexander: So it’s a ve for me, I think it’s a very intuitive way [00:18:00] to describe a treatment effect in situations where you just can’t quantify so easily how much better it’s, you can just say it’s better. Yeah. That’s the limitation on that one. Yeah, but I think the limitations [00:18:15] inherently in the data itself.

[00:18:16] Alexander: Yeah. It’s not a statistical method. Just by putting assumptions on top of it, we maybe even very fool ourselves by thinking about mean differences when LEAs means don’t make a lot of sense. What do [00:18:30] you saying Frank?

[00:18:30] Frank: I think first of all, the sweet metal, you are correct. It’s defined on a probability scale, so some people.

[00:18:37] Frank: It’s a little harder to interpret instead of comparing means. I guess that’s true. On the other hand, you [00:18:45] base your analysis or you define the on saying in which of the groups the data are, or the outcome tends to be larger than under a different condition. So for me, I don’t think that this effect is [00:19:00] harder to interpret.

[00:19:01] Frank: So it’s measured on a probability scale. So based on the strengths of this epic, you can for sure define that. The larger the data will be in the certain group say, then the larger will be the [00:19:15] probability, the larger immune closer to one. And

[00:19:17] Alexander: I think the other knife point is, if you think about binary case, and you touched earlier on set.

[00:19:24] Alexander: You can also apply ranks in the binary case there. This relative treatment effect [00:19:30] corresponds one to one to the relative risk difference. So if you have response rates between the in, in the two groups, the difference between these response rates, you can calculate that while pretty simple formula [00:19:45] directly into the urologist.

[00:19:47] Frank: It’s a major advantage also of this effect because for any. Postulate any distribution which has certain parameters, and then compute the relative effect. Then this relative effect is a [00:20:00] function of the parameters of this distribution, this whole distribution that you might think of, as you said before, for the normal distribution.

[00:20:07] Frank: Then this effect is a standardized, mean difference. If you have binary data, then it’s more, and it nothing adds on the difference between the [00:20:15] two success probabilities and you can. Continuous list for any distribution you might think of. So you always have a computes effect, this relative effect for any distribution.

[00:20:25] Frank: Then you can express this effect in terms of the parameters of this distribution. [00:20:30] So I think that the very nice property also of this effect.

[00:20:33] Alexander: Yeah, so that’s the other point. If you actually assume normal distribution, then you can also see for given treatment difference in terms of the means. How that [00:20:45] relates to the relative effect in the non-parametric case, and it’s just a function of the standardized mean difference then.

[00:20:54] Alexander: So that’s also a pre pretty nice way to get a little bit of a feeling for [00:21:00] what is a big present effect because you basically can calculate it back to what it would mean normal distribution setting.

[00:21:07] Benjamin: Okay. We talked a little bit about the treatment effects but as a result, how can you just visualize the treatment effect?

[00:21:14] Benjamin: So [00:21:15] how can you best present the result of the non plyometric analysis?

[00:21:19] Frank: So what we do and what we favor is for sure are for sure confidence intervals. So we did a lot of research in the computation and to derive [00:21:30] formulas for confidence intervals for these effects. You can compute those defects for any, like an, if you have a more complex model than having two groups, you can compute these effects or other group combinations.[00:21:45]

[00:21:45] Frank: Yeah, and I think combination are for sure, one very nice way to visualize the treatment effects.

[00:21:51] Alexander: By the way, if we are talking about that, I guess that is all described in your new book that’s about to come out, isn’t it? We’ll put a link to [00:22:00] that and to the show notes. And if I’m not mistaken, your book also comes with some program

[00:22:05] Frank: help, isn’t it?

[00:22:06] Frank: Yes. We implemented many programs for application for the, for our client disease ranking methods. So we implemented [00:22:15] macros for Zs, for us of we implemented our packages, which are called Rank fd, and we have Empire, and another one is all of these packages they emphasize on different application areas.

[00:22:27] Frank: So I guess they cover [00:22:30] a very broad range possible data models

[00:22:32] Alexander: as

[00:22:32] Alexander: you speak

[00:22:33] Alexander: about possible data models. So currently we have just discussed very much about the two distribution case and just, I think that is good to grasp the first understanding of [00:22:45] the relative treatment effect and these kind of things.

[00:22:47] Alexander: But there’s been lots of research going on over the. Recent decades to extend that in many different forms and says research in terms of having [00:23:00] multiple treatment groups as researchers, having multiple time points as research about having also coverts implemented in into that, looking into factorial designs.

[00:23:11] Alexander: In terms of that, it’s quite flexible now, isn’t it? Yes.

[00:23:14] Frank: [00:23:15] So you pick up about the definition of the treatment effect when you have more than two groups. So when you have more than two groups, there’s one arising question to how to define a treatment effect in such a case. So what you do is you need a benchmark.

[00:23:26] Frank: What you want to define a treatment effect, say, [00:23:30] first of all, like an effect for every group separately. You need a benchmark, but you define as the average. You define an effect for every group to have the probability, any randomly observation from the year. [00:23:45] Specific growth is greater than the average. So we did a lot of research going average, which, which goes to the definition of this average.

[00:23:54] Frank: So understand that literature or, and previous case people defined [00:24:00] the average associated meal. So in what we did, so we found that this definition is not the best to define the effect at all. First of all, because it’s weighted meal. It’s weighted by samples. Means that your treatment later on. We also depend on synthesizers, which [00:24:15] not the way how you want to fix model constant in terms of a treatment is that we ongoing with the pioneer, the mean as the unweighted meal.

[00:24:24] Frank: So the unweighted mean is a model constant. And then just how later on for the estimation, [00:24:30] the ranking methods also change so mean. If you go then to the estimation that you don’t use ranks, but we call it the pseudo rings. So also in our book that we call rank and pseudo methods. So this [00:24:45] just means that you estimate a different kind of treatment for several decades or for more advanced, moderate, undefeated approaches.

[00:24:54] Frank: So we generalize thesis projects for. Both samples can be any general linear model. This could [00:25:00] be factorial longitudinally. We are working on methods to adjust for coverts and baseline and or these, so there’s lot of research going on like in my groups and group of cutting it a [00:25:15] little bit, and one for Marcus Coley and from other people who are working this.

[00:25:19] Frank: So it’s a very. For me a very interesting field where none of research was going on.

[00:25:24] Alexander: Yeah. I can remember the discussions about through the ranks, I think started about the time I [00:25:30] wasn’t getting, where we were looking into this treatment effect a little bit closer and up to that point, and the, about let’s say mid to and nineties, we always had this compared the [00:25:45] treatment effect.

[00:25:45] Alexander: In terms of a treatment effect compared to the weighted average across all the different treatment groups, right? So the weighted average of the distributions. And we found that has some [00:26:00] nice optimality criteria from sample sizing and empowering and then precision. But of course it has this downside in terms of the interpretation set.

[00:26:09] Alexander: Your, if you don’t have a completely balanced design, then your [00:26:15] treatment effect really depends on, so sample sizes and differential sample sizes. Yeah. And so

[00:26:21] Frank: what we, what our risk actually, Alex, the direction of the research that, that we package. It also when you, that the treatment [00:26:30] effect might be on synthesized allocation and think about the power.

[00:26:33] Frank: So then we found that you might. Has very surprising because many paradox reside and let use of the surprising resides. That’s the power of [00:26:45] these tests highly depend on sample size and location. So what I mean by with all these AL ranking methods, you can have sample size and locations from their moderate in the way that you either get a significant outcome or a [00:27:00] significant reside or an insignificant.

[00:27:02] Frank: Which plays I might play a more important role than the different in the distributions at orange, which is, which was very huge drawback of this classical ranking methods. And this [00:27:15] lift can be repaired, but it define, or when as you sue the rents. Okay,

[00:27:19] Alexander: I can see that happening. But the interpretation of sub solar ranks is basically more or less the same of the ranks.

[00:27:26] Alexander: Since this relative treatment effect and as se, [00:27:30] as a reference is then just the average of all the other treatment groups. So basically, let’s say you have four D treatment groups and they are defined by different doses of your treatment, [00:27:45] let’s say no dose being for placebo, low dose, middle dose, and high dose.

[00:27:50] Alexander: And then if you wanna compare the high dose, you would compare. So high dose versus the other three together. So you can say, what’s the [00:28:00] probability that’s the highest dose? If you are put on the highest dose, you get a better outcome than if you are randomly put on each of the other three doses. Right?

[00:28:10] Alexander: That’s basically your interpret interpretation, isn’t it?

[00:28:13] Frank: Yes. To always [00:28:15] compare each group say to the meal of the other ones, and based on these. Then this forward to value. Let’s say for group number four that you said it’s the largest, then you can say immediately that when the outcome under those four, [00:28:30] it’s larger than under those three and the others.

[00:28:33] Alexander: How does that work? And that similar works now if we have look into time courses, isn’t it? So if we look into multiple time points, [00:28:45] then we can say, okay, now we not have only four the solutions for treatment doses, but we, let’s say have also five visits. And overall we have now 20 distributions. And you look into each.

[00:28:59] Alexander: [00:29:00] Time point and dose. You compare it to all the other time points and all the other doses basically. So you compare one distribution versus the 19 others.

[00:29:12] Frank: What you want to have, also, for better description, you want to [00:29:15] have such a, such an effect size measure for each time and dose combination. And then for this reason, we define the effect that you average all of the.

[00:29:26] Frank: Distribution under each time point and under each [00:29:30] dose, compare each or each dose in time combination to this me, then you’d have a very intuitive and also very easily interpretable effect measuring.

[00:29:41] Alexander: And then from there you can very easily further [00:29:45] derive other things, like your average treatment effect across the time points or your average time effect at a given time point and these kind of things where you just average relative treatment effect.

[00:29:58] Alexander: Isn’t that

[00:29:59] Frank: That’s [00:30:00] absolutely correct. Unweighted average, isn’t it? Yes. So you always, today we work for UC Unweighted. Average of the distributions as a reference distribution when they define the treatment effects. And then when you estimate, this [00:30:15] naturally leads to the pseudo ranking assignments.

[00:30:18] Benjamin: Yeah.

[00:30:18] Benjamin: And just, I’m just wondering, we just had a call on the real world evidence, we world data. So isn’t this a field where non parametric could get you just based on the data itself? Isn’t this a field where. [00:30:30] Non parametric can get into. And how is it then handled sometimes with missing data in non parametric?

[00:30:35] Frank: So first of all, this ranking applicable in in, in broad ranges. It’s also from my experience or I’m lecturing these massive at many places. [00:30:45] So where every place has a different field of applications and it is possible. So when you have missing values, we actually work on the ways of, we effectively incorporate missing values.

[00:30:56] Frank: So what I do or what we do that [00:31:00] we define as a method upon all available cases. So we don’t, implementation is an issue so far when you work in a purely non-parametric field like to implement where also, therefore what we do is to try to derive method based on [00:31:15] all available carriers in for sample is good to wait based on the information that you have.

[00:31:20] Frank: So we published a few publications about. Was missing. So what is a research that we actually do on the way to, hopefully [00:31:30] one day, very nice solution for this issue,

[00:31:32] Alexander: but basically you can apply lots of systems, similar techniques that you would have for a parametric analysis as well, isn’t it? So you could have simple imputation methods or you could [00:31:45] derive multiple imputation methods as well, isn’t it?

[00:31:48] Alexander: Depends.

[00:31:49] Frank: When you talk about computation, it’s usually based on the model. So you basically usually need a parametric model in the background, and then you estimate [00:32:00] the data, right? That’s what you do when you input, you estimate the data, the missing data, based on the information that you have. That go usually hand in hand with having a certain statistical model that was purely non-parametric fears you don’t have.

[00:32:14] Frank: So there’s not [00:32:15] really a model where you can estimate data from. So it might be, if you think about one of just replacing the missing value with see, let’s say the average of the observed observations. That’s for sure not the best way to do it, [00:32:30] that if we all agree, use the correlation of this imputed value and then also about how to estimate brand.

[00:32:36] Frank: Of this observation the missing observation had gotten. If a fear that properly, it’s possible, maybe if you [00:32:45] assume a very missing by your mechanism might missing completely at random, and then some more assumptions that you, there are some works where people relax this assumption and trying to impute just for by case.

[00:32:59] Frank: This is research [00:33:00] that actually is going on, but I am, I’m not very sure it there will ever be a very nice solution. Okay. I’m not sure, I think with one, one of the field that has the limitations maybe in this,

[00:33:10] Alexander: but is it really a limitations there or is it just that and the [00:33:15] pyramid work world, we make it easy for us because we just assume a parametric model.

[00:33:21] Alexander: And then we can work with that. Of course, you could first assume some kind of parametric model to, to fill your missing data and then [00:33:30] move forward with non-parametric

[00:33:31] Frank: Approach. Which ranking? Let’s go back to the point we are. So where does the rank come from? The ranks and I else that you put in every observation in the empirical distribution.

[00:33:41] Frank: So when you have written values, first question is which mechanism? [00:33:45] That to have missing values, right? Relates to that we cause a missing value mechanism. So now if you have missing values in this ranking, what you need to do, you need to estimate the condition distribution function, given a [00:34:00] certain missing value mechanism, what they might see or see already that it’s getting very complicated.

[00:34:05] Frank: So you can do this when you assume it’s completely at random, but as soon as you. Relax this assumption a little bit, and as you’re missing addendum then the [00:34:15] estimation of the distribution function, given this missing value mechanism, it become complicated, and that’s why I’m not sure how these things will go when you have even less strict assumptions on the missing way mechanism and missing addend.

[00:34:29] Alexander: I [00:34:30] think then the complete estimate discussion kicks in and you can say, okay, if you think, okay, all these missing data, all these, because of dropouts, they are actually treatment failures. You can assign them a certain kind of outcome. [00:34:45] Yeah. Then you are, it’s a composite strategy and then you’re there. Or if you have, if you wanna have a treatment policy, then you just further collect these data so that you can make an assumption.

[00:34:58] Alexander: I think that is [00:35:00] more a problem that you need to solve on the data itself. And then for me, the kind of analysis approach is more the second step.

[00:35:09] Frank: Lot of research. So one of my PhD students, she is working on this through [00:35:15] the ranking methods, was missing by, so there’s research going on and maybe one day we will have a very nice solution also with, let’s say a realistic missing value mechanism, assumptions.

[00:35:25] Frank: We are there on the first. First. So when we have standard and [00:35:30] then we keep going and see that maybe one day we will have a very nice solution.

[00:35:34] Alexander: Did you mentioned a couple of groups working on that. Is this some kind of. Working groups, some kind of special interest groups. This is working on that, [00:35:45] that people who would be interested in doing research on non-parametric or learning more about parametric could join

[00:35:52] Frank: the background for following.

[00:35:54] Frank: Like for me, I on this purely non-parametric mess, and then one day [00:36:00] I wanted to explore some better. Approximations to have some better results for small sample sizes. When I started to begin, was working with recently so on. Then one day I got in touch with people like with the research group from [00:36:15] Adelphia University, where I met Marcus Ley, said Marcus Ley.

[00:36:18] Frank: He worked very specializing in recently patient testing and that how collaborative work began. So then he had the expertise on the, I had the expertise on the non-parametric ranking. [00:36:30] Methods. So then we began to collaborate and do the resampling and implementation and other works in the ranking framework.

[00:36:39] Frank: And we have some other researchers in the United States and in Canada. I’ve been working a lot [00:36:45] also with these. And it’s always when everybody in the visit everybody used has a different. See it, he or she’s working on it depends also on the applications or like on the areas that somebody is working. So for me it has always [00:37:00] been the preclinical trials and then, and some others might do some in, let’s say in psychological studies.

[00:37:06] Frank: Yeah. Then we meet, we try to connect this area. So I think always fruitful from my experience and it’s [00:37:15] very. For me, I always learn a lot when I get in touch with new research groups and I can also see how I can enhance the models that I’ve studied. And you always have new, like more problems when you generalize the model that you have or like you [00:37:30] realize that you have to more like to settings that you never thought about.

[00:37:33] Frank: Yeah. And I think that’s how it usually goes.

[00:37:37] Alexander: Okay. Okay. Very good. Then with that, I think we had a really nice discussion about non-parametric and [00:37:45] as Benjamin mentioned in the beginning, it’s close to the hearts of see all of us because we have saw the VS and worked on it quite a lot, very early in your career and you one of the A one suits.

[00:37:58] Alexander: Continuously [00:38:00] working on that, and it’s awesome that this research is been going on for quite some time, and it’s now in a state where there’s a whole complete theory that you can draw from and such software solutions to [00:38:15] directly implement things. So that’s really good. As a statistician that you’re now listening to this episode, think about where these can help you in your day-to-day work.

[00:38:26] Alexander: Are there cases where you are thinking that [00:38:30] is maybe not the best approach to just assume normal distribution? Is there better ways to do that? And I think that is one of the innovation areas where you can bring new things to your team [00:38:45] and where you can maybe have better discussions about treatment effects and what you’re actually measuring there.

[00:38:53] Alexander: And. What is really interesting for the outcome and for me especially all [00:39:00] this, the composite estimate approach, one of the key areas where we should apply that much more, and especially if it’s not just a binary approach, but if we wanna have multiple categories depending [00:39:15] on why a patient drop out. Then I think it’s a very valuable approach, and I think the other ones are kind of outliers and these kind of things, so thanks a lot, Frank.

[00:39:26] Alexander: Thank you for really nice interview [00:39:30] and for everybody who’s interested in it. Check out the show notes. You’ll find a link to Frank’s work and lots of further work on parametric statistics. Thanks so much.

[00:39:41] Alexander: This show was created in association with [00:39:45] PSI. Thanks Reine and her team at VVS. who helps the show in the background and thank you for listening. Reach your potential lead grade science and serve patients. Just be an effective statistician.

Join The Effective Statistician LinkedIn group

This group was set up to help each other to become more effective statisticians. We’ll run challenges in this group, e.g. around writing abstracts for conferences or other projects. I’ll also post into this group further content.

Join Group

I want to help the community of statisticians, data scientists, programmers and other quantitative scientists to be more influential, innovative, and effective. I believe that as a community we can help our research, our regulatory and payer systems, and ultimately physicians and patients take better decisions based on better evidence.

I work to achieve a future in which everyone can access the right evidence in the right format at the right time to make sound decisions.

When my kids are sick, I want to have good evidence to discuss with the physician about the different therapy choices.

When my mother is sick, I want her to understand the evidence and being able to understand it.

When I get sick, I want to find evidence that I can trust and that helps me to have meaningful discussions with my healthcare professionals.

I want to live in a world, where the media reports correctly about medical evidence and in which society distinguishes between fake evidence and real evidence.

Let’s work together to achieve this.