Do you think, R can’t be used for regulatory submissions?
Are you forced to use SAS but would like to use R instead?
Would you like to understand how you can make R submission proof?
Then this interview with Lyn Tailor and Craig McIlloney helps you. Both work on the PSI special interest group Application and Implementation of Methodologies in Statistics (AIMS) on this topic.
Furthermore, we cover the following topics:
- Use of SAS vs R in the industry for exploratory vs submission work
- Where is the rise of R coming from?
- Where are the key differences between R and SAS?
- Why people aren’t using R more
- R foundations documentation for the core packages
- Challenges with the use of the other packages for submission work
- AIMS SIG plan for a Validation framework and platform for sharing evidence
- Work to date on the validation framework / set up of R Validation Hub Project/partners
- Who will benefit from this outcome?
- How you can contribute
References:
- AIMS slide set based on PSI strategy day
- AIMS SIG information: https://www.psiweb.org/sigs-special-interest-groups/aims
- The R validation project documentation will be stored on: https://github.com/pharmaR
- Discussion forum: https://rvalidationhub.slack.com
Featured courses
Click on the button to see our Teachble Inc. cources.
Craig McIlloney
Craig McIlloney holds an MSc in Applied Statistics from Napier University and a BSc (Hons) in Statistics from the University of Glasgow. Working in the pharmaceutical industry for more than 20 years Craig joined PPD in July 2002 where he has had various statistical and leadership roles and is currently Vice President of Global Biostatistics and Programming. Before PPD Craig spent 5 years working as a statistician and programmer with another CRO and 3 years working within the finance industry. From 2013 to 2017 Craig served on the PSI Board of Directors where he oversaw the move to a new website and helped establish the Application and Implementation of Methodologies in Statistics (AIMS) SIG. Craig is particularly focused on supporting the industry drive to expedite decision-making through quicker access to data and results which aligns with the focus of the AIMS SIG to provide access to a wider set of tools for statisticians and programmers. Craig is also a Chartered Statistician with The Royal Statistical Society.
Lyn Taylor
Lyn Taylor received her B.Sc. in Applied Statistics from Sheffield Hallam University, her M.Sc. in Medical Statistics from Leicester University, and her Ph.D. in Statistical Modelling of Markers of Severity in Rheumatoid Arthritis from Sheffield University. Lyn has worked in medical research for over 17 years, starting at SmithKline Beecham in the pre-clinical statistics group before moving into the world of CROs. After 11 years as a statistician at PAREXEL, Lyn worked at PRA Health Sciences for 3 years and is now Sheffield Office Manager for Phastar. Lyn is the chairperson for the Sheffield local group RSS, co-ordinates a Statistical Activity Network in Sheffield (SANS), and was on PSI CALC before joining the PSI Statistical Computing Committee which is now the PSI AIMS SIG. Lyn’s current statistical interests include multivariate modeling, estimands, and the use of R in our industry.
Transcript
The rise of R and what role the AIMS SIG plays in it
00:00
You are listening to episode number 45 of the Effective Statistician Podcast. The topic today is the Rise of R and how the AIMSIC is helping you.
00:17
Welcome to the Effective Statistician with Alexander Schacht and Benjamin P. Skill, the weekly podcast for statisticians in the health sector designed to improve your leadership skills, widen your business acumen and enhance your efficiency. We are coming up with a great free webinar. If you missed the first one, we will be back.
00:37
have another one. The topic is the same, four reasons why statisticians fail to lead and how to overcome them. Sign up at the effective statistician.com webinar and don’t miss this great opportunity. In this interview today, we talk about the AIMS SIG and the SIG is looking into
01:06
what R and SAS mean for the industry and how R is becoming more and more used in the industry to do lots of types of work, exploratory work, but also submission work. We talk about why people aren’t using R more and how R can be qualified for regulatory work.
01:31
and also what the plans for the aimsic are and how that will impact your life.
01:37
This podcast is created in association with PSI, a global member organization dedicated to leading and promoting best practice and industry initiatives. Join PSI today to further develop your statistical capabilities with access to the special interest groups, the video on demand content library, free registration to all PSI webinars and much much more. Just visit the PSI website at PSIweb.org to learn more and become a PSI member.
02:07
today.
02:14
Welcome to another episode of the Effective Statistician. And today I’m again with my co-host Benjamin. Hi, Benjamin, how are you doing? Hi, Alexander, very well, thanks. And we have two guests here, Craig and Lynn. How are you doing? Good, thank you. Great, thanks. OK, very good. So today we will talk about a special interest group of PSI, the application and implementation of methodologies in statistics, or
02:43
in short, Ames Thig. Both Craig and Lynn work on Thig and we will talk about a couple of very, very interesting developments today. But before we get into that, Craig and Lynn, maybe you can first introduce yourself. Sure. Of course, I’m Craig McElhoney. I’ve been in the industry for more than 20 years now.
03:12
pretty much watched as a statistician or a programmer in various roles and moved into leadership positions. Now I currently work at PPD, where I’m the vice president of global biostats and programming. So really looking over all the operations for our statistics and programming services globally and being able to…
03:39
look at ways forward and looking at what we can do to help shape the industry as well. So really looking forward to speaking to you both today. And Lume? Yeah, so I’m Lume Taylor. I started working as a statistician at PowerExcel back in 2001 and after about nine years working as a statistician I decided I wanted to take a bit of a career break and do a full-time PhD. So I did a PhD exploring multivariate methods to predict
04:08
of rheumatoid arthritis in patients using large quantities of genetic markers and demographics and patient characteristics. And this is how I first got interested in R, because I’ve got a history of just SAS programming. I tried to do my PhD using SAS, but found quite quickly that SAS actually isn’t very good with really, really large datasets and soon struggles with kind of some of the computing power that I needed for my dataset that had over like 330,000 variables.
04:38
So I started using R, taught myself R during my PhD, and that kind of led to some of the things we’re doing with the aimed SIG and how I can help doing that, which we’ll talk about later in the podcast. After doing my PhD, I went back to PowerXL for two years, and then PRA Health Sciences for three, and I’m just about to change jobs again to take a new position at FASTA as the manager of their new Sheffield office.
05:07
I’ve been a member of the PSI Ames SIG since its foundation really and the stats computing SIG before that. So I just really enjoy researching any new methods and new tools available to us to use as statisticians. So how much do you use then like R or SASS or whatever? How much do you actually code in your day-to-day business? Well that’s the fun thing now, I don’t use R at all day-to-day, only for kind of hobby stuff which I’ve looked at.
05:35
with the AIMSIG. So I use SAS every day, I would say, you know, in between writing analysis plans and reviewing protocols and CRFs. But I still keep myself really hands on at the moment, whether that will change when I’m at FASTA, but we’ll see. But I like to keep myself hands on with the programming. Lynn,
06:00
That’s really great. And Craig, do you also still use SAS over R in a day-to-day business? I don’t so much go in and use SAS over R. Obviously, being part of Ames, we’ve got opportunities to see what’s going on there. But certainly, teams that I’m working with are using it day in, day out. But as we know within the industry, SAS is such a large player and really has had
06:29
monopoly and particularly the area I have been involved in is mainly phase 2 to 4 trials, although when I started my career I did work for a CRO that specialised in phase 1 trials. So throughout that it’s been very much regulatory based and that’s really where SAS has been used more predominantly. What I’ve seen more and more in the last few years is the need to use R.
06:58
for a number of reasons. One of them being a lot of students come out and they don’t actually have SAS experience, but have R experience. There are some opportunities to use R for exploratory analyses, dealing with large data sets, et cetera, that Lynn mentioned, that are more beneficial, I guess, than SAS.
07:22
Also with regards to some of the graphical displays, etc. that you can get in and we’ll maybe talk about later. But so I’ve seen it used more and more within the industry and certainly what our clients are looking for, but still not really used when it comes to full regulatory submissions. And that’s something that I think is gonna change over time. And we’re already starting to see a bit of an evolution in the industry there.
07:51
So it has been really interesting in Ames, as we’ve had to focus in our, and we’ll maybe give you a bit of insight to why we’ll focus there as we go forward. But it’s just been interesting seeing as we’ve brought in some people that have got more expertise than maybe where Len and myself have started from, their way of going about things, and being able to very quickly get to some really elegant solutions to things. It’s really been quite insightful.
08:20
Maybe you can just introduce Ames a little bit more to the listeners and to me as well. I heard about it and obviously in preparation for today, but really it would be quite interesting to understand what Ames is about and the goals, the people or companies or parties that are involved. Sure. So Ames really came about from a strategy day that we had from PSI.
08:50
I was actually on the PSI board of directors a few years ago from 2013 to 2017. And during that time I joined as the director to look after the statistical computing committee. That was really a committee that had been set up a number of years back to be able to cater for programmers.
09:15
But then as we all know, FUSE started to evolve and more and more programmers went to FUSE as an organization. And the statistical membership that we had obviously do a lot of programming, also looking at different tools, et cetera, but didn’t really get anything from the stats computing committee.
09:40
When I joined, I actually focused for the first two years and we had to have a reboot of the website and that took a lot of investment. But once that was passed with, we looked at the validity of the computing committee and said, was it meeting its objectives? And we decided it was more relevant to go that different direction. And that was really where the aims SIG came up.
10:07
It’s quite a mouthful, so we just refer it to as aims, but it’s a good description of what it’s really about. It’s really the special interest group that is there to support statisticians and programmers with looking at new tools that are out there, with looking at what’s out there available as a toolkit to be able to support in day-to-day activities that statisticians are faced with.
10:35
So that could be anything from data visualization through to applying industry standards, things like CDISK, looking at freeware tools that are out there, taking on things like R and exploring where they could be used to complement what are the traditional toolkits of a statistician, which is SAS supported with a few specialist tools.
11:04
So it really was to identify where the membership could be better supported, to be able to get access and awareness of what was out there, not necessarily to tell them what to use, but actually have access to lots of different things that they could use in whatever setting they faced. And therefore, because of that, we had a variety of people that were interested in getting involved in it.
11:34
And we started off with some pharma, some CRO staff, also academics who were interested in joining it. What we then did was come up with a list of items that we thought would be really interesting, that statisticians were facing day in, day out. So there was more and more requests for things like simulation and modeling.
11:58
Estimans was a hot topic at the time, you know, complex missing data algorithms, you know, a number of different things. And we came out as a special interest group with a list of about 25 items we thought were worth exploring. I then went and presented to the European Statistical Leaders meeting back in the middle of 2016. We took those topics to them and said, what would they want us to focus on as a group?
12:28
And the message really from, you know, we had, I think we had four different breakout groups that were working on things. And every one of them independently came back and said, R. They wanted us to focus on R, particularly with regards to validation in R, because they felt that that was a tool that really could be utilized an awful lot more, but they didn’t know how to go back through that in an appropriate way for submissions.
12:58
So we took that away, refined maybe some of the aims and objectives of the site to focus on that initially. Ultimately, we want to get back to those 25 items and more if we can, but really homed in an R. And as a result, we have changed membership because new people have joined that are particularly interested in R and maybe some of the others that were there initially have moved.
13:26
moved on because it wasn’t necessarily what they wanted to focus on initially. So the result is that the site now is made up with various members coming from some large pharma, CRO, and really looking to drive things forward, providing access to vendors that are out there could support, but also doing our own thing. And we’ve already started publishing material, et cetera, on the.
13:54
the various newsletters that are available with PSI and our websites. And then most recently, and we can get into this later on, we presented a whole session at the PSI conference earlier this year in Amsterdam, which was kind of accumulation of all the work over the last two years. In May of this year, we proposed to the Arcon Sortium a project to build a validation hub, which would be open to all our users.
14:23
and the consortium’s infrastructure steering committee, the ISC, approved our application in support of our efforts and granted us $4,000 to get the project going. So in August, Andy Nicholls from GSK, he’s one of our SIG members, he was invited to present on behalf of AIMS at the inaugural R in Pharma conference and they did a validation workshop at the conference.
14:48
which has led to really an expansion of the project team really focusing on that R validation project. And now it’s including members from, you know, a real large number of the pharmaceutical companies, including regulatory bodies, such as the FDA attending our calls as well. So we’ve kind of got the original SIG, which is focusing on R right now, but then hopes to move into other areas. And we’ve got a working group, which is also working on the…
15:18
validation in our specific topic. So where is the problem actually with R and why is validation so important here? So I think there’s a lot of misunderstanding, well there was from myself when I first came into this, that I thought the FDA endorsed SAS or recommended the use of it. I thought that’s why we used it in our industry. But actually contrary to that, the FDA doesn’t endorse, require or recommend any software.
15:47
And this was announced in a clarifying statement back in 2015, where they said the FDA does not require the use of any specific software for statistical analysis. And statistical software is not explicitly discussed in 21 CFR Part 11, but they say that the software packages used for the stats analysis should be fully documented in the submission, including the version and the build identification. And obviously with something like R, which has built up of lots of different packages.
16:17
I think that’s where the challenge maybe starts and limits the use of R, because the documentation of validation isn’t perhaps as available, or if each package has got a lot of dependencies, then the actual identification of the full build that you’re using could be quite complex. But they also say, you know, in ICHE 9…
16:44
the computer software use, the data management stats analysis should be reliable, and documentation of appropriate software testing procedures should be available. So I think generally, R has been taken on for exploratory work. But when it comes to submissions, people have stuck to SAS just because the documentation and validation is clearer.
17:08
So I think given we’re not limited to SAS, it is interesting the majority of people just keep using it. And we felt from Ames that this was due to them having no centralized location containing evidence of the validation of the packages. And also maybe a lack of training and knowledge for a lot of statisticians in the industry about the use of R because we’ve just used SAS for so long. Yeah, I think that’s a real.
17:38
key factor is that SAS has been in the industry for so long, and it’s become the accepted standard. And it’s obviously going through various versions. And there are certain procedures that have been used for years and years. People know them really well. And therefore, there’s a feeling that that is obviously a lot more robust. It’s controlled because it’s coming from one organization that’s been in the industry for so long.
18:05
that’s dealing with it all, they should be going through the relevant rigor for that. Whereas obviously R can come from multiple sources and therefore there’s a bit of a fear there because SAS has been proven to get drugs through regulatory submissions. And we know that different reviewers may use different tools, but they’ve accepted what has been submitted in SAS.
18:33
So to deviate from that is, I think it’s almost a big leap of faith that the companies need to make to say, hey, we’re willing to go different routes and we can prove that this is successful, that it’s validated and we’re comfortable with it. Now, some companies are quite willing to do that. We’ve got representatives on our SIG that are really, you can see they really want to push that.
19:00
So companies like Roche, GSK, part of our side, you know, really moving forward significantly. And then you have others who maybe have got a proven process, a proven track record of going through certain approaches. Why should they deviate from that if it’s worked for them up to now? And I guess that’s part of the challenge, but we feel, and certainly what I got back from the stats leaders,
19:30
It’s not just necessarily about the use of R or SAS. It’s also about what people are trained in. And maybe the statisticians of today, coming through from universities, are maybe different from the statisticians of 10 years ago. They’ve got different toolkits. They’ve got different packages, products that they’re already getting experience in using them.
19:56
why not utilize those skills instead of taking them away from that and pushing them into something that we’ve used in an industry for many years? We should take advantage of that. I mean, you can see innovation happen lots of places across the industry, things like virtual trials, synthetic control arms, all those kind of areas. So why not do that in the traditional statistical forum that we use within clinical trials?
20:24
In terms of the benefits of using R, we talked about that statisticians nowadays at universities are trained rather in R than in SaaS, probably because it’s for free. At times it was for free for university license, I think. It was for free, wasn’t it? Yeah, but you can have it installed.
20:48
very easily for every student on your PC. You don’t need to go through any licensing things and stuff like that. The other benefit that I see is for lots of the new statistical approaches that come out, there’s very often directly an implementation with our package associated with the new publication of the statistical methodology.
21:18
which I think is another benefit of R. But what are, from your point of view, other benefits, why to use R instead of Sars? I think some of the R Shiny apps that are being created, and this was a big thing at the PSI conference this year, the number of people that have created their own applications using the R Shiny app and the R software, the R code. And it’s kind of…
21:48
unlimited what you can do with it and produce really nice point and click packages that can do just about anything. And there was examples at the conference such as exploring adverse event data in different ways, lab data, and loads of other really nice exploratory hands-on ways of playing with data, which SAS just doesn’t have the capabilities of doing.
22:14
Yeah, I remember very vividly this session that I actually shared. That was really awesome. Yep. So if you have missed that session, you can still view the recordings on the video on demand platform of PSI. And I’m pretty sure at the conference next year, the conference in 2019, we will have much more of these applications there.
22:44
We already touched some of the points regarding what the benefit of having SaaS is. For example, it’s been used that way for a long time and everyone knows how this works and it’s validated and all documented nicely. We talked basically the last 20 minutes about the benefits of using R in terms of being freely available and coming from university and so on.
23:13
Why aren’t people using R at the moment? I mean, it’s obviously rising the increased number of people using it, but why isn’t it more? What are the benefits for ourselves? Well, I think it does go back to certainly the industry we’re in, the trials are so heavy regulated. And there has been that understanding that the regulators, as Lyn said,
23:42
they expect SAS to be used. And that is the only package they’re going to look at. That’s not the case. They’re open to using any package. But there’s also then a fear and a risk of deviating away from the thing that has been used so successfully for many years. I think what we would advocate is you don’t move away from SAS. You use all the tools that are available to you for various situations.
24:12
So if SAS has well-defined processes for certain areas, certain analyses, then use SAS. If R is better for creating the apps through R Shiny, visualization, use that. If Spotfire is better for visualization, then use that in different settings. What we really want to be able to do
24:40
is show people, here’s all the things that you can use, here’s the benefits of them in various settings, here’s what you can do, and maybe remove fear from that. I think that’s one of the big factors that’s holding it back. So, while you get junior staff graduates that come through with that experience, they’re not necessarily in the place of making decisions on what packages are going to get used in companies. It’s people that have got many years of experience behind them.
25:09
And what do they use? They use SAS. That’s what they’re used to. That it’s very comfortable. SAS is getting better itself at things. But there’s an option to be able to get everything out there for people to make informed decisions. And I think that’s where we’re about. There isn’t enough information out there in the wider industry. And certainly,
25:37
and all the statisticians awareness about what R is capable of and why you shouldn’t be frightened of it. And that’s the barriers that we’re just trying to break down. So you’re saying it’s more about the taking away the barriers to use R rather than to replace R. Yeah. And one of those big barriers is the validation site because how do you validate an R? Well, actually, you could ask the same thing in SAS.
26:03
but people don’t ask that because it’s been there for so long and you’ve got a PROC mixed. Well, that’s been there for years, that’s OK. We don’t need to do that. We should pick something up in R and then people start thinking, hey, I’ve got to validate that. How do I go about it? So that’s really part of what this is about is, well, if you could create a library of validated approaches for various.
26:31
various needs, various analyses, various visualizations, etc. You’ve got a great starting place to go. So what’s your plan in terms of to come up with this validation framework? So just before we go on to that, I was just thinking about what Craig said about people and using it at the moment in the industry, using R in the industry. I think that people are using R a lot more now and there are specialist groups within a lot of the big pharma companies.
27:01
who are just using R. I know within Roche there are teams that are programming R every single day. So I think the move to R or the use of R within our companies where it can be done in an exploratory way is actually happening. And I think what we’re really focused on within
27:27
a method for people to use R for submission work as well as the exploratory side. And when we look at the submission work, there’s almost two parts within R that we need to consider. The R Foundation released an update to their guidance document for the use of R in regulated clinical trial environments. And that document addresses the validation question in relation to base and recommended R packages. So they’re the ones that come.
27:58
you know, installed with R when you download R itself. And the general consensus within the AIMS SIG is that base R is fit already. It’s for use in the regulatory context. And I think that in itself is kind of a key starting point because at the moment I think people are just scared to use R in the regulatory environment because then they’re going to get asked to provide their own.
28:25
documentation of the validation of it, when in reality maybe if we as a group agreed that the base R packages and the recommended packages could now be used without the fear of regulatory authorities coming back and asking us to validate them, if we are as a society in agreement that they can all be used, then we can use R now for regulatory submissions.
28:53
and all you’d have to do with individual companies is to assess them against your internal quality criteria. So that’s more verifying that the installation of R has been successful, rather than the packages themselves are doing what they’re supposed to be doing. So where we are now, I think, within the AIM SIG is that we’re all in agreement that the base of recommended packages can be used, and we don’t need to worry about any further.
29:23
validation of those. But the remaining R packages, which may come from anywhere, be written by anyone and they might not follow a typical software development lifecycle, then our discussions have really been centred around those R packages and how we can help supply validation documentation of those. And it’s sort of true that all packages that are on the CRAM, you know, the central R archive network, all packages on there must pass.
29:52
a large number of technical checks before they actually make it onto the system. But that’s not necessarily a guarantee of the quality. So the packages don’t have to contain examples of tests. Maintainers of the packages don’t have to follow development best practices. And further, since many are produced by individuals, as a company using those packages, I can’t actually audit the package maintainers for all of those different packages that I want to use.
30:21
And that’s kind of where R is different to some of the more off-the-shelf software, where at any point you could audit the SAS company and then supply tests and examples within SAS for you to actually use. So I think, first of all, the aims group kind of have broken it down into those base and recommended packages which are OK to use, and then these additional ones. So the validation hub which we’re looking to create is really focusing on the
30:50
add on packages that you download yourself. And what we’ve started to do is focus on two aspects of validation. The first one is requirements and tests, and the second one is a risk assessment. And when we think of validation, it typically involves writing a test, and a test is written to test a very specific requirement, such as does it do an ANOVA?
31:18
does the ANOVA give you the right results? And so you can write what you want the requirement to be for the package, and then you can test if it’s bringing you out the results you expect. And those requirements and tests generally aren’t available for all of these add-on R packages. So the validation hub will provide a mechanism for contributing both the requirements and the tests for the R packages. And we’re going to share all those requirements on a website so that anybody.
31:48
who wants to set a requirement and test it can write it, test it, load it up and then everybody in society, not only just in pharma but in all the other areas where R is being used, such as Microsoft and Amazon and Google, they’re all using R and we can have this single portal available to store these tests and requirements and share them with everybody.
32:15
then if a bug is actually found in a package, it can be fed back to the author, they can fix the bug and we get better quality and packages as a result of it. And all is documented. And it’s all documented and the different thing that instead of every single company having to do this enormous effort of setting tests and requirements for every single R package, if everybody just does a small amount and loads it up to the central shared place, then we all benefit from…
32:43
been able to use a series of packages, you go onto the site, drop down menu, select the package that you want to use, and then you can actually download the validation documentation from there and store it along with your submission. Then if the FDA come back or the EMEA come back and they say, have you got evidence that this package was tested, you can say, well, we ran these tests through and we got the results we expected from them.
33:12
And it will be all version-based as well, so that you select the version that has had the tests run on it and use that as an example. But there is another side of it, and that’s the risk assessment. And this is actually the part that we’re starting on. So if you’re using open source tools, you need some kind of an idea about how risky they are, what you’re using.
33:41
And so if an R package is being actively maintained, it’s got wide exposure to the user community, and bugs are clearly being tracked and fixed in an ongoing way, then we might consider it to be following some good development practice. And we may determine the use of that package is then low risk to our organization. And so for the hub, the validation hub, our goal is to determine a set of metrics that companies could use.
34:10
to actually assess the risk of our packages. We provide guidance for these metrics so that the organization can judge whether the risks are acceptable and whether any additional testing of the packages that then be required. And the idea is the metrics are run live from the site, and it’s just really collecting information. And at the moment, that information is not available on a single site. So you might want something like the number of downloads, the age of the package, the number of revisions it’s had.
34:39
time since the revision, credentials about the author, whether it’s been widely cited, and all these different metrics which give you evidence of how good a package might be. If we can get all that together somewhere, it’ll allow you to make a judgment of the level of risk of the package. Within each company, you may have different levels of risks that you’d be willing to take. But if we can somehow get this information…
35:08
into a central place, then it allows us to interpret that and make that decision ourselves based on real data, real time data on the large number of packages that are available. You already mentioned that R is being used not only in the pharma or CRO environment, but just looking at the work from the aims, who are you then actually aiming at with all your activities? I mean, obviously,
35:38
needs to be somewhere around the PSI, let’s say, so around the pharma statisticians. So is there anyone you are aiming at with your work primarily or is this very general? So I think our industry is probably the most, or one of the most regulated industries. So really this problem that we’ve got about being, if you like, pedantic about validation.
36:06
We’re trying to be really thorough like we have to be and make sure that the information that we get out of packages that we’re using is accurate and precise and correct. So it’s such an important part in our industry that we almost have to lead this drive because we need it the most. But the way that I see other industries taking advantage of this is
36:34
validation of all of these packages would be an enormous job. So if we can start within the pharmaceutical industry and getting people within our industry to contribute to providing evidence of testing, then there’s nothing to stop PhD students, people working for these other industries, you know, even finance sectors, banking, and like I said, Amazon, Google, if they are also interested in uploading.
37:04
evidence of tests and requirements, then that just helps us compile onto a central hub all the information in one place. So that if you’re doing a big submission, you’ve used a lot of packages and you want evidence of the validation of those packages, that we have more information on that site to download. And you don’t then yourself have to…
37:30
put together validation documentation for every single package that you want to use. That’s awesome. So basically everybody will benefit from this outcome. Yeah, I think the word in, I guess I would add to what Lyn said there, and I mean our industry is so heavily regulated because patient safety and health is involved in that. And so we need to be correct with providing
38:00
information that decisions can be made off the back of. So I think that’s why there’s a real drive within our industry. But there’s a lot of techniques, statistical techniques, are used across multiple industries. I think, though, going back to the who’s involved within our industry, yes, this came from PSI initially. And then it went to FSWY. So obviously, Europe.
38:27
European statisticians involved in that. But we’ve taken it much wider than that. So when Lyndon mentioned earlier about the funding from Arkansas team, so we’ve reached out to Arkansas team with the ideas here, we’ve reached out to the American Statistical Association, so the biopharmaceutical section there and they’ve got a working group that’s looking at software. So they’re linked into it. We have representation from Transcelerate.
38:57
We’ve got regulators, it’s the FDA at the moment, but we hope to get some from EMA and wider as well involved. And just to give you a list of companies that are involved in the wider group, we’ve got Amfi, Amgen, Biogen, Eli Lilly, GSK, J&J, Novartis, Merck, Pfizer, PPD, PRA, Roche,
39:26
and Synchronon. So really quite a wide representation. We would love to extend beyond that and get even more representation there because, you know, there’s a lot for individual companies to do and some of those individuals have multiple people joining the group and the momentum that we now have there because we’ve got the back end of the Arc Consortium in this and
39:56
TSI conference we had earlier this year is really significant. It took us a bit of time building up towards this, but I think we really are seeing that traction happen pretty significantly going forward, and it’s quite exciting to see. And really, we would encourage more companies to get involved in it because it’s really the industry that needs to make the move, not individual companies.
40:23
Yeah, Craig, I think it’s even more companies than the list now as well, because we’ve had people joining over the last few weeks. Just to say for those that don’t know what Transcelerate is, this is a new initiative. I think it’s still currently under review for approval, but it might almost be approved now. But they’re going to be tasked with conducting modernised data analytics for clinical development with R. And so it’s kind of a funded group specifically.
40:53
tasked with objectives with a limited time frame to meet those objectives funded by the pharmaceutical industry. So what we’re making sure as a group is that we’ve got representation from them on our group and vice versa to make sure the two initiatives remain closely aligned and avoid any duplication of effort, which is great because they’re going to have sort of not full-time staff, but they’re going to have staff with time.
41:20
dedicated to working on our in the industry, as opposed to a lot of the work we do, which is in our volunteer time. And also to add, Craig, I’m not sure we mentioned that we had three members of the FDA actually join our calls regarding how the framework will look on the validation hub and give us their input as to what they see as the important parts of validation and risk and testing.
41:49
So that was great to have their valuable input as well on the group. That’s awesome. Actually, in terms of Transcelerate, next week’s episode is all about Transcelerate. And if you’re more interested in that, then just stay tuned and listen to next week’s episode. In terms of contribution to this overall topic, you seem to be quite passionate about that. Why is that the case?
42:18
I think I’ve been involved in a number of projects over the years which have done big things with very little or no budget and I’ve seen the power of people working together to achieve something by just dedicating a small amount of their time. I mean I did some work with the RSS on careers workshops for students that are just coming out of like their GCSE, so like 15, 16 year olds and 16, 17 year olds.
42:47
to try and encourage them to take statistics further into their degree, just to kind of bring more high calibre students through into careers as statisticians, because I felt quite passionate about it being such a great career to take. And I think a lot of teachers haven’t got that information there. So with the RSS we developed around five mini workshops in different statistical areas. And they were in like…
43:17
you know, statistics in the environment or marketing statistics, it wasn’t all pharmaceutical statistics, but one of them was pharmaceutical stats. And the way we did that was just, you know, by donating a bit of our time and all working together. And I see this project as kind of the same thing that R has always been free software. And just because the pharmaceutical industry is well regulated, it should be that everybody needs the software to be of a high quality.
43:46
And I think if you share things, it’s just a way of not repeating the same research, but actually standing, I guess, standing on the shoulder of each other and just helping each other to be able to do more by working together rather than repeating the same research and actually not getting as far as you could if you work together in a bigger way. It’s a bit of a dream. That’s completely fine. That’s awesome.
44:16
So in terms of contribution, why should someone else contribute to this big project and to make this dream come alive? Do you know, I think I was thinking back to what Lyn was saying there about why she’s passionate about it. I guess the journey has been there for both of us on this. We’ve been the two people involved right from the beginning and it’s so exciting to see something that was an idea actually start to…
44:45
take shape and I think there’s nothing more fulfilling than actually giving people a voice and letting them run forward because what I guess people didn’t have was a place where do you go? I mean, Ara is the topic that we’re moving for at the moment but you know where do you go with that? What’s been really exciting to see is there has been a bit of a path that started to be paved
45:15
You’ve got people who are really passionate about it and are way more an expert than myself, for sure, at coming board and actually applying their skills where it should be. I think we’re at an interesting time in the industry. At the moment, there are lots of changes taking place. We know that across multiple industry, there’s lots of technology innovation coming into play that is changing the shape.
45:44
the way we do things. And it’s really starting to impact with, certainly within the clinical trial space. And therefore, I think it’s a time where you can be involved in shaping the industry and what the future of it looks like. That’s really exciting. And this group, and certainly the wider working group round about R, is very much in that place. And as a programmer, as somebody that’s
46:14
interested in technology, that knowing that you’ve had an influence on the future and that your skills are being used beyond just what your day-to-day job is, I think is really exciting. I think you can really leave a legacy there if you contribute here. So guys out there, if your programmer would like to shape the world a little bit, here you go. Absolutely.
46:44
And I think it’s good to recognise as well that not everybody is going to have the same skill set. I mean, I am by no means an expert in R and what’s been really nice is to be able to contribute in my own way. I’m pretty good at organising meetings and doing meeting minutes and getting people to contribute and enthuse people and Andy Nicholls has been fantastic over the last six months. I have to mention him on here because he’s really given us.
47:12
a drive and direction because of his expertise in R itself and in the validation of R. And then we’ve just managed on the 12th of December, so just last week, to go live with the website itself. And that really is thanks to Reinhard Koch from Roche and his expertise in websites and how to actually do that has just been fundamental as well. And between us with all our different kind of skills and things that we can do.
47:41
we can really make things happen. But I think individually, it would have taken a lot longer or not happened at all. So it’s really nice to mention a few individuals. There’s a lot more people that are really helping us and contributing in this wider collaboration, which we don’t have time to mention today, but just to thank everybody that is involved now and who is gonna be involved. And like you say, to ask people, anybody wanting to get involved in the validation hub, to contribute to…
48:10
and review metrics and to really design what we’re doing, please get in touch. You’re very welcome to attend our meetings. Our next one is the 22nd of January at 5 p.m. UK time. So if you get in touch with me and Taylor, I can get you invited to that. That’d be great. Yeah, and in terms of all the details, you will find that on the homepage of the Effective Statistician.
48:38
Just check out thee and you will find also contacts to Lynn Taylor and to Craig. Awesome. Thanks so much for this really, really interesting and forward looking interview. I wish you all the success you need to accomplish these goals and to make your dreams come through because we’ll all benefit from that in the future.
49:06
Thanks a lot, Greg. Thank you very much, I appreciate it. Great speech, you guys. So don’t forget to sign up for the leadership webinar. It’ll be awesome. And you don’t want to miss out on this one. The webinar is of course for free. Just sign up at thee The show was, as usual, created in association with PSI. And next week you’ll learn about another initiative.
49:35
and the impact it has on you as a statistician. Thanks for listening, bye.
Join The Effective Statistician LinkedIn group
This group was set up to help each other to become more effective statisticians. We’ll run challenges in this group, e.g. around writing abstracts for conferences or other projects. I’ll also post into this group further content.
I want to help the community of statisticians, data scientists, programmers and other quantitative scientists to be more influential, innovative, and effective. I believe that as a community we can help our research, our regulatory and payer systems, and ultimately physicians and patients take better decisions based on better evidence.
I work to achieve a future in which everyone can access the right evidence in the right format at the right time to make sound decisions.
When my kids are sick, I want to have good evidence to discuss with the physician about the different therapy choices.
When my mother is sick, I want her to understand the evidence and being able to understand it.
When I get sick, I want to find evidence that I can trust and that helps me to have meaningful discussions with my healthcare professionals.
I want to live in a world, where the media reports correctly about medical evidence and in which society distinguishes between fake evidence and real evidence.
Let’s work together to achieve this.