R versus SAS – which is the better tool?

Today, we talk about a controversial topic. If you go online and look into social media, you’ll find a lot of data scientists who talk about R versus Python. In medical and clinical research, it’s more about R versus SAS.

For this discussion, we have Sam Gardner and Thomas Neitmann representing SAS and R, respectively.

Join us while we discuss the following points:
  • Our first experience with R versus SAS
  • How long does someone completely new learn R versus SAS?
  • How easy is R versus SAS when managing day-to-day basis tasks?
  • How do R versus SAS community and events help?
  • How do updates work with R versus SAS?
  • How easy is it to submit study data to regulators using R versus SAS?
Reference:

Thomas Adventure Blog

Listen to this very insightful episode and share this with your friends and colleagues!

Never miss an episode!

Join thousends of your peers and subscribe to get our latest updates by email!

Get the shownotes of our podcast episodes plus tips and tricks to increase your impact at work to boost your career!

We won’t send you spam. Unsubscribe at any time. Powered by ConvertKit

Learn on demand

Click on the button to see our Teachble Inc. cources.

Load content

Thomas Neitmann

He is an R enthusiast currently working for the Swiss pharmaceutical company Roche as a Statistical Programmer Analyst for late-phase clinical trials in neuroscience indications.

His R journey began in 2014 when a coworker told him to run an R script to “analyze some data”. Having never programmed before at that time, he was overwhelmed. But he took on the challenge and soon realized the power and joy of programming.

Since then, he learned a couple of other programming languages including Matlab, Python, and SAS. But his favorite is still by far R.

He enjoys sharing his knowledge and started doing so publicly on LinkedIn in late 2019. Since then he went from around 300 to 7000+ followers. Many of those encouraged him to create a blog to have a central place for all his posts. So that’s what he did.

As the name suggests this blog focuses predominantly on R. He will occasionally cover other, related topics such as git, though. He also enjoys data visualization a lot so you’ll likely find some posts on that, too.

If you have a specific topic you would like him to write about, please feel free to reach out. The best option to do so is via his LinkedIn. If you are not yet connected with him, make sure to send him a request!

Transcript

Alexander: You are listening to The Effective Statistician podcasts, a weekly podcast with Alexander Schacht, Benjamin Piske and Sam Gardener designed to help you reach your potential lead great science and serve patients without becoming overwhelmed by work.

Today we are talking about controversial topic, R or SAS – what is a better tool in pharmaceutical research? Stay tuned!

If you go online and you look into social media, you’ll find lots of data scientists talking about R versus Python. Well, in medical clinical research. It’s not so much about R versus Python. It’s more about R versus SAS. SAS has been there for decades but R is the new guy that is getting stronger and stronger every day. Well, R has also invested quite lot. So what’s the tool of choice? Stay tuned. This is a really, really interesting discussion happening in this episode.

I’m producing this podcast in association with PSI, a community dedicated to leading and promoting the use of statistics within the healthcare industry for the benefit of patients. Join PSI today to further develop your statistical capabilities, with access to the ever-growing video on demand content library, free registration to all PSI webinars and much much more. The reduced rate is only 20 pounds for non high-income countries and it’s also only 95 pounds for high-income countries annually, of course. Visit PSI website at PSIweb.org to learn more about PSI activities and become a PSI member today.

Welcome to another episode. And today we are talking about a controversial topic that Sam actually came up with. Sam, you suggested that we do a podcast of R versus SAS so maybe, you know, you can tell a little bit of a story of you know, what was your first kind of experience? Or, you know, touch point with R during when did you first hear about it.

Sam: I was in graduate school, studying statistics, the primary tools we used to learn to do statistics on were SAS  and it was an old-school like SAS on the console, you know, where you submit code and wait for it to run. Go open a file with the output results or print them out. You know, it was not sort of even like the program editor because I think existed on Munich. But you know, we were just doing it on like IBM console and we use some other tools like Matlab and S Omnitab which was like the predecessor to minitab that was used sometimes for teaching. I’m not even sure when are actually sort of came into existence but I’m not sure it was around when I started graduate school but near the end of my program when I was there full time in school, there was a professor down the hall and he’s starting using this new code called R was a clone of SAS and our clone of S and we were all whispering about it in the hallway. Like oh, he’s using this new software and I’m not really sure about it is just trying to copy S and is it really going to go anywhere?

I guess, it really did actually in the end, but it has. But, but yeah, at the time it was, you know, sort of fringe. It was on the fringe of what people were doing and in statistics and it was very much just sort of experimental research tool at that time. So that was my first exposure to R and seeing it used. And then I didn’t really see much of it until probably, you know, in the last 10 years is when I’ve really started to see it and be used be recognized as a tool that could be used in for more than just academic purposes.

Alexander: Yeah, I also started with SAS. I remember, well, I actually started with C but that was, you even even earlier in my career, but when I started to do some statistics, I was using SAS, I think not sure whether it was some kind of it was version 2, or 3 point something or was version point something. I think we were quite excited when version 3.11 came out because it had some news In it. But yeah. So that was you can see or you can listen how old Sam and I are, but because we are so old, we have yeah, more younger generation participants today as well. So Thomas, great to have you here on the show as well. Yeah.

Thomas: Absolutely, thanks for having me.

Alexander: Maybe you can start a little bit with an introduction of yourself.

Thomas: Sure. So I’m currently working at Roche as a statistical programmer. I’ve been there for now for one and a half years or so. And in my current role, I’m actually leading a team that develops an R package to create Adam datasets according to the CDC standard which is actually a collaboration with another pharmaceutical company GSK. So you can see kind of R is in high demand in the industry which for a person like me is kind of a good situation because actually the very first mean, language I was an ever kind of introduced to was R that was back in 2014, I believe, when I, when I started working at a research lab and then some PhD student handed me a script over and said, yeah, just run that script over and you will get the output from this data that we collected, obviously, having never coded before. I didn’t even know how to open that file.

So, there was certainly a frustrating experience with her, but I downloaded R studio and and you know, I managed to run that program and then I thought like what would be so cool if I could change like some things and make it, you know, adjust it to my needs. And  then I really went down the rabbit hole and learned R and became pretty proficient in it and well now, yeah, I’m now a statistical programmer, and it kind of might in my daily bread and butter but actually then when I started, or when I decided to join the pharmaceutical industry, I learned about SAS for the first time that was guess 2019-ish or 18. Yeah. And then then I learned SAS, sort of, with my art background says that was very, very different but certainly was an interesting experience. And then I do these all these certifications that SAS offers and landed my first job at a CRO and back then at that job, it was really full time SAS your classical thing in the former industry. But once I joined over and rushed and very soon people kind of saw that I’m proficient at R so they handed me over all this work and then I started from there and now that’s basically what I am doing.

Alexander: Very very interesting. So Let’s talk today a little bit about thses two different softwares. And I have a couple of dimensions myself. By the way, I’m not doing any programming anymore for quite sometime so I probably a good kind of moderator here but not a lot of content perennial so I can only see what the output is and but I can’t really tell kind of Whole flexible ceases I tried a little bit to program. In other words, it’s quite some time ago. Okay, so let’s talk about first getting up to speed. Yeah. So, Sam, how long do you think, you know someone that is completely new, has never coded before can learn, you know, decent amount of SAS so to kind of get along on a day-to-day basis?

Sam: That’s a good question. And I’m not really sure. I think if people learn that is the way I learned it. You learn SAS by shamelessly stealing other people’s code. Yeah, you say, well, how did you do this before? And you see the code, and you start to see how it works and and then you start to learn the syntax, you know? And I’m back in the day long time ago, we actually had paper printed manuals, right? That was set on the shelf of all this Aspen. You could flip through these. The final ones. Yeah. Tax and examples now that’s all online. So I think with the availability of the online documentation, it’s a lot easier that way, but I think there’s a little bit of a learning curve. You have to climb up particularly if your first exposure was not SAS like if you’re just starting from scratch and you never done any statistical computing, you know, you’re going to struggle just like anybody else with it learning something new but if you started using R for instance, like Thomas, you now and then you get thrown into SAS is Immediately your good, it’s completely different, you know, the whole paradigm, in many ways for how you get data into do analysis, and then how you actually do analysis and get results out, and is very different from in between SAS and are many ways. And so, in a lot of that stuff in SAS is carried over from its initial architecture. You know, when it was designed to be run on mainframe computers back in the 90s in the 80s So some of that still kind of over that flavor. And it seems a little antique sometimes, antiquated, but once you learn it, you also think you see some of the real power than, and that, that’s there. And that some of those initial design, decisions, around the software, really had lasting staying power and they’re still good ideas about how would you design a software system to read in a lot of data and manage a lot of data and do analysis on it? It’s pretty good.

Alexander: So, Thomas, what would be your kind of perception? How do you learn R? Do you also give them some code and then work from there?

Thomas: Yeah. And actually if you’ve been kind of Google for our resources, there is so much. It’s actually hard to decide on which to start, which course to take. There’s so many resources out there’s some great books. For example, one, I recommend frequently is our for data science which is a really good one but there’re many platforms that offer, you know, that’s kind of mooc style courses which I also took back in the days which really helped me. And then again, I think if you’re new, if you’ve never, you know, really done computer programming, then it certainly will take time. Just like SAS. I think if you have a background in something like Python, for example, you can get up to speed very, very quickly because kind of the general principles are fairly similar, whereas, as mentioned, if you come from an art background and then get SAS, SAS is kind of very, very different and many of I would say today’s general-purpose languages.

Sam: Yeah R feels more like modern programming language which is or just general programming languages. Like if you learned how to program in C or C sharp or something, get you some of those more of an object-oriented type programming language, R seems a little bit more.

Thomas: Although, if you ask a python programmer, they will tell you that it’s a horrible language for whatever reason, I don’t understand but yeah. Everyone has their preference sickness.

Alexander: Yeah. That’s a lot of these Python verses R debates, which is interesting. But today it’s I think actually in the pharmaceutical industry, it’s more kind of questions of R versus SAS it’s interesting, not so much Python.

Thomas: I think from what I’ve seen some people playing around with python but it’s really, you know, in this thing whereas R is now I think every family company probably has now in our strategy so to speak. And it’s really trying to develop tools along those lines. And I think Python this more outside of classical clinical trials. If you are really heavy and machine learning to do neural networks.

Sam: I think that’s where pretty shines as great libraries. Yeah. So kind of general purpose data. Some of the non-clinical areas and areas where you have lots of data and you have to manage lots of data in your using, mostly open source tools python has a lot more use there as well.

Alexander: In terms kind of managing things on a day-to-day. Yeah. So when you need to update things, when you need to kind of change things, how easy would you say is it R versus SAS? This time, I maybe can start with Thomas.

Thomas: So where exactly are you going with the question? kind of updating existing code, or..?

Alexander. Yes, the ease of use on a day-to-day basis.

Thomas:  Well, I would say, it’s fairly similar for example, in R many people use R studio as their favorite editor and now SAS has SAS Studio, which is in many ways very similar. A nice interface to work with, at least I like that very much. I know like old-school SAS programmers. There are more on the PC SAS side, but it has their favorite tools.

Alexander: I was quite quite happy when you know SAS introduced some kind of help. So in terms of showings when you have missed the semicolon.

Thomas: Yeah, probably. Yeah.

Sam: That’s a whole different story about trying to pry the enhanced SAS editor out of programmers, cold dead fingers because they do not want to change. Maybe do that in exactly. Yeah.

Alexander: Okay, so IDE is really, really nice to have. Okay, next, I mention Community. Sam, what do you think about the SAS  Community and all the kind of events around it and stuff like this.

Thomas: I think an IDE is really a great tool to have at least I mean I can really not imagine not having one like just a plain text editor. I would probably really suck because I would have so many simple analysis.

Sam: Yes. Back in the day, that’s how we did SAS code. Yeah. Open up emacs VI and type your code. Save the file and do SAS space, the filename and let it run. You know, that was it, you know, there’re different types of community. SAS itself, you know, has a very large sales and marketing organization and they put a lot of effort into creating events and I guess, environments, where you can meet other sash users share stories with them, get to know them, where you can ask questions. Their technical support is very good too, you know, if you have a technical question, they generate a very helpful. So I think they did they do a lot that but I and I work, I used to work at SAS not selling, actually SAS Products I sold jump products or work in the sales and marketing for jump. So I know, but I know a little bit about how things work inside SAS or at least how I used to work. And, you know, SAS makes its money by selling big solutions to companies, right?

They don’t really make their money by selling PC SAS to an individual, right? Yeah, that they were the big revenue comes from as far a large company like a Pharma company will buy sassed, to be able to be used by hundreds of people. In systems that do fraud detection and things like that. So they are they organized a lot of their communities around those solutions. I guess is what I’m trying to get at, is they, you know, so you may go to an event and they’ll talk about well, what can SAS do for fraud detection, or you may go to a conference and they have a lot of talks on. This is how I use the SAS system is integrated system using SAS as the foundation to solve my problem.

But then you’d also have the SAS user groups and at least in United States. I’ve got guests to know if they have those existing. In Europe. But in United States have a lot of regional SAS user groups and often run their own conferences. Those are actually some of the best ones to go to if you really just want to learn new things about SAS because they tend to be run by users not by SAS and and that’s nice.

Alexander: Yeah, that’s also is a SAS global forum which I attended two years ago and speaking again in 2021. So depending on when is that.

Sam: Those are great events, if you can afford and have the time to go through, you learn a lot; they treat you pretty well. And, and And you get an opportunity to get a broad exposure to people who are using SAS and lot of different companies. Yeah, just Pharma.

Alexander: And thousands of participants. It’s huge. It’s really huge. Yeah, yeah. So you need a big convention center. Okay, tell us, what do you think about the R community?

Thomas: I think it’s a really great community especially if you’re like an internet first person, you know, kind of my generation. If you go on Twitter and then there’s hashtag our stats for example, and people, you know, posting their new solutions. They come up with asking questions, helping others out. It’s really great. It’s also on LinkedIn. It kind of started kicking up and then you have obviously things like you can stack over where you can probably get an answer for any question you can think of. So if you just type R and then your question in Google, you’ll probably find an entry from another user that face the similar problem as you.

And in general, the community is really trying to help each other trying to improve R and the R ecosystem itself. And there’s also a heavy emphasis on the kind of diversity trying to include as many groups of people as possible. So it’s a fantastic community to be part of them obviously compared to assess. There are helpless that you can call there’s no company behind this it’s kind of open source language. There’s obviously the core team, but that’s a couple of statistics professors that have a real full-time job. You can’t just call them up and ask them the question but that’s really where the community shines and then obviously there’s also a lot of conferences. So things like for example in my field R pharma is a big one. R studio, thus there R Studio corner. You have European conferences and whatnot. Also our meetups kind of similar to the SAS user groups that’s also huge. And I feel like it’s just getting more and more. And yeah, it’s great.

Alexander: Yes, it’s actually interesting. There’s R Consortium. So, that is a kind of much bigger society. I would say around association, around R and then there’s R studio but R Studios is actually a company. Yeah. So they actually have a help desk where you can call and say have also sales people and things like that.

Thomas: But then that’s really kind of R studio is there kind of IDE that they developed and now obviously they scale to much more products but it’s not like they are behind R the language itself. But definitely, we’re I think we’re very lucky to have R studio as a company in the space. And I think they recently transitioned to what is called a public benefit company in the US. So kind of there and you could kind of say, I think it’s a not-for-profit company. It’s trying trying to work for the greater good of the R community so to speak. So yeah and I think every pharma company that uses R uses, R studio products and is very happy to have them because they are really good.

Sam: And the like the art studio IDE quite a bit. It’s a nice tool. Tried a couple other things and that’s one of the best I think so. Yeah, yeah.

Alexander: And they also have, you know, the block like our views and stuff like this. So yeah.

Thomas: I think they’re really heavy also on the educational content site and they actually now also have an instructor program kind of teaching others how to teach R which I think in general and programming is a kind of problem because there are lots lots of great skilled programmers out there. But actually teaching someone how to program the whole different skill set and I was lucky enough to attend that session last year. So it’s really interesting. If you think of the methodology behind, how you can actually bring someone up to speed and can teach them, if kind of using empirical best practices and to program because oftentimes, yeah, it’s like, you give someone a script and then it’s like yeah, find it out and if you’re smart enough, you can do it. That’s probably not the best way to get someone.

Sam: you know, you said something there that triggered me a little bit. I don’t want this to turn into a war, right? Because this is not like, who just which software is the best software, but one thing triggered me a little bit. Is you said you have to teach somebody how to do programming? My experience has been with I work with a lot of people, scientists and engineers. They just want answers, they don’t want to; they want to solve problems, right? And what they want is the tools to help them solve problems. And so, yeah, I’m a big user of jump, which is one of SAS has products and and what’s its real power is it’s primarily or at least in its main interface point-and-click that has a graphical user interface. I don’t know if anything really how much of that actually exists for R and even for SAS and just you can you can you know, drop in little sections of code, right? That you want to run and say, open up data sets it and does a lot of the actual code generation there are code generators is what those graphical user interfaces are for SAS. It does a pretty good job of that for the basic user where they don’t really need to learn code. So what would be your response to that to say? Well, what about what if I’m working with someone? One, that doesn’t want to code.

Alexander: Yeah. Is that kind of some kind of jump like verswion of R? I don’t know.

Thomas: I mean, I didn’t really use jump ever so I cannot make like a really accurate comparison but I don’t think there is like a drag-and-drop kind of interface to be. Obviously our shiny is a very popular tool where you can kind of build solutions, but then you, you get into a real kind of programming development. But we, for example, to that extensively in Roche, we use this are shiny framework. And then give the applications where our scientists can kind of explore the data in a way that is a bit more tailored towards, let’s say the analysis that has been. For example, written down in statistical analysis plan, rather than being able to drag and drop around everything and give them sort of the ultimate freedom to explore let’s just say, which might sound not good but actually sometimes it’s good to push them in a certain direction let’s say with where they can go because clinical scientists can be very creative in finding out very interesting subgroups.

Sam: I think there is a package. There is a package called R commander that someone has out there that has some GUI functionality in it. So yeah.

Alexander: But you know it’s this is actually one of these things, it’s kind of some more flexible something becomes, the more difficult it is also becomes. Yes, it is. So and I think jumper send something that is less flexible, but then, of course, much easier to use because then you have the power of templates of defaults and all these kind of things. Yeah. And that helps and of course, to kind of minimize the options and standardized lots of things. And that makes it easy to get lots of things quite quickly. I think one of the nice things and such maybe the next things that I want to talk about is visualization, you know with this jump you can very quickly get lots of easy visualisations but you know. Yeah.

Sam: And in SAS to has newer the newer versions of SAS have come along with web-based HTML based interfaces where you can do some of that interactive visualization to and what do they call that they call that Visual Studio. I think that’s what Called in SAS. I’ve used it a little bit, it’s got a nice. it’s you know, you can you have to have your data set prepared in advance, right? But once you’ve got it set up, then you can drag and draft variables onto a graph and make different types of graphs and save them and things like that. And that’s very nice. Even though I’m kind of on the, on this debate kind of on the SAS side, I will say, like, really like high-quality publication quality graphs are hard to make SAS. I’m just, let’s just admit it.

They’re people who’ve kind of made their whole careers, knowing Proc Gplot, they’re experts in Proc Gplot because they can do it. They’ve been proved SAS quite a bit in a lot of ways. A lot of the statistical procedures automatically generate relevant statistical graphics now, which is nice that you can output as, and that’s just an option in the commands when you do it. But, but if you just want a really refined, precise graph of your data, sometimes you have to work pretty hard at it, but I don’t know if that’s different in R or not either.

Alexander: What’s your reply in terms of visualization in R?

Thomas: So from my point of view, yes, and I mention if you really want to create a publication, ready high-quality graph R is one of the best tools out there, I think, and especially the ggplot2 package which kind of builds on this, what’s called the grammar of Graphics. So it’s kind of really a, let’s say, plots, broken down one theoretical layer, how to, actually, which layers are there and so forth. So, rather than thinking about bar plots, It’s and scatter plots and so forth. Really taking a step back and trying to Define this grammar, which can make it. I would say initially difficult to work with, because you have to understand these concepts. But what I think,  once it made click, then you really able to create extremely refined graphics and also combined graphics in a very elegant way and yet, the more proficient than you get with theming then you can make it look like any newspaper you want out there because it’s so flexible. If you just stick to the default, well then it will look like a ggplot.

The kind of the default gray and white background, but super, super flexible tool, extremely powerful. So that’s again, one of the innovations that came out of our studio. We can where they develop this package. And yeah, I think the community also really appreciate staff that because R also has like built-in graphics which I think at the time was also extremely powerful and I think it still is but I would say ggplot just as you know another bit of functionality on top a bit more sophistication and you can get even better results.

Alexander: Okay. You just mentioned kind of ggplot is an update. So let’s talk a little bit about updates. How does it work with updates in R?

Thomas: yeah. So if you now go to the our project website. So that would be Rproject.org I believe and I hope that’s right. You would see that, I think the most recent version is R 4.0.3, something along those lines. So whenever you get to that page, you will be able to download the latest version. That’s kind of one the fault, but you can go back and download any version, basically, from version 1 and then once you’ve installed it on your computer, it’s there until you update it from yourself. I don’t think you get any notifications there unless you sign up for any of these our newsletters that there’s a newer version. So if you want to update it’s all up to you, you just go again to the page and download the more recent our version.

What you will much more often update is then packaged. So packages, be user-contributed set of functions that you can install and download. And depending on the pace of the developer, there may be a new version every month or every six months or put very stable packages. Now, maybe every other year or so, so I would say a lot of innovation in the R space happens on the package side, the actual are itself is pretty stable and backward Compatible, so you will not see any major changes. Even if you compare version 3 with version 4, there are some breaking changes, but actually they are pretty pretty minor and maybe some functionality is added as it I is, said, most of Innovation happens in packages.

Alexander: Okay. Okay, how does it work on the SAS side?

Sam: SAS is used to be with you wanted to upgrade or improve SAS, there would be some major release that would come out or some maintenance release, come out. And then, and it’s almost like, completely reinstalling SAS again sometimes to get it to work. They did come up with this concept of the last decade or so were they can provide sort of updates of particular parts of SAS. So a lot, like SAS stat, the Package SAS that which includes all of them a big statistics functions in it. They can you can have SAS 94 installed but you can get an update to SAS that sometimes on that that sits on top of SAS 94. That’s the that’s the way works. But honestly that’s and to be honest that’s one of the things that’s a challenge was running. SAS is just the system administration of it where Thomas was saying if you want our and you want the latest packages. It’s pretty much you sit down your computer and you download it right? And it works beautifully works pretty well. Fast sometimes can take a little bit more of a configuration, there’s little bit more configuration work, particularly if you’re not running it locally like on a, on a PC. But if you’re running it on a server somewhere, then you really kind of have get IT partners involved for configuring that.

Alexander:  And that’s also something like interesting. You have see, how should I say it? have some base SAS? Yeah. Like see kind of comparable, maybe to, R you know, in itself and then you have also kind of packages likes all starts and like, you just mentioned SAS visuals and all kind of, you know, other things on it that top of that.

Sam: I used to have no understanding that all a lot better. But base SAS comes with the basics of you can read data into SAS, you can do date the data step, but let’s lot of the data step things, which is very powerful in SAS, do me wrong data step and SAS amazing, and what it can do, its in very simple things that are difficult to do in other packages. You can do very simply with the way they’ve written that. But then, you know, then you want something to do regression, right? You need that stat, right? Because it’s got proc reg and prop, glm and proc mixed and all that included in right, they’ve also added a lot of what they call high performance routines in those. So there’s a proc mixed and there’s a proc HP mixed and the Says HP Max is designed to run on a system that’s got a lot more memory, a lot more processors. It’ll if you have a multi core system on just even on your PC it will use all the cores rather than just one core and and it can things can run a lot faster that way in those those high performance routines that are designed for more sort of parallel processing within a CPU system.

Alexander: Now, let’s talk about one of the probably biggest differences – cost. So…

Sam: Here’s what the war starts because I know the cost Thomas is going to say.

Alexander: What’s the answer Thomas?

Thomas: Well, R is an open source tool, you can get it for free from the internet. What you have to invest is some effort obviously, getting up to speed. R studio also is the most popular IDE and there is a free version out of that. If you get more in the Enterprise application, then you would also pay for that getting R studio cloud for example. But again the language itself is open source and free. So you could say no costs.

Alexander: And I think it’s even written into the R Consortium, kind of things, you know, everything must be free, isn’t it? It’s kind of there’s some..

Sam: That’s generally part of the open source license agreements that depending on the license agreement.

Thomas: R is licenses on On GPL version 3 so the new public license, which states that this is free software, and you can not just take it and wrap it into a commercial product, so it is free by design. And if you want to reuse it in any way, then your solution must also be free.

Sam: And it’s really, if you want to kind of end, depends. Like if you have software that can use an API that calls R but R’s doing the work, that’s not that’s not violating the license but if you’re saying I’m take the code, the actual base code in a corporate that in my code and compile that that’s where you get in that issue or you can’t do that and you stand steal the open source code to use as your own and sell it commercially.

Alexander: Yeah. So how was it for SAS?

Sam: Well, SAS is has a price, right? And it depends a lot on how you use it. That is I think one of the things that people get concerned about with SAS sometimes is how they do their pricing and their licensing model, their pricing model. But oftentimes they’re pricing is based on the number of people that are going to be using SAS at the same time. And so if you, you know, when I started doing consulting eight months ago, and I checked into getting a copy of SAS, just in case I might It in the price they quoted me was just a bit too expensive for me as an individual consultant to want to buy, unless I was going to be solely doing SAS programming work. Like, if I’m just going to solely, that was all as going to do 40 hours a week, it would make sense, right? At least a little bit in into your consulting fees, right, to pay for the license cost, but if you’re going to use it occasionally, like I would maybe three or four times a year now. It’s not worth it, you know? So I would definitely go use.

Yeah. But I’d be, I’ve got use are probably to do what I need to do if I needed something like that, unless it was absolutely, something that SAS only does right? That R can’t do it would be hard to find many cases where SAS is the only option, right? Nowadays? But that being said, for large companies that have a budget, right? And they have big problems to solve, I think SAS is reasonable given the price SAS has typically reasonable compared to other types of similar software that are commercial software. You know, so, so I guess I would leave it at that and, you know, and I think the price is always negotiable to raise. So you can always talk with them and try to get a better price. But you get for, what you pay for a supposed to say, sometimes, you get what you pay for. As Thomas said, if free is not free, free means, you got to invest in the knowledge, people knowledge, the people infrastructure. If you’re running are on a server or something, I get this this just the hardware it support to get that. So it’s not completely free and all that comes along with SAS to. You got to have the same things but sometimes you can. I found you can get really good support from SAS on getting yourself set up in keeping your system running, they also do a lot of just customer care. What if I think they call it, customer care where, hey, we got a group of SAS program, as they want to learn something about SAS, and I’ll get one of their expert programmers to come in and give you a seminar on it, right? And that’s just part of what comes along with the and they do that not formal in the agreement usually that you have with them. But they just do that because they want to keep you as a customer and they’re and they’re good about that and their technical support is really good.

Alexander: Yeah, we have recently had a peasant from jump coming in just giving you 90 minute presentation about it and then things like that too. Yeah. It’s kind of part of the sales programs you get for free.

Sam: You know it’s hard to compete with free. I mean be honest and what you call in everyone learned andeveryone in college now what are they learning? They’re learning R because R free I have two daughters, one studying an undergraduate for statistics, and another one is turning a graduate degree in statistics to my three daughters that you youngest or so. In statistics and they learning are for most of what they do. They do need little bit of SAS to I think just their their universities want them to have some other broader skills. But most of what they do is in R.

Alexander: Last point acceptability. So how easy is it to kind of submit study data and things like, said to regulators using SAS versus R.

Thomas: Yeah, I mean certainly SAS has been used for 30-40 years in this industry. So it is the standard for sure.  R is relatively new in this space. And I think very few companies have yet submitted R code to the FDA. For example, that being said, there’s a clear guidance document from the FDA stating that there is no need to submit any particular software or specifically SAS what you have to make sure is that you used a validated System. And I think with SAS, having a company behind it that makes it easier kind of to get up validated system. So on the R site that’s what companies are now heavily investing in kind of in-house developing a validated our system, but actually not only in our stairs things like the R Consortium where there was working groups along those lines.

That being said, from my experience at Roche the uses with not, maybe not skyrocketing, maybe not go that far. But certainly on a very high trend and I know several teams that work on kind of studies that hopefully make it to submission and they are doing that work in R so Roche has invested quite a bit in that space. And I think the the thing that is missing now is one of the big pharma companies submitting one of their key trials and saying we’ve done it all in R, I think once once that’s out there, everyone sees that. Oh, yeah, this actually works because if you now talk to people, it’s still kind of all. But can I use R? Is it a validated system? So people are still very cautious and maybe even anxious about it. But I think if you if you ask that question, five years down the road, you will just, Yeah, sure, use SAS, use R. Whatever.

Sam: That certainly is the Vantage says because of history and inertia, it’s generally accepted as good. Generally accepted that when you use SAS you’re going to get the answer, right? And it and I’m sure a lot of the efforts that are going in and even like the R in pharma work and to show that the functionality and are in the packages that you might include in the subset of packages that are considered validated match somewhat at least the SAS output or at least you know why they don’t match if they don’t match.

So it is kind of the gold standard and that’s an advantage to for SAS. But you’re right. I think the tipping point will be if one probably not one Pharma company, but I think it’s the 3 pharma companies, that they’ve done three submissions with R and it’s been accepted by the FDA for the clinical trial work, the clinical trial analysis, and all the tables and figures and listings were generated that way, then. Yeah, you’re going to see it. Be a much bigger potential bigger change. My concern would be, is that do not underestimate the amount of effort it’s going to take to get to a state where people have that level of confidence of R, in R, than they do with SAS and you know, that’s one of the, you know, some people say from not for profit for profit, they have this debate but the for-profit companies have a real incentive to make sure that their software works because if it doesn’t people are going to buy it anymore and and they’re going to have that revenue stream anymore. So there’s a lot of work that goes in ensuring that the software is of good, good quality. And and I think in general, lot of our packages are pretty good quality, but I wonder when I get the R package. That’s the our package 0.65, which kind of communicates to me that well, I kind of started this, but I didn’t really finish the package. It made, it may do what I needed to do, but not sure. I’d be want to submit use that to students submission, regulatory submissions. So..

Thomas: I think you raise a good point. Certainly if you use R you have to make your due diligence and be sure that especially on the package side that what you use is fit for purpose. Personally, I would consider R itself and the packages that come with it to be of the same standard as SAS as It’s maybe not a for-profit company behind it but it’s certainly a set of people that follow similar software engineering best practices. And if you then use packages popular packages for the tiny verse kind of out of R studio, I would consider that of equally good quality but then yeah if you download that package from GitHub, that my buddy wrote the other weekend. Yeah, maybe be a bit skeptical about that.

Sam: It is funny when I worked for a lanco and Animal Health, the some of their products are regulated by the not the FDA, but the US Department of Agriculture and they have a statistics group in there, and they’ve written their own are packages that they like people to use to do the analysis of that they typically do so I’ve used, that’s actually, some of my most recent examples of using R, where I’ve used their packages to do analysis that are actually used for batch release testing or call our submissions to the to that regulatory agency but it’s not the FDA.

Alexander: So interesting. Yeah, yeah. think that wraps up a very, very good discussion about R versus SAS. And, as you can listen, there are both a very, very powerful tools for sure. So, that is both, you know, very flexible, lots of different things and yeah, maybe it’s also a little bit of age topic.

Sam: Yeah. I think there is an old versus young aspect here people, what you learned when you were in school or what you learn. When you started your career.

Thomas: I think you can really not underestimate this point. Because if you now recruit for statistical programmers, and if you ask them to be SAS expert then your pool of candidates becomes somewhat slim. Whereas if you ask for our python stuff, you have a much much larger pool because that’s what folks are introduced to these days. And so I think that’s also one of the driving forces behind why big pharma companies, kind of doing a shift even though they have a lot of production legacy code that is working on the SAS side.

Alexander: I’m pretty sure because of all the legacy code that there is, SAS will not die very, very fast.

Thomas: No I wouldn’t count on that either. No. No.

Sam: I think, you know, any commercial company, they’re going to innovate or they’re going to die and I think SAS is innovating lot of ways. If you see what they’re doing with their modern versions of their platforms, where you install SAS on, that you now can work with, you can have on this platform, you can running SAS, and R and Python, and have a shiny server and everything. All kind of on that system. It’s and it’s really kind of designed to integrate well. So for people can have that collection of tools and one place and you know, call R from SAS, right? There’s you can do that, right? You can actually call SAS from R to you can there’s different ways to do that and so it’s I foresee a future where maybe there’s more of a cohesive environment where people can choose to say, I’m going have a computing environment and on the computing environment, I’m gonna have a collection tools and use them. The tools that I think are appropriate to solve my problems.

Alexander: Awesome. Thanks Thomas. Thanks Sam. That was an awesome episode. We talked, you know, from getting up to speed and ease of use, we talked about communities which is I think one of my favorite parts of it because I really love to, you know, work with communities. We talked about updates and crossed into acceptability and visualization. So lots of different dimensions of this debate. Stay tuned will surely have kind of public small that’s related to that in the future. Thanks so much.

Sam: All right, take care.

Thomas: Thanks for having me.

Alexander: hope you enjoyed this show which was created in association with PSI. Thanks to Reine who helps the show in the background and thank you for listening. Head over to the effectivestatistician.com to find much more that can help you boost your career as a statistician in the health sector. Reach your potential, lead great science and serve patients. Just be an effective statistician.

Join The Effective Statistician LinkedIn group

I want to help the community of statisticians, data scientists, programmers and other quantitative scientists to be more influential, innovative, and effective. I believe that as a community we can help our research, our regulatory and payer systems, and ultimately physicians and patients take better decisions based on better evidence.

I work to achieve a future in which everyone can access the right evidence in the right format at the right time to make sound decisions.

When my kids are sick, I want to have good evidence to discuss with the physician about the different therapy choices.

When my mother is sick, I want her to understand the evidence and being able to understand it.

When I get sick, I want to find evidence that I can trust and that helps me to have meaningful discussions with my healthcare professionals.

I want to live in a world, where the media reports correctly about medical evidence and in which society distinguishes between fake evidence and real evidence.

Let’s work together to achieve this.