Click on the button to load the content from podcastae8fac.podigee.io.

Load content

All Episodes

In this special replay of one of our all-time most popular episodes, we dive deep into one of the most debated topics in the pharmaceutical industry: R vs SAS.

Together with Thomas Neitmann and my co-host Sam Gardner, we compare these two powerful statistical programming tools from multiple angles — ease of learning, day-to-day usability, community support, visualization capabilities, regulatory acceptance, and more.

Whether you are a seasoned SAS programmer, an R enthusiast, or someone deciding which tool to focus on, this conversation will give you valuable insights into where each shines, where they struggle, and how the industry is evolving.

What You’ll Learn:

✔ How Thomas and Sam were first introduced to SAS and R — and how their early experiences shaped their preferences.

✔ Key differences in learning curves and the resources available for beginners.

✔ How each tool fares in day-to-day work and long-term maintainability.

✔ Strengths and weaknesses of SAS and R communities — and the events, forums, and support structures that keep them thriving.

✔ The impact of cost, licensing, and open-source vs proprietary models on adoption.

✔ How both tools handle data visualization and producing publication-quality graphics.

✔ Regulatory acceptance: How far R has come in being used for submissions to agencies like the FDA — and what’s still needed for broader acceptance.

✔ Why your choice of tool might also depend on generational trends and the skillsets of new talent entering the field.

Why You Should Listen:


This isn’t just a technical comparison — it’s a candid, practical discussion based on real-world pharmaceutical research experience. You’ll hear about cultural differences between the SAS and R worlds, the business factors that influence adoption, and the ways companies are moving toward hybrid environments where both can thrive.

If you’re making decisions about tools for your team or career, this episode will help you navigate the trade-offs with eyes wide open.

Links:

🔗 Thomas’ Adventure Blog

🔗 The Effective Statistician Academy – I offer free and premium resources to help you become a more effective statistician.

🔗 Medical Data Leaders Community – Join my network of statisticians and data leaders to enhance your influencing skills.

🔗 My New Book: How to Be an Effective Statistician – Volume 1 – It’s packed with insights to help statisticians, data scientists, and quantitative professionals excel as leaders, collaborators, and change-makers in healthcare and medicine.

🔗 PSI (Statistical Community in Healthcare) – Access webinars, training, and networking opportunities.

If you’re working on evidence generation plans or preparing for joint clinical advice, this episode is packed with insights you don’t want to miss.

Join the Conversation:
Did you find this episode helpful? Share it with your colleagues and let me know your thoughts! Connect with me on LinkedIn and be part of the discussion.

Subscribe & Stay Updated:
Never miss an episode! Subscribe to The Effective Statistician on your favorite podcast platform and continue growing your influence as a statistician.

Never miss an episode!

Join thousends of your peers and subscribe to get our latest updates by email!

Get the shownotes of our podcast episodes plus tips and tricks to increase your impact at work to boost your career!

We won’t send you spam. Unsubscribe at any time. Powered by Kit

Learn on demand

Click on the button to see our Teachble Inc. cources.

Load content

Thomas Neitmann

Director, Data Science | Statistical Programming | Open Source | R in Pharma

Thomas Neitmann is an R enthusiast and data science leader on a mission to leverage open-source software to advance treatments for neurodegenerative diseases such as Alzheimer’s, Parkinson’s, and ALS.

As Director of Data Science at Denali Therapeutics, Thomas is leading the transition away from legacy commercial software toward a fully R-based pipeline that streamlines clinical trial data processing, analysis, and reporting. His team’s work aims to accelerate drug development and bring life-changing therapies to patients faster.

Thomas discovered R in 2014 when a colleague handed him a script and told him to “analyze some data.” Having never programmed before, he took on the challenge — and quickly realized the power and joy of coding. Over the years, he has also worked with Matlab, Python, and SAS, but R remains his favorite tool.

Beyond his technical work, Thomas is passionate about sharing knowledge. Since 2019, he has grown his LinkedIn following from 300 to over 7,000, launched a blog focused on R programming, and contributed extensively to the R community, with a special interest in data visualization and reproducible workflows.

Sam Gardner

Director Product Development Statistics at Eli Lilly and Company

Sam holds a BS in Mathematics and Chemistry from Purdue University, a MS in Mathematics from Creighton University, and a MS in Statistics from the University of Kentucky.

Sam has held numerous roles in his career. Right out of college, he earned a commission as an officer in the United States Air Force. For 12 years, he worked as a military scientific analyst in several roles, including weapon systems modeling and simulation, flight test planning and data analysis, and human factors research and development. He was also a member of the faculty at the Air Force Institute of Technology, where he taught statistics and experimental design to fellow officers in the graduate engineering, science, and logistics programs.

After his military service, Sam joined Eli Lilly and Company and worked in research and development and manufacturing as a statistician and as a process chemist. He later made a career transition to the SAS Institute, where was a technical expert in the JMP Division, and that role gave him a broad exposure to hundreds of companies that are using statistics to solve business problems. He returned to Eli Lilly in the Advanced Analytics group, focusing on the application of predictive modeling in support of sales and marketing. He then moved back into research and development at Elanco Animal Health, where he led the Experimental Design and Statistics team with a focus on applying Quality by Design to the development of new pharmaceutical products.

After three decades of work in government and industry, Sam decided to launch his own business to provide consulting services that utilize his talents in statistical thinking, problem solving, experimental design, and statistics and predictive modeling.

Sam is a member of the American Statistical Association, the International Statistical Engineering Association, the American Association or Pharmaceutical Sciences, and he also serves as a volunteer member of the United States Pharmacopoeia’s Statistics Expert Committee.

Sam lives in Lafayette, Indiana, USA, with his wife, Susan, and together they have four adult children.

Transcript

00:00
You are listening to the Effective Statistician Podcast, the weekly podcast with Alexander Schacht and Benjamin Piske designed to help you reach your potential, great science and serve patients while having a great work-life balance.

00:22
In addition to our premium courses on the Effective Statistician Academy, we also have lots of free resources for you across all kind of different topics within that academy. Head over to theeffectivestatistician.com and find the academy and much more for you to become an effective statistician.

00:49
I’m producing this podcast in association with PSI, a community dedicated to leading and promoting user statistics within the healthcare industry for the benefit of patients. Join PSI today to further develop your statistical capabilities with access to the ever-growing video-on-demand content library, free registration to all PSI webinars, and much

01:13
Head over to the PSI website at PSIweb.org to learn more about PSI activities and become a PSI member today.

01:30
Welcome to another episode. And today we are talking about a controversial topic that Sam actually came up with. Sam, you suggested that we do a podcast of R versus SASS. Maybe you can tell a little bit of a story of what was your first kind of experience or touch point with R. did you first hear about it? When I was in graduate school studying statistics.

02:00
The primary tools we used to learn to do statistics on were SAS. And this is an old school like SAS on the console where you submit code and wait for it to run and go open a file with the output results or print them out. wasn’t even a, there was no even like the program editor was, I think existed on Unix. We were just doing it on like an IBM console and we use some other tools like Matlab and S Omnitab, which was like the predecessor to Minitab.

02:29
That was you some time for teaching and I’m not even sure when R actually came into existence, but I’m not sure it was around when I started graduate school, but near the end of my program, when I was there full time in school, there was a professor down the hall and he had started using this new code called R. It was like a clone of SAS or clone of S and we were all whispering about it in the hallway. Oh, he’s using this new software. I’m not really sure about it. It’s just trying to copy S and is it really going to go anywhere? I guess.

02:59
It really did actually in the end, but it has, but yeah, at the time it was fringe. It was on the fringe of what people were doing in statistics. And it was very much just an experimental research tool at that time. So that was my first exposure to R and seeing it used. And then I didn’t really see much of it until probably in the last 10 years is when I’ve really started to see it and be used and be recognized as a tool that can be used in for more than just academic purposes.

03:28
Yeah, I also started with SARS. I remember I actually started with C, but that was even earlier in my career. But when I started to do some statistics, I was using SARS. think, not sure whether it was some kind of version two, or whether it was version three point something. I think we were quite excited when version 3.11 came out because that had some new stuff in it. yeah. So.

03:57
That was, you can see, or you can listen to how old Sam and I are. But we, because we are so old, we have more younger generation participants today as well. So Thomas, great to have you here on the show as well. Yeah, absolutely. Thanks for having me. Yeah. Maybe you can start a little bit with an introduction of yourself. Sure. So I’m currently working at Roche as a statistical programmer.

04:27
been there for now one and a half years or so. And in my current role, I’m actually leading a team that develops an R package to create Adam datasets, according to the CDISC standard, which is actually a collaboration with another pharmaceutical company, GSK. So you can see R is in high demand in the industry, which for a person like me is a good situation because actually the very first programming language I was ever introduced to was R that was back in 2014, I believe.

04:55
When I started working at the research lab and then some PhD student handed me a script over and said, yeah, just run that script and you will get the output from this data that we collected. And obviously having never coded before, I didn’t even know how to open that file. So that was certainly a frustrating experience for sure, but I downloaded RStudio and I managed to run that program. And then I thought what would be so cool if I could change like some things and make it adjusted to my needs.

05:23
And then I really ran down the rabbit hole and learned R became pretty proficient in it. And now, yeah, I’m a statistical programmer and it’s my daily bread and butter. But actually then when I started or when I decided to join the pharmaceutical industry, I learned about SES for the first time. That was, guess, 2019ish or 18. Yeah. And then I learned SES with my R background, SES that was very different, but certainly was an interesting

05:52
experience and then I did all these certifications at SES offers and I landed my first job at a CRO. And back then at that job, it was really full-time SES, the classical thing in the farmer industry. But once I joined over at Roche very frequently or very soon, people saw that I’m proficient in R, so they handed me over all this work and then yeah, it started from there. And now that’s basically all I’m doing. Very interesting. So let’s talk today a little bit about

06:21
these two different softwares and I have a couple of dimensions wrote down for myself. By the way, I’m not doing any programming anymore for quite some time. So I’m probably a good kind of moderator here, but not a lot of content bringer. I can only see what the output is and, but I can’t really tell how flexible things is. I tried a little bit to program an output that’s quite some time ago. Okay. So let’s talk about

06:51
first getting up to speed. So Sam, how long do you think, you know, someone that is completely new, has never coded before can learn decent amount of SAS. So to get along on a day to day basis. That’s a good question. And I’m not really sure. I think if

07:19
People learn SAS the way I learned it. You learn SAS by shamelessly stealing other people’s code. say, why did you do this before? And you see the code and you start to see how it works. And then you start to learn the syntax. Back in the day, a long time ago, we actually had paper, printed manuscripts, right? That were set on the shelf of all the SAS. The blue ones. Yeah, find the syntax and examples. Now that’s all online.

07:47
I think with the availability of the online documentation, it’s a lot easier that way. But I think there’s a little bit of a learning curve that you have to climb up, particularly if your first exposure was not SAS. If you’re just starting from scratch and you’ve never done any statistical computing, you’re going to struggle just like anybody else with learning something new. But if you started using R, for instance, like Thomas, and then you get thrown into SAS, immediately you’re good. It’s completely different. The whole paradigm.

08:15
in many ways for how you get data in to do analysis and then how you actually do analysis and get results out is very different than in between SAS and R in many ways. so a lot of that stuff in SAS is carried over from its initial architecture when it was designed to be run on mainframe computers back in the 90s, right, in the 80s and 90s. So some of that still carries over that flavor and it seems a little antique sometimes, antiquated.

08:44
But once you learn it, also think you see some of the real power and that’s there that some of those initial design decisions around the software really had lasting staying power. They’re still good ideas about how do you, how would you design a software system to read in a lot of data and manage a lot of data and do analysis on it? It’s pretty good. Thomas, what would be your kind of perception? How do you learn R? Oh, you’re also given some code and then work from there. Yeah. And actually if you’ve been

09:14
kind of Google for our resources. There is so much, it’s actually hard to decide on where to start, which course to take. There’s so many resources out there. There’s some great books. For example, one I recommend frequently is R for data science, which is a really good one, but there’s many platforms that offer this kind of MOOC style courses, which I also took back in the days, which really helped me. And then again, I think if you’re new, if you’ve never really done computer programming, then it certainly will take time.

09:44
just like SAS. think if you have a background in something like Python, for example, you can get up to speed very quickly because the, general principles are fairly similar. Whereas as Sam mentioned, if you come from an R background and then get to SAS, SAS is very different to many of, would say today’s general purpose languages. Yeah, R feels more like modern programming languages or just general programming languages. If you learn how to program in C.

10:13
or C sharp or something like that. Some of those more of an object oriented type programming language R seems a little bit. Although if you ask a Python programmers, they will tell you that it’s a horrible language for whatever reason. don’t understand, but yeah, everyone has that preference. Yeah. There’s a lot of these Python versus R debates, which is interesting. But today it’s, think actually in the pharmaceutical industry, it’s more kind of question of R versus S. It’s interesting. Not so much.

10:43
I think from what I’ve seen, some people playing around with Python, but it’s really a niche thing. Whereas R is now, think every formula company probably has now an R strategy, so to speak. And it’s really trying to develop tools along those lines. And I think Python is more outside of classical clinical trials. If you’re a really heavy in machine learning, doing neural networks, I think that’s where pretty shines as a great libraries. Yeah. So general purpose data science. think in some of the.

11:13
non-clinical areas and areas where you have lots of data and you have to manage lots of data and you’re using mostly open source tools. Python has a lot more use there. In terms of then managing things on a day to day. So when you need to update things, when you need to change things, how easy would you say is it for those of SARS? This time I maybe can start with Thomas.

11:42
So where exactly are you going with the question? Updating existing code or? Yeah. So it’s the ease of use on a day-to-day basis. I would say it’s fairly similar. For example, in R, many people use RStudio as their favorite editor. now Sass has Sass Studio, which is in many ways very similar, a nice interface to work with. At least I love them very much. know like old school Sass programmers, they are more on the PC Sass side, but it has their favorite tools.

12:10
I was actually quite happy when SAS introduced some kind of help. So in terms of showing that when you have missed the semicolon. That’s a whole different story about trying to pry the enhanced SAS editor out of programmers cold dead fingers because they do not want to change. Maybe you do that on a different… Exactly. Yeah.

12:36
Okay. The IDE is really nice to have. Next dimension, community. Sam, what do you think about the SaaS community and all the kind of events around it and stuff like that? I think an IDE is really a great tool to have. At least I can really not imagine not having one, like just a plain text editor. I would probably really suck because I would have so many syntax errors. Yeah. Yeah. And that was back in the day. That’s how we did SaaS coding.

13:04
open up Emacs or VI and type your code, save the file and do SaaS space the file name and let it run. That was it. There’s different types of communities. SaaS itself has a very large sales and marketing organization and they put a lot of effort into creating events and I guess environments where you can meet other SaaS users, share stories with them, get to know them, where you can ask questions.

13:32
Their technical support is very good too. If you have a technical question, they generally are very helpful. So I think they do a lot of that, but I, and I work, I used to work at SaaS, not selling actually SaaS products. sold jump products or worked in the sales and marketing for jump. So I know, but I know a little bit about how things work inside SaaS and, at least how it used to work. And SaaS makes its money by selling big solutions to companies, right? They don’t really make their money by selling PC SaaS to an individual.

14:02
that the, the big revenue comes from is for a large company, like a pharma company. will buy SAS to be able to be used by hundreds of people and systems that do fraud detection and things like that. So they organize a lot of their communities around those solutions, I guess is what I’m trying to get at is they, so you may go to an event and they’ll talk about what can SAS do for fraud detection, or you may go to a conference and they have a lot of talks on this is how I use this SAS system.

14:29
integrated system using SaaS as the foundation to solve a problem. But then you also have the SaaS user groups, and at least in the United States, don’t know if those exist in Europe, but in the United States they have lot of regional SaaS user groups that often run their own conferences. Those are actually some of the best ones to go to if you really just want to new things about SaaS. Because they tend to be run by users, not by SaaS. And that’s nice. Yeah. There’s also the SaaS Global Forum.

14:58
which I attended two years ago and speaking again in 2021. depending on when this. are great events. If you can afford and you have the time to go through the, you learn a lot, they treat you pretty well and you get an opportunity to get a broad exposure to people who are using SaaS and all different companies. And thousands of participants. It’s huge. It’s really huge. Yeah. So you need a big convention center to run that. Okay.

15:27
Thomas, what do you think about the R community? I think it’s a really great community, especially if you’re like a, how to call it, internet first person, my generation. If you go on Twitter and then there’s hashtag R studs, for example, and people posting their new solutions, they come up with asking questions, helping others out. It’s really great. It’s also on LinkedIn, it started kicking up and then you have obviously things like Stack Over, where you can probably get an answer for any question you can think of.

15:56
So if you just type R and then your question in Google, you’ll probably find an entry from another user that faced a similar problem as you. And in general, the community is really trying to help each other, trying to improve R and the R ecosystem itself. And there’s also a heavy emphasis on kind of diversity, trying to include as many groups of people as possible. It’s a fantastic community to be part of. Then obviously compared to SaaS,

16:22
There is no R hat that you can call. There’s no company behind this. It’s an open source language. There’s obviously the R port team, but that’s basically a couple of statistics professors that have a real full-time job. can just call them up and ask them a question, but that’s really where the community shines. And then obviously there’s also a lot of conferences. So things like, for example, in my field, RN Farmer is a big one. RStudio does their RStudioCon.

16:48
You have European conferences and whatnot. Also our meetups similar to the SES user groups. That’s also huge. And I feel like it’s just getting more and more and yeah, it’s great. Yeah, that’s actually interesting. There’s our consortium. So that is a kind of much bigger society, I would say around or association around. And then there’s our studio, but our studio is actually a company. Yeah. So.

17:17
They actually have a help desk where you can call and say, have also saves people and things like that. But then that’s really RStudio is their kind of IDE that they developed. And now obviously they scale to much more products, but it’s not like they are behind R, the language itself, but definitely we’re, I think we’re very lucky to have RStudio as a company in this space. And I think they recently transitioned to what is called a public benefit company in the U S so there, and you could say, think.

17:45
It’s a not-for-profit company. It’s trying to work for the greater good of the R community, so to speak. So yeah. And I think every pharma company that uses RStudio products and is very happy to have them because they are really good. Yeah. I like the RStudio IDE quite a bit. It’s a nice tool. I’ve tried a couple other things and that’s one of the best, think. And they also have the blog like RViews and stuff like this. Yes.

18:13
Yeah, I think that they’re really heavy also on the educational content side. And they actually now also have an instructor program, kind of teaching others how to teach R, which I think in general and programming is a problem because there’s lots and lots of great skilled programmers out there, but actually teaching someone how to program is a whole different skillset. And I was lucky enough to attend that session last year. So it’s really interesting if you think of the methodology behind how you can actually bring someone.

18:41
up to speed and can teach them if using empirical best practices to program. Because oftentimes, yeah, it’s like you give someone a script and then it’s, yeah, find it out. And if you’re smart enough, you can do it. But that’s probably not the best way to get someone. You said something there that triggered me a little bit. And I don’t want this to turn into a war, right? Because this is not like who’s, which software is the best software. But one thing that triggered me a little bit is you said, yeah, to teach somebody how to do programming.

19:10
My experience has been with, work with a lot of people, scientists and engineers. They just want answers. They don’t want to, they want to solve problems. And what they want is the tools to help them solve problems. And I’m a big user of jump, which is one of SAS projects, products. And what’s its real power is it’s primarily, or at least on its main interface, point and click, it has a graphical user interface. I don’t know if anything really, how much of that actually exists for R and even for SAS. And just, you can drop in little

19:41
sections of code that you want to run and say open up data sets. And it does a lot of the actual code generation. They’re code generators is what those graphical user interfaces are for SAS. It does a pretty good job of that for the basic user where they don’t really need to learn code. So what would be your response to that? like, what if I’m working with someone that doesn’t want to code? Is there some kind of jump-like version of that? I don’t know. I didn’t really use jump ever, so I cannot make a.

20:08
really accurate comparison, but I don’t think there is like a drag and drop kind of interface to R. Obviously R Shiny is a very popular tool where you can build solutions, but then you get into real kind of programming development. But we, for example, do that extensively in Rosh. We use this R Shiny framework and then give basically these web applications where our scientists can explore their data in a way that is a bit more…

20:33
tailored towards, let’s say the analysis that has been, for example, written down in a statistical analysis plan, rather than being able to basically drag and drop around everything and give them the ultimate freedom to explore, just say, which might sound not good, but actually sometimes it’s good to push them in a certain direction, let’s say with where they can go because clinical scientists can be very creative in finding out very interesting subgroups. I think there is a package.

21:02
There is a package called R-Commander that someone has out there that has some GUI functionality in it. It’s, that is actually one of the things it’s the more flexible something becomes. Yeah. The more difficult it is also becomes. Yeah. And I think jumpers then something that is less flexible, but then of course much easier to use because then you have the power of

21:30
templates of defaults and all these kinds of things. Yeah. And that helps them of course, minimize the options and standardized lots of things. And that makes it easy to get lots of things quite quickly. think one of the nice things and that’s maybe the next thing that I want to talk about is visualization. With Jump, you can very quickly get lots of easy visualizations. In SAS too has newer, the newer versions of SAS have

22:00
come along with web-based, HTML-based interfaces where you can do some of that interactive visualization to, what do they call that? They call it Visual Studio, I think is what it’s called in SAS. I’ve used it a little bit. It’s nice. You have to have your data set paired in advance, right? But once you’ve got it set up, then you can drag and drop variables onto a graph and make different types of graphs and save them and things like that. And that’s very nice. Even though I’m on this debate on the SAS side, I will say like,

22:29
really like high quality publication quality graphs are hard to make in SAS. that’s just admitted. There are people who’ve made their whole careers knowing Prog Gplot, you know, how they, they’re experts in Prog Gplot because they can do it. They’ve improved SAS quite a bit in a lot of ways. lot of the statistical procedures automatically generate relevant statistical graphics now, which is nice that you can output as, and that’s just an option and the commands when you do it. But, but if you just want to really

22:58
refined, precise graph of your data. Sometimes you have to work pretty hard at it, but I don’t know if that’s different in R or not either. Thomas, what’s your reply in terms of visualization in R? So from my point of view, yeah, Sam mentioned, if you really want to create a publication ready, high quality graph, R is one of the best tools out there, I think. And especially the ggplot2 package, which kind of builds on this, what is called the grammar of graphics. So it’s really a

23:26
Let’s say plots broken down on a theoretical layer, to actually, which layers are there and so forth. So rather than thinking about bar plots and scatter plots and so forth, really taking a step back and trying to define this grammar, which can make it, would say initially difficult to work with because you have to understand these concepts. But I think once you, once it made click, then you’re really able to create extremely refined graphics and also combine graphics in a very elegant way.

23:56
And yeah, the more proficient then you get with theming, then you can basically make it look like any newspaper you want out there because it’s so flexible. If you just stick to the defaults, then it will look like a ggplot. But it’s default gray and white background, super flexible tool, extremely powerful. So that’s again, one of the innovations that came out of RStudio and Hadley Wickham where they developed this package. yeah, I think the community also really appreciates to have that. Because R also has a built-in graphics.

24:25
which I think at that time was also extremely powerful and I think it still is, but I would say ggplot just adds another bit of functionality on top, a bit more sophistication and you can get even better results. Okay. You just mentioned ggplot is an update. Yeah. So let’s talk a little bit about updates. How does it work with updates in R?

24:50
Yeah. If you now go to the R project website, so that would be R-project.org, believe. And I hope that’s right. You would see that I think the most recent version is R 4.0.3, something along those lines. So whenever you get to that page, you will be able to download the latest version. That’s one of the default, but you can go back and download any version basically from version one. And then once you’ve installed it on your computer, it’s yeah, it’s there until you.

25:17
updated from yourself. I don’t think you get any notifications unless you sign up for any of these R newsletters that there’s a new R version. So if you want to update, that’s all up to you. You just go again to the page and download the more recent R version. What you will much more often update is then packages. So packages basically be user contributed set of functions that you can install and download. And depending on the pace of the developer,

25:43
There may be a new version every month or every six months or with very stable packages now, maybe every other year or so. So I would say a lot of innovation in the R space happens on the package side. The actual R itself is pretty stable and backward compatible. So you will not see any major changes. Even if you compare version three with version four, there are some breaking changes, but actually they are pretty minor and maybe some functionality is added. But as I said, most of the innovation happens in packages.

26:14
Okay. Okay. How does it work on the SaaS side? SaaS used to be, if you wanted to upgrade or improve SaaS, there would be some major release that would come out or some maintenance release come out. And it’s almost like completely reinstalling SaaS again sometimes to get it to work. They did come up with this concept, I don’t know, the last decade or so, where they can provide.

27:04
sort of updates of particular parts of SAS. So like SAS stat, the package SAS stat, which includes all of the main big statistics functions in it. You can have SAS 9.4 installed, but you can get an update to SAS stat sometimes on that that sits on top of SAS 9.4. That’s the way it works. honestly, and to be honest, that’s one of the things that’s a challenge with running SAS is just the system administration of it, where Thomas was saying, if you want R and you want the latest packages, it’s pretty much you sit down at your computer and you download it, right?

27:34
And it works, usually works pretty well. SAS sometimes can take a little bit more of a configuration. There’s a little bit more configuration work, particularly if you’re not running it locally, like on a PC, but if you’re running it on a server somewhere, then you really kind of have to your IT partners involved for configuring. Yeah. And that’s also something like interesting. have the, how should I say it? You have the base source. Yeah. Like the kind of comparable maybe to R in itself. And then you have.

28:02
Also packages like SAS start and like you just mentioned, SAS visuals and all other things on top of it. I used to understand that a lot better, but Base SAS comes with the basics of you can read data into SAS. You can do the data step, but it a lot of the data step things, which is very powerful in SAS. get me wrong. Data step in SAS, amazing in what it can do. It’s very simple things that are difficult to do in other packages.

28:31
You can do very simply with the way they’ve written that. the, then then you want something to do regression, right? You need SAS stat, right? Cause it’s got proc reg and proc glm and proc mixed and all that included in it. Right. Um, they’ve also added a lot of what they call high performance routines in those. The, so the, so there’s a proc mixed and there’s a proc HP mixed. And the difference is HP mix is designed to run on a system that’s got a lot more memory.

29:01
lot more processors. It’ll, if you have a multi-core system on just even on your PC, it’ll use all the cores rather than just one core. And things can run a lot faster that way. And those high performance routines that are designed for more sort of parallel processing within a CPU system. Now let’s talk about one of the probably biggest differences. Here’s where the war starts.

29:31
Cause, cause I know the cost Thomas is going to say. R is an open source tool. You can get it for free from the internet. What you have to invest is some effort, obviously getting up to speed. RStudio also the most popular IDE. There is a free version out of that. If you get more in the enterprise application, then you would also pay for that getting RStudio cloud, for example. But again, the language itself is open source free. So you could say no costs.

30:01
Okay. And I think it’s even written into the R Consortium kind of things. Everything must be free. Isn’t it? That’s generally part of open source license agreements that depending on the license agreement that. Yes. I think R is licensed on GPL version three. So the new public license, which basically states that this is free software and you can not just take it and wrap it into a commercial product.

30:30
So it is free by design. And if you want to reuse it in any way, then your solution must also be free. it is really, if you want to, it depends if you have software that can use an API that calls R, but R is doing the work. That’s you know, that’s not violating the license. But if you’re saying I’m going to take the code, the actual base code and incorporate that in my code and compile, that’s where you get in that issue where you can’t do that and steal the open source code to use as your own and sell it commercially. How’s it for Sothe?

31:01
SAS has a price, right? And it depends a lot on how you use it. That is, I think, one of the things that people get concerned about with SAS sometimes is how they do their pricing and their licensing model, their pricing model. But oftentimes, their pricing is based on the number of people that are going to be using SAS at the same time. And so if, you know,

31:24
When I started doing consulting eight months ago and I checked into getting a copy of SAS just in case I might need it. And the price they quoted me was just a bit too expensive for me as an individual consultant to want to buy unless I was going to be solely doing SAS programming work. Like if I was just going to solely, that was all I was going to do 40 hours a week. It would make sense. Just add a little bit into your consulting fees to pay for the license costs. But if you’re going to use it occasionally, like I would maybe three or four times a year right now.

31:54
It’s not worth it. So I would definitely go use, yeah, I would go use R probably to do what I need to do if I needed something like that. Unless it was absolutely something that SaaS only does, right? That R can’t do. It would be hard to find many cases where SaaS is the only option nowadays. But that being said for large companies that have a budget and they have big problems to solve, I think SaaS is reasonable given the price of SaaS is typically reasonable.

32:21
compared to other types of similar software that are commercial software. So I guess I would leave it at that. And, I think the price is always negotiable too, right? So you can always talk with them and try to get a better price. But the, what you get for what you pay for, or I should say sometimes you get what you pay for. And as Thomas said, if free is not free means you got to invest in the knowledge, the people, knowledge, the people infrastructure. The, it’s, if you’re running R on a server,

32:50
or something like that, that’s just the hardware, IT support to get that. So it’s not completely free. Now all that comes along with SaaS too. You got to have the same things, but sometimes you can, I’ve found you can get really good support from SaaS on getting yourself set up and keeping your system running. They also do a lot of just customer care. I think they call it customer care where, hey, we got a group of SaaS programmers. They want to learn something about SaaS and they’ll get one of their expert programmers to come in and give you a seminar on it. And that’s just part of what comes along with

33:21
And they do that not, it’s not formal in the agreement usually that you have with them, but they just do that because they want to keep you as a customer and they’re good about that. And their technical support is really good. Yeah. We recently had a person from Jump coming in and just giving a 99 presentation about it and things like that too. Yeah. It’s part of the sales programs. You basically get it for free.

33:45
It’s hard to compete with free. I’ll be honest and what do you call and everyone learn and everyone in college right now, what are they learning? They’re learning R because R is free. I have two daughters, one studying an undergraduate for statistics and another one’s turning a graduate degree in statistics. Two of my three daughters, the two youngest are studying statistics and they are learning R for most of what they do. They’ve learned a little bit of SAS too. I think just their universities want them to have some other broader skills, but most of what they do is an R. Last point.

34:14
So how easy is it to submit study data and things like that to regulators using SAS versus R? Yeah, certainly SAS has been used for 30, 40 years in this industry. So it is the standard for sure. R is relatively new in this space. And I think very few companies have yet submitted R codes to the FDA, for example.

34:43
That being said, there’s a clear guidance document from the FDA stating that there is no need to submit in any particular software or specifically SaaS. What you have to make sure is that you used a validated system. And I think with SaaS having a company behind it, that makes it easier to get up a validated system. So on the R side, that’s basically what companies are now heavily investing in. In-house developing.

35:11
validated R system, but actually not only in ours, there’s things like the R Consortium where there’s working groups along those lines. That being said, from my experience at Roche, the uses, yeah, I would not, maybe not skyrocketing, maybe not go that far, but certainly on a very high trend. And I know several teams that work on studies that hopefully make it to submission and they are doing their work in R. So Roche has invested quite a bit in that space. And I think.

35:40
The thing that is missing now is one of the big pharma companies submitting one of their key trials and basically saying we’ve done it all in R. I think once that’s out there, everyone sees that, yeah, this actually works. Because if you now talk to people, it’s still, oh, but can I use R? Is it a validated system? So people are still very cautious and maybe even anxious about it. But I think if you ask that question five years down the road,

36:06
You will just hear, yeah, sure, use SAS, use R, whatever. That certainly is an advantage SAS has because of history and inertia. It’s generally accepted as good, right? Generally accepted that when you use SAS, you’re going to get the right answer. I’m sure a lot of the efforts that are going in, and even like the R and pharma work, to show that the functionality in R and the packages that you might include in the subset of packages that are considered validated match somewhat, at least.

36:34
SAS output or at least why they don’t match if they don’t match. So it is the gold standard. And that’s an advantage, right? To, to, SAS. But you’re right. think the tipping point will be if one, probably not one pharma company, but it gets to three pharma companies that they’ve done three submissions with R and it’s been accepted by the FDA and the clinical trial work, clinical trial analysis, and all the tables and figures and listings were generated that way.

37:02
then yeah, you’re going to see it be a much bigger, potential bigger change. I, my concern would be, is that to not underestimate the amount of effort it’s going to take to get to a state where people have that level of confidence of our in our that they do with SAS. And that’s one of the, some people say not for profit, poor profit. have this debate, but the for-profit companies have a real incentive to make sure that their software works. Right. Because if it doesn’t.

37:32
people aren’t going to buy it anymore. And they’re going to, they’re going to not going to have that revenue stream anymore. So there’s a lot of work that goes in ensuring that the software is of good quality. And, I think in general, a lot of our packages are pretty good quality, but I wonder when I get the R package, that’s the R package 0.65.

37:53
Which kind of communicates to me that I started this, but I didn’t really finish the package. It may do what I needed to do, but I’m not sure I’d be want to submit, use that to submission, regulatory submissions. So I think you raise a good point there. Certainly if you use R you have to make your due diligence and be sure that especially on the package side, that what you use is fit for purpose. Personally, I would consider R itself and the packages that come with it to be off the same standard as SES.

38:21
Maybe not a for-profit company behind it, but it’s certainly a set of people that follow similar software engineering best practices. And if you then use package, popular packages, for example, a tidy verse kind of out of our studio, I would consider that of equally good quality. then yeah, if you download that package from GitHub that my buddy wrote the other weekend, yeah, maybe be a bit skeptical about that. Yeah, it’s funny when I worked for Alenco and Animal Health, some of their products are regulated by the

38:51
not the FDA, but the USDA, US Department of Agriculture. And they have a statistics group in there and they’ve written their own R packages that they like people to use to do the analysis of that they typically do. I’ve used, that’s actually some of my most recent examples of using R’s where I’ve used their packages to do analysis that are actually used for batch release testing or submissions to that regulatory agency. But it’s not the FDA, it’s not the EMEA. Interesting. Yeah. I think that

39:19
wraps up a very good discussion about R versus S. And as you can listen, there are both very powerful tools for sure. So this is both very flexible, lots of different things. yeah, maybe it’s also a little bit of a age topic. Yeah. I think there is an old versus young aspect here. People, what you learned when you were in school.

39:47
or what you learn when you started your career. I think you could really not underestimate this point because if you now recruit for statistical programmers and if you ask them to be SaaS expert, then your pool of candidates becomes somewhat slim. Whereas if you ask for our Python stuff, you have a much, much larger pool because that’s what folks are introduced to these days. And so I think that’s also one of the driving forces behind why big pharma companies doing a shift, even though they have a lot of

40:14
production legacy codes that is working on the SaaS side. I’m pretty sure because of all the legacy codes that there is SaaS will not die very fast. I wouldn’t count on it. I it will stick around. think any commercial company they’re going to innovate or they’re going to die. And I think SaaS is innovating in lot of ways. You see what they’re doing with their modern versions of their platforms where you install SaaS on that you now can

40:42
work with, you can have on this platform, you’ve been running SAS and R and Python and have a shiny server and everything all on that system. And it’s really designed to integrate well. So for people can have that collection of tools in one place and call R from SAS, right? There’s, can do that, right? You can actually call SAS from R too, right? You can, there’s different ways to do that. So it’s, I foresee a feature where maybe there’s more of a cohesive.

41:11
environment where people can choose to say, I’m going to have a computing environment and on that computing environment, I’m going have a collection of tools and use the tools that I think are appropriate to solve my problems. Awesome. Thanks, Thomas. Thanks, Sam. That was an awesome episode. talked from getting up to speed and ease of use. We talked about communities, which is, think, one of my favorite parts of it because I really love to work with communities. We talked about up

41:38
stage and costs and acceptability and visualization. lots of different dimensions of this debate and stay tuned. We’ll surely have kind of topics more or less related to that in the future. Thanks so much.

42:00
This show was created in association with PSI. Thanks to Reine and her team at VVS for help with the show in the background and thank you for listening. Read your potential, read great signs and serve patients. Just be an effective statistician.

Join The Effective Statistician LinkedIn group

I want to help the community of statisticians, data scientists, programmers and other quantitative scientists to be more influential, innovative, and effective. I believe that as a community we can help our research, our regulatory and payer systems, and ultimately physicians and patients take better decisions based on better evidence.

I work to achieve a future in which everyone can access the right evidence in the right format at the right time to make sound decisions.

When my kids are sick, I want to have good evidence to discuss with the physician about the different therapy choices.

When my mother is sick, I want her to understand the evidence and being able to understand it.

When I get sick, I want to find evidence that I can trust and that helps me to have meaningful discussions with my healthcare professionals.

I want to live in a world, where the media reports correctly about medical evidence and in which society distinguishes between fake evidence and real evidence.

Let’s work together to achieve this.