How can artificial intelligence simplify your daily work?

How do tools like GitHub Copilot streamline tasks for statisticians and data scientists? 

In this episode of The Effective Statistician, I explore these questions with Paolo Eusebi, who shares his direct experience using Copilot in R Studio. We dive into how AI can suggest accurate code, assist with data manipulation, and help you quickly learn new programming techniques. We also discuss the critical need to safeguard data privacy when using these tools in larger organizations.

Join us to see how AI can boost your productivity and transform how you work!

Key Points:
  • AI in Daily Work
  • GitHub Copilot
  • Code Suggestions
  • Learning Tool
  • R Studio Integration
  • Productivity Boost
  • Data Privacy
  • Future of AI

As AI continues to revolutionize the way we work, tools like GitHub Copilot offer incredible opportunities to enhance productivity, simplify complex tasks, and accelerate learning. In this episode, Paolo and I dive into the practical applications of Copilot in our day-to-day work, sharing insights that can benefit statisticians and data scientists alike. Don’t miss out on this valuable discussion! 

Tune in to the episode now, and if you find it helpful, be sure to share it with your friends and colleagues who could also benefit from AI tools like Copilot. Let’s spread the knowledge and make work easier for everyone!

P.S. Don’t miss out on our upcoming Effective Statistician Online Conference: Fall 2024 – Transforming Healthcare!

Join us this November for an exciting opportunity to unlock the future of medical data with a special focus on RCT vs RWE, AI, and EU HTA. Whether you’re looking for free access to keynotes and panel discussions or want to dive deeper with **premium workshops** on topics like Machine Learning, Data Visualization, and Personal Productivity—there’s something for everyone. Secure your spot now and connect with over 800 professionals from the healthcare industry! Mark your calendars: Nov 7-8 & Nov 11-12, 2024.

Transform Your Career at The Effective Statistician Conference 2024!

  • Exceptional Speakers: Insights from leaders in statistics.
  • Networking: Connect with peers and experts.
  • Interactive Workshops: Hands-on learning experiences with Q&A.
  • Free Access: Selected presentations and networking.
  • All Access Pass: Comprehensive experience with recordings and workshops.
Register now!

Never miss an episode!

Join thousends of your peers and subscribe to get our latest updates by email!

Get the shownotes of our podcast episodes plus tips and tricks to increase your impact at work to boost your career!

We won’t send you spam. Unsubscribe at any time. Powered by ConvertKit

Learn on demand

Click on the button to see our Teachble Inc. cources.

Load content

Paolo Eusebi

Senior Consultant, Statistics & Psychometrics at IQVIA 

Statistician with broad experience in all aspects of biostatistics, epidemiology, and health services evaluation. Interested in consulting offers.

Specialties: Data management, data analysis, research projects. Knowledge of main statistical software packages such as SAS, STATA, and R. 

Transcript

How AI Can Help Us With Our Day-To-Day Work! Experiences With Copilot

Alexander: [00:00:00] Welcome to a new episode and today I’m super happy that we have Paolo again. Hi Paolo, how are you doing? 

Paolo: Alexander, I’m doing very well, you? 

Alexander: Very good. It’s spring outside and a lot of lot of fresh things are happening and of course there’s a lot of very, very interesting things happening outside. And I think at the moment you can’t actually look into any news, social media, whatsoever, and not step over AI.

It is. Yeah, definitely. It’s definitely everywhere. And today we want to talk really about how all these tools and specifically Copilot can help you at work now and how that could look like in the future. And Paolo, you have [00:01:00] now been working with Copilot a little bit and can actually share your experiences, your, your learnings.

But before we get into that how did you step over Copilot? When, when did? Did you first learn about that? 

Paolo: I learned about that. When I, I was you know, scrolling my LinkedIn feed and I saw a lot of buds around that. And it seemed you know, something really cool. Mm-Hmm, at least to try out.

Yeah. ’cause my first idea was to try out the, the tool and see maybe, I don’t know how this kind of tools can work for us. Are they really useful is the, but worth it. And maybe it’s, it was a way to be, you know, part of the conversation because, as you said, it’s really impossible to avoid ai, generative ai, large [00:02:00] language models and co-pilots in any discussions.

If you, if you are in a conference if you you know, in a virtual coffee with a colleague it’s something that, pops up you know, all the time. So it was not really because it was something that I, I needed for my daily work because I was stuck and I, and I And they need another tool, but it was just you know, for trying out.

Alexander: Yeah. So an interesting learning is you learn about all these kinds of different things if you scroll through the right social media and I have, I’ve learned a lot about all kinds of different things through LinkedIn. And so I think that stresses once again, the importance to leverage that social media network for your advantage.

I think it’s, it’s different from lots of the other [00:03:00] social media networks that are nice distractions and kind of entertaining. I think LinkedIn is part of, of work and I think checking it posting on it, being active on it, having a good profile helps you a lot with your job. 

Paolo: Yeah, of course. 

Alexander: So Copilot. What is that actually? 

Paolo: Yeah. I mean co-pilot, there are different versions of it. I tried GitHub copilot integrated in our studio because mm-Hmm. Basically co-pilot is an application integrated in another application for simplifying your work. Mm-Hmm. so. For example GitHub copilot integrated in R studio is an application driven by ai models that when given a prompt, we’ll output code for you.

[00:04:00] In a variety of languages because could be RN Python, for example. If you use it in the R studio IDE and and it, it’s a bit surprising because often. It provides you with a very accurate version of, of what you’re looking for. So of course Copilot leverages the code written by other I mean, the software developers and researchers.

So for the most basic stuff, it’s really impressive. I mean if, if you want to read the data set perform a linear regression, perform a logistic regression and maybe summarize the the results and put them in a table. Yeah, it was just fine.

Alexander: How do these prompts look like? Is it kind of take data set X use Y as a, [00:05:00] as an endpoint, use these variables X, Y, Z as, Cova and create a linear model? 

Paolo: Yes. And you, you, you can al you can also you know, for example, in Ara you, you can have comments as in other languages. Mm-Hmm. In your codes.

And there are different ways of command commanding your code. So basically you can start by putting your comment. In the code, like a logistic regression with Y as dependent variable and X as covariate and, and maybe Y and X comes from a data set that you have already loaded into the, the software.

And then you, you go to the next line and then copilot will do. Some auto completion and suggestions of the code. So you, you can see, you can see a line of Of [00:06:00] could suggest that and that you can simply push the tab to accept the suggestion, or maybe you can refuse the suggestion.

And because, because maybe it’s not accurate, maybe it’s not what you’re looking for. And but again, in general for also basic stuff. And and some, and sometimes was surprised because let’s suppose you’re doing some codes for manipulating data. And at some point, maybe you have a few lines for manipulating the fields in a categorical variable, because you want to fine tune the categories to look at in a better way.

Mm-Hmm. . And, and then after the first, maybe categories, then it starts to suggest the the next category and the next one, and they are, they are always accurate because of course it reads the levels of, of, of your variable in the [00:07:00] data set because it is able to use the data set to load that previous code for.

From other developers, your code, the code you entered before and and of course for basic stuff. And for example, when I, when I was doing this kind of manipulations at some point I created you know, missing values for some specific Categories with the purpose of removing these missing values at the end.

And really at the end, he suggested me and a dot to meet the commander for removing all the records. With missing values in that variable. So it’s, 

Alexander: This is interesting. Yeah, this is, so it basically reads the code while you’re typing it, compares it to existing other codes and makes suggestions 

Paolo: For you. [00:08:00] For you. 

Alexander: When you then do, for example, these, like a logistic model and you have entered two, three different variables. Would it also suggest kind of, Oh, you could add interactions or interaction terms or things like that? 

Paolo: No, but maybe you can I’m simply I mean, thinking allowed, maybe you can add the comment Like the same model with interaction.

And then you can have the next line maybe with the suggested model, including the original variables and the interactions that could work, of course, for example, for logistic regression it provides you with the. The, you know, the binomial distribution the link is equal to the logic link.

And of course there could be other options. I mean, it’s a co pilot and it is not statistician, but maybe it’s a good starting point for [00:09:00] having already something trusted. Maybe then you can dig into the documentation and see other ways of you know, doing logistic progressions and using other links, for example.

Then the output is often something that we can spend a lot of time on kind of creating a nice table or a nice figure. What’s, what’s your experience there? Yeah, for the figures, for example, I have to invest more time on it, but it is able to create ggplot code, for example, from, from zero just with one line of commenter, so if you want to have a box plot scatter plots or stuff like that, you can have it and I mean, this basic data visualizations are correct, so if you, and [00:10:00] also Maybe for learning could be important because I don’t know, ggplot is really cool, but you have to know it.

And maybe you are coming from another language. You’re just using R because it’s convenient for a piece of work you’re doing. But maybe you usually create figures with, I don’t know, Python maybe. But instead of going back to Python for the figures, you can stay in R because you can have basic suggestions from copilot.

Alexander: Yeah. How, how does it work with translating code? 

Paolo: It can also translate code and in general for this kind of Processes. You can also use the copilot application in the Microsoft edge, for example. Mm-Hmm. , just to name an option which [00:11:00] is integrated in, in Bing.

mm-Hmm, . So for example, even outside an ID, you can for example give it a piece of code in SAS and translated into our. 

Alexander: This is so cool. Yeah. You basically have a deep L for software languages for programming languages rather than for translated from French to German and from German to English and so on.

Paolo: Yeah. That’s of course software is, is a language. 

Alexander: Yeah. Yeah. Yeah. That’s that’s pretty cool. Now if I step over code. Yeah. And for example, I look into code from you know, an, an homepage from a friend, from a, from a colleague, from an expert. Yeah. Can Copilot help me understanding this code, what it does?

Paolo: Yes. And the, the nice thing [00:12:00] about copilot the, also the application integrated in windows, for example once I, I was reviewing code SAS. And you know, that in production you have someone using some kind of code and you know, you have another programmer using another piece of code and maybe they’re using different parts of the, the language for doing the same task.

And maybe you’re not I don’t know, super familiar with do loop because you, you used maybe other ways of working with data steps. I don’t know. And you, you can pick up the code. Pass it to copilot and a ask copilot for, for an explanation. How does it work? And and you can even try to fix it if it doesn’t work.

Of course. This is more challenging. Yeah. Yeah. And and I mean [00:13:00] it’s not always accurate. But again, of course, if you are fantastic programmer in, in this specific language so maybe you need it for enhancing your work in a repetitive task, but maybe it’s not 100 percent useful for fixing issues, but If you are a starter, for example, in a language, then maybe you, you, you can learn a lot and using a lot in in your work.

Alexander: Yeah. I love this idea of. getting started faster. Yeah. So you can create a draft program and then based on your knowledge, you can fine tune it improve it step by step over time. And of course for, for learners, It’s a great opportunity [00:14:00] to understand code from others. It’s a great opportunity to get faster started.

And as you said, kind of, for example, with ggplot, yeah. Understanding what are actually the things that you need to put in for a box plot or a scatterplot or some other plot. Yeah. It is much easier to. Basically let CoPilot find these different things and for you to kind of search for them. So things that, that there’s a lot of things that makes that really, really nice and easy.

Paolo: Yeah. And speaking of learning, for example, I read a paper on using for example, GitHub CoPilot in R for teaching statistics. Yeah. So directly, for example I mean, it’s tricky, but you have a, you have a classroom students for you know, maybe they are [00:15:00] studying business or psychology and they have to learn a bit of statistics and you can install R, R studio, the github copilot and start By teaching statistics directly with copilot without you know, passing to the all the steps explaining how the syntax work, stuff like that.

Because in one hour you are able to have an experience for the students who can interact with the data and have the first results. Yeah, of course, that is tricky because for some students, it’s like a copilot is the programming language. R& R studio. 

Alexander: Yeah, yeah, I think It’s great to get started. It’s great to learn a lot with it. And you still can’t switch your brain off. You still need to learn about these kind of different things. But I think it can greatly enhance. how you do your day to day jobs. Yeah, of [00:16:00] course, there is one thing that you need to notice. When you’re working in a big organization, be careful how you use that.

There is lots of these things are basically public. So everything you do in there is basically public. So that’s why more and more companies get to kind of these large language models that basically sit within their walls. They are so, so AI security walls, so to say. And if you have access to that, then this is really great.

However, for just for learning, for experimenting, for understanding how all kinds of different things work, I think this is a great, great tool. 

Paolo: Yeah. And we, we are just the beginning and these tools are here to stay, so 

Alexander: Absolutely. We’ll see. Thanks so much Paolo for this awesome discussion about co-pilot.

And stay tuned. We’ll have [00:17:00] much more about AI and machine learning and these kind of things in the future. Thanks so much.

Paolo: Thanks, bye.

Join The Effective Statistician LinkedIn group

I want to help the community of statisticians, data scientists, programmers and other quantitative scientists to be more influential, innovative, and effective. I believe that as a community we can help our research, our regulatory and payer systems, and ultimately physicians and patients take better decisions based on better evidence.

I work to achieve a future in which everyone can access the right evidence in the right format at the right time to make sound decisions.

When my kids are sick, I want to have good evidence to discuss with the physician about the different therapy choices.

When my mother is sick, I want her to understand the evidence and being able to understand it.

When I get sick, I want to find evidence that I can trust and that helps me to have meaningful discussions with my healthcare professionals.

I want to live in a world, where the media reports correctly about medical evidence and in which society distinguishes between fake evidence and real evidence.

Let’s work together to achieve this.