Data Governance in the Age of Generative AI with Sebastian Andruszczak Artwork

Transformation in Trials

A podcast about the transformations in clinical trial. As life science companies are pressured to deliver novel drugs faster, data, processes, applications, roles and change itself is changing. We speak to people in the industry that experience these transformations up close and make sense of how the pressure can become a catalyst for transformation.

All Episodes

Transformation in Trials

Data Governance in the Age of Generative AI with Sebastian Andruszczak

June 04, 2025 • Sam Parnell & Ivanna Rosendal • Season 6 • Episode 9

Send us a text

Sebastian Andruszczak, Chief Growth Officer at Holisticon (part of the Nexer Group), brings fresh perspective to the conversation around generative AI adoption in pharmaceutical companies. Drawing from his unique background in sales, marketing, and technology, Sebastian cuts through the hype to address fundamental challenges that prevent successful AI implementation in life sciences.

The conversation reveals a critical insight often overlooked in the rush to adopt new technology: without proper data governance, organizations risk "scaling disaster." Sebastian walks us through the specific challenges pharmaceutical companies face - from ever-changing data sources creating inconsistent AI responses, to linguistic variations causing subtle differences in interpretation across global teams. These issues become magnified when implementing AI at enterprise scale, potentially undermining the very efficiencies these technologies promise.

Sebastian outlines a methodical, four-part approach to building effective AI systems: starting with data governance fundamentals, then data engineering, followed by traditional machine learning, and finally generative AI applications. This structured approach has proven successful for companies like Beringer Ingelheim, whose ambitious knowledge management system has already saved one million working hours by processing over 800,000 documents across 200 departments in 70+ countries.

What makes this episode particularly valuable is Sebastian's candid assessment of the industry's current state. While acknowledging the transformative potential of strategic AI initiatives in drug discovery and knowledge management, he challenges the fear-driven implementation happening in many organizations. His "magic wand" wish for the industry reflects this pragmatism: focus first on data quality - a decades-old challenge that finally has the perfect justification for investment.

Whether you're directly involved in pharmaceutical technology implementation or interested in how AI adoption affects life sciences advancement, this episode provides practical insights on building systems that deliver genuine value rather than just following technological trends. Connect with Sebastian on LinkedIn to continue the conversation.

Support the show

________
Reach out to Ivanna Rosendal

Join the conversation on our LinkedIn page

Speaker 1: 0:18

Welcome to another episode of Transformation in Trials. I'm your host, Ivana Rosendahl. In this podcast, we explore how clinical trials are currently transforming so we can identify trends that can be further accelerated. We want to ensure that no patient has to wait for treatment and we get drugs to them as quickly as possible. Welcome to another episode of Transformation in Trials. Today in the studio with me I have Sebastian Andruszak. Hello, Sebastian.

Speaker 2: 0:47

Hello Iwanna Yep. My name is Sebastian Andruszak and I am a Chief Grow Officer at Holisticon. I'm happy to be here.

Speaker 1: 0:54

I am very excited to be here with you, and today is a new kind of episode in multiple ways. One, we're trying out a new question today. Two, this is actually the first time I'm recording in a new technology. And three, this is the first time that the guest, sebastian, actually prepared the questions instead of me preparing the questions. So I feel very fortunate to be working with someone who is as eager to be here as I am, so thank you for that, sebastian.

Speaker 2: 1:23

Thank you very much for this opportunity as well.

Speaker 1: 1:26

Well, I would like to hear more about what you're currently up to at both Holisticon and Nexer. Tell me more about what occupies your mind right now.

Speaker 2: 1:37

Okay, so, starting from the company I work for, holisticon Connect, that is, that part of Nexer, and maybe starting from Nexer. Nexer is the group of software development companies. It has been founded in Sweden in 1992. Currently it has more than 2,200 people, and Holistical Connect is one of the parts of the Nexer. That, by the way, is the part of the Daniel Group that employs more than 11,000 people across the world.

Speaker 2: 2:10

And if it comes to, we are a software development company that focuses mostly on offerings around cloud data and engineering services, and by engineering I mean mostly embedded connectivity and IoT. We are working with clients from Nordics and UK. We are also considering opening a few new markets, taking into account current geopolitical and economical situation. And if it comes to industries, we are mostly working with clients from pharma, biotech, telecoms and automotive. Then my role connects three areas. This is sales, marketing and the offering, and I really like to think about it as making sure that we are reaching right people via right channels with the right value proposition. So I'm not really thinking about selling like hard selling. I'm rather thinking about where are the gaps that we could fulfill to make sure that our clients are happy and can build their competitive edge.

Speaker 1: 3:19

That sounds good, and you and I met over a LinkedIn correspondence, as is the best way to meet other interesting people, and one of the things that we felt that we connected on was data governance and how generative AI doesn't really solve the issues of data governance. But before we get into the meat of this episode, if you were to explain to a 17-year-old in high school still how would you explain what data governance actually is?

Speaker 2: 3:51

Yeah, 17 years old is a rather old person.

Speaker 1: 3:58

But do they know what data governance is?

Speaker 2: 4:01

Yeah, I would say that this is the set of the rules and processes to ensure that the data is accurate, safe and easy to find and, generally speaking, it defines specific roles to work with data, such as data manager, data steward, some kind of user, and it creates rules like who is responsible for making sure that the data is accurate Basically. Sometimes, especially in big organizations, you have 47 sources of company name and then you want to have some report like source calculator. Then you want to have some report like source calculator and if no one is responsible, then you are encountering some relatively small but very disturbing issues. So data governance is mainly the set of rules and processes to make sure that it won't happen.

Speaker 1: 4:59

Yeah, that's a good way of explaining it. I have a relatively fresh example where we were trying to define country names and this should be a simple task. Right, italy is Italy, or is it? Are we writing the country name in the country's native language? Are we just abbreviating it to the letter that's on the license plate? Are we using a number as code? For, like, the phone number always has, like, a preamble so you can see which country it's from?

Speaker 1: 5:30

Can we use that. So when we're just trying to define, like, how do we define a list of countries, that is the same, all these questions. So apparently it's not that simple to define what data actually is.

Speaker 2: 5:41

Yeah, this is true. Like probably like seven years ago, I learned that German in Spain is apparently not German, it's Alemannia.

Speaker 1: 5:50

I would never conclude it by my own.

Speaker 2: 5:56

So, yeah, it's not as simple, but it's definitely doable.

Speaker 1: 6:00

Yeah, great. Well, hopefully the 17-year-olds amongst our audience are a little bit more educated now. Good, well, and the reason why we're talking about data governance is when we talk about generative AI, we often encounter problems that are kind of old problems, data governance problems that just get translated into our generative AI landscape. But tell me more about your recent thoughts on generative AI adoption in pharma and biotech.

Speaker 2: 6:34

Yeah, okay. So to give some background, taking into account my role and my vision of having the best possible value proposition for the right people, I have started to talk with representatives from pharmaceutical companies some time ago in order to validate where are they with the adoption of generative AI. There are two general approaches. First is rather pragmatic one and it's more like, yeah, let's have a generative AI-based chat and let's do whatever we can like translate, summarize and so on. So this is relatively simple, but it doesn't really offer a huge impact on a business. It's, yeah, you will read emails a little quicker, but then if you really want to have very accurate communication, it's not as much quicker, in my personal opinion. And then you have more top-down strategic kind of initiatives and those include, for example, knowledge management and learning systems, and I will give some example of Beringer Ingelheim, what they did during those years. And then you have drug discovery and those are much more ambitious and impactful ways how to use and adopt generative AI in a pharmaceutical business. But there are some challenges and the challenges are related to a few facts I would say. First, one is related to the fact that you have ever-changing both data sources and needs. So and sorry, just to conclude my previous thought. So what I'm trying to avoid when working on this offering is to go to big pharmaceutical clients and say, like yeah, we can connect your SharePoint with LLM. Like I'm sorry, but I can't bear it. Companies are actually doing it and I I have lack of understanding for this thing, because you know solution providers such as AWS, microsoft. They made their solutions in such a way that companies can connect those data sources by their own and, based on my perspective, there is no value proposition. That's why I'm thinking of talking with people.

Speaker 2: 9:09

Then, coming up to the challenges, there are a few challenges. First one is related to ever-changing inputs. So yeah, at the end of the day, in most of the cases, you are connecting some data sources to LLM and those data sources have different data in the time. The data is different in day one and at the end of the year. On top of this, you have ever-changing needs for the output, so people are asking different questions and have different perspectives. Perspective changes in the time. So people are asking different questions and have different perspectives. The perspective changes in the time. So if you add one to another, in some cases you can't really rely on the answers, because answers will be different. So you know, sometimes it's good if you are asking like what is market share for Nordic countries for first quarter 2024? And then you are asking the same question for current quarter, let's say no-transcript. This variance between answers is right.

Speaker 2: 10:19

But then if you want to know what are the core principles of our approach to strategic clients, let's say it's a very cheesy example, I would say, but just as an example, then you would expect that answers will be totally the same.

Speaker 2: 10:37

And even if you are not impacting LLM by providing new definitions of principles, still if you are adding new presentations, let's say to your PowerPoint, and they are not saying about principles in a straightforward way, but somewhere around, it will be taken into consideration, it can be taken into consideration. So in the first quarter and third quarter you will get the different answers and it's not really what is expected. So this is the first thing. Then we have an issue, I would say, related to the languages that you asked the question. You can have totally the same sources, but if you have 100,000 people compounding and some are asking a question in Chinese and some are asking questions in Polish and in English, then taking into account the different semantics of those languages, you will get slightly different answers. And this is actually interesting because on the one hand it could be just accepted, because at the end of the day, if those people in China, poland and England are reading same content just reading manually, let's say still they understand and they process information in a little different way.

Speaker 1: 11:59

Yeah, it's the structure of languages.

Speaker 2: 12:02

Yeah, but then if you're implementing in IT world, we used to think about standardization, about standards, consistency, accuracy, so it's not really. It could be acceptable, but it's not really what is expected.

Speaker 2: 12:22

So, this is, let's say, I don't want to say minor issue, but this is another issue. And then the third issue is related to particular LLMs. Flavor, I would say and by flavor I mean if you take different LLMs, they are all trained on different data sets and they all have different sets of censorship. And based on my personal perspective but I have also confirmed this perspective with some representatives from pharmaceutical companies, from enterprise segment If you take this into consideration, it's a little like imagine that you have 100 people 97 of them is from your company and three are from OpenAI. You don't really want on a strategic level, you don't really want to have it intentional. It's not an intentional decision, but it is like this If you have the same data sets, three different LLMs, you will get a slightly different answer. And if you take into account this 100,000 employees, let's say at the end of the day, small variances and inaccuracies will create huge impact. I think this is the butterfly effect yeah.

Speaker 1: 13:48

I see and also it's a new problem that arises with this specific technology. It's a new problem of data governance than we had before. It is still a data governance, but it it is more. The information is ever-changing, so do the answers to the questions that we ask, even the ones in the past.

Speaker 2: 14:09

Yeah, this is a very valid observation. But then another observation that makes this scenario and setup a little different than it was before is the fact that with generative AI, you can scale things I'm not really sure if exponentially, but very quickly, and you don't really want to scale up disaster, and it's very easy to make it happen. So this is another risk that wasn't really as relevant before. So, yeah, those are the challenges.

Speaker 1: 14:45

I love that image. You can very quickly scale up disaster, yeah. And also, and not only is the ancient problem of data governance more relevant, so is the ancient problem of different structures of language and how we understand things. And how do the models that we have? Do they account for the different understandings across languages and do they highlight that this understanding is actually different in China compared to England?

Speaker 2: 15:18

Yeah, and I wouldn't really like to go as deep, but just as a highlight. You know, based on social psychology, it works in both ways. So language impacts our perception and perception impacts our language. So you probably remember this example that in Iceland, people have like so huge number of adjectives to describe snow. Yeah, so if you use illustration and you think about what are the potential impacts of understanding things from different perspectives, totally different perspectives it could be huge, like if you have some meeting for all employees at the end of the year from different perspectives, totally different perspectives. It could be huge, like if you have some meeting for all employees at the end of the year and they will use the same tool. Apparently it could appear that they will speak about totally different things.

Speaker 1: 16:11

Yeah, yeah.

Speaker 2: 16:12

So yeah.

Speaker 1: 16:13

That is very interesting. I feel like I should get like a language professor in the studio to talk about this more. That would be very interesting, but that is a topic for another day.

Speaker 2: 16:24

Yeah.

Speaker 1: 16:25

And Sebastian, I would just say that I am fascinated by your approach to designing this offering, that you're trying to understand what will actually make an impact and what is the structure of this new technology and what are the needs of life sciences, before you kind of just jump in and say here's the new thing. I think that's a really cool way to go about building a product.

Speaker 2: 16:45

Thank you very much.

Speaker 1: 16:47

I would be curious to talk more about solutions then, because these are the problems we have. What can we take from things that we know work in the past for data problems, and what is potentially something where we need to think new thoughts to solve some of the problems?

Speaker 2: 17:07

Okay, thank you very much for this question. So the offering that I'm working on at this moment concludes from four different, let's say, sub-offerings. It's not the parts, maybe, rather like this, and it's data governance, data engineering, something that I call deep AI, slash ML and then generative AI, and it's made in the way that follows the right order, I believe, like in order to deal with generative AI, you need to ensure that your data is right, that you have right rules, that people right people have right responsibilities. So this is the data governance. Then, if it comes to the data engineering that, taken into account the fact that you have so many different data sources, it's not like all people need all the data, so you need to create some data pipelines and ultimately end up with data sets that are accessible to the right departments, right teams for the right described results. Then we have deep AI ML. So it's more about this now. I I think I should say old school ai, but also very serious ai, such as just implementing machineric algorithms, different kinds of regression to predict the future based on a limited, let's say, number of information I'm not sure if limited is right in this case, but you know what I mean and then I think we could go to the generative AI-based solutions, and here we also have, like I also have some idea that supports the challenges that we were discussing before, and I'm thinking about data validation, ai agents, so something that still works in the area of making the data right. So, starting from the data governance, I'm not very sure how should I start. Like you already know, like I have already mentioned, what is this?

Speaker 2: 19:18

The general concept related to our offering particularly is related to the fact that, based on my current conversations not all of them, but the majority of them I have found that sometimes triggering such strategic initiatives is easier from the external world than internal world. Yes, inside of huge organizations regardless if you want it this way or not there is so much politics, yes, and all the agreements and different things that are making working on the data quality very difficult, especially that for some stakeholders and users, results and effects and impact is not noticeable immediately. It's more like building the basics and then, like in two or three quarters, we can start building generative. So it's not can start building generating phase. So it's not like instant reward. So sometimes it's difficult and that's why I have used the opportunities that I have related to working in an xr. So I I just started strategic conversation with another two business areas of Nexer Nexer Data Management and Nexer Enterprise Applications UK and we started conversation about how this offering should look like. So the data governance is definitely the first step. Then we have the data engineering, but I think I have described it already. So it's about making sure that right people, tools and processes have right data sets to work with and these data sets are accurate, consistent and up-to-date. Then we have predictions and stuff.

Speaker 2: 21:16

But I would like to tell you a few words more about those data validation AI agents. Because if we go back to one of the challenges related to data consistency, so you're asking a question in different moments in time and you are having a different answer, having different answer. Theoretically, you could have a, let's say, book of principles and you could check up the output. With book of principles, you could ask just a few questions like is the output to the book of principles items content? If so, please validate it, and you could do it manually. It would make the whole generative AI tool like. I can't really understand the sense because it wouldn't be a question answer. It would be question. Then take it from the data validation team. We'll check the answer and send it to you until the end of the day. So it would be first step, first concept, but then you could also really automate it so you could implement the data validation done by AI agents in the process of your solution. It could be validated either semi-automatically or automatically. It could be validated either semi-automatically or automatically, and I think this is something that we would like to help to develop to our clients. I have huge trust in this direction, and I also promised to tell a few words about Behringer Ingelheim. I'm looking for my notes to not miss anything. Yeah, I have it. So Beringer in Gelsheim is actually doing something very similar. They are planning to complete this project until the end of 2026. It has started in 2020, so before.

Speaker 2: 23:15

Generative AI as a data digestion and comprehension kind of tool. And just to give a few interesting highlights it processes, digests and comprehends 800,000 external documents, 25,000 internal documents. It scales up to more than 200 departments in more than 70 countries. Until now, they have saved 1 million hours. Holy moly, yeah, holy moly, yeah. You know I was speaking with many different companies and Beringer Ingenheim, at least based on my. Obviously, companies can't really share everything it is new but, based on my current knowledge, beringer Ingenheim is the furthest if it comes to development of such tools. Some of companies are trying slash considering and a huge bulk of them haven't really got some kind of top stakeholder buy-in for it. People are not really convinced that, yeah, it makes sense. So I think our role in this process is to convince people, to show them real impact, to show them potential output and to help them to develop such tools. So this is about solutions.

Speaker 1: 24:55

I would say yeah, and I think it's helpful that you kind of say, well, this is the direction that we believe will work, and and just having some sort of idea of what does it take to actually get there is very helpful to the rest of the industry who are trying to figure out well, which, which path should we take here?

Speaker 2: 25:14

yeah, thank you very much for those words. It makes me, yeah, even more optimistic when I think about this. Like I, I personally, I don't have a doubt that this is good, right direction, but then, yeah, I just need to validate it with my, with my potential clients, clients.

Speaker 1: 25:31

So, thank you very much well, sebastian I, I am curious. How did you end up in this space in the first place Life sciences, ai, building products? What happened in your career?

Speaker 2: 25:47

I wouldn't really like to disappoint you, especially if it comes to the area of life sciences, because, to be perfectly honest, it's a interesting but, at the same time, difficult question. You know, I'm coming from sales and marketing world. At the same time, I just like the technology and not like I like to use Canva. Some people when I have interviews, they are like, yeah, I'm very interested in technology. So I'm like, okay, what technology are you interested in? Oh, I'm a heavy user. And they are giving some examples like Canva or Trello, and I'm like, okay, so no, I'm interested in the technology in a different way. I'm interested in the technology in a different way.

Speaker 2: 26:37

I have implemented a few CRMs somewhere in the past for rather small medium companies. But yeah, I learned it by my own. I know how to build websites. I built probably more than 100 websites During last years. I have completed a few interesting certificates from Google, I mean from GCP, from Mendix. This is enterprise-grade, local solution and I'm trying to be on top of this. So at some point I decided that I don't want to work with companies that are providing solutions. I want to work for companies that are building.

Speaker 1: 27:18

Yes important distinction I want to work for companies that are building yes Pinks Important distinction.

Speaker 2: 27:23

You know, the reason is, I think technology gives us so much opportunities and areas where we could do something really spectacular that I just didn't want to be limited. That's why I decided to move my career from solution providers, let's say, to software development companies, and then when I started to work in Holisticon you know life science meant as biotechnology and pharma is one of our target markets, so naturally I'm a rather involved employee. So in order to be able to conclude all things that I have shared today, I just needed to do a few extra miles and understand specific processes and general business principles of those segments. So this is my general connection to the business at this moment.

Speaker 1: 28:28

That makes sense. And what do you think happens from here when we think about AI? Where is it going to go? And maybe just let's keep it pretty short term, because a lot is happening quickly. Where do you think we're going?

Speaker 2: 28:44

That's a very good question, Like I think I will refer to my previous to the thing that I said previously. It goes parallel. On the one hand, we have those low effort, low impact actions, and I believe that it will be supported by solution providers. So, as for now, it's not only about LLMs per se and AWS, Google, Microsoft and OpenAI and so on. It's also about solutions that are already generative, AI-based solutions that are already embedded in bigger solutions, slash ecosystems such as SAP, IBM, Salesforce and so on, even local, like Mendix. So I think this is going to be one direction and initiatives here are going to be triggered rather bottom-up. Initiatives here are going to be triggered rather bottom-up. It can provide some interesting results, but I don't really feel that it can be really, like you know, very, very, very impactful and spectacular. Then we have those huge initiatives and those, like I believe it's going to still be mostly probably drug discovery.

Speaker 2: 30:05

Then another thing about knowledge, digestion and comprehension. I think it's also valid and the comprehensive part is really important, because five, seven years ago we were like, yeah, let's have our data clean, let's have it like a lot, Then let's have some regression, predict some numbers, and it was what companies were playing for For now, I think it's more about also understanding, deep understanding of the data, especially unstructured data. So I think it's going to be like I wouldn't really expect some spectacular development as per se. I think people will just try to do as much as they can in this area. Technology is going to be better, but at the same point, hype, I hope, is going to decrease a little, because the topic is overhyped. Yes, agreed, I understand this, but I don't really like it.

Speaker 1: 31:17

It's counterproductive that the hype is so high. It creates unnecessary tension and pressure without leading to better solutions.

Speaker 2: 31:27

That's a very good point and there were actually some. I don't remember who has provided this data, but there was some research about reasons of implementing generative AI in huge organizations and apparently some huge numbers. A huge percentage of respondents said like yeah, they are doing it because of fear of missing out and I don't believe that such initiatives should be triggered by this kind of reasons. Again, I understand, but I don't believe that such initiatives should be triggered by this kind of reasons. Again, I understand, but I don't agree.

Speaker 1: 32:04

Yeah, I completely agree with your disagreement on this. Well, Sebastian, as we start rounding off, we always ask our guests the same question. I'm especially curious about your answer because you've had a different entrance into the life sciences space. You have the luxury of somewhat seeing it from the outside and not just being brought up through the life sciences. If I gave you the transformation in trials magic wand that can change one thing in the life sciences industry, what would you wish to change?

Speaker 2: 32:37

Very easy To be fair the general approach to the data quality, and I understand now I will be serious I understand that I see only a small part of the whole environment and probably you should treat it just as a, let's say, half joke or 30% joke, like I understand that there could be things beyond my comprehension that are much more crucial, important, but sorry, I can't talk about it if I don't understand it. But if we look on life science even not only life science and our current technological landscape, I believe that this is a perfect moment. A perfect moment, a perfect opportunity to solve issues that we are talking about for two decades. Like everyone heard, sit in, sit out, organizations are talking about low quality of data, poor quality, inaccuracies and so on, and, if we refer back to what I said before, no one wants to scale disaster.

Speaker 2: 33:51

So I think now, generateify gives us additional benefit. We have great justification to actually run those initiatives. Focus around providing the right data quality and, on the other hand, you know I said that I can't see the whole picture but at the same time, like, based on my set of values that I believe in and, yeah, like data are crucial If we want to take the right, like data are crucial If we want to take a right decision, data are crucial. So I think it's not really overshot. And if we look on this data quality related projects, I have also put two other keywords to my notes ontology and knowledge graphs. So I think this is something that should be taken into account. So, yeah, I would change general approach. So C-level stakeholders would say like, yeah, let's start to work on our data quality today. It's a great idea, sebastian.

Speaker 1: 34:51

That would be great. And also, if people do want to reach out to you and talk more about helping build your product or more about knowledge graphs, where can they find you, Sebastian?

Speaker 2: 35:05

The easiest way would be on LinkedIn. My last name is rather difficult to pronounce.

Speaker 1: 35:11

Put it in the show notes.

Speaker 2: 35:19

So yeah, probably it would be the perfect first step and we would actually move the conversation on our company emails around the meeting and have a great conversation.

Speaker 1: 35:25

That sounds great, Sebastian. This has been an absolute pleasure. Thank you so much for joining me today.

Speaker 2: 35:32

Thank you so much for this opportunity. I really appreciate it.

Speaker 1: 35:50

You're listening to Transformation in Trials. I really appreciate it. Any other player? Remember to subscribe and get the episodes hot off the editor.

Ivanna M Rosendal

Host