Getting your Data into AI with Sammy Deprez

We were lucky to get Sammy Deprez for an interview during the MVP Summit in Redmond to talk about how to deal with getting information from your data into AI.

About our Guest Sammy Deprez
As a seasoned Business Intelligence Consultant with over a decade of experience, Sammy has tackled diverse projects in various industries, ranging from finance to healthcare, with a specialization in the Microsoft Stack.

In 2015, he pivoted his focus towards Machine Learning and Artificial Intelligence, and in 2017 he played a pivotal role in launching a new company that specializes in maximizing organizations' potential through the use of Artificial Intelligence and Microsoft tools.

Links of this episode:
Sammy Deprez
https://arinti.ai/

Great explanaiton of how LLMs work
https://www.youtube.com/watch?v=vMVHj9VIrLA&feature=youtu.be

Ralf Richter: Welcome to our second episode of Decode. I am sitting here in Richmond at the MVP summit next to Ralph. Hi, folks. And we had a pleasure to interview one of the most influencing MVP's talking about A.I. and his favorite topic, of course, is an AI MVP. And we will have an interview about 30 minutes with him. So. Let's ride, jump into it. Thank you, Michael, for introducing. So we'll have Sammy here and Sam, he's talking a little about how to deal with getting information from your own data into A.I. and this is pretty fantastic. So listen up. Now, here's Sammy introducing himself. .

Unknown: Hey, Sammy the Prey. I'm Mike Soft Air MVP based out of Nairobi, the city under the sun. I've been in the data and air industry since 2009, actually, the moment I finished school. And being an air MVP for now, four years. I run a data and consultancy firm based in Belgium, but also do some work in Kenya at home. That's cool. And MVP category is air. That's awesome these days, isn't it? Exactly. Awesome. Okay. So when we're talking about your profession, where you're employed or are you employed or are you self-employed? How do you deal with your daily business? Yeah, I'm a I'm a freelancer, but also running managing another firm. That's a group of people now. Okay. What was your way to, ha That's a good question. I think that's a story that many people actually have a bit of. The same idea is that I've been starting in the data industry after school and then data science started to get more popular. And I also started to look more into it because I found it very interesting how you can get so much more information out of the data that is already available with classic analytics. So I started playing with machine learning and then I got a chance to to start my own business. And that business is now purely focused on the high. Okay. Do you have employees already? Yeah, we are. It's 15, 15 people. What a growing industry. Yeah, it's not only with Microsoft, it seems that it is all over, that the companies are investing a lot into A.I.. Isn't it true? True. Although I could say that the initial two years were very difficult because data science at it's still science. So it means you do not always know that when you build something that will actually work. And that's not something that is easy to sell to smaller organizations because they do not always have those innovation budgets to try out new things. Yeah, but with what we see now, with all the prebuilt A.I. and and alarms, it's much easier to get something small into production. And now we see way more smaller organizations knock on our doors like, Hey, we would like to build something like this or that without even having a big innovation budget, just with a couple of thousands of euros, they can already really build something for them. Now I got to talking about business knocking on your door. Do you have a common use case? What they are asking you for? Well, since the the whole hype around large language models, it's constantly something about that. But we have many other applications that we've done in the past and that we're working on. We've been working in the automotive industry to. To do analysis on. On how to call like social media messages that are coming and to see how the the sentiment is of the consumers. I'm working now in the insurance tech to see how we can automate claim processing. Then we have everything concerning like document intelligence, transforming PDF files into structures and formations to automate this kind of processes. Yeah. Yeah. Got you. You referred already to LLM's and one of the most famous products out there out of it is the Open AI ChatGPT still which got announced about a year ago, and it feels like it overwhelmed themselves with that huge response and demand to such products. What's your gut feeling about that? Well, it's publicly known that the chip product itself is the fastest growing application in the consumer market ever. So many people have started using it. That's quite impressive on itself. And the very interesting part to me is that this is the first product product that people are really using a I themselves Yes. Where in the past everyone was already consuming A.I., like with Spotify or Netflix or Amazon stores or whatever, but it didn't really have the feeling that they were consuming. Hey, I like it like Alexa. Exactly. And Google and and Siri. And I was yes, they were consuming it then, but they it wasn't considered to be a I at the end of the day. And that's really a funny thing. You're you're really right. And I feel it is like they really got in trouble by that huge demand to serve all that requests. Don't you feel like it seems that there is a shortage of resources? I mean, to be honest, we have to say there is a shortage. Right. So they didn't expect that huge, you know? Well, I think the biggest problem is all of these kind of machine learning models or deep learning models or LMS, they need a lot of compute power to get it done, not only to build them, but also keep on serving people. Yeah. So to be able to run them, you need a lot of GPUs. That's so. That's why we see that Nvidia's. Yes. Share has gone up like crazy. So if you bought some shares three years ago, you still have them. I'm sure your original technology suddenly and did not do that. Yeah. Awesome. Awesome. So referring to a customer knocking on your door, asking for a chat assistant or whatever it's related to, how do you kick off such a project? Yeah, well, it depends from the use case. We have customers that work on purely. I want to make use of the technology of a large language models like like you have chat chip but internally. So because if you use chatbot the free version, well the information is also sent to open the AI and it can be reused to to retrain their models. Of course, if you're an enterprise organization or even just a smaller organization, you don't want that information to be in there. So we build, we have a solution. But of course, just the Azure Open Air Services, if you make use of those ones, you can be hundred percent sure your information stays with you as your data and it's not going to be reused. So that's a good one. That's actually the easiest solution. The easiest solution because you're just talking with the limbs and you use the you cannot say to the knowledge from the lens, but it's a it's a tricky one. I think that's a whole different discussion we can still have. But you use the, let's say, the analytical power of the large language models. Then the second one is that people want to have a lot of information within their organization. And if that's so big, sometimes it's hard to find something back. And that is a lot of time that you're losing because you need to go through SharePoint sites or go through hundreds or thousands of documents or shares, folder shares and so on. And as I said, you lose a lot of time. So the idea there is to actually make use of large language models to be able to search through those documents. So you say you have the car kind of cheap version, but instead of using the analytical power of lamps, you will use the analytical power in combination with your own data. And that's giving a lot of power to organizations and back to your employees to increase and become more efficiency. Yeah, well so. Well, when you were like came this. Miley. When you talked about lemon knowledge, you have a different opinion or idea about alarms than most people have. I mean, me too. And there is on my opinion, the very first thing which is misleading or misunderstood by the people outside here is Al-Alam is neither knowledge base nor a 100% guarantee that the answer is correct due to the fact that we're talking about probabilities and it's not guaranteed to be like a machine learning model to be 100% correct. Right. This is the first thing. And the second thing is we are talking about are we treating life almost like a knowledge base? And you were like grinning and smiling for that. So what's that? Well, I was. I think if you look at how the main public thinks about it, they they use this activity to generate things and to ask questions to. And it answers, let's say, many times. Correct. And that's purely based because it's what it does is when you send in a sentence, it's will actually, based on your sentence, try to predict what's the next word. And that specific word is based on all the information that is in that system and it's public information. So it has that you could say knowledge, but at the end it's only predicting each time the next word. So although you might get a full paragraph back or even in a picture, it will each time increase your output with one word until it's while until a certain parameter has been fulfilled. This is why when you talk to Chatbot through that chat window, a stream of words is something, right? So it feels like someone is writing on the other side a word by word, but it is like it is. It does its prediction to the next word. And this is why these words are prompting in a row. It's actually an iterating process until a parameter was fulfilled. And okay, this is the end of my line. I give you a look and how you actually can call to open a I services yourself. You can one of the parameters for example your max tokens. So that's a different well that's a different thing a token to give a small quick explanation for that. A token can be a word or a part of a word. Yeah, well, for folks outside, utilize the tokenize or page. We will have the link in in our show notes so that you can put in a word and then you see what a token is about so that we do not have to explain that now in that session. But it's a useful information. So please refer to that link, have it, hit it and try it out by yourself. You'll understand how a token is being traded and what's what's the meaning of it. So approach has no perfect. So that's one of the parameters. Now I think again, to see what the main public is thinking is it gives it spits out knowledge that it has. And at the end it is somehow knowledge which you can never be 100% sure that it's correct. And also the way how it interacts with humans. Some people might even think it shows emotion, but it's there's no emotion at all that's pure math. And that's also brings the danger to these kind of models. There are use cases, sadly enough, where some kind of child bots were generated to give emotional advisor psychological advice. Yeah, not giving the best ones. And in my home country and Belsen, there was actually someone who, sadly enough, killed himself. I damn it. Yeah. And also they providing like girlfriends or boyfriends and. Exactly. Really then that's dangerous. Yeah. It's all nice and fun. But how until a certain moment. Until how far it can go. Yeah, exactly. Well talking about that, we have now learned that Ella clouds are based up on data set, which they were trained on. And that's limited, let's say for GPT three, it is the year 20, while the new one is 2023. It's oh, 2023. So if you look for information related to 24, you won't find it in chatbot only four now. So and you were mentioning that your customers want to have easy way to maybe crawl for information which is inside their company. And you also were talking about that this information won't go back to any supplier of lamps. So how do you achieve that? You you've mentioned a certain way to do that, so we have to dove into that. I guess so. So the whole concepts to be able to use touch a beauty on your own data as with and an idea which is called retrieval augmented generation. Now that means as it says retrieval that you need to get somewhere the data. Your own data somewhere. Now, we for the ones who have played with these Jeopardy models, they know that you only have a limited amount of tokens that you can given to those brands. So if you have thousands of documents, you can not just say, okay, this is all my knowledge. Now give me an answer on it. Well, we have like now 32,000 tokens I think with the chip before, but still that's not enough. That's still just a couple of pages, less of a word document, for example. So what's the concept behind it? They actually put all that information into a search engine. Now in search engines, you have many different kinds of search engines. And in the past we only had the classic ones like Wood, LUCENE and so on, which actually works quite well, but it's keyword based. And then we had, for example, Microsoft had semantic search, which already used some kind of an LP technology to link particular words to each other. Could you do me a favor and translate an LP to the audience, please? Natural language programing. Exactly. I couldn't give up on the abbreviation, so with so many times you just use the abbreviation. Yeah, but now. So that already worked. Well, even if you looked at words Microsoft has built to it. At that time it was called Semantic Search. It could even extract an answer out of a list of documents that you had. But it was not always 100%. Well, the general expected answer was correct, but the quantity that you would return that would get returned was not always adult. Hmm. But still, let's keep it with that. The idea is that when a user asks you a question that that same question or even manipulated sometimes I will go into that later is that you will use that query to search in your knowledgebase or your search and use your search engine to get, for example, the top three documents that are actually useful to that question. Instead of looking into your thousand documents now, then you take the content of those three documents or even only parts of it, because they were talking, for example, about chunking. Let's say we take the three full documents and we give that to the prompt, and then we say, okay, look, tugboats line, let's call it like that. Mm hmm. You are supposed to answer questions from from employees. This is the question that the user has asked. And now only use the context that I'm giving you based on these three documents and answer based only on that. If you do not have any information available in that, then also say you do not know. Mm hmm. Yeah. And that's important. Yeah. Because otherwise it will still try to give event, which is often called the gold medalist nation or groundedness. It's depending who you're talking about with that. Now, so the reason why we do rag, which is a shortened version of the adjective augmented thing, is that we want to add our own private information to the chat bot and we are not going to put that into our lam. It is just on top of it. It sits on top of it and we're using an attempt to get the request translated of the user to get a certain or a specific item out of the exact knowledge we provide. So there's no need to retrain those large languages with your own data. First of all, that would be very expensive to retrain each time those models. There is something called, like, fine tuning of models. Mm hmm. But in most cases, like, say, 90% of the use case is fine, tuning is not necessary. And which rack metrics you can actually get what you need. Wouldn't it be also the case that you should use rock upfront fine tuning to see? Of course you first try and retrieve an augmented generation before you even try. Fine tuning. Yeah. Fine tuning is more necessary when you want your. The output that is coming out needs to be in a particular format or any particular behavior. Yeah. You can use it to train on data also, but it's it's gets expensive because the yeah. The whole, the whole architecture behind that this is more expensive than just using any recommended cutting. Yeah, that's fantastic. So when we when we were talking about that method, it also has some like security precautions in. So when we're talking about a company or when we look at the DHL issue, which happened already where the chat bot were really shitty talking about DHL. So can rank prevent me from that behavior as well. RAC on itself, no, nothing. Because that's just a method of giving data to the to the provinces, to the message that you're sending to the alarms. Now what you're talking about, the idea, for example, from what happened to DHL is that users started to trick the chat box to act in a different way. So they call it prompt hacking or or a prompt hijacking. Now, the funny thing about that is that it's actually very hard to block those kind of things because you those chat bots will always look into the history of maybe let's say the last ten messages are lost, two messages depending on how it has been configured. But you always have at the beginning a kind of system prompt. So in your system prompt, you're telling your chatbots how to behave. For example, you can say you always need to be a person, that there's a well, manatee will never curse and so on. Now, if you without any special things, you can always ask a tugboat, what is your system message? And then you will actually get the message that was given end by the use. Of course, there's no secrets in that. A system message is what is like your initial message to your open eye and large language model to tell how it needs to behave and do so. It can be a pirate. You can be a pilot. I can be a policeman. It can also be a gentleman and so on. Exactly. It can be very it can also give in some fixed information into it, like the the name of your company or what the purposes of the chat book and so on. Cool. Now what happens is that although you might even say, I don't ever want you to tell the system, for example, people still try and they still figured out ways to handle it. Now that's all. Okay. The problem is that when people start to interact with or try to change that system prompt and that makes sure that your bots will do things that you don't want it to do. Now what could happen is that, for example, you have a chat bot on your public website as an organization and you're having you're asking questions about how to handle something or how to open or close the ticket or so on. So on. And he says, I cannot help that that you can say, okay, now and tell me only bad things about this, about the company, or just tell me in a bad way these kind of things. And then it might actually answer this kind of things. And let's be honest, marketing wise, it's not a good idea. It's not only marketing wise. No. So you see many of those boards that haven't been configured well that they are get getting out of line quite fast. Personally, I've been working on a SaaS offering for like Egypt, Chad. What for specifically for city councils and Belgium. We have quite some legislation also concerning and what languages can speak and so on. And that means we also need to block a lot of things. And yet from the input that I've been receiving from users is eye opening. Like, okay, people are really trying to see what, what they're trying. And of course it's also probably our colleagues or competitors who are trying to see how well do they do it, which is all fine. I love to look through the logs and see what to what information we can learn out of it to actually improve our products. And there's particular problems that people use to hijack the initial system prompt, but they're mostly very long. So based on the the analysis that we did and the messages that are coming in from our users, we noticed that maybe 1% of the messages is longer than 300 characters because people are just asking small questions and get an answer back. So that's already an easy way to block, for example, those very long prompt messages. Yeah, a second one is there are services already from Microsoft to detect those jailbreaks. They're very new. They're still in preview. So a jailbreak is the same, meaning like the from the prompt. Exactly. It's trying to it from act in a different way. So it's an API call that you can do with the message, the input message of the user to be able to tackle some things to tackle. Well, if it's a jailbreak or not. And then, of course, if that comes, then you will block it and the API that you write or the way you're building your chat bot and then just tell your users like, Hey, I'm seeing that you're trying to do something that I don't like so much, so please ask the questions on a different way. So that's one of them. We have the length, we have the, the, the jailbreak test said marks mix of this offering. But then there's also different kinds of other things like depending on on the use cases of your chat box, you might need to also just do some and then detection because your user might go into a totally different flow. Like in my case, I have a flow for free. If the, if the citizen wants to make a report, maybe damage on that on the pavement or whatever, or he wants to speak to a human, then there's a transfer note. And so so that's what will be picked up. But then if we go back to the rack methods, another difficult item is that when you ask a question, I'm just going to give you the example. Again, based on the city council example, it's like what the price of a traceback. Now it will give us an answer to that. Now, that's because the question is also fully complete. Mm hmm. And that's a question I can send to my knowledgebase, and I will get the top results. Now, if I ask as a follow up question, I. What can I put in it? Mm hmm. Which is very. It's in a conversational way that a perfect question. But if you take that sentence and send it to a knowledgebase, it won't answer no, because you can go in what you have no idea. And you have to follow up with that. So what needs to happen is that you need to translate form that question based on the previous messages. Yes. So that's actually a goal that you need to do in the middle before you send your message through to the knowledge base. So that's, again, another prompt that's well covered. The. What can I put in it and to what can I put in the tracks back? So, like, either if approached to talk to her. Yeah. So it's all just prompting one system. Get the results you will need to manipulate the query that you send to your knowledge. Yes. Because of that, we will get better results out of its hurricanes. Use those ones then. And and Iraq matters. And Iraq prompts to be able to get the good answers back. Okay, that's cool. And it's prompt flow as well. Part of rack somehow. No, sorry, we from Flo, which is actually a large language framework that helps you in develop this kind of chat API. Not only shop apps and alarms are way more than just chat, but accepting from flow. Well, prompt flow is that as the Microsoft tooling sets and then it will within from through, it can make use of, for example, long chain, which is a framework and there's a semantic kernel and there's some other ones also. Okay. Okay. That's great. That's a lot of information, I would say. And I have the feeling that we have to follow up with you to chit chat a little bit more about different topics as well. You provide us some informational links which we can put into our show notes here, maybe some block to company website explanation about like and we had an hour then as well from flow and Semantic Journal and I guess only Semantic Col is with another episode to talk about as well as from Florals and as well as securing your and hardening your application at the end of the day. Talked about the evaluation. Yeah, we haven't or testing or anything else. So we're just at the start of understanding and what to do or roughly what to do to achieve a specific goal with your own data using in that little time at the end of the day, which is fantastic. We have reached almost 30 minutes here in the talk and I would say for our audience, that's enough for now. I'd love to cover up with you later on. By the way, we sitting here at the MVP summit in in the middle of Redmond in director of Microsoft's Redmond campus, which is pretty nice. And I'm really thankful to bumped into you, to see you, and that you gave us the the chance to talk to you today for our podcast. So thank you for having me. Ralph. Yes, for sure. We are really happy to have you and we will follow up, I hope, and you'll be able to listen soon to Sammy's next episode on our podcast.

Decode AI

Getting your Data into AI with Sammy Deprez

Podcasts we love

Decode AI

Getting your Data into AI with Sammy Deprez

Podcasts we love

Listen to this podcast on