Joint Authors: Alen Michael, Clayton Pilat, Reagan Burpee and Etienne Oosthuysen
Generative AI is currently being baked into the various tools we love and use in our data workloads, with more announcements on the way. Make no mistake, every technology vendor is active in this area behind the scenes and exposé is doing just that to uncover how all these moving parts will work and come together so that we can assist our clients navigate this exciting disruption.
Whilst we are awaiting some key announcements from technology partners such as Microsoft, AWS, Databricks and others on the integration of Generative AI into their technologies, we decided to leverage some of the existing technologies, and interface them in such a way that we can leverage the OpenAI Services in Azure, to showcase the art of the possible. Please see some other potential use cases using the technologies and principles described here.
Our proof-of-concept use case
A bot that interfaces with data in our data lake, using the large language model (LLM) Generative AI, and guided prompt engineering in the background that allows the user to ask relevant questions from the data without having to rely on a report or a business model.
In this innovation lab, we used internal company datasets pertaining to consultant skills information across our consultants’ group. The three datasets are:
A skills matrix that is used by consultants to self-rate.
A repository of technologies used per client project.
A repository of technologies envisaged for upcoming pipeline of work.
The new GPT based application we created as part of this PoC allows stakeholders to ask questions from this data, for example “Which consultants have skills of 3 or higher across Databricks, Synapse, Power BI and machine learning studio, with such experience within the previous 36 months”, or “Given our pipeline of work, are there any skills gaps we should be aware of”.
You can imagine the possibilities for such a GPT based application used across corporate data is tremendous. More on that later.
Here’s how we did it:
We deployed / used the following services in our Innovation Lab in Azure:
- Azure Data Lake
- Azure Cognitive Services (Cognitive Search)
- OpenAI Services (GPT3.5 Turbo)
- Azure App Service
Solution steps and components:
- We loaded the relevant CSVs to our data lake. This data includes the various technologies we use, consultant names and self-ratings by technology, as well as pipeline and past project information including dates and technologies used.
- We used Azure Cognitive Services, specifically Cognitive Search to do the following:
a. Connect to the data.
b. Create an Index and an Index Definition specific to the content that includes, amongst others, Technologies, Consultants, self-rating Scores, Projects, Technologies used, and Dates.
c. Create an Indexer that includes the JSON logic that maps the scanned content to the Index Definition.
d. Storage of these results in a no-SQL environment under the covers.
- We created a Python App using Flask that contains both a back- and a front-end, hosted in Azure App Services.
a. This allows the user to ask relevant questions from the data. We are using GPT3.5 Turbo, which is one of the models deployed as part of the OpenAI Service offering in Azure. This supports a question-and-answer method without the need for a ‘chat’ functionality, specifically using a ‘few-shot learning’ (FSL) approach as that works best for our specific use case.
b. Within the app, we included the applicable Prompt Engineering for our use case. In this specific example, we do not fine tune the model, but rather provide it context so that it knows how to respond. With the few-shot learning method, we prefix sample questions, thoughts and actions to the user’s question.
c. To GPT we therefore send:
i: Sample questions
ii: Thoughts which show the model how to think, and
iii: Actions which show the model what to do from the thought.
iv: The user question – “Which consultants have skills of 3 or higher across Databricks, Synapse, Power BI and machine learning studio, with such experience within the previous 36 months”.
Here is an example of the FSL method prefixed to the question to guide GPT:
4. GPT now knows what to do, and accesses the stored results, retrieves the information, and crafts the response back to the app. It is worth noting that the few-short-learning text, the actual question, and the response (completion), cannot exceed 4,000 tokens.
So how does this work in motion?
A) The user poses their question to the app.
B) The mechanics inside the app uses the few-shot learning method, and prompts are prefixed to the actual question using applicable prompt engineering.
C) This is now passed to GPT, which now understands what to do, where to source and which tool to use to retrieve the information (in this instance this is Cognitive Search).
D) Cognitive Search is then accessed for the names, dates and skills from the indexed results, and GPT determines how to respond based on the FSL examples. It would also, in a more scaled out version, be possible to teach the model to use multiple tools using prompts within token limits.
E) The information is then passed back via the app to the user.
What are some of the uses cases for a productionised application such as this?
The possibilities for something like this, where a simple Q&A mechanism will allow a user to ask contextual questions from data, indexed by cognitive search and interpreted by the power of LLM’s are endless. For example:
- When an auditor tries to match expenditures with actual receipts/invoices scanned and uploaded in a very large organisation it could take days. With a well-engineered Generative AI powered app the auditor could just ‘Ask’ questions about expenditures and find supporting documents.
- Medical diagnosis where a well-engineered Generative AI powered app could help a doctor find and understand bloods and scans of a patient and cross check it with a database for diagnosis purposes.
- Legal precedence where a well-engineered Generative AI powered app could assist lawyers to find cases with similar characteristics.