Author: Etienne Oosthuysen
There are a number of things in Microsoft’s evolving AI portfolio, which is in various stages (planned, private preview and public preview), that will allow data workers to leverage the power of Generative AI and Large Language Models, notably GPT, to analyse your corporate data. These include Copilot for Microsoft 365, which will bring ChatGPT to (amongst others) Excel, and of course, all the planned AI functionality in the form of Copilot for Microsoft Fabric. And now there’s Azure OpenAI on your data! Sweet!
Introducing Azure OpenAI on your data
Announced today in public preview, is something pretty significant: Azure OpenAI on your data. Now, Azure OpenAI Services allows you to connect to your data sources and leverage the power of Azure Cognitive Services, particularly a Cognitive Search index.
With Azure OpenAI on your data, when a user provides a prompt, two things will happen:
(a) Azure OpenAI on your data, and a Cognitive Search Index, determines what data to retrieve based on your user prompt and the preceding conversation history,
(b) The retrieved data is then appended to the original prompt and sent as a new prompt to GPT (within Azure), which uses this information to provide completion.
All of this is possible over .txt, .md, .html, Microsoft Word, Microsoft PowerPoint, and PDF. And through an easy-to-deploy app.
What about security
But what about security, as this is, after all, ChatGPT accessing your data? Not really. Yes, it is ChatGPT (or, more accurately, gpt-35-turbo and GPT-4 language models), but it is entirely hosted within Azure, and all data, therefore, remains within the Azure OpenAI service and, therefore, within the Azure backbone. This is described in more detail here.
Why is Azure OpenAI on your data different from those other things in Microsoft’s evolving AI portfolio?
Copilot is due to release within Microsoft 365 applications and within Fabric in the hopefully not-too-distant future. This means its Generative AI role will enhance the productivity within those technologies. That’s awesome if you are a data worker using those technologies.
Azure OpenAI on your data, on the other hand, will allow you to step beyond the confines of those technologies and create something really bespoke, yet in an amazingly simple way.
How easy is this really?
For this test drive, I will use a dataset within my Azure Data Lake. It contains over 16,000 records of video games with sales greater than 100,000 copies for the period 1980 through to 2016 and is stored as .txt in a comma-separated format.
The original dataset used is available here.
Here are the steps I followed:
Set up Azure OpenAI on your data and the Cognitive Search Index
A) In my OpenAI Service, I selected BYO Data preview:
B) I then selected my data source – Azure Blob Storage. You can upload a file, use an existing Azure Cognitive Search index, or point to your data in Azure Data Lake (Blob Storage). I am sure each of these options has its own pros and cons, but that is a discussion for another time.
C) I added my data source information and allowed Azure OpenAI on your data to create an Azure Cognitive Search index as part of the set-up for me.
D) Once the set-up and index were completed, all that was left to do was to craft a system message. I used “You are an AI assistant, and you are useful for answering questions from video game sales. The dataset has the columns Rank, Name, Platform, Year, Genre, Publisher, NA_Sales, EU_Sales, JP_Sales, Other_Sales, and Global_Sales. All sales are in millions. Please answer questions by parsing through all dialogue.”
E) The final step was to play around with the session settings and parameters, after which time I was able to do a test drive within the playground chat session.
Test drive in the playground chat
The first prompt, the response and the code is shown below:
Deploy the app
This literally involved a single-page configuration:
The final product
Here is a recording of some of my prompt interactions:
There are clearly some issues, both with the accuracy of the results returned (this seemed somewhat minor, but inconsistent), and its ability to return results on the first go. BUT this product is still in public preview, and it seems as if issues are expected, especially when you read MSFT material stating, “If you receive incorrect answers, report it as a quality bug”. So, this is simply too early for an unequivocal judgement which will have to wait until later in the public preview and once it goes into general availability in the hopefully not-too-distant future.
However, if it does what I think it can, then this has the potential to move conversational AI and data analysis forward quite a bit. My gut feeling is that Azure OpenAI for your data will provide a skeleton only, and that’s okay, as it will remove or expedite some of the lower-value work. Yet what is truly exciting is what organisations can build on top of this foundation.
This article was originally published here: https://www.linkedin.com/pulse/chatgpt-your-own-data-introducing-azure-openai-etienne-oosthuysen/