MSUSAzureAccelerators
diff --git a/‎01-Load-Data-ACogSearch.ipynb
+234-283 b/‎01-Load-Data-ACogSearch.ipynb
+234-283
diff --git a/‎02-LoadCSVOneToMany-ACogSearch.ipynb
+217-209 b/‎02-LoadCSVOneToMany-ACogSearch.ipynb
+217-209
diff --git a/‎03-Quering-AOpenAI.ipynb
+476-522 b/‎03-Quering-AOpenAI.ipynb
+476-522
diff --git a/‎04-Complex-Docs.ipynb
+255-265 b/‎04-Complex-Docs.ipynb
+255-265
diff --git a/‎05-Adding_Memory.ipynb
+318-401 b/‎05-Adding_Memory.ipynb
+318-401
diff --git a/‎06-First-RAG.ipynb
+713 b/‎06-First-RAG.ipynb
+713
diff --git a/‎07-SQLDB_QA.ipynb
-705 b/‎07-SQLDB_QA.ipynb
-705
diff --git a/‎06-TabularDataQA.ipynb ‎07-TabularDataQA.ipynb
+218-165 b/‎06-TabularDataQA.ipynb ‎07-TabularDataQA.ipynb
+218-165
diff --git a/‎08-BingChatClone.ipynb
-492 b/‎08-BingChatClone.ipynb
-492
diff --git a/‎08-SQLDB_QA.ipynb
+589 b/‎08-SQLDB_QA.ipynb
+589
diff --git a/‎09-BingChatClone.ipynb
+615 b/‎09-BingChatClone.ipynb
+615
diff --git a/‎09-API-Search.ipynb ‎10-API-Search.ipynb
+195-126 b/‎09-API-Search.ipynb ‎10-API-Search.ipynb
+195-126
diff --git a/‎10-Smart_Agent.ipynb
-1,317 b/‎10-Smart_Agent.ipynb
-1,317
diff --git a/‎11-Smart_Agent.ipynb
+1,621 b/‎11-Smart_Agent.ipynb
+1,621
diff --git a/‎11-Building-Apps.ipynb ‎12-Building-Apps.ipynb b/‎11-Building-Apps.ipynb ‎12-Building-Apps.ipynb
diff --git a/‎README.md
+17-16 b/‎README.md
+17-16
diff --git a/‎apps/backend/backend.zip
3.24 KB b/‎apps/backend/backend.zip
3.24 KB
diff --git a/‎apps/backend/bot.py
+69-43 b/‎apps/backend/bot.py
+69-43
diff --git a/‎common/callbacks.py
+5-5 b/‎common/callbacks.py
+5-5
@@ -1,6 +1,6 @@
 ![image](https://user-images.githubusercontent.com/113465005/226238596-cc76039e-67c2-46b6-b0bb-35d037ae66e1.png)
 
-# 3 or 5 days POC VBD powered by: Azure Search + Azure OpenAI + Bot Framework + Langchain + Azure SQL + CosmosDB + Bing Search API
+# 3 or 5 days POC VBD powered by: Azure AI Search + Azure OpenAI + Bot Framework + Langchain + Azure SQL + CosmosDB + Bing Search API + Document Intelligence SDK
 [![Open in GitHub Codespaces](https://github.com/codespaces/badge.svg)](https://codespaces.new/MSUSAzureAccelerators/Azure-Cognitive-Search-Azure-OpenAI-Accelerator?quickstart=1)
 [![Open in VS Code Dev Containers](https://img.shields.io/static/v1?style=for-the-badge&label=Remote%20-%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/MSUSAzureAccelerators/Azure-Cognitive-Search-Azure-OpenAI-Accelerator)
 
@@ -33,7 +33,8 @@ The repo is made to teach you step-by-step on how to build a OpenAI-based Smart
 * The customer team and the Microsoft team must have Contributor permissions to this resource group so they can set everything up 2 weeks prior to the workshop
 * A storage account must be set in place in the RG.
 * Customer Data/Documents must be uploaded to the blob storage account, at least two weeks prior to the workshop date
-* Multi-Tenant App Registration (Service Principal).
+* A Multi-Tenant App Registration (Service Principal) must be created by the customer (save the Client Id and Secret Value).
+* Customer must provide the Microsoft Team , 10-20 questions (easy to hard) that they want the bot to respond correctly.
 * For IDE collaboration and standarization during workshop, AML compute instances with Jupyper Lab will be used, for this, Azure Machine Learning Workspace must be deployed in the RG
    * Note: Please ensure you have enough core compute quota in your Azure Machine Learning workspace 
 
@@ -48,9 +49,12 @@ The repo is made to teach you step-by-step on how to build a OpenAI-based Smart
    * 3a. Azure SQL Database - contains COVID-related statistics in the US.
    * 3b. API Endpoints - RESTful OpenAPI 3.0 API containing up-to-date statistics about Covid.
    * 3c. Azure Bing Search API - provides access to the internet allowing scenerios like: QnA on public websites .
-   * 3d. Azure AI Text Search - contains AI-enriched documents from Blob Storage (10k PDFs and 90k articles).
-   * 3e. Azure AI Vector Search - contains 5 lenghty PDF books vectorized per page.
+   * 3d. Azure AI Search - contains AI-enriched documents from Blob Storage:
+       - 10,000 Arxiv Computer Science PDFs  
+       - 90,000 Covid publication abstracts
+       - 5 lenghty PDF books
    * 3f. CSV Tabular File - contains COVID-related statistics in the US.
+   * 3g. Kraken broker API for currencies
 4. The app retrieves the result from the source and crafts the answer.
 5. The tuple (Question and Answer) is saved to CosmosDB as persistent memory and for further analysis.
 6. The answer is delivered to the user.
@@ -68,9 +72,8 @@ https://gptsmartsearch.azurewebsites.net/
 
    - Uses [Bot Framework](https://dev.botframework.com/) and [Bot Service](https://azure.microsoft.com/en-us/products/bot-services/) to Host the Bot API Backend and to expose it to multiple channels including MS Teams.
    - 100% Python.
-   - Uses [Azure Cognitive Services](https://azure.microsoft.com/en-us/products/cognitive-services/) to index and enrich unstructured documents: Detect Language, OCR images, Key-phrases extraction, entity recognition (persons, emails, addresses, organizations, urls).
-   - Uses Vector Search Capabilities of Azure Cognitive Search to provide the best semantic answer.
-   - Creates vectors on-demand as users interact with the system. (versus vectorizing the whole datalake at the beginning)
+   - Uses [Azure Cognitive Services](https://azure.microsoft.com/en-us/products/cognitive-services/) to index and enrich unstructured documents: OCR over images, Chunking and automated vectorization.
+   - Uses Hybrid Search Capabilities of Azure AI Search to provide the best semantic answer (Text and Vector search combined).
    - Uses [LangChain](https://langchain.readthedocs.io/en/latest/) as a wrapper for interacting with Azure OpenAI , vector stores, constructing prompts and creating agents.
    - Multi-Lingual (ingests, indexes and understand any language)
    - Multi-Index -> multiple search indexes
@@ -89,22 +92,20 @@ https://gptsmartsearch.azurewebsites.net/
 Note: (Pre-requisite) You need to have an Azure OpenAI service already created
 
 1. Fork this repo to your Github account.
-2. In Azure OpenAI studio, deploy these models: **Make sure that the deployment name is the same as the model name.**
-   - "gpt-35-turbo"
-   - "gpt-35-turbo-16k"
-   - "gpt-4"
-   - "gpt-4-32k"
-   - "text-embedding-ada-002"
+2. In Azure OpenAI studio, deploy these models (older models than the ones stated below won't work):
+   - "gpt-35-turbo-1106 (or newer)" 
+   - "gpt-4-turbo-1106  (or newer)"
+   - "text-embedding-ada-002 (or newer)"
 3. Create a Resource Group where all the assets of this accelerator are going to be. Azure OpenAI can be in different RG or a different Subscription.
-4. ClICK BELOW to create all the Azure Infrastructure needed to run the Notebooks (Azure Cognitive Search, Cognitive Services, etc):
+4. ClICK BELOW to create all the Azure Infrastructure needed to run the Notebooks (Azure AI Search, Cognitive Services, etc):
 
 [![Deploy To Azure](https://aka.ms/deploytoazurebutton)](https://portal.azure.com/#create/Microsoft.Template/uri/https%3A%2F%2Fraw.githubusercontent.com%2Fpablomarin%2FGPT-Azure-Search-Engine%2Fmain%2Fazuredeploy.json) 
 
 **Note**: If you have never created a `Azure AI Services Multi-Service account` before, please create one manually in the azure portal to read and accept the Responsible AI terms. Once this is deployed, delete this and then use the above deployment button.
 
 5. Clone your Forked repo to your AML Compute Instance. If your repo is private, see below in Troubleshooting section how to clone a private repo.
 
-6. Make sure you run the notebooks on a **Python 3.10 conda enviroment**
+6. Make sure you run the notebooks on a **Python 3.10 conda enviroment** or newer
 7. Install the dependencies on your machine (make sure you do the below pip comand on the same conda environment that you are going to run the notebooks. For example, in AZML compute instance run:
 ```
 conda activate azureml_py310_sdkv2
@@ -124,7 +125,7 @@ You might get some pip dependancies errors, but that is ok, the libraries were i
 
 ## **FAQs**
 
-1. **Why use Azure Cognitive Search engine to provide the context for the LLM and not fine tune the LLM instead?**
+1. **Why use Azure AI Search engine to provide the context for the LLM and not fine tune the LLM instead?**
 
 A: Quoting the [OpenAI documentation](https://platform.openai.com/docs/guides/fine-tuning): "GPT-3 has been pre-trained on a vast amount of text from the open internet. When given a prompt with just a few examples, it can often intuit what task you are trying to perform and generate a plausible completion. This is often called "few-shot learning.
 Fine-tuning improves on few-shot learning by training on many more examples than can fit in the prompt, letting you achieve better results on a wide number of tasks. Once a model has been fine-tuned, you won't need to provide examples in the prompt anymore. This **saves costs and enables lower-latency requests**"
 
@@ -7,19 +7,30 @@
 import requests
 import json
 from concurrent.futures import ThreadPoolExecutor
-from langchain.chat_models import AzureChatOpenAI
+from langchain_openai import AzureChatOpenAI
 from langchain.utilities import BingSearchAPIWrapper
-from langchain.memory import ConversationBufferWindowMemory
 from langchain.memory import CosmosDBChatMessageHistory
+from langchain.agents import AgentExecutor, create_openai_tools_agent
+from langchain_core.runnables import ConfigurableField, ConfigurableFieldSpec
+from langchain_core.chat_history import BaseChatMessageHistory
+from langchain_community.chat_message_histories import ChatMessageHistory, CosmosDBChatMessageHistory
 from langchain.agents import ConversationalChatAgent, AgentExecutor, Tool
 from typing import Any, Dict, List, Optional, Union
 from langchain.callbacks.base import BaseCallbackHandler
 from langchain.callbacks.manager import CallbackManager
 from langchain.schema import AgentAction, AgentFinish, LLMResult
+from langchain_community.chat_message_histories import ChatMessageHistory
+from langchain_core.runnables.history import RunnableWithMessageHistory
 
 #custom libraries that we will use later in the app
-from utils import DocSearchAgent, CSVTabularAgent, SQLSearchAgent, ChatGPTTool, BingSearchAgent, APISearchAgent, run_agent, reduce_openapi_spec
-from prompts import WELCOME_MESSAGE, CUSTOM_CHATBOT_PREFIX, CUSTOM_CHATBOT_SUFFIX
+from utils import (
+    DocSearchAgent, 
+    CSVTabularAgent, 
+    SQLSearchAgent, 
+    ChatGPTTool, 
+    BingSearchAgent
+)
+from prompts import CUSTOM_CHATBOT_PROMPT, WELCOME_MESSAGE
 
 from botbuilder.core import ActivityHandler, TurnContext
 from botbuilder.schema import ChannelAccount, Activity, ActivityTypes
@@ -53,6 +64,20 @@ class MyBot(ActivityHandler):
 
     def __init__(self):
         self.model_name = os.environ.get("AZURE_OPENAI_MODEL_NAME") 
+        
+    def get_session_history(self, session_id: str, user_id: str) -> CosmosDBChatMessageHistory:
+        cosmos = CosmosDBChatMessageHistory(
+            cosmos_endpoint=os.environ['AZURE_COSMOSDB_ENDPOINT'],
+            cosmos_database=os.environ['AZURE_COSMOSDB_NAME'],
+            cosmos_container=os.environ['AZURE_COSMOSDB_CONTAINER_NAME'],
+            connection_string=os.environ['AZURE_COMOSDB_CONNECTION_STRING'],
+            session_id=session_id,
+            user_id=user_id
+            )
+
+        # prepare the cosmosdb instance
+        cosmos.prepare_cosmos()
+        return cosmos
 
     # Function to show welcome message to new users
     async def on_members_added_activity(self, members_added: ChannelAccount, turn_context: TurnContext):
@@ -67,6 +92,7 @@ async def on_message_activity(self, turn_context: TurnContext):
         # Extract info from TurnContext - You can change this to whatever , this is just one option 
         session_id = turn_context.activity.conversation.id
         user_id = turn_context.activity.from_property.id + "-" + turn_context.activity.channel_id
+        
         input_text_metadata = dict()
         input_text_metadata["local_timestamp"] = turn_context.activity.local_timestamp.strftime("%I:%M:%S %p, %A, %B %d of %Y")
         input_text_metadata["local_timezone"] = turn_context.activity.local_timezone
@@ -80,55 +106,55 @@ async def on_message_activity(self, turn_context: TurnContext):
         cb_manager = CallbackManager(handlers=[cb_handler])
 
         # Set LLM 
-        llm = AzureChatOpenAI(deployment_name=self.model_name, temperature=0.5, max_tokens=1000, callback_manager=cb_manager)
+        llm = AzureChatOpenAI(deployment_name=self.model_name, temperature=0.5, 
+                              max_tokens=1500, callback_manager=cb_manager, streaming=True)
 
         # Initialize our Tools/Experts
-        text_indexes = ["cogsrch-index-files", "cogsrch-index-csv"]
+        indexes = ["cogsrch-index-files", "cogsrch-index-csv", "cogsrch-index-books"]
+        
         doc_search = DocSearchAgent(llm=llm, indexes=text_indexes,
-                           k=10, similarity_k=4, reranker_th=1,
-                           sas_token=os.environ['BLOB_SAS_TOKEN'],
-                           callback_manager=cb_manager, return_direct=True)
-        vector_only_indexes = ["cogsrch-index-books-vector"]
-        book_search = DocSearchAgent(llm=llm, vector_only_indexes = vector_only_indexes,
-                           k=10, similarity_k=10, reranker_th=1,
+                           k=6, reranker_th=1,
                            sas_token=os.environ['BLOB_SAS_TOKEN'],
-                           callback_manager=cb_manager, return_direct=True,
-                           name="@booksearch",
-                           description="useful when the questions includes the term: @booksearch.\n")
-        www_search = BingSearchAgent(llm=llm, k=5, callback_manager=cb_manager, return_direct=True)
-        sql_search = SQLSearchAgent(llm=llm, k=10, callback_manager=cb_manager, return_direct=True)
-        chatgpt_search = ChatGPTTool(llm=llm, callback_manager=cb_manager, return_direct=True)
+                           callback_manager=cb_manager, verbose=False)
 
-        url = 'https://disease.sh/apidocs/swagger_v3.json'
-        spec = requests.get(url).json()
-
-        api_search = APISearchAgent(llm=llm,
-                            llm_search=AzureChatOpenAI(deployment_name="gpt-35-turbo-16k", temperature=0, max_tokens=1000),
-                            api_spec=str(reduce_openapi_spec(spec)), 
-                            limit_to_domains=["https://disease.sh/"],
-                            callback_manager=cb_manager, return_direct=True)
-
-        tools = [www_search, sql_search, doc_search, chatgpt_search, book_search, api_search]
-
-        # Set brain Agent with persisten memory in CosmosDB
-        cosmos = CosmosDBChatMessageHistory(
-                        cosmos_endpoint=os.environ['AZURE_COSMOSDB_ENDPOINT'],
-                        cosmos_database=os.environ['AZURE_COSMOSDB_NAME'],
-                        cosmos_container=os.environ['AZURE_COSMOSDB_CONTAINER_NAME'],
-                        connection_string=os.environ['AZURE_COMOSDB_CONNECTION_STRING'],
-                        session_id=session_id,
-                        user_id=user_id
-                    )
-        cosmos.prepare_cosmos()
-        memory = ConversationBufferWindowMemory(memory_key="chat_history", return_messages=True, k=30, chat_memory=cosmos)
-        agent = ConversationalChatAgent.from_llm_and_tools(llm=llm, tools=tools,system_message=CUSTOM_CHATBOT_PREFIX,human_message=CUSTOM_CHATBOT_SUFFIX)
-        agent_chain = AgentExecutor.from_agent_and_tools(agent=agent, tools=tools, memory=memory, handle_parsing_errors=True)
+        www_search = BingSearchAgent(llm=llm, k=5, callback_manager=cb_manager)
+        sql_search = SQLSearchAgent(llm=llm, k=30, callback_manager=cb_manager)
+        chatgpt_search = ChatGPTTool(llm=llm, callback_manager=cb_manager)
+        
+        tools = [doc_search, www_search, sql_search, chatgpt_search]
+        
+        agent = create_openai_tools_agent(llm, tools, CUSTOM_CHATBOT_PROMPT)
+        agent_executor = AgentExecutor(agent=agent, tools=tools)
+        brain_agent_executor = RunnableWithMessageHistory(
+            agent_executor,
+            get_session_history,
+            input_messages_key="question",
+            history_messages_key="history",
+            history_factory_config=[
+                ConfigurableFieldSpec(
+                    id="user_id",
+                    annotation=str,
+                    name="User ID",
+                    description="Unique identifier for the user.",
+                    default="",
+                    is_shared=True,
+                ),
+                ConfigurableFieldSpec(
+                    id="session_id",
+                    annotation=str,
+                    name="Session ID",
+                    description="Unique identifier for the conversation.",
+                    default="",
+                    is_shared=True,
+                ),
+            ],
+        )
 
         await turn_context.send_activity(Activity(type=ActivityTypes.typing))
 
         # Please note below that running a non-async function like run_agent in a separate thread won't make it truly asynchronous. It allows the function to be called without blocking the event loop, but it may still have synchronous behavior internally.
         loop = asyncio.get_event_loop()
-        answer = await loop.run_in_executor(ThreadPoolExecutor(), run_agent, input_text, agent_chain)
+        answer = brain_agent_executor.invoke({"question": input_text}, config=config)["output"]
 
         await turn_context.send_activity(answer)
 
 
@@ -20,13 +20,13 @@ def on_llm_error(self, error: Union[Exception, KeyboardInterrupt], **kwargs: Any
         """Run when LLM errors."""
         sys.stdout.write(f"LLM Error: {error}\n")
 
-    def on_chain_start(self, serialized: Dict[str, Any], inputs: Dict[str, Any], **kwargs: Any) -> Any:
-        """Print out that we are entering a chain."""
-
     def on_tool_start(self, serialized: Dict[str, Any], input_str: str, **kwargs: Any) -> Any:
         sys.stdout.write(f"Tool: {serialized['name']}\n")
-
+        
+    def on_retriever_start(self, serialized: Dict[str, Any], query: str) -> Any:
+        sys.stdout.write(f"Retriever: {serialized}\n")
+        
     def on_agent_action(self, action: AgentAction, **kwargs: Any) -> Any:
-        sys.stdout.write(f"{action.log}\n")
+        sys.stdout.write(f"Agent Action: {action.log}\n")