Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated requirements and credentials files #37

Draft
wants to merge 115 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
115 commits
Select commit Hold shift + click to select a range
d943ca8
Add files via upload
davyu-azfed Jul 26, 2023
0c121a4
Update README.md
davyu-azfed Jul 26, 2023
0a837be
Merge pull request #1 from FederalCSUMission/davyu-azfed-patch-1
davyu-azfed Jul 26, 2023
78422f3
Backend app modifications to enable running in Azure Gov
davyu-azfed Jul 26, 2023
c1256aa
Update ARM template to match Bicep
github-actions[bot] Jul 26, 2023
f0edc8b
removing backend.zip
davyu-azfed Jul 26, 2023
f40aafd
Merge branch 'davyu_updateAppAzureGov' of https://github.com/FederalC…
davyu-azfed Jul 26, 2023
0d03d48
front end app update to run in Azure Gov
davyu-azfed Jul 27, 2023
e8912ed
Update ARM template to match Bicep
github-actions[bot] Jul 27, 2023
3084690
Adding Deploy to Azure Gov button
davyu-azfed Jul 27, 2023
4d6d264
Merge branch 'davyu_updateAppAzureGov' of https://github.com/FederalC…
davyu-azfed Jul 27, 2023
68a5ada
Remove Deploy to Azure button
davyu-azfed Jul 27, 2023
d65ad35
Update bicep files
davyu-azfed Jul 27, 2023
bf3387f
Update ARM template to match Bicep
github-actions[bot] Jul 27, 2023
3e4fdf4
updated readme and bicep, arm to deploy to Azure Gov.
davyu-azfed Jul 28, 2023
b4fae25
Merge branch 'davyu_updateAppAzureGov' of https://github.com/FederalC…
davyu-azfed Jul 28, 2023
e1650db
Update ARM template to match Bicep
github-actions[bot] Jul 28, 2023
ea2e89b
More updates on Readme
davyu-azfed Jul 28, 2023
3587f49
Merge branch 'davyu_updateAppAzureGov' of https://github.com/FederalC…
davyu-azfed Jul 28, 2023
2799f19
Creating branch for VectorDB notbook
josephyassin-git Aug 2, 2023
17dc8da
Merge remote-tracking branch 'upstream/main'
davyu-azfed Aug 16, 2023
927b296
Added Weaviate API Key
josephyassin-git Aug 17, 2023
4014725
Reading data from Arxiv
josephyassin-git Aug 17, 2023
3853a46
making mods to remove Azure Gov incompatible scripts
davyu-azfed Aug 22, 2023
15092b8
Merge pull request #4 from FederalCSUMission/davyu-deployTest817
cheruvu1 Aug 23, 2023
c84c426
Merge remote-tracking branch 'origin/main' into VectorDB_IPYNB_Yassin
josephyassin-git Aug 30, 2023
0be7736
Added pdf load from blob storage
josephyassin-git Aug 30, 2023
767b0fc
Writing the first few thousand PDfs to weaviate
josephyassin-git Aug 31, 2023
c198533
Renaming files to align the main repository
marktab Sep 3, 2023
698a8b3
Merge pull request #5 from marktab/marktab-openai
davyu-azfed Sep 5, 2023
4692a25
updated architecture diagram to include AKS, weaviate, and dev enviro…
davyu-azfed Sep 7, 2023
ca90d59
updated architecture to include aks, weaviate, and azure machine lear…
davyu-azfed Sep 7, 2023
ab96056
Added notebook for querying loaded data
josephyassin-git Sep 7, 2023
c1622e6
Updated Markdown to include AML
josephyassin-git Sep 8, 2023
c88f01a
Merge pull request #6 from FederalCSUMission/davyu-updateArch
davyu-azfed Sep 8, 2023
628ae14
Merge branch 'main' of https://github.com/FederalCSUMission/Azure-Ope…
davyu-azfed Sep 8, 2023
bf74c44
updates to readme and fix frontend app
davyu-azfed Sep 8, 2023
6f53edc
updates to readme and fix frontend app
davyu-azfed Sep 8, 2023
c46e1be
Paramaterized the notebook
josephyassin-git Sep 8, 2023
0707927
Removed spacing in .env
josephyassin-git Sep 8, 2023
ed5ce33
Merge pull request #7 from FederalCSUMission/davyu-govupdates
josephyassin Sep 8, 2023
a91e0d5
Merge pull request #8 from FederalCSUMission/VectorDB_IPYNB_Yassin
josephyassin Sep 8, 2023
4e986af
Added notebook for querying VDB w/o reloading docs
josephyassin-git Sep 11, 2023
b819da4
Merge pull request #9 from FederalCSUMission/VectorDB_IPYNB_Yassin
josephyassin Sep 11, 2023
549a0a9
Fixing deploy to azure gov issue, updating instructions in frontend a…
davyu-azfed Sep 12, 2023
d332c35
Merge branch 'davyu-fixesandupdates091223' of https://github.com/Fede…
davyu-azfed Sep 12, 2023
469bb82
Merge pull request #10 from FederalCSUMission/davyu-fixesandupdates09…
davyu-azfed Sep 12, 2023
e89a42d
Update utils.py
WookieeOnTheRun Sep 18, 2023
525f5f5
Merge pull request #12 from FederalCSUMission/andyc-federalized-20230918
marktab Sep 18, 2023
e45f142
Adding AKS deployment to main arm template. Adding param files for …
davyu-azfed Sep 19, 2023
277a7ab
Merge branch 'davyu-updateIaaC' of https://github.com/FederalCSUMissi…
davyu-azfed Sep 19, 2023
99c6fe1
Resource Validation
marktab Sep 19, 2023
e2ce051
Update utils.py
marktab Sep 19, 2023
abb276f
Notebook 1
marktab Sep 19, 2023
8e958b3
Notebook 2 & 3 Refresh
marktab Sep 19, 2023
caac5be
Update credentials.env
marktab Sep 19, 2023
9a32cb8
Merge pull request #14 from FederalCSUMission/davyu-updateIaaC
marktab Sep 20, 2023
d2a6bd0
Merge pull request #13 from FederalCSUMission/marktab-20230919
marktab Sep 20, 2023
b9cdb2e
Update 01-Load-Data-ACogSearch.ipynb
marktab Sep 24, 2023
9bf6ea3
Update 02-LoadCSVOneToMany-ACogSearch.ipynb
marktab Sep 24, 2023
2be7493
Merge pull request #15 from FederalCSUMission/marktab-20230924
davyu-azfed Sep 25, 2023
3c3d25f
Add files via upload
WookieeOnTheRun Sep 29, 2023
eb83535
Add files via upload
WookieeOnTheRun Oct 2, 2023
e6cacbd
Delete utils.py
WookieeOnTheRun Oct 2, 2023
861c8ab
Delete requirements.txt
WookieeOnTheRun Oct 2, 2023
d57efc7
Update credentials.env
WookieeOnTheRun Oct 2, 2023
3108f65
Merge pull request #16 from FederalCSUMission/andyc-federalized-20230929
marktab Oct 3, 2023
3a4a2ca
Merge pull request #17 from FederalCSUMission/andyc-federalized-20231002
marktab Oct 3, 2023
707bc15
Adding modified credentials and utils
marktab Oct 12, 2023
78f6973
Sync Notebooks 1-10 with Main Repo
marktab Oct 12, 2023
9fa4b1e
Update 01-Load-Data-ACogSearch.ipynb
marktab Oct 12, 2023
b41c65c
Merge pull request #18 from FederalCSUMission/marktab-20231011
WookieeOnTheRun Oct 12, 2023
277d49d
Add files via upload
marktab Oct 12, 2023
5dd53bd
Delete 03-Quering-AOpenAI.ipynb
marktab Oct 12, 2023
3e6ab59
Delete 04-Complex-Docs.ipynb
marktab Oct 12, 2023
a8493cf
Delete 10-Building-Apps.ipynb
marktab Oct 12, 2023
cbff15a
Sync Notebooks 1-10 with Main Repo
marktab Oct 12, 2023
2b10a7e
Merge pull request #19 from FederalCSUMission/marktab-20231011
WookieeOnTheRun Oct 12, 2023
93ecc0d
Delete utils.py
marktab Oct 12, 2023
9681a1f
Adding Utils.py
marktab Oct 12, 2023
e6f8ca5
Adding Markdown Files
marktab Oct 12, 2023
912a533
Merge pull request #20 from FederalCSUMission/marktab-20231011
WookieeOnTheRun Oct 12, 2023
b57730a
Reformatting to fix error in Utils
josephyassin-git Oct 19, 2023
cee0d33
Merge pull request #21 from FederalCSUMission/VectorDB_IPYNB_Yassin
marktab Oct 19, 2023
c35f32a
Add files via upload
WookieeOnTheRun Oct 20, 2023
17f9240
Add files via upload
WookieeOnTheRun Oct 20, 2023
3a43569
Merge pull request #22 from FederalCSUMission/andyc-federalized-20231020
josephyassin Oct 20, 2023
df493c1
Added YOUR_NAME to creds file
josephyassin-git Oct 23, 2023
167dde5
Merge branch 'main' into VectorDB_IPYNB_Yassin
davyu-azfed Oct 23, 2023
24e0759
Merge pull request #24 from FederalCSUMission/VectorDB_IPYNB_Yassin
davyu-azfed Oct 23, 2023
65ea23e
clear jupyter notebook cell output and update override=True for env f…
davyu-azfed Oct 23, 2023
fd913b9
cleared outputs and added override=True
davyu-azfed Oct 23, 2023
e00571a
remove language detection
davyu-azfed Oct 24, 2023
0054250
Merge pull request #26 from FederalCSUMission/davyu-updates-1023
josephyassin Oct 24, 2023
f0f16cb
reverted naming changes
josephyassin-git Oct 31, 2023
3b50284
Merge branch 'VectorDB_IPYNB_Yassin' of https://github.com/FederalCSU…
josephyassin-git Oct 31, 2023
337e170
Merge pull request #28 from FederalCSUMission/VectorDB_IPYNB_Yassin
davyu-azfed Oct 31, 2023
9c8a3c7
fixing jupyter notebook merge issues that caused the notebooks to not…
davyu-azfed Oct 31, 2023
4d9db73
Merge pull request #29 from FederalCSUMission/davyu-updates-103123
josephyassin Oct 31, 2023
36999f1
Updated to make aks and azure search optional
davyu-azfed Nov 6, 2023
6a328a2
Merge pull request #30 from FederalCSUMission/davyu-updateaks1106
minhvu10 Nov 6, 2023
9ec93ea
Update requirements.txt
minhvu10 Nov 8, 2023
25f075c
Update credentials.env
minhvu10 Nov 8, 2023
c7594f4
Update 05-Adding_Memory.ipynb
minhvu10 Nov 13, 2023
93d34d6
Update 05-Adding_Memory.ipynb
minhvu10 Nov 13, 2023
37934bb
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 13, 2023
7594f27
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 13, 2023
dcbbed2
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 13, 2023
036318f
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 13, 2023
c2a38ea
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 13, 2023
132c0e7
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 13, 2023
a649156
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 13, 2023
66debea
Update credentials.env
minhvu10 Nov 13, 2023
ec92ac5
Update 06-TabularDataQA.ipynb
minhvu10 Nov 15, 2023
5cbec78
Update 07-SQLDB_QA.ipynb
minhvu10 Nov 15, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
438 changes: 438 additions & 0 deletions 00-Resource-Validation.ipynb

Large diffs are not rendered by default.

272 changes: 140 additions & 132 deletions 01-Load-Data-ACogSearch.ipynb

Large diffs are not rendered by default.

264 changes: 113 additions & 151 deletions 02-LoadCSVOneToMany-ACogSearch.ipynb

Large diffs are not rendered by default.

2,177 changes: 834 additions & 1,343 deletions 03-Quering-AOpenAI.ipynb

Large diffs are not rendered by default.

773 changes: 0 additions & 773 deletions 04-Adding_Memory.ipynb

This file was deleted.

690 changes: 690 additions & 0 deletions 04-Complex-Docs.ipynb

Large diffs are not rendered by default.

725 changes: 725 additions & 0 deletions 05-Adding_Memory.ipynb

Large diffs are not rendered by default.

831 changes: 0 additions & 831 deletions 05-TabularDataQA.ipynb

This file was deleted.

784 changes: 0 additions & 784 deletions 06-SQLDB_QA.ipynb

This file was deleted.

414 changes: 414 additions & 0 deletions 06-TabularDataQA.ipynb

Large diffs are not rendered by default.

434 changes: 0 additions & 434 deletions 07-BingChatClone.ipynb

This file was deleted.

458 changes: 458 additions & 0 deletions 07-SQLDB_QA.ipynb

Large diffs are not rendered by default.

415 changes: 415 additions & 0 deletions 08-BingChatClone.ipynb

Large diffs are not rendered by default.

1,152 changes: 0 additions & 1,152 deletions 08-Smart_Agent.ipynb

This file was deleted.

716 changes: 716 additions & 0 deletions 09-Smart_Agent.ipynb

Large diffs are not rendered by default.

File renamed without changes.
328 changes: 328 additions & 0 deletions 11-VectorDB_Load.ipynb

Large diffs are not rendered by default.

274 changes: 274 additions & 0 deletions 12-VectorDB_QA.ipynb
Original file line number Diff line number Diff line change
@@ -0,0 +1,274 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"# Question and answer LLM using the Vector Database data as context"
]
},
{
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"Prerequites: \n",
"\n",
"1. This notebook assumes you've completed the previous notebook and have data loaded into your vector store. \n",
"\n",
"2. Azure openAI endpoint\n",
" Confirm that you've deployed both an embedding model and a LLM. https://learn.microsoft.com/en-us/azure/ai-services/openai/how-to/create-resource?pivots=web-portal\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"jupyter": {
"outputs_hidden": true,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"%pip install weaviate-client\n",
"%pip install langchain\n",
"%pip install openai[datalib]\n",
"%pip install tiktoken\n",
"%pip install python-dotenv"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1694446501722
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"import os\n",
"from dotenv import load_dotenv\n",
"load_dotenv(\"credentials.env\", override=True)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1694446501905
},
"jupyter": {
"outputs_hidden": true
}
},
"outputs": [],
"source": [
"from langchain.vectorstores import Weaviate\n",
"import weaviate\n",
"\n",
"WEAVIATE_URL = os.environ[\"VECTOR_DB_WEAVIATE_URL\"]\n",
"WEAVIATE_API_KEY = os.environ[\"WEAVIATE_API_KEY\"]\n",
"\n",
"#create client to inte ract with Weaviate\n",
"client = weaviate.Client(url=WEAVIATE_URL, auth_client_secret=weaviate.AuthApiKey(WEAVIATE_API_KEY))\n",
"#Print schemas in weaviate, you should see your index from the previous notebook named \"arxivcs_index\"\n",
"client.schema.get()"
]
},
{
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"### Set embedding parameters and LLM paramters"
]
},
{
"cell_type": "markdown",
"metadata": {
"nteract": {
"transient": {
"deleting": false
}
}
},
"source": [
"We are not embedding documents into Weaviate but we still need the embedding model to convert our prompt into a vector and do a similarity search. Make sure you are using the same embedding model here as you did to write data to the vector database. "
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1694446502048
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"#embedding parameters, should be the same as what was used to embed documents into weaviate\n",
"import openai\n",
"\n",
"openai.api_type = \"azure\"\n",
"openai.api_key = os.environ['openai_api_key']\n",
"openai.api_base = os.environ[\"AZURE_OPENAI_ENDPOINT\"]\n",
"openai.api_version = os.environ[\"AZURE_OPENAI_API_VERSION\"]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1694446502196
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"#LLM Paramaters\n",
"from langchain.chat_models import AzureChatOpenAI\n",
"\n",
"llm = AzureChatOpenAI(\n",
" openai_api_base=os.environ[\"AZURE_OPENAI_ENDPOINT\"],\n",
" openai_api_version=os.environ[\"AZURE_OPENAI_API_VERSION\"],\n",
" deployment_name=os.environ[\"AZURE_OPENAI_LLM_DEPLOYMENT\"],\n",
" openai_api_key=os.environ['openai_api_key'],\n",
" openai_api_type=\"azure\",\n",
")"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"gather": {
"logged": 1694446250628
},
"jupyter": {
"outputs_hidden": false,
"source_hidden": false
},
"nteract": {
"transient": {
"deleting": false
}
}
},
"outputs": [],
"source": [
"from langchain import PromptTemplate, LLMChain\n",
"\n",
"# Prompt\n",
"prompt = PromptTemplate.from_template(\n",
" \"Summarize the main themes in these retrieved docs: {docs}\"\n",
")\n",
"\n",
"# Chain\n",
"llm_chain = LLMChain(llm=llm, prompt=prompt)\n",
"\n",
"#Converting our input question to an Embedding to use for search\n",
"response = openai.Embedding.create(\n",
" input=\"What is Quantum mechanics?\",\n",
" engine=\"textembedding\"\n",
")\n",
"embeddings = response['data'][0]['embedding']\n",
"\n",
"# Run\n",
"db = Weaviate(client=client, index_name=\"arxivcs_index\", text_key=\"text\")\n",
"docs = db.similarity_search_by_vector(embedding=embeddings)\n",
"result = llm_chain(docs)\n",
"\n",
"# Output\n",
"result['text']"
]
}
],
"metadata": {
"kernel_info": {
"name": "python310-sdkv2"
},
"kernelspec": {
"display_name": "Python 3.10 - SDK v2",
"language": "python",
"name": "python310-sdkv2"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
},
"microsoft": {
"host": {
"AzureML": {
"notebookHasBeenCompleted": true
}
},
"ms_spell_check": {
"ms_spell_check_language": "en"
}
},
"nteract": {
"version": "nteract-front-end@1.0.0"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
9 changes: 9 additions & 0 deletions CODE_OF_CONDUCT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Microsoft Open Source Code of Conduct

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).

Resources:

- [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/)
- [Microsoft Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
- Contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with questions or concerns
14 changes: 14 additions & 0 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Contributing

This project welcomes contributions and suggestions. Most contributions require you to
agree to a Contributor License Agreement (CLA) declaring that you have the right to,
and actually do, grant us the rights to use your contribution. For details, visit
https://cla.microsoft.com.

When you submit a pull request, a CLA-bot will automatically determine whether you need
to provide a CLA and decorate the PR appropriately (e.g., label, comment). Simply follow the
instructions provided by the bot. You will only need to do this once across all repositories using our CLA.

This project has adopted the [Microsoft Open Source Code of Conduct](https://opensource.microsoft.com/codeofconduct/).
For more information see the [Code of Conduct FAQ](https://opensource.microsoft.com/codeofconduct/faq/)
or contact [opencode@microsoft.com](mailto:opencode@microsoft.com) with any additional questions or comments.
21 changes: 21 additions & 0 deletions LICENSE.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
Copyright (c) Microsoft Corporation.

MIT License

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED *AS IS*, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.
Loading