diff --git "a/task3_model_building/Research/model_Research_docsummaryindex.ipynb" "b/task3_model_building/Research/model_Research_docsummaryindex.ipynb" new file mode 100644--- /dev/null +++ "b/task3_model_building/Research/model_Research_docsummaryindex.ipynb" @@ -0,0 +1,731 @@ +{ + "cells": [ + { + "attachments": { + "dfbc8159-1f08-4664-8b3f-e37b030dbcfc.png": { + "image/png": "" + } + }, + "cell_type": "markdown", + "id": "bad29816-c1ce-4c8c-845a-5f6abcb2df3f", + "metadata": {}, + "source": [ + "![image.png](attachment:dfbc8159-1f08-4664-8b3f-e37b030dbcfc.png)" + ] + }, + { + "cell_type": "markdown", + "id": "86c50a6e-8fb7-466c-b5ab-1337caf066f5", + "metadata": {}, + "source": [ + "## Limitations of Existing Approaches\n", + "There are a few limitations of embedding retrieval using text chunks.\n", + "\n", + "Text chunks lack global context. Oftentimes the question requires context beyond what is indexed in a specific chunk.\n", + "Careful tuning of top-k / similarity score thresholds. Make the value too small and you’ll miss context. Make the value too big and cost/latency might increase with more irrelevant context.\n", + "Embeddings don’t always select the most relevant context for a question. Embeddings are inherently determined separately between text and the context.\n", + "Adding keyword filters are one way to enhance the retrieval results. But that comes with its own set of challenges. We would need to adequately determine the proper keywords for each document, either manually or through an NLP keyword extraction/topic tagging model. Also we would need to adequately infer the proper keywords from the query." + ] + }, + { + "cell_type": "markdown", + "id": "8a7bca69-325a-4dad-bb07-66bbeddc8ed5", + "metadata": {}, + "source": [ + "## How It Works(Document Summary Index)\n", + "During build-time, we ingest each document, and use a LLM to extract a summary from each document. We also split the document up into text chunks (nodes). Both the summary and the nodes are stored within our Document Store abstraction. We maintain a mapping from the summary to the source document/nodes.\n", + "\n", + "During query-time, we retrieve relevant documents to the query based on their summaries, using the following approaches:\n", + "\n", + "LLM-based Retrieval: We present sets of document summaries to the LLM, and ask the LLM to determine which documents are relevant + their relevance score.\n", + "Embedding-based Retrieval: We retrieve relevant documents based on summary embedding similarity (with a top-k cutoff).\n", + "Note that this approach of retrieval for document summaries (even with the embedding-based approach) is different than embedding-based retrieval over text chunks. The retrieval classes for the document summary index retrieve all nodes for any selected document, instead of returning relevant chunks at the node-level.\n", + "\n", + "Storing summaries for a document also enables LLM-based retrieval. Instead of feeding the entire document to the LLM in the beginning, we can first have the LLM inspect the concise document summary to see if it’s relevant to the query at all. This leverages the reasoning capabilities of LLM’s which are more advanced than embedding-based lookup, but avoids the cost/latency of feeding the entire document to the LLM\n", + "\n", + "Additional Insights\n", + "Document retrieval with summaries can be thought of as a “middle ground” between semantic search and brute-force summarization across all docs. We look up documents based on summary relevance with the given query, and then return all *nodes* corresponding to the retrieved docs.\n", + "\n", + "Why should we do this? This retrieval method gives user more context than top-k over a text-chunk, by retrieving context at a document-level. But, it’s also a more flexible/automatic approach than topic modeling; no more worrying about whether your text has the right keyword tags!" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "13662caa-7ed4-498b-9820-e9719524aae1", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "4275600d-586e-4001-a458-d7f3cba3226e", + "metadata": {}, + "outputs": [], + "source": [ + "!pip install llama-index-llms-huggingface-api\n", + "!pip install llama-index-embeddings-huggingface\n", + "!pip install llama-index-llms-llama-cpp\n", + "!pip install llama-index\n", + "!pip install huggingface_hub\n", + "!pip install transformers\n", + "!pip install torch\n", + "!pip install gradio\n", + "!pip install llama-index-llms-huggingface\n", + "! pip install llama-index-llms-groq\n", + "!pip install llama-index-llms-gemini" + ] + }, + { + "cell_type": "code", + "execution_count": 4, + "id": "e489b70d-a252-481b-884b-318e8fc38186", + "metadata": {}, + "outputs": [ + { + "name": "stderr", + "output_type": "stream", + "text": [ + "/home/zeus/miniconda3/envs/cloudspace/lib/python3.10/site-packages/pydantic/_internal/_fields.py:161: UserWarning: Field \"model_id\" has conflict with protected namespace \"model_\".\n", + "\n", + "You may be able to resolve this warning by setting `model_config['protected_namespaces'] = ()`.\n", + " warnings.warn(\n" + ] + } + ], + "source": [ + "from llama_index.core import VectorStoreIndex, SimpleDirectoryReader\n", + "from llama_index.core.tools import QueryEngineTool, ToolMetadata\n", + "from llama_index.core.callbacks import CallbackManager, LlamaDebugHandler\n", + "from llama_index.core import ServiceContext, StorageContext\n", + "from llama_index.llms.huggingface import HuggingFaceLLM\n", + "from llama_index.llms.huggingface_api import HuggingFaceInferenceAPI\n", + "from llama_index.core import Settings\n", + "from llama_index.embeddings.huggingface import HuggingFaceEmbedding\n", + "from llama_index.core.retrievers import VectorIndexRetriever\n", + "from llama_index.core.query_engine import RetrieverQueryEngine\n", + "from llama_index.core.postprocessor import SimilarityPostprocessor\n", + "from llama_index.core import SimpleDirectoryReader, load_index_from_storage\n", + "import os\n", + "import nest_asyncio\n", + "import os\n", + "from huggingface_hub import login\n", + "\n", + "from llama_index.core import SimpleDirectoryReader, get_response_synthesizer\n", + "from llama_index.core import DocumentSummaryIndex\n", + "from llama_index.core.node_parser import SentenceSplitter" + ] + }, + { + "cell_type": "code", + "execution_count": 5, + "id": "b5dc3276-7f37-4284-9c83-3f699fd10b29", + "metadata": {}, + "outputs": [], + "source": [ + "import nest_asyncio\n", + "\n", + "nest_asyncio.apply()" + ] + }, + { + "cell_type": "code", + "execution_count": 6, + "id": "77b089e0-1418-40aa-a060-a123f9c1439b", + "metadata": {}, + "outputs": [], + "source": [ + "import logging\n", + "import sys\n", + "\n", + "logging.basicConfig(stream=sys.stdout, level=logging.WARNING)\n", + "logging.getLogger().addHandler(logging.StreamHandler(stream=sys.stdout))\n", + "\n", + "# # Uncomment if you want to temporarily disable logger\n", + "#logger = logging.getLogger()\n", + "#logger.disabled = True" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "d65a45d9-cd4e-44e5-b5c8-26aa42bdd54f", + "metadata": {}, + "outputs": [], + "source": [ + "#if you wann run on your local compute \n", + "\n", + "from llama_index.llms.llama_cpp import LlamaCPP\n", + "\n", + "def messages_to_prompt(messages):\n", + " prompt = \"\"\n", + " for message in messages:\n", + " if message.role == 'system':\n", + " prompt += f\"<|system|>\\n{message.content}\\n\"\n", + " elif message.role == 'user':\n", + " prompt += f\"<|user|>\\n{message.content}\\n\"\n", + " elif message.role == 'assistant':\n", + " prompt += f\"<|assistant|>\\n{message.content}\\n\"\n", + "\n", + " # ensure we start with a system prompt, insert blank if needed\n", + " if not prompt.startswith(\"<|system|>\\n\"):\n", + " prompt = \"<|system|>\\n\\n\" + prompt\n", + "\n", + " # add final assistant prompt\n", + " prompt = prompt + \"<|assistant|>\\n\"\n", + "\n", + " return prompt\n", + "\n", + "def completion_to_prompt(completion):\n", + " return f\"<|system|>\\n\\n<|user|>\\n{completion}\\n<|assistant|>\\n\"\n", + "\n", + "model_url = \"https://huggingface.co/TheBloke/zephyr-7B-beta-GGUF/resolve/main/zephyr-7b-beta.Q4_0.gguf\"\n", + "\n", + "llm = LlamaCPP(\n", + " model_url=model_url,\n", + " model_path=None,\n", + " temperature=0.1,\n", + " max_new_tokens=2000,\n", + " context_window= 32769,\n", + " generate_kwargs={},\n", + " messages_to_prompt=messages_to_prompt,\n", + " completion_to_prompt=completion_to_prompt,\n", + " verbose=True,\n", + ")\n", + "\n", + "response = llm.complete(\"Hello, how are you?\")\n", + "print(str(response))" + ] + }, + { + "cell_type": "code", + "execution_count": 7, + "id": "cfb1f727-f3fd-4142-ba8a-5790a69f9214", + "metadata": {}, + "outputs": [], + "source": [ + "##If you wish for groq =API\n", + "#from llama_index.llms.groq import Groq\n", + "\n", + "#Settings.llm = Groq(model=\"llama-3.1-70b-versatile\", api_key=\"gsk_..........\")" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "21caa939-247a-4272-af3b-221b16acc33f", + "metadata": {}, + "outputs": [ + { + "ename": "SyntaxError", + "evalue": "invalid syntax (2752512220.py, line 5)", + "output_type": "error", + "traceback": [ + "\u001b[0;36m Cell \u001b[0;32mIn[5], line 5\u001b[0;36m\u001b[0m\n\u001b[0;31m --------------------------------------------------------------------------\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mSyntaxError\u001b[0m\u001b[0;31m:\u001b[0m invalid syntax\n" + ] + } + ], + "source": [ + "#--------------------------------------------------------------------------\n", + "#If you wish for huggingfaceAPI\n", + "#HF_TOKEN = \"hf_..............\" \n", + "#login(token=HF_TOKEN)\n", + "#os.environ['HuggingFace_API_TOKEN'] = HF_TOKEN\n", + "\n", + "# Settings.llm = HuggingFaceInferenceAPI(model_name = \"meta-llama/Meta-Llama-3-8B-Instruct\", token=HF_TOKEN)\n", + "\n", + "#-----------------------------------\n", + "#from llama_index.llms.gemini import Gemini\n", + "\n", + "#Settings.llm = Gemini(model=\"models/gemini-1.5-flash\", api_key=\"AI..............\")\n", + "#resp = llm.complete(\"Write a poem about a magic backpack\")\n", + "#print(resp)" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "a5019381-df68-400b-b8bc-c3f37f725d0f", + "metadata": {}, + "outputs": [], + "source": [ + "#from llama_index.llms.gemini import Gemini\n", + "\n", + "#Settings.llm = Gemini(model=\"models/gemini-1.5-flash\", api_key=\"AI....................\")\n", + "#resp = llm.complete(\"Write a poem about a magic backpack\")\n", + "#print(resp)" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "2dfd52cd-efed-437a-8ea0-a0b59f4f1cad", + "metadata": {}, + "outputs": [], + "source": [ + "embed_model = HuggingFaceEmbedding(model_name=\"thenlper/gte-large\")\n", + "Settings.embed_model = embed_model" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "7d01a4b9-c203-498f-9c4e-cc79edc42ac3", + "metadata": {}, + "outputs": [], + "source": [ + "# LLM (gpt-3.5-turbo)\n", + "llm = Settings.llm\n", + "splitter = SentenceSplitter(chunk_size=4000)" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "3feb274f-bdd5-4bad-9960-3a8679e3366f", + "metadata": {}, + "outputs": [], + "source": [ + "# DO NOT RUN to use index\n", + "#Define the directory containing the articles\n", + "reader = SimpleDirectoryReader(input_dir=\"./bill24\")\n", + "# Load documents with parallel processing\n", + "documents = reader.load_data(num_workers=4)\n", + "print(f\"Loaded {len(documents)} documents.\")\n", + "\n", + "# default mode of building the index\n", + "response_synthesizer = get_response_synthesizer(\n", + " response_mode=\"tree_summarize\",use_async=True\n", + ")\n", + "doc_summary_index = DocumentSummaryIndex.from_documents(\n", + " documents,\n", + " llm=llm,\n", + " transformations=[splitter],\n", + " response_synthesizer=response_synthesizer,\n", + " show_progress=True,\n", + ")\n", + "\n", + "doc_summary_index.storage_context.persist(\"index_bill24\")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 17, + "id": "d8cd673f-784c-4720-9ad4-1c968d7407b3", + "metadata": {}, + "outputs": [], + "source": [ + "\n", + "#from llama_index.core import load_index_from_storage\n", + "#from llama_index.core import StorageContext\n", + "# rebuild storage context\n", + "storage_context1 = StorageContext.from_defaults(persist_dir=\"./index_bill1\")\n", + "index_bill1 = load_index_from_storage(storage_context1)\n", + "\n", + "storage_context2 = StorageContext.from_defaults(persist_dir=\"./index_bill2\")\n", + "index_bill2 = load_index_from_storage(storage_context2)\n", + "\n", + "storage_context3 = StorageContext.from_defaults(persist_dir=\"./index_bill3\")\n", + "index_bill3 = load_index_from_storage(storage_context3)\n", + "\n", + "storage_context4 = StorageContext.from_defaults(persist_dir=\"./index_bill4\")\n", + "index_bill4 = load_index_from_storage(storage_context4)\n", + "\n", + "storage_context5 = StorageContext.from_defaults(persist_dir=\"./index_bill5\")\n", + "index_bill5 = load_index_from_storage(storage_context5)\n", + "\n", + "storage_context6 = StorageContext.from_defaults(persist_dir=\"./index_bill6\")\n", + "index_bill6 = load_index_from_storage(storage_context6)\n", + "\n", + "storage_context7 = StorageContext.from_defaults(persist_dir=\"./index_bill7\")\n", + "index_bill7 = load_index_from_storage(storage_context7)\n", + "\n", + "storage_context8 = StorageContext.from_defaults(persist_dir=\"./index_bill8\")\n", + "index_bill8 = load_index_from_storage(storage_context8)\n", + "\n", + "storage_context9 = StorageContext.from_defaults(persist_dir=\"./index_bill9\")\n", + "index_bill9 = load_index_from_storage(storage_context9)\n", + "\n", + "storage_context10 = StorageContext.from_defaults(persist_dir=\"./index_bill10\")\n", + "index_bill10 = load_index_from_storage(storage_context10)\n", + "\n", + "storage_context11 = StorageContext.from_defaults(persist_dir=\"./index_bill11\")\n", + "index_bill11 = load_index_from_storage(storage_context11)\n", + "\n", + "storage_context12 = StorageContext.from_defaults(persist_dir=\"./index_bill12\")\n", + "index_bill12 = load_index_from_storage(storage_context12)\n", + "\n", + "storage_context13 = StorageContext.from_defaults(persist_dir=\"./index_bill13\")\n", + "index_bill13 = load_index_from_storage(storage_context13)\n", + "\n", + "storage_context14 = StorageContext.from_defaults(persist_dir=\"./index_bill14\")\n", + "index_bill14 = load_index_from_storage(storage_context14)\n", + "\n", + "storage_context15 = StorageContext.from_defaults(persist_dir=\"./index_bill15\")\n", + "index_bill15 = load_index_from_storage(storage_context15)\n", + "\n", + "storage_context16 = StorageContext.from_defaults(persist_dir=\"./index_bill16\")\n", + "index_bill16 = load_index_from_storage(storage_context16)\n", + "\n", + "storage_context17 = StorageContext.from_defaults(persist_dir=\"./index_bill17\")\n", + "index_bill17 = load_index_from_storage(storage_context17)\n", + "\n", + "storage_context18 = StorageContext.from_defaults(persist_dir=\"./index_bill18\")\n", + "index_bill18 = load_index_from_storage(storage_context18)\n", + "\n", + "storage_context19 = StorageContext.from_defaults(persist_dir=\"./index_bill19\")\n", + "index_bill19 = load_index_from_storage(storage_context19)\n", + "\n", + "storage_context20 = StorageContext.from_defaults(persist_dir=\"./index_bill20\")\n", + "index_bill20 = load_index_from_storage(storage_context20)\n", + "\n", + "storage_context21 = StorageContext.from_defaults(persist_dir=\"./index_bill21\")\n", + "index_bill21 = load_index_from_storage(storage_context21)\n", + "\n", + "storage_context22 = StorageContext.from_defaults(persist_dir=\"./index_bill22\")\n", + "index_bill22 = load_index_from_storage(storage_context22)\n", + "\n", + "storage_context23 = StorageContext.from_defaults(persist_dir=\"./index_bill23\")\n", + "index_bill23 = load_index_from_storage(storage_context23)\n", + "\n", + "storage_context24 = StorageContext.from_defaults(persist_dir=\"./index_bill24\")\n", + "index_bill24 = load_index_from_storage(storage_context24)\n", + "\n", + "storage_context25 = StorageContext.from_defaults(persist_dir=\"./index_bill25\")\n", + "index_bill25 = load_index_from_storage(storage_context25)\n", + "\n", + "storage_context26 = StorageContext.from_defaults(persist_dir=\"./index_bill26\")\n", + "index_bill26 = load_index_from_storage(storage_context26)\n", + "\n", + "storage_context27 = StorageContext.from_defaults(persist_dir=\"./index_bill27\")\n", + "index_bill27 = load_index_from_storage(storage_context27)\n", + "\n", + "storage_context28 = StorageContext.from_defaults(persist_dir=\"./index_bill28\")\n", + "index_bill28 = load_index_from_storage(storage_context28)" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "d58e9c31-013f-46c2-b14b-24c4af3087bc", + "metadata": {}, + "outputs": [], + "source": [ + "from llama_index.core.retrievers import QueryFusionRetriever\n", + "\n", + "retriever = QueryFusionRetriever(\n", + " [index_bill1.as_retriever(),\n", + " index_bill2.as_retriever(),\n", + " index_bill3.as_retriever(),\n", + " index_bill4.as_retriever(),\n", + " index_bill5.as_retriever(),\n", + " index_bill6.as_retriever(),\n", + " index_bill7.as_retriever(),\n", + " index_bill8.as_retriever(),\n", + " index_bill9.as_retriever(),\n", + " index_bill10.as_retriever(),\n", + " index_bill11.as_retriever(),\n", + " index_bill12.as_retriever(),\n", + " index_bill13.as_retriever(),\n", + " index_bill14.as_retriever(),\n", + " index_bill15.as_retriever(),\n", + " index_bill16.as_retriever(),\n", + " index_bill17.as_retriever(),\n", + " index_bill18.as_retriever(),\n", + " index_bill19.as_retriever(),\n", + " index_bill20.as_retriever(),\n", + " index_bill21.as_retriever(),\n", + " index_bill22.as_retriever(),\n", + " index_bill23.as_retriever(),\n", + " index_bill24.as_retriever(),\n", + " index_bill25.as_retriever(),\n", + " index_bill26.as_retriever(),\n", + " index_bill27.as_retriever(),\n", + " index_bill28.as_retriever()],\n", + " similarity_top_k=4,\n", + " num_queries=1, # set this to 1 to disable query generation\n", + " use_async=True,\n", + " verbose=True,\n", + " query_gen_prompt= (\"\"\"\\\n", + "You are a helpful QA assistant. Using the context information provided below, \\\n", + "respond to the query accurately and comprehensively, relying solely on the context and not on any prior knowledge. \\\n", + "Ensure your response is clear and concise.\\\n", + "\"\"\")\n", + ")\n" + ] + }, + { + "cell_type": "code", + "execution_count": 15, + "id": "c663cff3-c79e-45d6-92d4-ba095594e3f4", + "metadata": {}, + "outputs": [], + "source": [ + "#model default prompt\n", + "\n", + "#QUERY_GEN_PROMPT = (\n", + "# \"You are a helpful assistant that generates multiple search queries based on a \"\n", + " # \"single input query. Generate {num_queries} search queries, one on each line, \"\n", + " # \"related to the following input query:\\n\"\n", + " # \"Query: {query}\\n\"\n", + " # \"Queries:\\n\"\n", + "# )" + ] + }, + { + "cell_type": "code", + "execution_count": 19, + "id": "8ca6ff3f-5782-4a84-9b7d-22fc2aa503e9", + "metadata": {}, + "outputs": [], + "source": [ + "# use retriever as part of a query engine\n", + "from llama_index.core.query_engine import RetrieverQueryEngine\n", + "\n", + "# configure response synthesizer\n", + "response_synthesizer = get_response_synthesizer(response_mode=\"tree_summarize\")\n", + "\n", + "# assemble query engine\n", + "query_engine = RetrieverQueryEngine(\n", + " retriever=retriever,\n", + " response_synthesizer=response_synthesizer,\n", + ")\n", + "\n", + "# query\n" + ] + }, + { + "cell_type": "code", + "execution_count": 20, + "id": "aa7ff231-b4cc-4891-b98b-3d708641136d", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "This document does not contain information about sports teams in Toronto. \n", + "\n" + ] + } + ], + "source": [ + "response = query_engine.query(\"What are the sports teams in Toronto?\")\n", + "print(response)" + ] + }, + { + "cell_type": "code", + "execution_count": 21, + "id": "27f397db-6fda-4296-a7c0-e3f0d569b8fa", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "This act authorizes the Minister of Health to make payments of up to $2.5 billion from the Consolidated Revenue Fund for expenses related to COVID-19 tests. It also allows the Minister to transfer COVID-19 tests and instruments used in relation to those tests to provinces, territories, and other bodies and persons in Canada. \n", + "\n" + ] + } + ], + "source": [ + "response = query_engine.query(\"explain An Act respecting certain measures related\"\n", + "\"to COVID-19\")\n", + "print(response)" + ] + }, + { + "cell_type": "code", + "execution_count": 22, + "id": "cdb26489-5ff7-4d52-aece-112f13519d09", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "The Online Streaming Act amends the Broadcasting Act to include online undertakings as a distinct class of broadcasting undertakings. It also specifies that the Act does not apply to programs uploaded to an online undertaking that provides a social media service by a user of the service, unless the programs are prescribed by regulation. The Act updates the broadcasting policy for Canada, enhancing the vitality of official language minority communities in Canada and fostering the full recognition and use of both English and French in Canadian society. It also provides the Commission with the power to require that persons carrying on broadcasting undertakings make expenditures to support the Canadian broadcasting system. \n", + "\n" + ] + } + ], + "source": [ + "response = query_engine.query(\"list all context\")\n", + "print(response)" + ] + }, + { + "cell_type": "code", + "execution_count": 23, + "id": "97c71c98-7ccb-4491-a84a-17f4776ccf3c", + "metadata": {}, + "outputs": [ + { + "data": { + "text/markdown": [ + "**`Final Response:`** The provided text discusses amendments to the Criminal Code, Firearms Act, Nuclear Safety and Control Act, Immigration and Refugee Protection Act, and An Act to amend certain Acts and Regulations in relation to firearms." + ], + "text/plain": [ + "" + ] + }, + "metadata": {}, + "output_type": "display_data" + } + ], + "source": [ + "response = query_engine.query(\"list all titles that you can answer\")\n", + "\n", + "from llama_index.core.response.notebook_utils import display_response\n", + "\n", + "display_response(response)" + ] + }, + { + "cell_type": "code", + "execution_count": 16, + "id": "cc177e35-9444-4294-b853-3cac0e6d5a04", + "metadata": {}, + "outputs": [], + "source": [ + "#explore query_engines" + ] + }, + { + "cell_type": "code", + "execution_count": 18, + "id": "8d4bb416-55bb-4c1c-8663-a739d8e7fccc", + "metadata": {}, + "outputs": [ + { + "name": "stdout", + "output_type": "stream", + "text": [ + "4\n", + "None\n", + "Page 1 \n", + "First Session, Forty-fourth Parliament,\n", + "70-71 Elizabeth II, 2021-2022\n", + "STATUTES OF CANADA 2022\n", + "CHAPTER 2\n", + "An Act respecting certain measures related\n", + "to COVID-19\n", + "ASSENTED TO\n", + "MARCH 4, 2022\n", + "BILL C-10\n", + "\n", + "Page 2 \n", + "RECOMMENDATION\n", + "Her Excellency the Governor General recommends to the House\n", + "of Commons the appropriation of public revenue under the cir-\n", + "cumstances, in the manner and for the purposes set out in a\n", + "measure entitled “An Act respecting certain measures related to\n", + "COVID-19”.\n", + "SUMMARY\n", + "This enactment authorizes the Minister of Health to make pay-\n", + "ments of up to $2.5 billion out of the Consolidated Revenue Fund\n", + "in relation to coronavirus disease 2019 (COVID-19) tests.\n", + "It also authorizes that Minister to transfer COVID-19 tests and in-\n", + "struments used in relation to those tests to the provinces and\n", + "territories and to bodies and persons in Canada.\n", + "Available on the House of Commons website at the following address:\n", + "www.ourcommons.ca\n", + "2021-2022\n", + "\n", + "Page 3 \n", + "70-71 ELIZABETH II\n", + "CHAPTER 2\n", + "An Act respecting certain measures related to\n", + "COVID-19\n", + "[Assented to 4th March, 2022]\n", + "Her Majesty, by and with the advice and consent of\n", + "the Senate and House of Commons of Canada,\n", + "enacts as follows:\n", + "Payments out of C.R.F.\n", + "1 The Minister of Health may make payments, the total\n", + "of which may not exceed $2.5 billion, out of the Consoli-\n", + "dated Revenue Fund for any expenses incurred on or af-\n", + "ter January 1, 2022 in relation to coronavirus disease\n", + "2019 (COVID-19) tests.\n", + "Transfers\n", + "2 The Minister of Health may transfer to any province or\n", + "territory, or to any body or person in Canada, any coro-\n", + "navirus disease 2019 (COVID-19) tests or instruments\n", + "used in relation to those tests acquired by Her Majesty in\n", + "right of Canada on or after April 1, 2021.\n", + "Published under authority of the Speaker of the House of Commons\n", + "2021-2022\n", + "\n", + "Page 4 \n", + "Available on the House of Commons website\n", + "Disponible sur le site Web de la Chambre des com\n" + ] + } + ], + "source": [ + "retrieved_nodes = retriever.retrieve(\"explain covid\")\n", + "\n", + "print(len(retrieved_nodes))\n", + "\n", + "print(retrieved_nodes[0].score)\n", + "print(retrieved_nodes[0].node.get_text())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "a4efb2a7-1a80-4935-960f-2bc86a8d7629", + "metadata": {}, + "outputs": [], + "source": [ + "retrieved_nodes = retriever.retrieve(\"An Act to provide for the establishment of a national council for reconciliation\"\n", + "\"ASSENTED TO APRIL 30, 2024\"\n", + "\"BILL C-29\")\n", + "\n", + "print(len(retrieved_nodes))\n", + "\n", + "print(retrieved_nodes[0].score)\n", + "print(retrieved_nodes[0].node.get_text())" + ] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "63a04ae1-a0ae-4df4-89d0-8a42a147aaf5", + "metadata": {}, + "outputs": [], + "source": [] + }, + { + "cell_type": "code", + "execution_count": null, + "id": "beb2308d-c418-403d-8826-0bbf3495a91f", + "metadata": {}, + "outputs": [], + "source": [] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.10.10" + } + }, + "nbformat": 4, + "nbformat_minor": 5 +}