Spaces:

mbudisic
/

PsTuts-RAG

Sleeping

App Files Files Community

PsTuts-RAG / pstuts_rag /CERT_SUBMISSION.md

mbudisic

Added cert narrative docs

07a933a 6 months ago

preview code

raw

history blame

5.94 kB

	# Task 1: Defining your Problem and Audience

	You are an AI Solutions Engineer.

	What problem do you want to solve? Who is it a problem for?

	<aside>
	📝

	Task 1: Articulate the problem and the user of your application

	Hints:

	- Create a list of potential questions that your user is likely to ask!
	- What is the user’s job title, and what is the part of their job function that you’re trying to automate?
	</aside>

	✅ Deliverables

	1. Write a succinct 1-sentence description of the problem
	2. Write 1-2 paragraphs on why this is a problem for your specific user

	<aside>
	⚠️

	If you cannot come up with a problem worth solving, use this one as a default.

	⚖️ Default Problem: Legal documents are too hard to understand for average people

	Default Solution: Build a fine-tuned, agentic RAG application that can answer questions in simple language about a court case based on source documents and additional relevant retrieved information

	</aside>

	# Task 2: Propose a Solution

	Now that you’ve defined a problem and a user, there are many possible solutions.

	Choose one, and articulate it.

	<aside>
	📝

	Task 2: Articulate your proposed solution

	Hint:

	- Paint a picture of the “better world” that your user will live in. How will they save time, make money, or produce higher-quality output?
	- Recall the [LLM Application stack](https://a16z.com/emerging-architectures-for-llm-applications/) we’ve discussed at length
	</aside>

	✅ Deliverables

	1. Write 1-2 paragraphs on your proposed solution. How will it look and feel to the user?
	2. Describe the tools you plan to use in each part of your stack. Write one sentence on why you made each tooling choice.
	1. LLM
	2. Embedding Model
	3. Orchestration
	4. Vector Database
	5. Monitoring
	6. Evaluation
	7. User Interface
	8. (Optional) Serving & Inference
	3. Where will you use an agent or agents? What will you use “agentic reasoning” for in your app?

	# Task 3: Dealing with the Data

	You are an AI Systems Engineer. The AI Solutions Engineer has handed off the plan to you. Now you must identify some source data that you can use for your application.

	Assume that you’ll be doing at least RAG (e.g., a PDF) with a general agentic search (e.g., a search API like [Tavily](https://tavily.com/) or [SERP](https://serpapi.com/)).

	Do you also plan to do fine-tuning or alignment? Should you collect data, use Synthetic Data Generation, or use an off-the-shelf dataset from [HF Datasets](https://huggingface.co/docs/datasets/en/index) or [Kaggle](https://www.kaggle.com/datasets)?

	<aside>
	📝

	Task 3: Collect data for (at least) RAG and choose (at least) one external API

	Hint:

	- Ask other real people (ideally the people you’re building for!) what they think.
	- What are the specific questions that your user is likely to ask of your application? Write these down.
	</aside>

	✅ Deliverables

	1. Describe all of your data sources and external APIs, and describe what you’ll use them for.
	2. Describe the default chunking strategy that you will use. Why did you make this decision?
	3. [Optional] Will you need specific data for any other part of your application? If so, explain.

	# Task 4: Building a Quick End-to-End Prototype

	<aside>
	📝

	Task 4: Build an end-to-end RAG application using an industry-standard open-source stack and your choice of commercial off-the-shelf models

	</aside>

	✅ Deliverables

	1. Build an end-to-end prototype and deploy it to a Hugging Face Space (or other endpoint)

	# Task 5: Creating a Golden Test Data Set

	You are an AI Evaluation & Performance Engineer. The AI Systems Engineer who built the initial RAG system has asked for your help and expertise in creating a "Golden Data Set" for evaluation.

	<aside>
	📝

	Task 5: Generate a synthetic test data set to baseline an initial evaluation with RAGAS

	</aside>

	✅ Deliverables

	1. Assess your pipeline using the RAGAS framework including key metrics faithfulness, response relevance, context precision, and context recall. Provide a table of your output results.
	2. What conclusions can you draw about the performance and effectiveness of your pipeline with this information?

	# Task 6: Fine-Tuning Open-Source Embeddings

	You are a Machine Learning Engineer. The AI Evaluation and Performance Engineer has asked for your help to fine-tune the embedding model.

	<aside>
	📝

	Task 6: Generate synthetic fine-tuning data and complete fine-tuning of the open-source embedding model

	</aside>

	✅ Deliverables

	1. Swap out your existing embedding model for the new fine-tuned version. Provide a link to your fine-tuned embedding model on the Hugging Face Hub.

	# Task 7: Assessing Performance

	You are the AI Evaluation & Performance Engineer. It's time to assess all options for this product.

	<aside>
	📝

	Task 7: Assess the performance of the fine-tuned agentic RAG application

	</aside>

	✅ Deliverables

	1. How does the performance compare to your original RAG application? Test the fine-tuned embedding model using the RAGAS frameworks to quantify any improvements. Provide results in a table.
	2. Articulate the changes that you expect to make to your app in the second half of the course. How will you improve your application?

	# Your Final Submission

	Please include the following in your final submission:

	1. A public (or otherwise shared) link to a GitHub repo that contains:
	1. A 5-minute (OR LESS) loom video of a live demo of your application that also describes the use case.
	2. A written document addressing each deliverable and answering each question
	3. All relevant code
	2. A public (or otherwise shared) link to the final version of your public application on Hugging Face (or other)
	3. A public link to your fine-tuned embedding model on Hugging Face