Spaces:
Sleeping
Sleeping
File size: 3,448 Bytes
2ee6a32 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 |
---
language:
- en
license: mit
pretty_name: PsTuts-RAG Q&A Dataset
size_categories:
- 10K<n<100K
tags:
- rag
- question-answering
- photoshop
- ragas
---
# π PsTuts-RAG Q&A Dataset
This dataset contains question-answer pairs generated using [RAGAS](https://github.com/explodinggradients/ragas) from Photoshop tutorial video transcripts. It's designed for training and evaluating RAG (Retrieval-Augmented Generation) systems focused on Photoshop tutorials.
## π Dataset Description
### Dataset Summary
The dataset contains 100 question-answer pairs related to Photoshop usage, generated from video transcripts using RAGAS's knowledge graph and testset generation capabilities. The questions are formulated from the perspective of different user personas (Beginner Photoshop User and Photoshop Trainer).
### Dataset Creation
The dataset was created through the following process:
1. Loading transcripts from Photoshop tutorial videos
2. Building a knowledge graph using RAGAS with the following transformations:
- Headlines extraction
- Headline splitting
- Summary extraction
- Embedding extraction
- Theme extraction
- NER extraction
- Similarity calculations
3. Generating synthetic question-answer pairs using different query synthesizers:
- SingleHopSpecificQuerySynthesizer (80%)
- MultiHopAbstractQuerySynthesizer (10%)
- MultiHopSpecificQuerySynthesizer (10%)
### Languages
The dataset is in English.
## π Dataset Structure
### Data Instances
Each instance in the dataset contains:
- `user_input`: A question about Photoshop
- `reference`: The reference answer
- Additional metadata from RAGAS generation
Example:
```
{
"user_input": "How can I use the Move tool to move many layers at once in Photoshop?",
"reference": "If you have the Move tool selected in Photoshop, you can move multiple layers at once by selecting those layers in the Layers panel first, then dragging any of the selected layers with the Move tool."
}
```
### Data Fields
- `user_input`: String containing the question
- `reference`: String containing the reference answer
- Additional RAGAS metadata fields
### Data Splits
The dataset was generated from test and dev splits of the original transcripts.
## π Usage
This dataset can be used for:
- Fine-tuning RAG systems for Photoshop-related queries
- Evaluating RAG system performance on domain-specific (Photoshop) knowledge
- Benchmarking question-answering models in the design/creative software domain
### Loading the Dataset
```python
from datasets import load_dataset
dataset = load_dataset("mbudisic/pstuts_rag_qa")
```
## π Additional Information
### Source Data
The source data consists of transcripts from Photoshop tutorial videos, processed and transformed into a knowledge graph using RAGAS.
### Personas Used in Generation
1. **Beginner Photoshop User**: Learning to complete simple tasks, use tools in Photoshop, and navigate the UI
2. **Photoshop Trainer**: Experienced trainer looking to develop step-by-step guides for Photoshop beginners
### Citation
If you use this dataset in your research, please cite:
```
@misc{pstuts_rag_qa,
author = {Budisic, Marko},
title = {PsTuts-RAG Q&A Dataset},
year = {2023},
publisher = {Hugging Face},
howpublished = {\url{https://huggingface.co/datasets/mbudisic/pstuts_rag_qa}}
}
```
### Contributions
Thanks to RAGAS for providing the framework to generate this dataset. |