Joblib
English
llm
human-feedback
weak supervision
data filtering
Christopher Glaze commited on
Commit
dfc3b71
·
1 Parent(s): c894c5e

Update readme

Browse files
Files changed (2) hide show
  1. README.md +11 -5
  2. curating_model_eval.png +0 -0
README.md CHANGED
@@ -25,16 +25,22 @@ The instruction classification schema is based on prior work in large language m
25
  # Model evaluation
26
  Model response quality scores were evaluated with double-blind A/B testing that compared dataset responses against what was generated by ChatGPT (version 3.5 turbo). Our evaluation confirmed that response quality predicted preferences for the dataset response over ChatGPT's:
27
 
28
- <center>
29
- <img src="curating_model_eval.png" width="300"/>
30
- </center>
 
 
 
31
 
32
  # Usage
33
  The model can accept either a dictionary or list of dicts as input. Each dict needs an ```instruction``` field at a bare minimum (in which case it will simply classify the instruction). If a ```response field``` is included, a response score will be returned. Users can also provide a ```dataset field```, which will only change model predictions if it falls under one of the existing sources we trained on (but can be left blank): dolly, helpful-instructions or open-assistant.
34
 
35
  ## Example
36
- Input:
 
37
  ```{'instruction': 'What are ways I can stay energized throughout the day?', 'response': 'Drink lots of coffee!'}```
38
-
 
39
  Model output:
 
40
  ```{'instruction class': 'brainstorming', 'instruction class confidence': 0.9683452, 'response quality': 0.08076164}```
 
25
  # Model evaluation
26
  Model response quality scores were evaluated with double-blind A/B testing that compared dataset responses against what was generated by ChatGPT (version 3.5 turbo). Our evaluation confirmed that response quality predicted preferences for the dataset response over ChatGPT's:
27
 
28
+ | Model response score | Win rate over ChatGPT |
29
+ | ----------- | ----------- |
30
+ | 0-0.25 | 0.25 |
31
+ | 0.25-0.5 | 0.28 |
32
+ | 0.5-0.75 | 0.43 |
33
+ | 0.75-1.0 | 0.47 |
34
 
35
  # Usage
36
  The model can accept either a dictionary or list of dicts as input. Each dict needs an ```instruction``` field at a bare minimum (in which case it will simply classify the instruction). If a ```response field``` is included, a response score will be returned. Users can also provide a ```dataset field```, which will only change model predictions if it falls under one of the existing sources we trained on (but can be left blank): dolly, helpful-instructions or open-assistant.
37
 
38
  ## Example
39
+ Input:
40
+ <br>
41
  ```{'instruction': 'What are ways I can stay energized throughout the day?', 'response': 'Drink lots of coffee!'}```
42
+ <br>
43
+ <br>
44
  Model output:
45
+ <br>
46
  ```{'instruction class': 'brainstorming', 'instruction class confidence': 0.9683452, 'response quality': 0.08076164}```
curating_model_eval.png DELETED
Binary file (64.7 kB)