File size: 5,417 Bytes
69fed0d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c406c98
4112d91
 
c406c98
4112d91
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c406c98
4112d91
 
 
 
 
 
 
c406c98
9e240ab
4112d91
 
ce7f6c2
 
4112d91
 
 
 
c406c98
4112d91
 
 
 
24e55cc
e9b88a7
c406c98
4112d91
e9b88a7
4112d91
c406c98
4112d91
ce7f6c2
4112d91
 
 
 
 
 
 
69fed0d
 
e9b88a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
69fed0d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9e240ab
69fed0d
 
 
363f05e
be94e15
363f05e
69fed0d
 
 
e9b88a7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ce7f6c2
 
e9b88a7
 
 
 
 
 
 
 
 
9e240ab
e9b88a7
 
 
9e240ab
e9b88a7
9e240ab
e9b88a7
 
 
 
 
 
9e240ab
e9b88a7
 
 
 
 
 
 
 
 
 
 
 
ce7f6c2
 
e9b88a7
 
 
 
 
9e240ab
e9b88a7
 
 
a44acf6
e9b88a7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
---
license: cc-by-nc-4.0
---

# Precious3-GPT-Multi-Modal inference

Model inference is running at HuggingFace Inference endpoint


## Definitions

- **Signature**: up- and down-gene lists

---

## Run generation step by step


### Step 1 - connect to endpoint
```python

import requests

API_URL = "https://cu2s6lgb4jew3tht.us-east-1.aws.endpoints.huggingface.cloud"
headers = {
    "Accept" : "application/json",
    "Authorization": "Bearer hf_XXXX",
    "Content-Type": "application/json" 
}

def query(payload):
    response = requests.post(API_URL, headers=headers, json=payload)
    return response.json()

```

### Step 2 - create input for endpoint

```python
import json
with open('./generation-configs/meta2diff.json', 'r') as f:
    config_data = json.load(f)

# prepare sample
config_sample = {"inputs": config_data, "mode": "meta2diff", "parameters": {
    "temperature": 0.8,
    "top_p": 0.2,
    "top_k": 3550,
    "n_next_tokens": 50,
    "random_seed": 137
}}

```

### Expected input at Step 2.
```json
{
    "inputs": {
        "instruction": "disease2diff2disease", 
        "tissue": ["whole blood"],
        "age": 60,
        "cell": "", 
        "efo": "Orphanet_139399", 
        "datatype": "", "drug": "", "dose": "", "time": "", "case": "", "control": "", "dataset_type": "expression ", "gender": "m", "species": "human", "up": [], "down": []
    }, 
    "mode": "meta2diff", 
    "parameters": {
        "temperature": 0.8, "top_p": 0.2, "top_k": 3550, "n_next_tokens": 50, "random_seed": 137
    }
}
```

### Step 3. Send request to endpoint 
```python
output = query(config_sample)
```


OUTPUT STRUCTURE
```json
{
    "output": {
        "up": List, 
        "down": List
    },
    "mode": String, // Generation mode was selected
    "message": "Done!",  // or Error
    "input": String // Input prompt was passed

}
```
NOTE: If the ```mode``` was supposed to generate compounds, the output would contain ```compounds: List```,

---

## Generation Modes (`mode` in config)

Choose the appropriate mode based on your requirements:

1. **meta2diff**: Generate signature given meta-data such as tissue, compound, gender, etc.
2. **diff2compound**: Predict compounds based on signature.
3. **meta2diff2compound**: Generate signatures given meta-data and then predict compounds based on generated signatures.

---


### Instruction (`inputs.instruction` in config)

You can use the following instructions (one or several at a time):

1. disease2diff2disease - generate signature for disease
2. compound2diff2compound - generate signature for compound
3. age_group2diff2age_group - generate signature for age group / predict age group based on signature


### Other meta-data (`inputs.` in config)

1. Age (```age```) for human - in years, for macaque and mouse - in days
2. 
Full list of available values for each meta-data item you can find in ```p3_entities_with_type.csv```



## Examples

In the following example all possible configuration fields are specified. You can leave some meta-data fields in ```inputs``` section empty string(```""```) or empty list(```[]```). 

_Example 1_

If you want to generate signature given specific meta-data you can use the following configuration. Note, ```up``` and ```down``` fields are empty lists as you want to generate them. 

```json
{
    "inputs": {
        "instruction": "disease2diff2disease",
        "tissue": ["whole blood"],
        "age": "",
        "cell": "",
        "efo": "Orphanet_139399",
        "datatype": "",
        "drug": "",
        "dose": "",
        "time": "",
        "case": "",
        "control": "",
        "dataset_type": "expression",
        "gender": "m",
        "species": "human",
        "up": [],
        "down": []
    },
    "mode": "meta2diff",
    "parameters": {
        "temperature": 0.8,
        "top_p": 0.2,
        "top_k": 3550,
        "n_next_tokens": 50,
        "random_seed": 137
    }
}

```
Here we asked model to generate signature for Human, male, in tissue - whole blood with disease Orphanet_139399.


_Example 2_

You want to generate signature for healthy Human, male, 40-50 years, in tissue - whole blood.
```json
{
    "inputs": {
        "instruction": ["disease2diff2disease", "age_group2diff2age_group"],
        "tissue": ["whole blood"],
        "age": "",
        "cell": "",
        "efo": "",
        "datatype": "",
        "drug": "",
        "dose": "",
        "time": "",
        "case": "40.0-50.0",
        "control": "",
        "dataset_type": "expression",
        "gender": "m",
        "species": "human",
        "up": [],
        "down": []
    },
    "mode": "meta2diff",
    "parameters": {
        "temperature": 0.8,
        "top_p": 0.2,
        "top_k": 3550,
        "n_next_tokens": 50,
        "random_seed": 137
    }
}

```
Note, here we used ```disease2diff2disease``` instruction, but we expected to generate signatures for healthy human, that's why we'd set ```efo``` to empty string "".
Alternatively, we can add one more instruction to example 2 - ```"instruction": ["disease2diff2disease", "age_group2diff2age_group"]```

---

## Multi-Modality
Applies by default in tasks where you pass signature. For each gene in up- and down- lists model gets embeddings from Knowledge Graph and Text NNs. Then embeddings are averaged in order to obtain one embedding for each modality for each gene list (4 averaged embeddings in total).