wanglingfeng commited on
Commit
6200954
·
1 Parent(s): ca284c7
ALTo.pth ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:320057d9b06177a73b3cc7cdba8a090149a7e41b2071f5eb3f799578bec706e6
3
+ size 3853850831
README.md CHANGED
@@ -1,3 +1,7 @@
1
  ---
2
  license: apache-2.0
 
 
 
3
  ---
 
 
1
  ---
2
  license: apache-2.0
3
+ base_model:
4
+ - OpenGVLab/InternVL2_5-8B
5
+ pipeline_tag: mask-generation
6
  ---
7
+ <div align="center">
added_tokens.json ADDED
@@ -0,0 +1,1037 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "</box>": 92552,
3
+ "</img>": 92545,
4
+ "</quad>": 92548,
5
+ "</ref>": 92550,
6
+ "<ALTo_End>": 92554,
7
+ "<ALTo_Start>": 92553,
8
+ "<IMG_CONTEXT>": 92546,
9
+ "<TOK_0>": 92555,
10
+ "<TOK_1000>": 93555,
11
+ "<TOK_1001>": 93556,
12
+ "<TOK_1002>": 93557,
13
+ "<TOK_1003>": 93558,
14
+ "<TOK_1004>": 93559,
15
+ "<TOK_1005>": 93560,
16
+ "<TOK_1006>": 93561,
17
+ "<TOK_1007>": 93562,
18
+ "<TOK_1008>": 93563,
19
+ "<TOK_1009>": 93564,
20
+ "<TOK_100>": 92655,
21
+ "<TOK_1010>": 93565,
22
+ "<TOK_1011>": 93566,
23
+ "<TOK_1012>": 93567,
24
+ "<TOK_1013>": 93568,
25
+ "<TOK_1014>": 93569,
26
+ "<TOK_1015>": 93570,
27
+ "<TOK_1016>": 93571,
28
+ "<TOK_1017>": 93572,
29
+ "<TOK_1018>": 93573,
30
+ "<TOK_1019>": 93574,
31
+ "<TOK_101>": 92656,
32
+ "<TOK_1020>": 93575,
33
+ "<TOK_1021>": 93576,
34
+ "<TOK_1022>": 93577,
35
+ "<TOK_1023>": 93578,
36
+ "<TOK_102>": 92657,
37
+ "<TOK_103>": 92658,
38
+ "<TOK_104>": 92659,
39
+ "<TOK_105>": 92660,
40
+ "<TOK_106>": 92661,
41
+ "<TOK_107>": 92662,
42
+ "<TOK_108>": 92663,
43
+ "<TOK_109>": 92664,
44
+ "<TOK_10>": 92565,
45
+ "<TOK_110>": 92665,
46
+ "<TOK_111>": 92666,
47
+ "<TOK_112>": 92667,
48
+ "<TOK_113>": 92668,
49
+ "<TOK_114>": 92669,
50
+ "<TOK_115>": 92670,
51
+ "<TOK_116>": 92671,
52
+ "<TOK_117>": 92672,
53
+ "<TOK_118>": 92673,
54
+ "<TOK_119>": 92674,
55
+ "<TOK_11>": 92566,
56
+ "<TOK_120>": 92675,
57
+ "<TOK_121>": 92676,
58
+ "<TOK_122>": 92677,
59
+ "<TOK_123>": 92678,
60
+ "<TOK_124>": 92679,
61
+ "<TOK_125>": 92680,
62
+ "<TOK_126>": 92681,
63
+ "<TOK_127>": 92682,
64
+ "<TOK_128>": 92683,
65
+ "<TOK_129>": 92684,
66
+ "<TOK_12>": 92567,
67
+ "<TOK_130>": 92685,
68
+ "<TOK_131>": 92686,
69
+ "<TOK_132>": 92687,
70
+ "<TOK_133>": 92688,
71
+ "<TOK_134>": 92689,
72
+ "<TOK_135>": 92690,
73
+ "<TOK_136>": 92691,
74
+ "<TOK_137>": 92692,
75
+ "<TOK_138>": 92693,
76
+ "<TOK_139>": 92694,
77
+ "<TOK_13>": 92568,
78
+ "<TOK_140>": 92695,
79
+ "<TOK_141>": 92696,
80
+ "<TOK_142>": 92697,
81
+ "<TOK_143>": 92698,
82
+ "<TOK_144>": 92699,
83
+ "<TOK_145>": 92700,
84
+ "<TOK_146>": 92701,
85
+ "<TOK_147>": 92702,
86
+ "<TOK_148>": 92703,
87
+ "<TOK_149>": 92704,
88
+ "<TOK_14>": 92569,
89
+ "<TOK_150>": 92705,
90
+ "<TOK_151>": 92706,
91
+ "<TOK_152>": 92707,
92
+ "<TOK_153>": 92708,
93
+ "<TOK_154>": 92709,
94
+ "<TOK_155>": 92710,
95
+ "<TOK_156>": 92711,
96
+ "<TOK_157>": 92712,
97
+ "<TOK_158>": 92713,
98
+ "<TOK_159>": 92714,
99
+ "<TOK_15>": 92570,
100
+ "<TOK_160>": 92715,
101
+ "<TOK_161>": 92716,
102
+ "<TOK_162>": 92717,
103
+ "<TOK_163>": 92718,
104
+ "<TOK_164>": 92719,
105
+ "<TOK_165>": 92720,
106
+ "<TOK_166>": 92721,
107
+ "<TOK_167>": 92722,
108
+ "<TOK_168>": 92723,
109
+ "<TOK_169>": 92724,
110
+ "<TOK_16>": 92571,
111
+ "<TOK_170>": 92725,
112
+ "<TOK_171>": 92726,
113
+ "<TOK_172>": 92727,
114
+ "<TOK_173>": 92728,
115
+ "<TOK_174>": 92729,
116
+ "<TOK_175>": 92730,
117
+ "<TOK_176>": 92731,
118
+ "<TOK_177>": 92732,
119
+ "<TOK_178>": 92733,
120
+ "<TOK_179>": 92734,
121
+ "<TOK_17>": 92572,
122
+ "<TOK_180>": 92735,
123
+ "<TOK_181>": 92736,
124
+ "<TOK_182>": 92737,
125
+ "<TOK_183>": 92738,
126
+ "<TOK_184>": 92739,
127
+ "<TOK_185>": 92740,
128
+ "<TOK_186>": 92741,
129
+ "<TOK_187>": 92742,
130
+ "<TOK_188>": 92743,
131
+ "<TOK_189>": 92744,
132
+ "<TOK_18>": 92573,
133
+ "<TOK_190>": 92745,
134
+ "<TOK_191>": 92746,
135
+ "<TOK_192>": 92747,
136
+ "<TOK_193>": 92748,
137
+ "<TOK_194>": 92749,
138
+ "<TOK_195>": 92750,
139
+ "<TOK_196>": 92751,
140
+ "<TOK_197>": 92752,
141
+ "<TOK_198>": 92753,
142
+ "<TOK_199>": 92754,
143
+ "<TOK_19>": 92574,
144
+ "<TOK_1>": 92556,
145
+ "<TOK_200>": 92755,
146
+ "<TOK_201>": 92756,
147
+ "<TOK_202>": 92757,
148
+ "<TOK_203>": 92758,
149
+ "<TOK_204>": 92759,
150
+ "<TOK_205>": 92760,
151
+ "<TOK_206>": 92761,
152
+ "<TOK_207>": 92762,
153
+ "<TOK_208>": 92763,
154
+ "<TOK_209>": 92764,
155
+ "<TOK_20>": 92575,
156
+ "<TOK_210>": 92765,
157
+ "<TOK_211>": 92766,
158
+ "<TOK_212>": 92767,
159
+ "<TOK_213>": 92768,
160
+ "<TOK_214>": 92769,
161
+ "<TOK_215>": 92770,
162
+ "<TOK_216>": 92771,
163
+ "<TOK_217>": 92772,
164
+ "<TOK_218>": 92773,
165
+ "<TOK_219>": 92774,
166
+ "<TOK_21>": 92576,
167
+ "<TOK_220>": 92775,
168
+ "<TOK_221>": 92776,
169
+ "<TOK_222>": 92777,
170
+ "<TOK_223>": 92778,
171
+ "<TOK_224>": 92779,
172
+ "<TOK_225>": 92780,
173
+ "<TOK_226>": 92781,
174
+ "<TOK_227>": 92782,
175
+ "<TOK_228>": 92783,
176
+ "<TOK_229>": 92784,
177
+ "<TOK_22>": 92577,
178
+ "<TOK_230>": 92785,
179
+ "<TOK_231>": 92786,
180
+ "<TOK_232>": 92787,
181
+ "<TOK_233>": 92788,
182
+ "<TOK_234>": 92789,
183
+ "<TOK_235>": 92790,
184
+ "<TOK_236>": 92791,
185
+ "<TOK_237>": 92792,
186
+ "<TOK_238>": 92793,
187
+ "<TOK_239>": 92794,
188
+ "<TOK_23>": 92578,
189
+ "<TOK_240>": 92795,
190
+ "<TOK_241>": 92796,
191
+ "<TOK_242>": 92797,
192
+ "<TOK_243>": 92798,
193
+ "<TOK_244>": 92799,
194
+ "<TOK_245>": 92800,
195
+ "<TOK_246>": 92801,
196
+ "<TOK_247>": 92802,
197
+ "<TOK_248>": 92803,
198
+ "<TOK_249>": 92804,
199
+ "<TOK_24>": 92579,
200
+ "<TOK_250>": 92805,
201
+ "<TOK_251>": 92806,
202
+ "<TOK_252>": 92807,
203
+ "<TOK_253>": 92808,
204
+ "<TOK_254>": 92809,
205
+ "<TOK_255>": 92810,
206
+ "<TOK_256>": 92811,
207
+ "<TOK_257>": 92812,
208
+ "<TOK_258>": 92813,
209
+ "<TOK_259>": 92814,
210
+ "<TOK_25>": 92580,
211
+ "<TOK_260>": 92815,
212
+ "<TOK_261>": 92816,
213
+ "<TOK_262>": 92817,
214
+ "<TOK_263>": 92818,
215
+ "<TOK_264>": 92819,
216
+ "<TOK_265>": 92820,
217
+ "<TOK_266>": 92821,
218
+ "<TOK_267>": 92822,
219
+ "<TOK_268>": 92823,
220
+ "<TOK_269>": 92824,
221
+ "<TOK_26>": 92581,
222
+ "<TOK_270>": 92825,
223
+ "<TOK_271>": 92826,
224
+ "<TOK_272>": 92827,
225
+ "<TOK_273>": 92828,
226
+ "<TOK_274>": 92829,
227
+ "<TOK_275>": 92830,
228
+ "<TOK_276>": 92831,
229
+ "<TOK_277>": 92832,
230
+ "<TOK_278>": 92833,
231
+ "<TOK_279>": 92834,
232
+ "<TOK_27>": 92582,
233
+ "<TOK_280>": 92835,
234
+ "<TOK_281>": 92836,
235
+ "<TOK_282>": 92837,
236
+ "<TOK_283>": 92838,
237
+ "<TOK_284>": 92839,
238
+ "<TOK_285>": 92840,
239
+ "<TOK_286>": 92841,
240
+ "<TOK_287>": 92842,
241
+ "<TOK_288>": 92843,
242
+ "<TOK_289>": 92844,
243
+ "<TOK_28>": 92583,
244
+ "<TOK_290>": 92845,
245
+ "<TOK_291>": 92846,
246
+ "<TOK_292>": 92847,
247
+ "<TOK_293>": 92848,
248
+ "<TOK_294>": 92849,
249
+ "<TOK_295>": 92850,
250
+ "<TOK_296>": 92851,
251
+ "<TOK_297>": 92852,
252
+ "<TOK_298>": 92853,
253
+ "<TOK_299>": 92854,
254
+ "<TOK_29>": 92584,
255
+ "<TOK_2>": 92557,
256
+ "<TOK_300>": 92855,
257
+ "<TOK_301>": 92856,
258
+ "<TOK_302>": 92857,
259
+ "<TOK_303>": 92858,
260
+ "<TOK_304>": 92859,
261
+ "<TOK_305>": 92860,
262
+ "<TOK_306>": 92861,
263
+ "<TOK_307>": 92862,
264
+ "<TOK_308>": 92863,
265
+ "<TOK_309>": 92864,
266
+ "<TOK_30>": 92585,
267
+ "<TOK_310>": 92865,
268
+ "<TOK_311>": 92866,
269
+ "<TOK_312>": 92867,
270
+ "<TOK_313>": 92868,
271
+ "<TOK_314>": 92869,
272
+ "<TOK_315>": 92870,
273
+ "<TOK_316>": 92871,
274
+ "<TOK_317>": 92872,
275
+ "<TOK_318>": 92873,
276
+ "<TOK_319>": 92874,
277
+ "<TOK_31>": 92586,
278
+ "<TOK_320>": 92875,
279
+ "<TOK_321>": 92876,
280
+ "<TOK_322>": 92877,
281
+ "<TOK_323>": 92878,
282
+ "<TOK_324>": 92879,
283
+ "<TOK_325>": 92880,
284
+ "<TOK_326>": 92881,
285
+ "<TOK_327>": 92882,
286
+ "<TOK_328>": 92883,
287
+ "<TOK_329>": 92884,
288
+ "<TOK_32>": 92587,
289
+ "<TOK_330>": 92885,
290
+ "<TOK_331>": 92886,
291
+ "<TOK_332>": 92887,
292
+ "<TOK_333>": 92888,
293
+ "<TOK_334>": 92889,
294
+ "<TOK_335>": 92890,
295
+ "<TOK_336>": 92891,
296
+ "<TOK_337>": 92892,
297
+ "<TOK_338>": 92893,
298
+ "<TOK_339>": 92894,
299
+ "<TOK_33>": 92588,
300
+ "<TOK_340>": 92895,
301
+ "<TOK_341>": 92896,
302
+ "<TOK_342>": 92897,
303
+ "<TOK_343>": 92898,
304
+ "<TOK_344>": 92899,
305
+ "<TOK_345>": 92900,
306
+ "<TOK_346>": 92901,
307
+ "<TOK_347>": 92902,
308
+ "<TOK_348>": 92903,
309
+ "<TOK_349>": 92904,
310
+ "<TOK_34>": 92589,
311
+ "<TOK_350>": 92905,
312
+ "<TOK_351>": 92906,
313
+ "<TOK_352>": 92907,
314
+ "<TOK_353>": 92908,
315
+ "<TOK_354>": 92909,
316
+ "<TOK_355>": 92910,
317
+ "<TOK_356>": 92911,
318
+ "<TOK_357>": 92912,
319
+ "<TOK_358>": 92913,
320
+ "<TOK_359>": 92914,
321
+ "<TOK_35>": 92590,
322
+ "<TOK_360>": 92915,
323
+ "<TOK_361>": 92916,
324
+ "<TOK_362>": 92917,
325
+ "<TOK_363>": 92918,
326
+ "<TOK_364>": 92919,
327
+ "<TOK_365>": 92920,
328
+ "<TOK_366>": 92921,
329
+ "<TOK_367>": 92922,
330
+ "<TOK_368>": 92923,
331
+ "<TOK_369>": 92924,
332
+ "<TOK_36>": 92591,
333
+ "<TOK_370>": 92925,
334
+ "<TOK_371>": 92926,
335
+ "<TOK_372>": 92927,
336
+ "<TOK_373>": 92928,
337
+ "<TOK_374>": 92929,
338
+ "<TOK_375>": 92930,
339
+ "<TOK_376>": 92931,
340
+ "<TOK_377>": 92932,
341
+ "<TOK_378>": 92933,
342
+ "<TOK_379>": 92934,
343
+ "<TOK_37>": 92592,
344
+ "<TOK_380>": 92935,
345
+ "<TOK_381>": 92936,
346
+ "<TOK_382>": 92937,
347
+ "<TOK_383>": 92938,
348
+ "<TOK_384>": 92939,
349
+ "<TOK_385>": 92940,
350
+ "<TOK_386>": 92941,
351
+ "<TOK_387>": 92942,
352
+ "<TOK_388>": 92943,
353
+ "<TOK_389>": 92944,
354
+ "<TOK_38>": 92593,
355
+ "<TOK_390>": 92945,
356
+ "<TOK_391>": 92946,
357
+ "<TOK_392>": 92947,
358
+ "<TOK_393>": 92948,
359
+ "<TOK_394>": 92949,
360
+ "<TOK_395>": 92950,
361
+ "<TOK_396>": 92951,
362
+ "<TOK_397>": 92952,
363
+ "<TOK_398>": 92953,
364
+ "<TOK_399>": 92954,
365
+ "<TOK_39>": 92594,
366
+ "<TOK_3>": 92558,
367
+ "<TOK_400>": 92955,
368
+ "<TOK_401>": 92956,
369
+ "<TOK_402>": 92957,
370
+ "<TOK_403>": 92958,
371
+ "<TOK_404>": 92959,
372
+ "<TOK_405>": 92960,
373
+ "<TOK_406>": 92961,
374
+ "<TOK_407>": 92962,
375
+ "<TOK_408>": 92963,
376
+ "<TOK_409>": 92964,
377
+ "<TOK_40>": 92595,
378
+ "<TOK_410>": 92965,
379
+ "<TOK_411>": 92966,
380
+ "<TOK_412>": 92967,
381
+ "<TOK_413>": 92968,
382
+ "<TOK_414>": 92969,
383
+ "<TOK_415>": 92970,
384
+ "<TOK_416>": 92971,
385
+ "<TOK_417>": 92972,
386
+ "<TOK_418>": 92973,
387
+ "<TOK_419>": 92974,
388
+ "<TOK_41>": 92596,
389
+ "<TOK_420>": 92975,
390
+ "<TOK_421>": 92976,
391
+ "<TOK_422>": 92977,
392
+ "<TOK_423>": 92978,
393
+ "<TOK_424>": 92979,
394
+ "<TOK_425>": 92980,
395
+ "<TOK_426>": 92981,
396
+ "<TOK_427>": 92982,
397
+ "<TOK_428>": 92983,
398
+ "<TOK_429>": 92984,
399
+ "<TOK_42>": 92597,
400
+ "<TOK_430>": 92985,
401
+ "<TOK_431>": 92986,
402
+ "<TOK_432>": 92987,
403
+ "<TOK_433>": 92988,
404
+ "<TOK_434>": 92989,
405
+ "<TOK_435>": 92990,
406
+ "<TOK_436>": 92991,
407
+ "<TOK_437>": 92992,
408
+ "<TOK_438>": 92993,
409
+ "<TOK_439>": 92994,
410
+ "<TOK_43>": 92598,
411
+ "<TOK_440>": 92995,
412
+ "<TOK_441>": 92996,
413
+ "<TOK_442>": 92997,
414
+ "<TOK_443>": 92998,
415
+ "<TOK_444>": 92999,
416
+ "<TOK_445>": 93000,
417
+ "<TOK_446>": 93001,
418
+ "<TOK_447>": 93002,
419
+ "<TOK_448>": 93003,
420
+ "<TOK_449>": 93004,
421
+ "<TOK_44>": 92599,
422
+ "<TOK_450>": 93005,
423
+ "<TOK_451>": 93006,
424
+ "<TOK_452>": 93007,
425
+ "<TOK_453>": 93008,
426
+ "<TOK_454>": 93009,
427
+ "<TOK_455>": 93010,
428
+ "<TOK_456>": 93011,
429
+ "<TOK_457>": 93012,
430
+ "<TOK_458>": 93013,
431
+ "<TOK_459>": 93014,
432
+ "<TOK_45>": 92600,
433
+ "<TOK_460>": 93015,
434
+ "<TOK_461>": 93016,
435
+ "<TOK_462>": 93017,
436
+ "<TOK_463>": 93018,
437
+ "<TOK_464>": 93019,
438
+ "<TOK_465>": 93020,
439
+ "<TOK_466>": 93021,
440
+ "<TOK_467>": 93022,
441
+ "<TOK_468>": 93023,
442
+ "<TOK_469>": 93024,
443
+ "<TOK_46>": 92601,
444
+ "<TOK_470>": 93025,
445
+ "<TOK_471>": 93026,
446
+ "<TOK_472>": 93027,
447
+ "<TOK_473>": 93028,
448
+ "<TOK_474>": 93029,
449
+ "<TOK_475>": 93030,
450
+ "<TOK_476>": 93031,
451
+ "<TOK_477>": 93032,
452
+ "<TOK_478>": 93033,
453
+ "<TOK_479>": 93034,
454
+ "<TOK_47>": 92602,
455
+ "<TOK_480>": 93035,
456
+ "<TOK_481>": 93036,
457
+ "<TOK_482>": 93037,
458
+ "<TOK_483>": 93038,
459
+ "<TOK_484>": 93039,
460
+ "<TOK_485>": 93040,
461
+ "<TOK_486>": 93041,
462
+ "<TOK_487>": 93042,
463
+ "<TOK_488>": 93043,
464
+ "<TOK_489>": 93044,
465
+ "<TOK_48>": 92603,
466
+ "<TOK_490>": 93045,
467
+ "<TOK_491>": 93046,
468
+ "<TOK_492>": 93047,
469
+ "<TOK_493>": 93048,
470
+ "<TOK_494>": 93049,
471
+ "<TOK_495>": 93050,
472
+ "<TOK_496>": 93051,
473
+ "<TOK_497>": 93052,
474
+ "<TOK_498>": 93053,
475
+ "<TOK_499>": 93054,
476
+ "<TOK_49>": 92604,
477
+ "<TOK_4>": 92559,
478
+ "<TOK_500>": 93055,
479
+ "<TOK_501>": 93056,
480
+ "<TOK_502>": 93057,
481
+ "<TOK_503>": 93058,
482
+ "<TOK_504>": 93059,
483
+ "<TOK_505>": 93060,
484
+ "<TOK_506>": 93061,
485
+ "<TOK_507>": 93062,
486
+ "<TOK_508>": 93063,
487
+ "<TOK_509>": 93064,
488
+ "<TOK_50>": 92605,
489
+ "<TOK_510>": 93065,
490
+ "<TOK_511>": 93066,
491
+ "<TOK_512>": 93067,
492
+ "<TOK_513>": 93068,
493
+ "<TOK_514>": 93069,
494
+ "<TOK_515>": 93070,
495
+ "<TOK_516>": 93071,
496
+ "<TOK_517>": 93072,
497
+ "<TOK_518>": 93073,
498
+ "<TOK_519>": 93074,
499
+ "<TOK_51>": 92606,
500
+ "<TOK_520>": 93075,
501
+ "<TOK_521>": 93076,
502
+ "<TOK_522>": 93077,
503
+ "<TOK_523>": 93078,
504
+ "<TOK_524>": 93079,
505
+ "<TOK_525>": 93080,
506
+ "<TOK_526>": 93081,
507
+ "<TOK_527>": 93082,
508
+ "<TOK_528>": 93083,
509
+ "<TOK_529>": 93084,
510
+ "<TOK_52>": 92607,
511
+ "<TOK_530>": 93085,
512
+ "<TOK_531>": 93086,
513
+ "<TOK_532>": 93087,
514
+ "<TOK_533>": 93088,
515
+ "<TOK_534>": 93089,
516
+ "<TOK_535>": 93090,
517
+ "<TOK_536>": 93091,
518
+ "<TOK_537>": 93092,
519
+ "<TOK_538>": 93093,
520
+ "<TOK_539>": 93094,
521
+ "<TOK_53>": 92608,
522
+ "<TOK_540>": 93095,
523
+ "<TOK_541>": 93096,
524
+ "<TOK_542>": 93097,
525
+ "<TOK_543>": 93098,
526
+ "<TOK_544>": 93099,
527
+ "<TOK_545>": 93100,
528
+ "<TOK_546>": 93101,
529
+ "<TOK_547>": 93102,
530
+ "<TOK_548>": 93103,
531
+ "<TOK_549>": 93104,
532
+ "<TOK_54>": 92609,
533
+ "<TOK_550>": 93105,
534
+ "<TOK_551>": 93106,
535
+ "<TOK_552>": 93107,
536
+ "<TOK_553>": 93108,
537
+ "<TOK_554>": 93109,
538
+ "<TOK_555>": 93110,
539
+ "<TOK_556>": 93111,
540
+ "<TOK_557>": 93112,
541
+ "<TOK_558>": 93113,
542
+ "<TOK_559>": 93114,
543
+ "<TOK_55>": 92610,
544
+ "<TOK_560>": 93115,
545
+ "<TOK_561>": 93116,
546
+ "<TOK_562>": 93117,
547
+ "<TOK_563>": 93118,
548
+ "<TOK_564>": 93119,
549
+ "<TOK_565>": 93120,
550
+ "<TOK_566>": 93121,
551
+ "<TOK_567>": 93122,
552
+ "<TOK_568>": 93123,
553
+ "<TOK_569>": 93124,
554
+ "<TOK_56>": 92611,
555
+ "<TOK_570>": 93125,
556
+ "<TOK_571>": 93126,
557
+ "<TOK_572>": 93127,
558
+ "<TOK_573>": 93128,
559
+ "<TOK_574>": 93129,
560
+ "<TOK_575>": 93130,
561
+ "<TOK_576>": 93131,
562
+ "<TOK_577>": 93132,
563
+ "<TOK_578>": 93133,
564
+ "<TOK_579>": 93134,
565
+ "<TOK_57>": 92612,
566
+ "<TOK_580>": 93135,
567
+ "<TOK_581>": 93136,
568
+ "<TOK_582>": 93137,
569
+ "<TOK_583>": 93138,
570
+ "<TOK_584>": 93139,
571
+ "<TOK_585>": 93140,
572
+ "<TOK_586>": 93141,
573
+ "<TOK_587>": 93142,
574
+ "<TOK_588>": 93143,
575
+ "<TOK_589>": 93144,
576
+ "<TOK_58>": 92613,
577
+ "<TOK_590>": 93145,
578
+ "<TOK_591>": 93146,
579
+ "<TOK_592>": 93147,
580
+ "<TOK_593>": 93148,
581
+ "<TOK_594>": 93149,
582
+ "<TOK_595>": 93150,
583
+ "<TOK_596>": 93151,
584
+ "<TOK_597>": 93152,
585
+ "<TOK_598>": 93153,
586
+ "<TOK_599>": 93154,
587
+ "<TOK_59>": 92614,
588
+ "<TOK_5>": 92560,
589
+ "<TOK_600>": 93155,
590
+ "<TOK_601>": 93156,
591
+ "<TOK_602>": 93157,
592
+ "<TOK_603>": 93158,
593
+ "<TOK_604>": 93159,
594
+ "<TOK_605>": 93160,
595
+ "<TOK_606>": 93161,
596
+ "<TOK_607>": 93162,
597
+ "<TOK_608>": 93163,
598
+ "<TOK_609>": 93164,
599
+ "<TOK_60>": 92615,
600
+ "<TOK_610>": 93165,
601
+ "<TOK_611>": 93166,
602
+ "<TOK_612>": 93167,
603
+ "<TOK_613>": 93168,
604
+ "<TOK_614>": 93169,
605
+ "<TOK_615>": 93170,
606
+ "<TOK_616>": 93171,
607
+ "<TOK_617>": 93172,
608
+ "<TOK_618>": 93173,
609
+ "<TOK_619>": 93174,
610
+ "<TOK_61>": 92616,
611
+ "<TOK_620>": 93175,
612
+ "<TOK_621>": 93176,
613
+ "<TOK_622>": 93177,
614
+ "<TOK_623>": 93178,
615
+ "<TOK_624>": 93179,
616
+ "<TOK_625>": 93180,
617
+ "<TOK_626>": 93181,
618
+ "<TOK_627>": 93182,
619
+ "<TOK_628>": 93183,
620
+ "<TOK_629>": 93184,
621
+ "<TOK_62>": 92617,
622
+ "<TOK_630>": 93185,
623
+ "<TOK_631>": 93186,
624
+ "<TOK_632>": 93187,
625
+ "<TOK_633>": 93188,
626
+ "<TOK_634>": 93189,
627
+ "<TOK_635>": 93190,
628
+ "<TOK_636>": 93191,
629
+ "<TOK_637>": 93192,
630
+ "<TOK_638>": 93193,
631
+ "<TOK_639>": 93194,
632
+ "<TOK_63>": 92618,
633
+ "<TOK_640>": 93195,
634
+ "<TOK_641>": 93196,
635
+ "<TOK_642>": 93197,
636
+ "<TOK_643>": 93198,
637
+ "<TOK_644>": 93199,
638
+ "<TOK_645>": 93200,
639
+ "<TOK_646>": 93201,
640
+ "<TOK_647>": 93202,
641
+ "<TOK_648>": 93203,
642
+ "<TOK_649>": 93204,
643
+ "<TOK_64>": 92619,
644
+ "<TOK_650>": 93205,
645
+ "<TOK_651>": 93206,
646
+ "<TOK_652>": 93207,
647
+ "<TOK_653>": 93208,
648
+ "<TOK_654>": 93209,
649
+ "<TOK_655>": 93210,
650
+ "<TOK_656>": 93211,
651
+ "<TOK_657>": 93212,
652
+ "<TOK_658>": 93213,
653
+ "<TOK_659>": 93214,
654
+ "<TOK_65>": 92620,
655
+ "<TOK_660>": 93215,
656
+ "<TOK_661>": 93216,
657
+ "<TOK_662>": 93217,
658
+ "<TOK_663>": 93218,
659
+ "<TOK_664>": 93219,
660
+ "<TOK_665>": 93220,
661
+ "<TOK_666>": 93221,
662
+ "<TOK_667>": 93222,
663
+ "<TOK_668>": 93223,
664
+ "<TOK_669>": 93224,
665
+ "<TOK_66>": 92621,
666
+ "<TOK_670>": 93225,
667
+ "<TOK_671>": 93226,
668
+ "<TOK_672>": 93227,
669
+ "<TOK_673>": 93228,
670
+ "<TOK_674>": 93229,
671
+ "<TOK_675>": 93230,
672
+ "<TOK_676>": 93231,
673
+ "<TOK_677>": 93232,
674
+ "<TOK_678>": 93233,
675
+ "<TOK_679>": 93234,
676
+ "<TOK_67>": 92622,
677
+ "<TOK_680>": 93235,
678
+ "<TOK_681>": 93236,
679
+ "<TOK_682>": 93237,
680
+ "<TOK_683>": 93238,
681
+ "<TOK_684>": 93239,
682
+ "<TOK_685>": 93240,
683
+ "<TOK_686>": 93241,
684
+ "<TOK_687>": 93242,
685
+ "<TOK_688>": 93243,
686
+ "<TOK_689>": 93244,
687
+ "<TOK_68>": 92623,
688
+ "<TOK_690>": 93245,
689
+ "<TOK_691>": 93246,
690
+ "<TOK_692>": 93247,
691
+ "<TOK_693>": 93248,
692
+ "<TOK_694>": 93249,
693
+ "<TOK_695>": 93250,
694
+ "<TOK_696>": 93251,
695
+ "<TOK_697>": 93252,
696
+ "<TOK_698>": 93253,
697
+ "<TOK_699>": 93254,
698
+ "<TOK_69>": 92624,
699
+ "<TOK_6>": 92561,
700
+ "<TOK_700>": 93255,
701
+ "<TOK_701>": 93256,
702
+ "<TOK_702>": 93257,
703
+ "<TOK_703>": 93258,
704
+ "<TOK_704>": 93259,
705
+ "<TOK_705>": 93260,
706
+ "<TOK_706>": 93261,
707
+ "<TOK_707>": 93262,
708
+ "<TOK_708>": 93263,
709
+ "<TOK_709>": 93264,
710
+ "<TOK_70>": 92625,
711
+ "<TOK_710>": 93265,
712
+ "<TOK_711>": 93266,
713
+ "<TOK_712>": 93267,
714
+ "<TOK_713>": 93268,
715
+ "<TOK_714>": 93269,
716
+ "<TOK_715>": 93270,
717
+ "<TOK_716>": 93271,
718
+ "<TOK_717>": 93272,
719
+ "<TOK_718>": 93273,
720
+ "<TOK_719>": 93274,
721
+ "<TOK_71>": 92626,
722
+ "<TOK_720>": 93275,
723
+ "<TOK_721>": 93276,
724
+ "<TOK_722>": 93277,
725
+ "<TOK_723>": 93278,
726
+ "<TOK_724>": 93279,
727
+ "<TOK_725>": 93280,
728
+ "<TOK_726>": 93281,
729
+ "<TOK_727>": 93282,
730
+ "<TOK_728>": 93283,
731
+ "<TOK_729>": 93284,
732
+ "<TOK_72>": 92627,
733
+ "<TOK_730>": 93285,
734
+ "<TOK_731>": 93286,
735
+ "<TOK_732>": 93287,
736
+ "<TOK_733>": 93288,
737
+ "<TOK_734>": 93289,
738
+ "<TOK_735>": 93290,
739
+ "<TOK_736>": 93291,
740
+ "<TOK_737>": 93292,
741
+ "<TOK_738>": 93293,
742
+ "<TOK_739>": 93294,
743
+ "<TOK_73>": 92628,
744
+ "<TOK_740>": 93295,
745
+ "<TOK_741>": 93296,
746
+ "<TOK_742>": 93297,
747
+ "<TOK_743>": 93298,
748
+ "<TOK_744>": 93299,
749
+ "<TOK_745>": 93300,
750
+ "<TOK_746>": 93301,
751
+ "<TOK_747>": 93302,
752
+ "<TOK_748>": 93303,
753
+ "<TOK_749>": 93304,
754
+ "<TOK_74>": 92629,
755
+ "<TOK_750>": 93305,
756
+ "<TOK_751>": 93306,
757
+ "<TOK_752>": 93307,
758
+ "<TOK_753>": 93308,
759
+ "<TOK_754>": 93309,
760
+ "<TOK_755>": 93310,
761
+ "<TOK_756>": 93311,
762
+ "<TOK_757>": 93312,
763
+ "<TOK_758>": 93313,
764
+ "<TOK_759>": 93314,
765
+ "<TOK_75>": 92630,
766
+ "<TOK_760>": 93315,
767
+ "<TOK_761>": 93316,
768
+ "<TOK_762>": 93317,
769
+ "<TOK_763>": 93318,
770
+ "<TOK_764>": 93319,
771
+ "<TOK_765>": 93320,
772
+ "<TOK_766>": 93321,
773
+ "<TOK_767>": 93322,
774
+ "<TOK_768>": 93323,
775
+ "<TOK_769>": 93324,
776
+ "<TOK_76>": 92631,
777
+ "<TOK_770>": 93325,
778
+ "<TOK_771>": 93326,
779
+ "<TOK_772>": 93327,
780
+ "<TOK_773>": 93328,
781
+ "<TOK_774>": 93329,
782
+ "<TOK_775>": 93330,
783
+ "<TOK_776>": 93331,
784
+ "<TOK_777>": 93332,
785
+ "<TOK_778>": 93333,
786
+ "<TOK_779>": 93334,
787
+ "<TOK_77>": 92632,
788
+ "<TOK_780>": 93335,
789
+ "<TOK_781>": 93336,
790
+ "<TOK_782>": 93337,
791
+ "<TOK_783>": 93338,
792
+ "<TOK_784>": 93339,
793
+ "<TOK_785>": 93340,
794
+ "<TOK_786>": 93341,
795
+ "<TOK_787>": 93342,
796
+ "<TOK_788>": 93343,
797
+ "<TOK_789>": 93344,
798
+ "<TOK_78>": 92633,
799
+ "<TOK_790>": 93345,
800
+ "<TOK_791>": 93346,
801
+ "<TOK_792>": 93347,
802
+ "<TOK_793>": 93348,
803
+ "<TOK_794>": 93349,
804
+ "<TOK_795>": 93350,
805
+ "<TOK_796>": 93351,
806
+ "<TOK_797>": 93352,
807
+ "<TOK_798>": 93353,
808
+ "<TOK_799>": 93354,
809
+ "<TOK_79>": 92634,
810
+ "<TOK_7>": 92562,
811
+ "<TOK_800>": 93355,
812
+ "<TOK_801>": 93356,
813
+ "<TOK_802>": 93357,
814
+ "<TOK_803>": 93358,
815
+ "<TOK_804>": 93359,
816
+ "<TOK_805>": 93360,
817
+ "<TOK_806>": 93361,
818
+ "<TOK_807>": 93362,
819
+ "<TOK_808>": 93363,
820
+ "<TOK_809>": 93364,
821
+ "<TOK_80>": 92635,
822
+ "<TOK_810>": 93365,
823
+ "<TOK_811>": 93366,
824
+ "<TOK_812>": 93367,
825
+ "<TOK_813>": 93368,
826
+ "<TOK_814>": 93369,
827
+ "<TOK_815>": 93370,
828
+ "<TOK_816>": 93371,
829
+ "<TOK_817>": 93372,
830
+ "<TOK_818>": 93373,
831
+ "<TOK_819>": 93374,
832
+ "<TOK_81>": 92636,
833
+ "<TOK_820>": 93375,
834
+ "<TOK_821>": 93376,
835
+ "<TOK_822>": 93377,
836
+ "<TOK_823>": 93378,
837
+ "<TOK_824>": 93379,
838
+ "<TOK_825>": 93380,
839
+ "<TOK_826>": 93381,
840
+ "<TOK_827>": 93382,
841
+ "<TOK_828>": 93383,
842
+ "<TOK_829>": 93384,
843
+ "<TOK_82>": 92637,
844
+ "<TOK_830>": 93385,
845
+ "<TOK_831>": 93386,
846
+ "<TOK_832>": 93387,
847
+ "<TOK_833>": 93388,
848
+ "<TOK_834>": 93389,
849
+ "<TOK_835>": 93390,
850
+ "<TOK_836>": 93391,
851
+ "<TOK_837>": 93392,
852
+ "<TOK_838>": 93393,
853
+ "<TOK_839>": 93394,
854
+ "<TOK_83>": 92638,
855
+ "<TOK_840>": 93395,
856
+ "<TOK_841>": 93396,
857
+ "<TOK_842>": 93397,
858
+ "<TOK_843>": 93398,
859
+ "<TOK_844>": 93399,
860
+ "<TOK_845>": 93400,
861
+ "<TOK_846>": 93401,
862
+ "<TOK_847>": 93402,
863
+ "<TOK_848>": 93403,
864
+ "<TOK_849>": 93404,
865
+ "<TOK_84>": 92639,
866
+ "<TOK_850>": 93405,
867
+ "<TOK_851>": 93406,
868
+ "<TOK_852>": 93407,
869
+ "<TOK_853>": 93408,
870
+ "<TOK_854>": 93409,
871
+ "<TOK_855>": 93410,
872
+ "<TOK_856>": 93411,
873
+ "<TOK_857>": 93412,
874
+ "<TOK_858>": 93413,
875
+ "<TOK_859>": 93414,
876
+ "<TOK_85>": 92640,
877
+ "<TOK_860>": 93415,
878
+ "<TOK_861>": 93416,
879
+ "<TOK_862>": 93417,
880
+ "<TOK_863>": 93418,
881
+ "<TOK_864>": 93419,
882
+ "<TOK_865>": 93420,
883
+ "<TOK_866>": 93421,
884
+ "<TOK_867>": 93422,
885
+ "<TOK_868>": 93423,
886
+ "<TOK_869>": 93424,
887
+ "<TOK_86>": 92641,
888
+ "<TOK_870>": 93425,
889
+ "<TOK_871>": 93426,
890
+ "<TOK_872>": 93427,
891
+ "<TOK_873>": 93428,
892
+ "<TOK_874>": 93429,
893
+ "<TOK_875>": 93430,
894
+ "<TOK_876>": 93431,
895
+ "<TOK_877>": 93432,
896
+ "<TOK_878>": 93433,
897
+ "<TOK_879>": 93434,
898
+ "<TOK_87>": 92642,
899
+ "<TOK_880>": 93435,
900
+ "<TOK_881>": 93436,
901
+ "<TOK_882>": 93437,
902
+ "<TOK_883>": 93438,
903
+ "<TOK_884>": 93439,
904
+ "<TOK_885>": 93440,
905
+ "<TOK_886>": 93441,
906
+ "<TOK_887>": 93442,
907
+ "<TOK_888>": 93443,
908
+ "<TOK_889>": 93444,
909
+ "<TOK_88>": 92643,
910
+ "<TOK_890>": 93445,
911
+ "<TOK_891>": 93446,
912
+ "<TOK_892>": 93447,
913
+ "<TOK_893>": 93448,
914
+ "<TOK_894>": 93449,
915
+ "<TOK_895>": 93450,
916
+ "<TOK_896>": 93451,
917
+ "<TOK_897>": 93452,
918
+ "<TOK_898>": 93453,
919
+ "<TOK_899>": 93454,
920
+ "<TOK_89>": 92644,
921
+ "<TOK_8>": 92563,
922
+ "<TOK_900>": 93455,
923
+ "<TOK_901>": 93456,
924
+ "<TOK_902>": 93457,
925
+ "<TOK_903>": 93458,
926
+ "<TOK_904>": 93459,
927
+ "<TOK_905>": 93460,
928
+ "<TOK_906>": 93461,
929
+ "<TOK_907>": 93462,
930
+ "<TOK_908>": 93463,
931
+ "<TOK_909>": 93464,
932
+ "<TOK_90>": 92645,
933
+ "<TOK_910>": 93465,
934
+ "<TOK_911>": 93466,
935
+ "<TOK_912>": 93467,
936
+ "<TOK_913>": 93468,
937
+ "<TOK_914>": 93469,
938
+ "<TOK_915>": 93470,
939
+ "<TOK_916>": 93471,
940
+ "<TOK_917>": 93472,
941
+ "<TOK_918>": 93473,
942
+ "<TOK_919>": 93474,
943
+ "<TOK_91>": 92646,
944
+ "<TOK_920>": 93475,
945
+ "<TOK_921>": 93476,
946
+ "<TOK_922>": 93477,
947
+ "<TOK_923>": 93478,
948
+ "<TOK_924>": 93479,
949
+ "<TOK_925>": 93480,
950
+ "<TOK_926>": 93481,
951
+ "<TOK_927>": 93482,
952
+ "<TOK_928>": 93483,
953
+ "<TOK_929>": 93484,
954
+ "<TOK_92>": 92647,
955
+ "<TOK_930>": 93485,
956
+ "<TOK_931>": 93486,
957
+ "<TOK_932>": 93487,
958
+ "<TOK_933>": 93488,
959
+ "<TOK_934>": 93489,
960
+ "<TOK_935>": 93490,
961
+ "<TOK_936>": 93491,
962
+ "<TOK_937>": 93492,
963
+ "<TOK_938>": 93493,
964
+ "<TOK_939>": 93494,
965
+ "<TOK_93>": 92648,
966
+ "<TOK_940>": 93495,
967
+ "<TOK_941>": 93496,
968
+ "<TOK_942>": 93497,
969
+ "<TOK_943>": 93498,
970
+ "<TOK_944>": 93499,
971
+ "<TOK_945>": 93500,
972
+ "<TOK_946>": 93501,
973
+ "<TOK_947>": 93502,
974
+ "<TOK_948>": 93503,
975
+ "<TOK_949>": 93504,
976
+ "<TOK_94>": 92649,
977
+ "<TOK_950>": 93505,
978
+ "<TOK_951>": 93506,
979
+ "<TOK_952>": 93507,
980
+ "<TOK_953>": 93508,
981
+ "<TOK_954>": 93509,
982
+ "<TOK_955>": 93510,
983
+ "<TOK_956>": 93511,
984
+ "<TOK_957>": 93512,
985
+ "<TOK_958>": 93513,
986
+ "<TOK_959>": 93514,
987
+ "<TOK_95>": 92650,
988
+ "<TOK_960>": 93515,
989
+ "<TOK_961>": 93516,
990
+ "<TOK_962>": 93517,
991
+ "<TOK_963>": 93518,
992
+ "<TOK_964>": 93519,
993
+ "<TOK_965>": 93520,
994
+ "<TOK_966>": 93521,
995
+ "<TOK_967>": 93522,
996
+ "<TOK_968>": 93523,
997
+ "<TOK_969>": 93524,
998
+ "<TOK_96>": 92651,
999
+ "<TOK_970>": 93525,
1000
+ "<TOK_971>": 93526,
1001
+ "<TOK_972>": 93527,
1002
+ "<TOK_973>": 93528,
1003
+ "<TOK_974>": 93529,
1004
+ "<TOK_975>": 93530,
1005
+ "<TOK_976>": 93531,
1006
+ "<TOK_977>": 93532,
1007
+ "<TOK_978>": 93533,
1008
+ "<TOK_979>": 93534,
1009
+ "<TOK_97>": 92652,
1010
+ "<TOK_980>": 93535,
1011
+ "<TOK_981>": 93536,
1012
+ "<TOK_982>": 93537,
1013
+ "<TOK_983>": 93538,
1014
+ "<TOK_984>": 93539,
1015
+ "<TOK_985>": 93540,
1016
+ "<TOK_986>": 93541,
1017
+ "<TOK_987>": 93542,
1018
+ "<TOK_988>": 93543,
1019
+ "<TOK_989>": 93544,
1020
+ "<TOK_98>": 92653,
1021
+ "<TOK_990>": 93545,
1022
+ "<TOK_991>": 93546,
1023
+ "<TOK_992>": 93547,
1024
+ "<TOK_993>": 93548,
1025
+ "<TOK_994>": 93549,
1026
+ "<TOK_995>": 93550,
1027
+ "<TOK_996>": 93551,
1028
+ "<TOK_997>": 93552,
1029
+ "<TOK_998>": 93553,
1030
+ "<TOK_999>": 93554,
1031
+ "<TOK_99>": 92654,
1032
+ "<TOK_9>": 92564,
1033
+ "<box>": 92551,
1034
+ "<img>": 92544,
1035
+ "<quad>": 92547,
1036
+ "<ref>": 92549
1037
+ }
config.json ADDED
@@ -0,0 +1,206 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_commit_hash": null,
3
+ "_name_or_path": "/mnt/checkpoints/internvl_seg/0506_internvl_himtokv2_adaptive_exp_joint_train_ov/checkpoint-28000",
4
+ "architectures": [
5
+ "InternVLChatModel"
6
+ ],
7
+ "auto_map": {
8
+ "AutoConfig": "configuration_internvl_chat.InternVLChatConfig",
9
+ "AutoModel": "modeling_internvl_chat.InternVLChatModel",
10
+ "AutoModelForCausalLM": "modeling_internvl_chat.InternVLChatModel"
11
+ },
12
+ "downsample_ratio": 0.5,
13
+ "dynamic_image_size": true,
14
+ "force_image_size": 448,
15
+ "hidden_size": 4096,
16
+ "llm_config": {
17
+ "_attn_implementation_autoset": true,
18
+ "_name_or_path": "internlm/internlm2_5-7b-chat",
19
+ "add_cross_attention": false,
20
+ "architectures": [
21
+ "InternLM2ForCausalLM"
22
+ ],
23
+ "attn_implementation": "flash_attention_2",
24
+ "auto_map": {
25
+ "AutoConfig": "configuration_internlm2.InternLM2Config",
26
+ "AutoModel": "modeling_internlm2.InternLM2ForCausalLM",
27
+ "AutoModelForCausalLM": "modeling_internlm2.InternLM2ForCausalLM",
28
+ "AutoModelForSequenceClassification": "modeling_internlm2.InternLM2ForSequenceClassification"
29
+ },
30
+ "bad_words_ids": null,
31
+ "begin_suppress_tokens": null,
32
+ "bias": false,
33
+ "bos_token_id": 1,
34
+ "chunk_size_feed_forward": 0,
35
+ "cross_attention_hidden_size": null,
36
+ "decoder_start_token_id": null,
37
+ "diversity_penalty": 0.0,
38
+ "do_sample": false,
39
+ "early_stopping": false,
40
+ "encoder_no_repeat_ngram_size": 0,
41
+ "eos_token_id": 2,
42
+ "exponential_decay_length_penalty": null,
43
+ "finetuning_task": null,
44
+ "forced_bos_token_id": null,
45
+ "forced_eos_token_id": null,
46
+ "hidden_act": "silu",
47
+ "hidden_size": 4096,
48
+ "id2label": {
49
+ "0": "LABEL_0",
50
+ "1": "LABEL_1"
51
+ },
52
+ "initializer_range": 0.02,
53
+ "intermediate_size": 14336,
54
+ "is_decoder": false,
55
+ "is_encoder_decoder": false,
56
+ "label2id": {
57
+ "LABEL_0": 0,
58
+ "LABEL_1": 1
59
+ },
60
+ "length_penalty": 1.0,
61
+ "max_length": 20,
62
+ "max_position_embeddings": 32768,
63
+ "min_length": 0,
64
+ "model_type": "internlm2",
65
+ "no_repeat_ngram_size": 0,
66
+ "num_attention_heads": 32,
67
+ "num_beam_groups": 1,
68
+ "num_beams": 1,
69
+ "num_hidden_layers": 32,
70
+ "num_key_value_heads": 8,
71
+ "num_return_sequences": 1,
72
+ "output_attentions": false,
73
+ "output_hidden_states": false,
74
+ "output_scores": false,
75
+ "pad_token_id": 2,
76
+ "prefix": null,
77
+ "pretraining_tp": 1,
78
+ "problem_type": null,
79
+ "pruned_heads": {},
80
+ "remove_invalid_values": false,
81
+ "repetition_penalty": 1.0,
82
+ "return_dict": true,
83
+ "return_dict_in_generate": false,
84
+ "rms_norm_eps": 1e-05,
85
+ "rope_scaling": {
86
+ "factor": 2.0,
87
+ "type": "dynamic"
88
+ },
89
+ "rope_theta": 1000000,
90
+ "sep_token_id": null,
91
+ "suppress_tokens": null,
92
+ "task_specific_params": null,
93
+ "temperature": 1.0,
94
+ "tf_legacy_loss": false,
95
+ "tie_encoder_decoder": false,
96
+ "tie_word_embeddings": false,
97
+ "tokenizer_class": null,
98
+ "top_k": 50,
99
+ "top_p": 1.0,
100
+ "torch_dtype": "bfloat16",
101
+ "torchscript": false,
102
+ "transformers_version": "4.46.3",
103
+ "typical_p": 1.0,
104
+ "use_bfloat16": true,
105
+ "use_cache": false,
106
+ "vocab_size": 93579
107
+ },
108
+ "max_dynamic_patch": 1,
109
+ "min_dynamic_patch": 1,
110
+ "model_type": "internvl_chat",
111
+ "pad2square": false,
112
+ "ps_version": "v2",
113
+ "select_layer": -1,
114
+ "template": "internvl2_5",
115
+ "tie_word_embeddings": false,
116
+ "torch_dtype": "bfloat16",
117
+ "transformers_version": null,
118
+ "use_backbone_lora": 0,
119
+ "use_llm_lora": 0,
120
+ "use_thumbnail": true,
121
+ "vision_config": {
122
+ "_attn_implementation_autoset": true,
123
+ "_name_or_path": "",
124
+ "add_cross_attention": false,
125
+ "architectures": [
126
+ "InternVisionModel"
127
+ ],
128
+ "attention_dropout": 0.0,
129
+ "bad_words_ids": null,
130
+ "begin_suppress_tokens": null,
131
+ "bos_token_id": null,
132
+ "chunk_size_feed_forward": 0,
133
+ "cross_attention_hidden_size": null,
134
+ "decoder_start_token_id": null,
135
+ "diversity_penalty": 0.0,
136
+ "do_sample": false,
137
+ "drop_path_rate": 0.1,
138
+ "dropout": 0.0,
139
+ "early_stopping": false,
140
+ "encoder_no_repeat_ngram_size": 0,
141
+ "eos_token_id": null,
142
+ "exponential_decay_length_penalty": null,
143
+ "finetuning_task": null,
144
+ "forced_bos_token_id": null,
145
+ "forced_eos_token_id": null,
146
+ "hidden_act": "gelu",
147
+ "hidden_size": 1024,
148
+ "id2label": {
149
+ "0": "LABEL_0",
150
+ "1": "LABEL_1"
151
+ },
152
+ "image_size": 448,
153
+ "initializer_factor": 1.0,
154
+ "initializer_range": 0.02,
155
+ "intermediate_size": 4096,
156
+ "is_decoder": false,
157
+ "is_encoder_decoder": false,
158
+ "label2id": {
159
+ "LABEL_0": 0,
160
+ "LABEL_1": 1
161
+ },
162
+ "layer_norm_eps": 1e-06,
163
+ "length_penalty": 1.0,
164
+ "max_length": 20,
165
+ "min_length": 0,
166
+ "model_type": "intern_vit_6b",
167
+ "no_repeat_ngram_size": 0,
168
+ "norm_type": "layer_norm",
169
+ "num_attention_heads": 16,
170
+ "num_beam_groups": 1,
171
+ "num_beams": 1,
172
+ "num_channels": 3,
173
+ "num_hidden_layers": 24,
174
+ "num_return_sequences": 1,
175
+ "output_attentions": false,
176
+ "output_hidden_states": false,
177
+ "output_scores": false,
178
+ "pad_token_id": null,
179
+ "patch_size": 14,
180
+ "prefix": null,
181
+ "problem_type": null,
182
+ "pruned_heads": {},
183
+ "qk_normalization": false,
184
+ "qkv_bias": true,
185
+ "remove_invalid_values": false,
186
+ "repetition_penalty": 1.0,
187
+ "return_dict": true,
188
+ "return_dict_in_generate": false,
189
+ "sep_token_id": null,
190
+ "suppress_tokens": null,
191
+ "task_specific_params": null,
192
+ "temperature": 1.0,
193
+ "tf_legacy_loss": false,
194
+ "tie_encoder_decoder": false,
195
+ "tie_word_embeddings": true,
196
+ "tokenizer_class": null,
197
+ "top_k": 50,
198
+ "top_p": 1.0,
199
+ "torch_dtype": "bfloat16",
200
+ "torchscript": false,
201
+ "transformers_version": "4.46.3",
202
+ "typical_p": 1.0,
203
+ "use_bfloat16": true,
204
+ "use_flash_attn": true
205
+ }
206
+ }
generation_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "_from_model_config": true,
3
+ "eos_token_id": [
4
+ 92542,
5
+ 92543
6
+ ],
7
+ "transformers_version": "4.46.3"
8
+ }
model-00001-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f28282abf82a558e8f6e319610e68f676731b626da62fd0687f32205f36d5e98
3
+ size 3992013192
model-00002-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:b64d9e9875ce6006ac2559b15df439b8cdeeb2c4c57172f1fb9111da3cadf0a5
3
+ size 3976355888
model-00003-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:ad03429cdddba87f6a79a22503a8accf7948e315e1a33bb3300f0f76895df11b
3
+ size 3959578704
model-00004-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:bdcef2063296f11ba3b082db83f6af7b3c59938ebf304203e3365607a1ae5de2
3
+ size 3405921192
model-00005-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d4af41d52432d8be12907d0ff351814f6dd0d4727371a2d00d58c610262fdc00
3
+ size 2748025063
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
preprocessor_config.json ADDED
@@ -0,0 +1,19 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "crop_size": 448,
3
+ "do_center_crop": true,
4
+ "do_normalize": true,
5
+ "do_resize": true,
6
+ "feature_extractor_type": "CLIPFeatureExtractor",
7
+ "image_mean": [
8
+ 0.485,
9
+ 0.456,
10
+ 0.406
11
+ ],
12
+ "image_std": [
13
+ 0.229,
14
+ 0.224,
15
+ 0.225
16
+ ],
17
+ "resample": 3,
18
+ "size": 448
19
+ }
special_tokens_map.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "additional_special_tokens": [
3
+ "<|im_start|>",
4
+ "<|im_end|>",
5
+ "<|action_start|>",
6
+ "<|action_end|>",
7
+ "<|interpreter|>",
8
+ "<|plugin|>",
9
+ "<img>",
10
+ "</img>",
11
+ "<IMG_CONTEXT>",
12
+ "<quad>",
13
+ "</quad>",
14
+ "<ref>",
15
+ "</ref>",
16
+ "<box>",
17
+ "</box>"
18
+ ],
19
+ "bos_token": {
20
+ "content": "<s>",
21
+ "lstrip": false,
22
+ "normalized": false,
23
+ "rstrip": false,
24
+ "single_word": false
25
+ },
26
+ "eos_token": {
27
+ "content": "</s>",
28
+ "lstrip": false,
29
+ "normalized": false,
30
+ "rstrip": false,
31
+ "single_word": false
32
+ },
33
+ "pad_token": {
34
+ "content": "</s>",
35
+ "lstrip": false,
36
+ "normalized": false,
37
+ "rstrip": false,
38
+ "single_word": false
39
+ },
40
+ "unk_token": {
41
+ "content": "<unk>",
42
+ "lstrip": false,
43
+ "normalized": false,
44
+ "rstrip": false,
45
+ "single_word": false
46
+ }
47
+ }
tokenization_internlm2.py ADDED
@@ -0,0 +1,235 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) The InternLM team and The HuggingFace Inc. team. All rights reserved.
2
+ #
3
+ # This code is based on transformers/src/transformers/models/llama/tokenization_llama.py
4
+ #
5
+ # Licensed under the Apache License, Version 2.0 (the "License");
6
+ # you may not use this file except in compliance with the License.
7
+ # You may obtain a copy of the License at
8
+ #
9
+ # http://www.apache.org/licenses/LICENSE-2.0
10
+ #
11
+ # Unless required by applicable law or agreed to in writing, software
12
+ # distributed under the License is distributed on an "AS IS" BASIS,
13
+ # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
14
+ # See the License for the specific language governing permissions and
15
+ # limitations under the License.
16
+
17
+ """Tokenization classes for InternLM."""
18
+ import os
19
+ from shutil import copyfile
20
+ from typing import Any, Dict, List, Optional, Tuple
21
+
22
+ import sentencepiece as spm
23
+ from transformers.tokenization_utils import PreTrainedTokenizer
24
+ from transformers.utils import logging
25
+
26
+ logger = logging.get_logger(__name__)
27
+
28
+ VOCAB_FILES_NAMES = {'vocab_file': './tokenizer.model'}
29
+
30
+ PRETRAINED_VOCAB_FILES_MAP = {}
31
+
32
+
33
+ # Modified from transformers.model.llama.tokenization_llama.LlamaTokenizer
34
+ class InternLM2Tokenizer(PreTrainedTokenizer):
35
+ """
36
+ Construct a InternLM2 tokenizer. Based on byte-level Byte-Pair-Encoding.
37
+
38
+ Args:
39
+ vocab_file (`str`):
40
+ Path to the vocabulary file.
41
+ """
42
+
43
+ vocab_files_names = VOCAB_FILES_NAMES
44
+ pretrained_vocab_files_map = PRETRAINED_VOCAB_FILES_MAP
45
+ model_input_names = ['input_ids', 'attention_mask']
46
+ _auto_class = 'AutoTokenizer'
47
+
48
+ def __init__(
49
+ self,
50
+ vocab_file,
51
+ unk_token='<unk>',
52
+ bos_token='<s>',
53
+ eos_token='</s>',
54
+ pad_token='</s>',
55
+ sp_model_kwargs: Optional[Dict[str, Any]] = None,
56
+ add_bos_token=True,
57
+ add_eos_token=False,
58
+ decode_with_prefix_space=False,
59
+ clean_up_tokenization_spaces=False,
60
+ **kwargs,
61
+ ):
62
+ self.sp_model_kwargs = {} if sp_model_kwargs is None else sp_model_kwargs
63
+ self.vocab_file = vocab_file
64
+ self.add_bos_token = add_bos_token
65
+ self.add_eos_token = add_eos_token
66
+ self.decode_with_prefix_space = decode_with_prefix_space
67
+ self.sp_model = spm.SentencePieceProcessor(**self.sp_model_kwargs)
68
+ self.sp_model.Load(vocab_file)
69
+ self._no_prefix_space_tokens = None
70
+ super().__init__(
71
+ bos_token=bos_token,
72
+ eos_token=eos_token,
73
+ unk_token=unk_token,
74
+ pad_token=pad_token,
75
+ clean_up_tokenization_spaces=clean_up_tokenization_spaces,
76
+ **kwargs,
77
+ )
78
+
79
+ @property
80
+ def no_prefix_space_tokens(self):
81
+ if self._no_prefix_space_tokens is None:
82
+ vocab = self.convert_ids_to_tokens(list(range(self.vocab_size)))
83
+ self._no_prefix_space_tokens = {i for i, tok in enumerate(vocab) if not tok.startswith('▁')}
84
+ return self._no_prefix_space_tokens
85
+
86
+ @property
87
+ def vocab_size(self):
88
+ """Returns vocab size"""
89
+ return self.sp_model.get_piece_size()
90
+
91
+ @property
92
+ def bos_token_id(self) -> Optional[int]:
93
+ return self.sp_model.bos_id()
94
+
95
+ @property
96
+ def eos_token_id(self) -> Optional[int]:
97
+ return self.sp_model.eos_id()
98
+
99
+ def get_vocab(self):
100
+ """Returns vocab as a dict"""
101
+ vocab = {self.convert_ids_to_tokens(i): i for i in range(self.vocab_size)}
102
+ vocab.update(self.added_tokens_encoder)
103
+ return vocab
104
+
105
+ def _tokenize(self, text):
106
+ """Returns a tokenized string."""
107
+ return self.sp_model.encode(text, out_type=str)
108
+
109
+ def _convert_token_to_id(self, token):
110
+ """Converts a token (str) in an id using the vocab."""
111
+ return self.sp_model.piece_to_id(token)
112
+
113
+ def _convert_id_to_token(self, index):
114
+ """Converts an index (integer) in a token (str) using the vocab."""
115
+ token = self.sp_model.IdToPiece(index)
116
+ return token
117
+
118
+ def _maybe_add_prefix_space(self, tokens, decoded):
119
+ if tokens and tokens[0] not in self.no_prefix_space_tokens:
120
+ return ' ' + decoded
121
+ else:
122
+ return decoded
123
+
124
+ def convert_tokens_to_string(self, tokens):
125
+ """Converts a sequence of tokens (string) in a single string."""
126
+ current_sub_tokens = []
127
+ out_string = ''
128
+ prev_is_special = False
129
+ for token in tokens:
130
+ # make sure that special tokens are not decoded using sentencepiece model
131
+ if token in self.all_special_tokens:
132
+ if not prev_is_special:
133
+ out_string += ' '
134
+ out_string += self.sp_model.decode(current_sub_tokens) + token
135
+ prev_is_special = True
136
+ current_sub_tokens = []
137
+ else:
138
+ current_sub_tokens.append(token)
139
+ prev_is_special = False
140
+ out_string += self.sp_model.decode(current_sub_tokens)
141
+ out_string = self.clean_up_tokenization(out_string)
142
+ out_string = self._maybe_add_prefix_space(tokens=tokens, decoded=out_string)
143
+ return out_string[1:]
144
+
145
+ def save_vocabulary(self, save_directory, filename_prefix: Optional[str] = None) -> Tuple[str]:
146
+ """
147
+ Save the vocabulary and special tokens file to a directory.
148
+
149
+ Args:
150
+ save_directory (`str`):
151
+ The directory in which to save the vocabulary.
152
+
153
+ Returns:
154
+ `Tuple(str)`: Paths to the files saved.
155
+ """
156
+ if not os.path.isdir(save_directory):
157
+ logger.error(f'Vocabulary path ({save_directory}) should be a directory')
158
+ return
159
+ out_vocab_file = os.path.join(
160
+ save_directory, (filename_prefix + '-' if filename_prefix else '') + VOCAB_FILES_NAMES['vocab_file']
161
+ )
162
+
163
+ if os.path.abspath(self.vocab_file) != os.path.abspath(out_vocab_file) and os.path.isfile(self.vocab_file):
164
+ copyfile(self.vocab_file, out_vocab_file)
165
+ elif not os.path.isfile(self.vocab_file):
166
+ with open(out_vocab_file, 'wb') as fi:
167
+ content_spiece_model = self.sp_model.serialized_model_proto()
168
+ fi.write(content_spiece_model)
169
+
170
+ return (out_vocab_file,)
171
+
172
+ def build_inputs_with_special_tokens(self, token_ids_0, token_ids_1=None):
173
+ if self.add_bos_token:
174
+ bos_token_ids = [self.bos_token_id]
175
+ else:
176
+ bos_token_ids = []
177
+
178
+ output = bos_token_ids + token_ids_0
179
+
180
+ if token_ids_1 is not None:
181
+ output = output + token_ids_1
182
+
183
+ if self.add_eos_token:
184
+ output = output + [self.eos_token_id]
185
+
186
+ return output
187
+
188
+ def get_special_tokens_mask(
189
+ self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None, already_has_special_tokens: bool = False
190
+ ) -> List[int]:
191
+ """
192
+ Retrieve sequence ids from a token list that has no special tokens added. This method is called when adding
193
+ special tokens using the tokenizer `prepare_for_model` method.
194
+
195
+ Args:
196
+ token_ids_0 (`List[int]`):
197
+ List of IDs.
198
+ token_ids_1 (`List[int]`, *optional*):
199
+ Optional second list of IDs for sequence pairs.
200
+ already_has_special_tokens (`bool`, *optional*, defaults to `False`):
201
+ Whether or not the token list is already formatted with special tokens for the model.
202
+
203
+ Returns:
204
+ `List[int]`: A list of integers in the range [0, 1]: 1 for a special token, 0 for a sequence token.
205
+ """
206
+ if already_has_special_tokens:
207
+ return super().get_special_tokens_mask(
208
+ token_ids_0=token_ids_0, token_ids_1=token_ids_1, already_has_special_tokens=True
209
+ )
210
+
211
+ if token_ids_1 is None:
212
+ return [1] + ([0] * len(token_ids_0)) + [1]
213
+ return [1] + ([0] * len(token_ids_0)) + [1, 1] + ([0] * len(token_ids_1)) + [1]
214
+
215
+ def create_token_type_ids_from_sequences(
216
+ self, token_ids_0: List[int], token_ids_1: Optional[List[int]] = None
217
+ ) -> List[int]:
218
+ """
219
+ Create a mask from the two sequences passed to be used in a sequence-pair classification task. T5 does not make
220
+ use of token type ids, therefore a list of zeros is returned.
221
+
222
+ Args:
223
+ token_ids_0 (`List[int]`):
224
+ List of IDs.
225
+ token_ids_1 (`List[int]`, *optional*):
226
+ Optional second list of IDs for sequence pairs.
227
+
228
+ Returns:
229
+ `List[int]`: List of zeros.
230
+ """
231
+ eos = [self.eos_token_id]
232
+
233
+ if token_ids_1 is None:
234
+ return len(token_ids_0 + eos) * [0]
235
+ return len(token_ids_0 + eos + token_ids_1 + eos) * [0]
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f868398fc4e05ee1e8aeba95ddf18ddcc45b8bce55d5093bead5bbf80429b48b
3
+ size 1477754
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff