eustlb HF Staff commited on
Commit
311e5b0
·
verified ·
1 Parent(s): d9499c3

Upload processor

Browse files
Files changed (4) hide show
  1. added_tokens.json +3 -0
  2. special_tokens_map.json +4 -7
  3. tokenizer_config.json +0 -0
  4. vocab.json +990 -988
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "</s>": 1026
3
+ }
special_tokens_map.json CHANGED
@@ -1,9 +1,6 @@
1
  {
2
- "unk_token": {
3
- "content": "<unk>",
4
- "lstrip": false,
5
- "normalized": false,
6
- "rstrip": false,
7
- "single_word": false
8
- }
9
  }
 
1
  {
2
+ "bos_token": "<s>",
3
+ "eos_token": "</s>",
4
+ "pad_token": "<pad>",
5
+ "unk_token": "<unk>"
 
 
 
6
  }
tokenizer_config.json CHANGED
The diff for this file is too large to render. See raw diff
 
vocab.json CHANGED
@@ -1,1026 +1,1028 @@
1
  {
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2
  "<unk>": 0,
3
- "▁t": 1,
4
- "▁th": 2,
5
- "▁a": 3,
6
- "▁i": 4,
7
- "▁the": 5,
8
- "re": 6,
9
- "▁w": 7,
10
- "▁s": 8,
11
- "▁o": 9,
12
- "in": 10,
13
- "at": 11,
14
- "er": 12,
15
- "ou": 13,
16
- "nd": 14,
17
- "▁c": 15,
18
- "▁b": 16,
19
- "▁h": 17,
20
- "on": 18,
21
- "▁m": 19,
22
- "▁f": 20,
23
- "ing": 21,
24
- "▁to": 22,
25
- "en": 23,
26
- "▁p": 24,
27
- "▁and": 25,
28
- "▁d": 26,
29
- "es": 27,
30
- "or": 28,
31
  "an": 29,
32
- "ll": 30,
33
- "▁y": 31,
34
- "▁l": 32,
35
- "ed": 33,
36
- "▁of": 34,
37
- "▁in": 35,
38
- "it": 36,
39
- "is": 37,
40
- "▁you": 38,
41
- "▁that": 39,
42
  "ar": 40,
43
- "▁g": 41,
44
- "▁n": 42,
 
 
 
 
 
45
  "as": 43,
46
- "om": 44,
47
- "▁it": 45,
48
- "ic": 46,
49
- "ve": 47,
50
- "▁e": 48,
51
- "▁wh": 49,
52
- "▁be": 50,
53
- "us": 51,
54
- "le": 52,
55
- "al": 53,
56
- "ion": 54,
57
- "ow": 55,
58
- "▁we": 56,
59
- "▁re": 57,
60
- "▁is": 58,
61
- "ut": 59,
62
- "ot": 60,
63
- "ent": 61,
64
- "▁on": 62,
65
- "et": 63,
66
- "▁ha": 64,
 
 
 
67
  "ay": 65,
68
- "ct": 66,
69
- "▁he": 67,
70
- "id": 68,
71
- "▁for": 69,
72
- "▁st": 70,
73
- "ver": 71,
74
- "ly": 72,
75
- "ro": 73,
76
- "ig": 74,
77
- "▁so": 75,
78
- "ld": 76,
79
- "▁this": 77,
80
- "ke": 78,
81
- "▁u": 79,
82
- "se": 80,
83
- "all": 81,
84
- "st": 82,
85
- "ur": 83,
86
  "ce": 84,
 
 
 
 
87
  "ch": 85,
88
- "im": 86,
89
- "ith": 87,
90
- "▁as": 88,
91
- "▁k": 89,
92
- "▁an": 90,
93
- "▁was": 91,
94
- "▁j": 92,
95
- "▁with": 93,
96
- "ir": 94,
97
- "▁go": 95,
98
- "ra": 96,
99
- "▁do": 97,
100
- "▁have": 98,
101
- "▁li": 99,
102
- "▁sh": 100,
103
- "▁se": 101,
104
- "▁they": 102,
105
- "▁are": 103,
106
- "am": 104,
107
- "ht": 105,
108
- "▁but": 106,
109
- "ation": 107,
110
- "▁not": 108,
111
- "th": 109,
112
- "▁r": 110,
113
- "ally": 111,
114
- "ad": 112,
115
- "ust": 113,
116
- "▁or": 114,
117
- "▁com": 115,
118
- "ould": 116,
119
- "▁can": 117,
120
- "ill": 118,
121
- "▁ne": 119,
122
- "ight": 120,
123
- "▁ch": 121,
124
- "▁de": 122,
125
- "▁con": 123,
126
- "▁at": 124,
127
- "▁mo": 125,
128
- "ant": 126,
129
- "oo": 127,
130
- "il": 128,
131
- "▁me": 129,
132
- "▁what": 130,
133
- "▁there": 131,
134
- "ter": 132,
135
- "pe": 133,
136
- "▁ab": 134,
137
- "▁su": 135,
138
- "ere": 136,
139
  "ck": 137,
140
- "▁pro": 138,
141
- "▁al": 139,
142
- "▁fr": 140,
143
- "▁kn": 141,
144
- "▁all": 142,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
145
  "ers": 143,
146
- "▁like": 144,
147
- "ge": 145,
148
- "▁ex": 146,
149
- "▁som": 147,
150
- "ul": 148,
151
- "▁your": 149,
152
- "▁v": 150,
153
- "pp": 151,
154
- "use": 152,
155
- "▁if": 153,
156
  "ess": 154,
157
- "ate": 155,
158
  "est": 156,
159
- "▁know": 157,
160
- "out": 158,
161
- "if": 159,
162
- "▁just": 160,
163
- "ment": 161,
164
- "qu": 162,
165
- "op": 163,
166
- "ain": 164,
167
- "▁one": 165,
168
- "ol": 166,
169
- "ri": 167,
170
- "art": 168,
171
- "very": 169,
172
- "▁wor": 170,
173
- "ive": 171,
174
- "ist": 172,
175
- "▁my": 173,
176
- "nt": 174,
177
- "ab": 175,
178
- "▁from": 176,
179
- "ort": 177,
180
- "▁ma": 178,
181
- "▁about": 179,
182
- "res": 180,
183
- "ity": 181,
184
- "▁out": 182,
185
- "▁bec": 183,
186
- "▁le": 184,
187
- "our": 185,
188
- "od": 186,
189
- "and": 187,
190
- "ink": 188,
191
- "ie": 189,
192
- "▁up": 190,
193
- "ind": 191,
194
- "os": 192,
195
- "un": 193,
196
- "ause": 194,
197
- "oug": 195,
198
- "um": 196,
199
- "▁some": 197,
200
- "▁int": 198,
201
- "▁by": 199,
202
- "▁pl": 200,
203
- "▁get": 201,
204
- "el": 202,
205
- "ard": 203,
206
- "▁when": 204,
207
- "▁don": 205,
208
  "her": 206,
209
- "▁will": 207,
210
- "▁us": 208,
211
- "▁would": 209,
212
- "ook": 210,
213
- "ies": 211,
 
 
 
 
 
 
 
 
 
214
  "ich": 212,
215
- "▁because": 213,
216
- "▁think": 214,
217
- "em": 215,
218
- "▁pe": 216,
219
- "▁his": 217,
220
- "ack": 218,
221
- "▁then": 219,
222
- "▁our": 220,
223
  "ide": 221,
224
- "▁tim": 222,
225
- "▁how": 223,
226
- "ven": 224,
227
- "▁tr": 225,
228
- "▁who": 226,
229
- "▁them": 227,
230
- "ure": 228,
231
- "▁ar": 229,
232
- "▁ye": 230,
233
- "▁more": 231,
234
- "▁going": 232,
235
- "ect": 233,
236
- "▁sa": 234,
237
- "▁cl": 235,
238
- "▁had": 236,
239
- "▁now": 237,
240
- "▁which": 238,
241
- "▁here": 239,
242
- "ous": 240,
243
- "▁their": 241,
244
- "▁tw": 242,
245
- "so": 243,
246
- "▁has": 244,
247
- "ud": 245,
248
- "▁co": 246,
249
- "▁ta": 247,
250
- "ound": 248,
251
- "▁were": 249,
252
- "ast": 250,
253
- "▁peop": 251,
254
- "ough": 252,
255
- "▁no": 253,
256
- "▁really": 254,
257
- "▁any": 255,
258
- "▁people": 256,
259
- "▁want": 257,
260
- "▁she": 258,
261
- "▁en": 259,
262
- "▁fa": 260,
263
- "▁te": 261,
264
- "ame": 262,
265
  "ine": 263,
266
- "▁qu": 264,
267
- "red": 265,
268
- "▁im": 266,
269
- "▁right": 267,
270
- "ther": 268,
271
- "▁act": 269,
272
- "▁thing": 270,
273
- "king": 271,
274
- "ose": 272,
275
- "▁ad": 273,
276
- "▁see": 274,
277
- "▁time": 275,
278
- "▁these": 276,
279
- "ci": 277,
280
- "one": 278,
281
- "▁say": 279,
282
- "▁also": 280,
283
- "▁fe": 281,
284
- "per": 282,
285
- "▁ag": 283,
286
- "▁man": 284,
287
- "ore": 285,
288
- "▁un": 286,
289
- "pt": 287,
290
- "▁her": 288,
291
- "▁look": 289,
292
- "ong": 290,
293
- "ice": 291,
294
- "▁very": 292,
295
- "ff": 293,
296
  "ions": 294,
297
- "▁comp": 295,
298
- "▁did": 296,
299
- "itt": 297,
300
- "▁well": 298,
301
- "▁other": 299,
302
- "iv": 300,
303
- "ase": 301,
304
- "ree": 302,
305
- "hing": 303,
306
- "▁lo": 304,
307
- "reat": 305,
308
- "▁cont": 306,
309
- "▁part": 307,
310
- "▁into": 308,
311
- "nder": 309,
312
- "▁been": 310,
313
- "are": 311,
314
- "▁am": 312,
315
- "ans": 313,
316
- "▁sp": 314,
317
- "▁two": 315,
318
- "ue": 316,
319
- "▁way": 317,
320
- "age": 318,
321
- "▁where": 319,
322
- "ite": 320,
323
- "▁dis": 321,
324
- "▁than": 322,
325
- "▁every": 323,
326
- "▁pr": 324,
327
- "▁po": 325,
328
- "ag": 326,
329
- "▁need": 327,
330
- "ach": 328,
331
- "iff": 329,
332
- "ence": 330,
333
- "pl": 331,
334
- "own": 332,
335
- "▁ac": 333,
336
- "ble": 334,
337
- "▁over": 335,
338
- "iz": 336,
339
- "▁work": 337,
340
- "▁res": 338,
341
- "▁make": 339,
342
- "▁could": 340,
343
- "▁off": 341,
344
- "ually": 342,
345
- "▁ro": 343,
346
- "▁back": 344,
347
- "able": 345,
348
  "ip": 346,
349
- "ry": 347,
350
- "▁him": 348,
351
- "▁cour": 349,
352
- "ber": 350,
353
- "▁pre": 351,
354
- "▁fir": 352,
355
- "▁spe": 353,
356
- "ap": 354,
357
- "ars": 355,
358
- "▁diff": 356,
359
  "ire": 357,
360
- "▁somet": 358,
361
- "▁imp": 359,
362
- "▁those": 360,
363
- "▁comm": 361,
364
- "ance": 362,
365
- "ick": 363,
366
- "▁even": 364,
367
- "ated": 365,
368
- "way": 366,
369
- "sel": 367,
370
- "▁let": 368,
371
- "▁br": 369,
372
- "ty": 370,
373
- "▁per": 371,
374
- "int": 372,
375
- "▁first": 373,
376
- "▁thr": 374,
377
- "▁under": 375,
378
- "ah": 376,
379
- "▁may": 377,
380
- "▁cou": 378,
381
- "▁new": 379,
382
- "ress": 380,
383
- "act": 381,
384
- "▁gr": 382,
385
- "ep": 383,
386
- "▁said": 384,
387
- "ations": 385,
388
- "▁good": 386,
389
- "ace": 387,
390
- "ass": 388,
391
- "▁does": 389,
392
- "orm": 390,
393
  "ish": 391,
394
- "▁af": 392,
395
- "ving": 393,
396
- "co": 394,
397
- "▁app": 395,
398
- "▁lot": 396,
399
- "▁things": 397,
400
- "▁tra": 398,
401
- "ittle": 399,
402
- "▁bl": 400,
403
- "▁little": 401,
404
- "▁mu": 402,
405
- "cess": 403,
406
- "fe": 404,
407
- "ome": 405,
408
- "▁inc": 406,
409
- "▁differe": 407,
410
- "ary": 408,
411
- "ical": 409,
412
- "▁only": 410,
413
- "ult": 411,
414
- "▁again": 412,
415
- "▁got": 413,
416
- "ens": 414,
417
- "▁gu": 415,
418
- "▁kind": 416,
419
- "▁much": 417,
420
- "ord": 418,
421
- "▁through": 419,
422
  "ition": 420,
423
- "ild": 421,
424
- "▁down": 422,
425
- "▁actually": 423,
426
- "▁something": 424,
427
- "ang": 425,
428
- "ru": 426,
429
- "ces": 427,
430
- "▁fl": 428,
431
- "ile": 429,
432
- "ater": 430,
433
- "▁ra": 431,
434
- "▁take": 432,
435
- "ict": 433,
436
- "ign": 434,
437
- "▁sc": 435,
438
- "vel": 436,
439
- "▁bet": 437,
440
- "▁tal": 438,
441
- "▁yeah": 439,
442
- "▁use": 440,
443
- "fore": 441,
444
- "▁bu": 442,
445
- "▁start": 443,
446
- "ory": 444,
447
- "be": 445,
448
- "▁day": 446,
449
- "wn": 447,
450
- "xt": 448,
451
- "ia": 449,
452
- "ak": 450,
453
- "▁after": 451,
454
- "▁should": 452,
455
- "▁fo": 453,
456
- "▁ho": 454,
457
- "▁hel": 455,
458
- "▁ind": 456,
459
- "▁uh": 457,
460
  "na": 458,
461
- "ial": 459,
462
- "other": 460,
463
- "▁ke": 461,
464
- "▁call": 462,
465
- "▁most": 463,
466
- "▁ok": 464,
467
- "▁different": 465,
468
- "▁em": 466,
469
- "ting": 467,
470
- "ple": 468,
471
- "▁being": 469,
472
- "▁bo": 470,
473
  "ning": 471,
474
- "▁too": 472,
475
- "ors": 473,
476
- "▁happ": 474,
477
- "ark": 475,
 
478
  "og": 476,
479
- "▁help": 477,
480
- "▁rem": 478,
481
- "du": 479,
482
- "ction": 480,
 
 
 
 
 
 
 
 
483
  "ood": 481,
484
- "▁ser": 482,
485
- "ether": 483,
486
- "ious": 484,
487
- "▁mean": 485,
488
- "▁many": 486,
489
- "▁court": 487,
490
- "▁bel": 488,
491
- "ade": 489,
492
- "▁la": 490,
493
- "ved": 491,
494
- "▁des": 492,
495
- "▁rec": 493,
496
- "▁jo": 494,
497
- "▁dec": 495,
498
- "ves": 496,
499
- "▁before": 497,
500
- "▁put": 498,
501
- "self": 499,
502
- "▁point": 500,
503
- "te": 501,
504
- "▁ev": 502,
505
- "form": 503,
506
- "ents": 504,
507
- "▁add": 505,
508
- "ody": 506,
509
- "thing": 507,
510
- "▁case": 508,
511
- "▁pers": 509,
512
- "▁cons": 510,
513
- "iss": 511,
514
- "▁three": 512,
515
- "oth": 513,
516
- "▁ph": 514,
517
- "▁come": 515,
518
- "▁find": 516,
519
- "▁why": 517,
520
- "ull": 518,
521
- "▁show": 519,
522
- "▁bas": 520,
523
- "▁great": 521,
524
- "ily": 522,
525
- "▁rel": 523,
526
- "▁sm": 524,
527
- "▁its": 525,
528
- "▁fact": 526,
529
- "▁pos": 527,
530
  "ool": 528,
531
- "ments": 529,
532
- "ise": 530,
533
- "nds": 531,
534
- "ys": 532,
535
- "▁try": 533,
536
- "ual": 534,
537
- "ful": 535,
538
- "erm": 536,
539
- "▁inter": 537,
540
- "ons": 538,
541
- "▁quest": 539,
542
- "▁sub": 540,
543
- "we": 541,
544
- "vers": 542,
545
- "▁supp": 543,
546
- "▁feel": 544,
547
- "▁same": 545,
548
- "ub": 546,
549
- "ates": 547,
550
- "urn": 548,
551
- "ert": 549,
552
- "▁inv": 550,
553
- "day": 551,
554
- "▁rep": 552,
555
- "igh": 553,
556
- "▁sy": 554,
557
- "▁inst": 555,
558
- "▁long": 556,
559
- "▁still": 557,
560
- "▁okay": 558,
561
- "ft": 559,
562
- "ific": 560,
563
- "atch": 561,
564
  "ought": 562,
565
- "ath": 563,
566
- "▁own": 564,
567
- "▁made": 565,
568
- "ix": 566,
569
- "ced": 567,
570
- "ks": 568,
571
- "lic": 569,
572
- "▁wr": 570,
573
- "de": 571,
574
- "▁cr": 572,
575
- "▁att": 573,
576
- "▁ob": 574,
577
- "▁world": 575,
578
- "▁sure": 576,
579
- "ward": 577,
580
- "▁bit": 578,
581
- "▁life": 579,
582
- "▁person": 580,
583
- "▁pres": 581,
584
- "ph": 582,
585
- "▁vide": 583,
586
- "▁reg": 584,
587
- "▁end": 585,
588
- "ject": 586,
589
- "ange": 587,
590
- "▁fin": 588,
591
- "ied": 589,
592
- "pect": 590,
593
- "▁didn": 591,
594
- "▁around": 592,
595
- "ian": 593,
596
- "▁car": 594,
597
- "ible": 595,
598
- "▁sim": 596,
599
- "ever": 597,
600
- "▁sch": 598,
601
- "ating": 599,
602
- "▁pol": 600,
603
- "▁set": 601,
604
- "▁oh": 602,
605
- "cy": 603,
606
- "▁real": 604,
607
- "▁import": 605,
608
- "▁count": 606,
609
- "▁um": 607,
610
- "▁next": 608,
611
- "cial": 609,
612
- "les": 610,
613
- "▁hu": 611,
614
- "▁acc": 612,
615
- "▁might": 613,
616
- "▁ent": 614,
617
- "▁doing": 615,
618
- "▁ins": 616,
619
- "▁gen": 617,
620
- "▁play": 618,
621
- "▁cle": 619,
622
- "▁another": 620,
623
- "ady": 621,
624
- "ular": 622,
625
- "ib": 623,
626
- "ways": 624,
627
- "ered": 625,
628
- "ility": 626,
629
- "ities": 627,
630
- "▁op": 628,
631
- "▁def": 629,
632
- "▁years": 630,
633
- "▁never": 631,
634
  "ower": 632,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
635
  "ram": 633,
636
- "▁tell": 634,
637
- "▁sl": 635,
638
- "onna": 636,
639
- "ail": 637,
 
640
  "ren": 638,
641
- "ute": 639,
642
- "▁gonna": 640,
643
- "▁big": 641,
644
- "▁give": 642,
645
- "der": 643,
646
- "ount": 644,
647
- "▁ap": 645,
648
- "kes": 646,
649
- "▁state": 647,
650
- "▁cor": 648,
651
- "▁min": 649,
652
- "ically": 650,
653
- "▁mon": 651,
654
- "▁fam": 652,
655
- "▁important": 653,
656
- "▁always": 654,
657
- "▁high": 655,
658
- "▁four": 656,
659
- "▁gra": 657,
660
- "▁ca": 658,
661
- "▁stud": 659,
662
- "▁dist": 660,
663
- "▁talk": 661,
664
- "▁num": 662,
665
- "▁str": 663,
666
- "▁today": 664,
667
- "ract": 665,
668
- "▁while": 666,
669
- "ason": 667,
670
- "▁iss": 668,
671
- "▁sur": 669,
672
- "▁char": 670,
673
- "▁last": 671,
674
- "oy": 672,
675
- "ited": 673,
676
- "▁exper": 674,
677
- "▁place": 675,
678
- "▁tri": 676,
679
- "▁ear": 677,
680
- "▁belie": 678,
681
- "▁able": 679,
682
- "▁underst": 680,
683
- "▁che": 681,
684
- "▁both": 682,
685
- "ug": 683,
686
- "▁doesn": 684,
687
- "▁keep": 685,
688
- "▁happen": 686,
689
- "ings": 687,
690
- "iew": 688,
691
- "ather": 689,
692
- "▁ass": 690,
693
- "▁love": 691,
694
- "ative": 692,
695
- "av": 693,
696
- "▁yes": 694,
697
- "▁ele": 695,
698
- "▁year": 696,
699
- "▁such": 697,
700
- "▁video": 698,
701
- "ness": 699,
702
- "▁el": 700,
703
- "▁trans": 701,
704
- "▁five": 702,
705
- "▁produ": 703,
706
- "ave": 704,
707
- "erest": 705,
708
- "als": 706,
709
- "body": 707,
710
- "cus": 708,
711
- "▁found": 709,
712
- "atter": 710,
713
- "▁eff": 711,
714
- "▁god": 712,
715
- "▁used": 713,
716
- "llow": 714,
717
- "▁interest": 715,
718
- "▁question": 716,
719
- "hip": 717,
720
- "▁bus": 718,
721
- "▁ask": 719,
722
- "▁exam": 720,
723
- "▁prov": 721,
724
- "lud": 722,
725
- "▁form": 723,
726
- "▁law": 724,
727
- "ense": 725,
728
- "▁child": 726,
729
- "▁gl": 727,
730
- "ne": 728,
731
- "▁each": 729,
732
- "▁understand": 730,
733
- "▁care": 731,
734
  "stem": 732,
735
- "▁med": 733,
736
- "▁maybe": 734,
737
- "ably": 735,
738
- "▁det": 736,
739
- "▁coll": 737,
740
- "its": 738,
741
- "▁commun": 739,
742
- "▁hand": 740,
743
- "▁'": 741,
744
- "▁ref": 742,
745
- "▁lear": 743,
746
- "▁done": 744,
747
- "▁gener": 745,
748
- "vern": 746,
749
- "▁mr": 747,
750
- "ween": 748,
751
- "▁better": 749,
752
- "▁between": 750,
753
- "li": 751,
754
- "blem": 752,
755
- "▁system": 753,
756
- "ertain": 754,
757
- "▁school": 755,
758
- "▁eas": 756,
759
- "▁exp": 757,
760
- "▁war": 758,
761
- "ention": 759,
762
- "▁ty": 760,
763
- "▁govern": 761,
764
  "ues": 762,
765
- "▁problem": 763,
766
- "▁plan": 764,
767
- "ac": 765,
768
- "▁conf": 766,
769
- "▁course": 767,
770
- "ouse": 768,
771
- "▁mar": 769,
772
- "▁stand": 770,
773
- "▁sk": 771,
774
- "▁seco": 772,
775
- "uring": 773,
776
- "▁ed": 774,
777
- "▁mem": 775,
778
- "ros": 776,
779
- "cri": 777,
780
- "▁thought": 778,
781
- "cept": 779,
782
- "▁partic": 780,
783
- "▁test": 781,
784
- "olog": 782,
785
- "iness": 783,
786
- "▁far": 784,
787
- "led": 785,
788
- "▁col": 786,
789
- "▁looking": 787,
790
- "▁read": 788,
791
- "▁whether": 789,
792
- "▁word": 790,
793
- "me": 791,
794
- "▁once": 792,
795
- "ize": 793,
796
- "▁home": 794,
797
- "▁requ": 795,
798
- "gg": 796,
799
- "▁ide": 797,
800
- "▁thank": 798,
801
  "ures": 799,
802
- "▁called": 800,
803
- "▁cur": 801,
804
- "▁water": 802,
805
- "▁frie": 803,
806
- "▁side": 804,
807
- "▁best": 805,
808
- "▁number": 806,
809
- "oney": 807,
810
- "▁turn": 808,
811
- "ock": 809,
812
- "▁eng": 810,
813
- "▁top": 811,
814
- "▁open": 812,
815
- "ead": 813,
816
- "▁everything": 814,
817
- "▁term": 815,
818
- "▁prob": 816,
819
- "▁hard": 817,
820
- "▁fun": 818,
821
- "▁spec": 819,
822
- "▁dire": 820,
823
- "▁second": 821,
824
- "▁pa": 822,
825
- "▁build": 823,
826
- "▁run": 824,
827
- "▁sign": 825,
828
- "▁reason": 826,
829
- "▁inform": 827,
830
- "▁watch": 828,
831
  "ution": 829,
832
- "▁few": 830,
833
- "mo": 831,
834
- "▁hum": 832,
835
- "ision": 833,
836
- "▁ext": 834,
837
- "▁tog": 835,
838
- "▁conc": 836,
839
- "▁thous": 837,
840
- "▁thousand": 838,
841
- "▁support": 839,
842
- "▁together": 840,
843
- "▁six": 841,
844
- "ps": 842,
845
- "▁mark": 843,
846
- "ics": 844,
847
- "▁includ": 845,
848
- "ef": 846,
849
- "▁opp": 847,
850
- "ident": 848,
851
- "▁anything": 849,
852
- "▁met": 850,
853
- "▁bre": 851,
854
- "▁jud": 852,
855
- "▁away": 853,
856
- "▁old": 854,
857
- "▁prog": 855,
858
- "ten": 856,
859
- "▁book": 857,
860
- "▁says": 858,
861
- "▁seem": 859,
862
- "▁contin": 860,
863
- "▁process": 861,
864
- "▁sing": 862,
865
- "▁money": 863,
866
- "▁having": 864,
867
- "▁beg": 865,
868
- "▁comple": 866,
869
- "▁thir": 867,
870
- "▁using": 868,
871
- "▁ret": 869,
872
- "ger": 870,
873
- "▁head": 871,
874
- "▁cre": 872,
875
- "▁poss": 873,
876
- "enty": 874,
877
- "▁certain": 875,
878
- "▁clear": 876,
879
- "ines": 877,
880
- "▁wee": 878,
881
- "arch": 879,
882
- "▁inf": 880,
883
- "ont": 881,
884
- "▁sit": 882,
885
- "▁lead": 883,
886
- "alth": 884,
887
- "▁art": 885,
888
- "ross": 886,
889
- "▁pub": 887,
890
- "▁without": 888,
891
- "▁pret": 889,
892
- "▁getting": 890,
893
- "ient": 891,
894
- "▁z": 892,
895
- "▁wom": 893,
896
- "▁power": 894,
897
- "ational": 895,
898
- "ner": 896,
899
- "▁rest": 897,
900
- "▁believe": 898,
901
- "▁wa": 899,
902
- "▁aut": 900,
903
- "▁move": 901,
904
- "aim": 902,
905
- "▁sort": 903,
906
- "idence": 904,
907
- "▁creat": 905,
908
- "▁expl": 906,
909
- "▁name": 907,
910
- "▁went": 908,
911
- "▁eu": 909,
912
- "▁change": 910,
913
- "▁came": 911,
914
- "▁pay": 912,
915
- "ices": 913,
916
- "▁sin": 914,
917
- "▁pur": 915,
918
- "▁pass": 916,
919
- "▁whole": 917,
920
- "▁house": 918,
921
- "▁hund": 919,
922
- "▁hundred": 920,
923
- "▁pretty": 921,
924
- "▁trying": 922,
925
- "▁ple": 923,
926
- "▁allow": 924,
927
- "▁compan": 925,
928
- "▁government": 926,
929
- "▁small": 927,
930
- "▁light": 928,
931
- "▁bra": 929,
932
- "▁stu": 930,
933
- "aint": 931,
934
- "▁ah": 932,
935
- "▁prot": 933,
936
- "ets": 934,
937
- "▁cent": 935,
938
  "velop": 936,
939
- "▁family": 937,
940
- "▁business": 938,
941
- "ety": 939,
942
- "▁making": 940,
943
- "▁list": 941,
944
- "▁experi": 942,
945
- "eric": 943,
946
- "▁follow": 944,
947
- "ately": 945,
948
- "▁probably": 946,
949
- "▁appe": 947,
950
- "▁serv": 948,
951
- "▁val": 949,
952
- "▁leg": 950,
953
- "▁resp": 951,
954
- "▁develop": 952,
955
- "ready": 953,
956
- "▁already": 954,
957
- "▁sec": 955,
958
- "ell": 956,
959
- "▁saying": 957,
960
- "ash": 958,
961
- "▁hear": 959,
962
- "▁loc": 960,
963
- "▁adv": 961,
964
- "▁pri": 962,
965
- "ret": 963,
966
- "▁lar": 964,
967
- "▁beh": 965,
968
- "▁must": 966,
969
- "▁hon": 967,
970
- "▁means": 968,
971
- "ew": 969,
972
- "▁par": 970,
973
- "▁order": 971,
974
- "▁mom": 972,
975
- "gn": 973,
976
- "▁though": 974,
977
- "▁record": 975,
978
- "▁miss": 976,
979
- "▁dr": 977,
980
- "▁es": 978,
981
- "▁eight": 979,
982
- "▁ever": 980,
983
- "▁left": 981,
984
- "▁example": 982,
985
- "▁enough": 983,
986
- "osed": 984,
987
- "▁claim": 985,
988
- "ank": 986,
989
- "con": 987,
990
- "▁americ": 988,
991
- "▁information": 989,
992
- "▁arg": 990,
993
- "▁full": 991,
994
- "nce": 992,
995
- "▁consid": 993,
996
- "▁working": 994,
997
- "ature": 995,
998
- "▁": 996,
999
- "e": 997,
1000
- "t": 998,
1001
- "a": 999,
1002
- "o": 1000,
1003
- "i": 1001,
1004
- "n": 1002,
1005
- "s": 1003,
1006
- "r": 1004,
1007
- "h": 1005,
1008
- "l": 1006,
1009
- "d": 1007,
1010
- "u": 1008,
1011
- "c": 1009,
1012
- "m": 1010,
1013
- "y": 1011,
1014
  "w": 1012,
1015
- "g": 1013,
1016
- "f": 1014,
1017
- "p": 1015,
1018
- "b": 1016,
1019
- "v": 1017,
1020
- "k": 1018,
1021
- "'": 1019,
1022
- "j": 1020,
1023
  "x": 1021,
1024
- "q": 1022,
 
 
1025
  "z": 1023
1026
- }
 
1
  {
2
+ " ": 996,
3
+ " '": 741,
4
+ " a": 3,
5
+ " ab": 134,
6
+ " able": 679,
7
+ " about": 179,
8
+ " ac": 333,
9
+ " acc": 612,
10
+ " act": 269,
11
+ " actually": 423,
12
+ " ad": 273,
13
+ " add": 505,
14
+ " adv": 961,
15
+ " af": 392,
16
+ " after": 451,
17
+ " ag": 283,
18
+ " again": 412,
19
+ " ah": 932,
20
+ " al": 139,
21
+ " all": 142,
22
+ " allow": 924,
23
+ " already": 954,
24
+ " also": 280,
25
+ " always": 654,
26
+ " am": 312,
27
+ " americ": 988,
28
+ " an": 90,
29
+ " and": 25,
30
+ " another": 620,
31
+ " any": 255,
32
+ " anything": 849,
33
+ " ap": 645,
34
+ " app": 395,
35
+ " appe": 947,
36
+ " ar": 229,
37
+ " are": 103,
38
+ " arg": 990,
39
+ " around": 592,
40
+ " art": 885,
41
+ " as": 88,
42
+ " ask": 719,
43
+ " ass": 690,
44
+ " at": 124,
45
+ " att": 573,
46
+ " aut": 900,
47
+ " away": 853,
48
+ " b": 16,
49
+ " back": 344,
50
+ " bas": 520,
51
+ " be": 50,
52
+ " bec": 183,
53
+ " because": 213,
54
+ " been": 310,
55
+ " before": 497,
56
+ " beg": 865,
57
+ " beh": 965,
58
+ " being": 469,
59
+ " bel": 488,
60
+ " belie": 678,
61
+ " believe": 898,
62
+ " best": 805,
63
+ " bet": 437,
64
+ " better": 749,
65
+ " between": 750,
66
+ " big": 641,
67
+ " bit": 578,
68
+ " bl": 400,
69
+ " bo": 470,
70
+ " book": 857,
71
+ " both": 682,
72
+ " br": 369,
73
+ " bra": 929,
74
+ " bre": 851,
75
+ " bu": 442,
76
+ " build": 823,
77
+ " bus": 718,
78
+ " business": 938,
79
+ " but": 106,
80
+ " by": 199,
81
+ " c": 15,
82
+ " ca": 658,
83
+ " call": 462,
84
+ " called": 800,
85
+ " came": 911,
86
+ " can": 117,
87
+ " car": 594,
88
+ " care": 731,
89
+ " case": 508,
90
+ " cent": 935,
91
+ " certain": 875,
92
+ " ch": 121,
93
+ " change": 910,
94
+ " char": 670,
95
+ " che": 681,
96
+ " child": 726,
97
+ " cl": 235,
98
+ " claim": 985,
99
+ " cle": 619,
100
+ " clear": 876,
101
+ " co": 246,
102
+ " col": 786,
103
+ " coll": 737,
104
+ " com": 115,
105
+ " come": 515,
106
+ " comm": 361,
107
+ " commun": 739,
108
+ " comp": 295,
109
+ " compan": 925,
110
+ " comple": 866,
111
+ " con": 123,
112
+ " conc": 836,
113
+ " conf": 766,
114
+ " cons": 510,
115
+ " consid": 993,
116
+ " cont": 306,
117
+ " contin": 860,
118
+ " cor": 648,
119
+ " cou": 378,
120
+ " could": 340,
121
+ " count": 606,
122
+ " cour": 349,
123
+ " course": 767,
124
+ " court": 487,
125
+ " cr": 572,
126
+ " cre": 872,
127
+ " creat": 905,
128
+ " cur": 801,
129
+ " d": 26,
130
+ " day": 446,
131
+ " de": 122,
132
+ " dec": 495,
133
+ " def": 629,
134
+ " des": 492,
135
+ " det": 736,
136
+ " develop": 952,
137
+ " did": 296,
138
+ " didn": 591,
139
+ " diff": 356,
140
+ " differe": 407,
141
+ " different": 465,
142
+ " dire": 820,
143
+ " dis": 321,
144
+ " dist": 660,
145
+ " do": 97,
146
+ " does": 389,
147
+ " doesn": 684,
148
+ " doing": 615,
149
+ " don": 205,
150
+ " done": 744,
151
+ " down": 422,
152
+ " dr": 977,
153
+ " e": 48,
154
+ " each": 729,
155
+ " ear": 677,
156
+ " eas": 756,
157
+ " ed": 774,
158
+ " eff": 711,
159
+ " eight": 979,
160
+ " el": 700,
161
+ " ele": 695,
162
+ " em": 466,
163
+ " en": 259,
164
+ " end": 585,
165
+ " eng": 810,
166
+ " enough": 983,
167
+ " ent": 614,
168
+ " es": 978,
169
+ " eu": 909,
170
+ " ev": 502,
171
+ " even": 364,
172
+ " ever": 980,
173
+ " every": 323,
174
+ " everything": 814,
175
+ " ex": 146,
176
+ " exam": 720,
177
+ " example": 982,
178
+ " exp": 757,
179
+ " exper": 674,
180
+ " experi": 942,
181
+ " expl": 906,
182
+ " ext": 834,
183
+ " f": 20,
184
+ " fa": 260,
185
+ " fact": 526,
186
+ " fam": 652,
187
+ " family": 937,
188
+ " far": 784,
189
+ " fe": 281,
190
+ " feel": 544,
191
+ " few": 830,
192
+ " fin": 588,
193
+ " find": 516,
194
+ " fir": 352,
195
+ " first": 373,
196
+ " five": 702,
197
+ " fl": 428,
198
+ " fo": 453,
199
+ " follow": 944,
200
+ " for": 69,
201
+ " form": 723,
202
+ " found": 709,
203
+ " four": 656,
204
+ " fr": 140,
205
+ " frie": 803,
206
+ " from": 176,
207
+ " full": 991,
208
+ " fun": 818,
209
+ " g": 41,
210
+ " gen": 617,
211
+ " gener": 745,
212
+ " get": 201,
213
+ " getting": 890,
214
+ " give": 642,
215
+ " gl": 727,
216
+ " go": 95,
217
+ " god": 712,
218
+ " going": 232,
219
+ " gonna": 640,
220
+ " good": 386,
221
+ " got": 413,
222
+ " govern": 761,
223
+ " government": 926,
224
+ " gr": 382,
225
+ " gra": 657,
226
+ " great": 521,
227
+ " gu": 415,
228
+ " h": 17,
229
+ " ha": 64,
230
+ " had": 236,
231
+ " hand": 740,
232
+ " happ": 474,
233
+ " happen": 686,
234
+ " hard": 817,
235
+ " has": 244,
236
+ " have": 98,
237
+ " having": 864,
238
+ " he": 67,
239
+ " head": 871,
240
+ " hear": 959,
241
+ " hel": 455,
242
+ " help": 477,
243
+ " her": 288,
244
+ " here": 239,
245
+ " high": 655,
246
+ " him": 348,
247
+ " his": 217,
248
+ " ho": 454,
249
+ " home": 794,
250
+ " hon": 967,
251
+ " house": 918,
252
+ " how": 223,
253
+ " hu": 611,
254
+ " hum": 832,
255
+ " hund": 919,
256
+ " hundred": 920,
257
+ " i": 4,
258
+ " ide": 797,
259
+ " if": 153,
260
+ " im": 266,
261
+ " imp": 359,
262
+ " import": 605,
263
+ " important": 653,
264
+ " in": 35,
265
+ " inc": 406,
266
+ " includ": 845,
267
+ " ind": 456,
268
+ " inf": 880,
269
+ " inform": 827,
270
+ " information": 989,
271
+ " ins": 616,
272
+ " inst": 555,
273
+ " int": 198,
274
+ " inter": 537,
275
+ " interest": 715,
276
+ " into": 308,
277
+ " inv": 550,
278
+ " is": 58,
279
+ " iss": 668,
280
+ " it": 45,
281
+ " its": 525,
282
+ " j": 92,
283
+ " jo": 494,
284
+ " jud": 852,
285
+ " just": 160,
286
+ " k": 89,
287
+ " ke": 461,
288
+ " keep": 685,
289
+ " kind": 416,
290
+ " kn": 141,
291
+ " know": 157,
292
+ " l": 32,
293
+ " la": 490,
294
+ " lar": 964,
295
+ " last": 671,
296
+ " law": 724,
297
+ " le": 184,
298
+ " lead": 883,
299
+ " lear": 743,
300
+ " left": 981,
301
+ " leg": 950,
302
+ " let": 368,
303
+ " li": 99,
304
+ " life": 579,
305
+ " light": 928,
306
+ " like": 144,
307
+ " list": 941,
308
+ " little": 401,
309
+ " lo": 304,
310
+ " loc": 960,
311
+ " long": 556,
312
+ " look": 289,
313
+ " looking": 787,
314
+ " lot": 396,
315
+ " love": 691,
316
+ " m": 19,
317
+ " ma": 178,
318
+ " made": 565,
319
+ " make": 339,
320
+ " making": 940,
321
+ " man": 284,
322
+ " many": 486,
323
+ " mar": 769,
324
+ " mark": 843,
325
+ " may": 377,
326
+ " maybe": 734,
327
+ " me": 129,
328
+ " mean": 485,
329
+ " means": 968,
330
+ " med": 733,
331
+ " mem": 775,
332
+ " met": 850,
333
+ " might": 613,
334
+ " min": 649,
335
+ " miss": 976,
336
+ " mo": 125,
337
+ " mom": 972,
338
+ " mon": 651,
339
+ " money": 863,
340
+ " more": 231,
341
+ " most": 463,
342
+ " move": 901,
343
+ " mr": 747,
344
+ " mu": 402,
345
+ " much": 417,
346
+ " must": 966,
347
+ " my": 173,
348
+ " n": 42,
349
+ " name": 907,
350
+ " ne": 119,
351
+ " need": 327,
352
+ " never": 631,
353
+ " new": 379,
354
+ " next": 608,
355
+ " no": 253,
356
+ " not": 108,
357
+ " now": 237,
358
+ " num": 662,
359
+ " number": 806,
360
+ " o": 9,
361
+ " ob": 574,
362
+ " of": 34,
363
+ " off": 341,
364
+ " oh": 602,
365
+ " ok": 464,
366
+ " okay": 558,
367
+ " old": 854,
368
+ " on": 62,
369
+ " once": 792,
370
+ " one": 165,
371
+ " only": 410,
372
+ " op": 628,
373
+ " open": 812,
374
+ " opp": 847,
375
+ " or": 114,
376
+ " order": 971,
377
+ " other": 299,
378
+ " our": 220,
379
+ " out": 182,
380
+ " over": 335,
381
+ " own": 564,
382
+ " p": 24,
383
+ " pa": 822,
384
+ " par": 970,
385
+ " part": 307,
386
+ " partic": 780,
387
+ " pass": 916,
388
+ " pay": 912,
389
+ " pe": 216,
390
+ " peop": 251,
391
+ " people": 256,
392
+ " per": 371,
393
+ " pers": 509,
394
+ " person": 580,
395
+ " ph": 514,
396
+ " pl": 200,
397
+ " place": 675,
398
+ " plan": 764,
399
+ " play": 618,
400
+ " ple": 923,
401
+ " po": 325,
402
+ " point": 500,
403
+ " pol": 600,
404
+ " pos": 527,
405
+ " poss": 873,
406
+ " power": 894,
407
+ " pr": 324,
408
+ " pre": 351,
409
+ " pres": 581,
410
+ " pret": 889,
411
+ " pretty": 921,
412
+ " pri": 962,
413
+ " pro": 138,
414
+ " prob": 816,
415
+ " probably": 946,
416
+ " problem": 763,
417
+ " process": 861,
418
+ " produ": 703,
419
+ " prog": 855,
420
+ " prot": 933,
421
+ " prov": 721,
422
+ " pub": 887,
423
+ " pur": 915,
424
+ " put": 498,
425
+ " qu": 264,
426
+ " quest": 539,
427
+ " question": 716,
428
+ " r": 110,
429
+ " ra": 431,
430
+ " re": 57,
431
+ " read": 788,
432
+ " real": 604,
433
+ " really": 254,
434
+ " reason": 826,
435
+ " rec": 493,
436
+ " record": 975,
437
+ " ref": 742,
438
+ " reg": 584,
439
+ " rel": 523,
440
+ " rem": 478,
441
+ " rep": 552,
442
+ " requ": 795,
443
+ " res": 338,
444
+ " resp": 951,
445
+ " rest": 897,
446
+ " ret": 869,
447
+ " right": 267,
448
+ " ro": 343,
449
+ " run": 824,
450
+ " s": 8,
451
+ " sa": 234,
452
+ " said": 384,
453
+ " same": 545,
454
+ " say": 279,
455
+ " saying": 957,
456
+ " says": 858,
457
+ " sc": 435,
458
+ " sch": 598,
459
+ " school": 755,
460
+ " se": 101,
461
+ " sec": 955,
462
+ " seco": 772,
463
+ " second": 821,
464
+ " see": 274,
465
+ " seem": 859,
466
+ " ser": 482,
467
+ " serv": 948,
468
+ " set": 601,
469
+ " sh": 100,
470
+ " she": 258,
471
+ " should": 452,
472
+ " show": 519,
473
+ " side": 804,
474
+ " sign": 825,
475
+ " sim": 596,
476
+ " sin": 914,
477
+ " sing": 862,
478
+ " sit": 882,
479
+ " six": 841,
480
+ " sk": 771,
481
+ " sl": 635,
482
+ " sm": 524,
483
+ " small": 927,
484
+ " so": 75,
485
+ " som": 147,
486
+ " some": 197,
487
+ " somet": 358,
488
+ " something": 424,
489
+ " sort": 903,
490
+ " sp": 314,
491
+ " spe": 353,
492
+ " spec": 819,
493
+ " st": 70,
494
+ " stand": 770,
495
+ " start": 443,
496
+ " state": 647,
497
+ " still": 557,
498
+ " str": 663,
499
+ " stu": 930,
500
+ " stud": 659,
501
+ " su": 135,
502
+ " sub": 540,
503
+ " such": 697,
504
+ " supp": 543,
505
+ " support": 839,
506
+ " sur": 669,
507
+ " sure": 576,
508
+ " sy": 554,
509
+ " system": 753,
510
+ " ta": 247,
511
+ " take": 432,
512
+ " tal": 438,
513
+ " talk": 661,
514
+ " te": 261,
515
+ " tell": 634,
516
+ " term": 815,
517
+ " test": 781,
518
+ " th": 2,
519
+ " than": 322,
520
+ " thank": 798,
521
+ " that": 39,
522
+ " the": 5,
523
+ " their": 241,
524
+ " them": 227,
525
+ " then": 219,
526
+ " there": 131,
527
+ " these": 276,
528
+ " they": 102,
529
+ " thing": 270,
530
+ " things": 397,
531
+ " think": 214,
532
+ " thir": 867,
533
+ " this": 77,
534
+ " those": 360,
535
+ " though": 974,
536
+ " thought": 778,
537
+ " thous": 837,
538
+ " thousand": 838,
539
+ " thr": 374,
540
+ " three": 512,
541
+ " through": 419,
542
+ " tim": 222,
543
+ " time": 275,
544
+ " to": 22,
545
+ " today": 664,
546
+ " tog": 835,
547
+ " together": 840,
548
+ " too": 472,
549
+ " top": 811,
550
+ " tr": 225,
551
+ " tra": 398,
552
+ " trans": 701,
553
+ " tri": 676,
554
+ " try": 533,
555
+ " trying": 922,
556
+ " turn": 808,
557
+ " tw": 242,
558
+ " two": 315,
559
+ " ty": 760,
560
+ " u": 79,
561
+ " uh": 457,
562
+ " um": 607,
563
+ " un": 286,
564
+ " under": 375,
565
+ " underst": 680,
566
+ " understand": 730,
567
+ " up": 190,
568
+ " us": 208,
569
+ " use": 440,
570
+ " used": 713,
571
+ " using": 868,
572
+ " v": 150,
573
+ " val": 949,
574
+ " very": 292,
575
+ " vide": 583,
576
+ " video": 698,
577
+ " w": 7,
578
+ " wa": 899,
579
+ " want": 257,
580
+ " war": 758,
581
+ " was": 91,
582
+ " watch": 828,
583
+ " water": 802,
584
+ " way": 317,
585
+ " we": 56,
586
+ " wee": 878,
587
+ " well": 298,
588
+ " went": 908,
589
+ " were": 249,
590
+ " wh": 49,
591
+ " what": 130,
592
+ " when": 204,
593
+ " where": 319,
594
+ " whether": 789,
595
+ " which": 238,
596
+ " while": 666,
597
+ " who": 226,
598
+ " whole": 917,
599
+ " why": 517,
600
+ " will": 207,
601
+ " with": 93,
602
+ " without": 888,
603
+ " wom": 893,
604
+ " wor": 170,
605
+ " word": 790,
606
+ " work": 337,
607
+ " working": 994,
608
+ " world": 575,
609
+ " would": 209,
610
+ " wr": 570,
611
+ " y": 31,
612
+ " ye": 230,
613
+ " yeah": 439,
614
+ " year": 696,
615
+ " years": 630,
616
+ " yes": 694,
617
+ " you": 38,
618
+ " your": 149,
619
+ " z": 892,
620
+ "'": 1019,
621
+ "</s>": 1026,
622
+ "<pad>": 1024,
623
+ "<s>": 1025,
624
  "<unk>": 0,
625
+ "a": 999,
626
+ "ab": 175,
627
+ "able": 345,
628
+ "ably": 735,
629
+ "ac": 765,
630
+ "ace": 387,
631
+ "ach": 328,
632
+ "ack": 218,
633
+ "act": 381,
634
+ "ad": 112,
635
+ "ade": 489,
636
+ "ady": 621,
637
+ "ag": 326,
638
+ "age": 318,
639
+ "ah": 376,
640
+ "ail": 637,
641
+ "aim": 902,
642
+ "ain": 164,
643
+ "aint": 931,
644
+ "ak": 450,
645
+ "al": 53,
646
+ "all": 81,
647
+ "ally": 111,
648
+ "als": 706,
649
+ "alth": 884,
650
+ "am": 104,
651
+ "ame": 262,
 
652
  "an": 29,
653
+ "ance": 362,
654
+ "and": 187,
655
+ "ang": 425,
656
+ "ange": 587,
657
+ "ank": 986,
658
+ "ans": 313,
659
+ "ant": 126,
660
+ "ap": 354,
 
 
661
  "ar": 40,
662
+ "arch": 879,
663
+ "ard": 203,
664
+ "are": 311,
665
+ "ark": 475,
666
+ "ars": 355,
667
+ "art": 168,
668
+ "ary": 408,
669
  "as": 43,
670
+ "ase": 301,
671
+ "ash": 958,
672
+ "ason": 667,
673
+ "ass": 388,
674
+ "ast": 250,
675
+ "at": 11,
676
+ "atch": 561,
677
+ "ate": 155,
678
+ "ated": 365,
679
+ "ately": 945,
680
+ "ater": 430,
681
+ "ates": 547,
682
+ "ath": 563,
683
+ "ather": 689,
684
+ "ating": 599,
685
+ "ation": 107,
686
+ "ational": 895,
687
+ "ations": 385,
688
+ "ative": 692,
689
+ "atter": 710,
690
+ "ature": 995,
691
+ "ause": 194,
692
+ "av": 693,
693
+ "ave": 704,
694
  "ay": 65,
695
+ "b": 1016,
696
+ "be": 445,
697
+ "ber": 350,
698
+ "ble": 334,
699
+ "blem": 752,
700
+ "body": 707,
701
+ "c": 1009,
 
 
 
 
 
 
 
 
 
 
 
702
  "ce": 84,
703
+ "ced": 567,
704
+ "cept": 779,
705
+ "ces": 427,
706
+ "cess": 403,
707
  "ch": 85,
708
+ "ci": 277,
709
+ "cial": 609,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
710
  "ck": 137,
711
+ "co": 394,
712
+ "con": 987,
713
+ "cri": 777,
714
+ "ct": 66,
715
+ "ction": 480,
716
+ "cus": 708,
717
+ "cy": 603,
718
+ "d": 1007,
719
+ "day": 551,
720
+ "de": 571,
721
+ "der": 643,
722
+ "du": 479,
723
+ "e": 997,
724
+ "ead": 813,
725
+ "ect": 233,
726
+ "ed": 33,
727
+ "ef": 846,
728
+ "el": 202,
729
+ "ell": 956,
730
+ "em": 215,
731
+ "en": 23,
732
+ "ence": 330,
733
+ "ens": 414,
734
+ "ense": 725,
735
+ "ent": 61,
736
+ "ention": 759,
737
+ "ents": 504,
738
+ "enty": 874,
739
+ "ep": 383,
740
+ "er": 12,
741
+ "ere": 136,
742
+ "ered": 625,
743
+ "erest": 705,
744
+ "eric": 943,
745
+ "erm": 536,
746
  "ers": 143,
747
+ "ert": 549,
748
+ "ertain": 754,
749
+ "es": 27,
 
 
 
 
 
 
 
750
  "ess": 154,
 
751
  "est": 156,
752
+ "et": 63,
753
+ "ether": 483,
754
+ "ets": 934,
755
+ "ety": 939,
756
+ "ever": 597,
757
+ "ew": 969,
758
+ "f": 1014,
759
+ "fe": 404,
760
+ "ff": 293,
761
+ "fore": 441,
762
+ "form": 503,
763
+ "ft": 559,
764
+ "ful": 535,
765
+ "g": 1013,
766
+ "ge": 145,
767
+ "ger": 870,
768
+ "gg": 796,
769
+ "gn": 973,
770
+ "h": 1005,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
771
  "her": 206,
772
+ "hing": 303,
773
+ "hip": 717,
774
+ "ht": 105,
775
+ "i": 1001,
776
+ "ia": 449,
777
+ "ial": 459,
778
+ "ian": 593,
779
+ "ib": 623,
780
+ "ible": 595,
781
+ "ic": 46,
782
+ "ical": 409,
783
+ "ically": 650,
784
+ "ice": 291,
785
+ "ices": 913,
786
  "ich": 212,
787
+ "ick": 363,
788
+ "ics": 844,
789
+ "ict": 433,
790
+ "id": 68,
 
 
 
 
791
  "ide": 221,
792
+ "idence": 904,
793
+ "ident": 848,
794
+ "ie": 189,
795
+ "ied": 589,
796
+ "ient": 891,
797
+ "ies": 211,
798
+ "iew": 688,
799
+ "if": 159,
800
+ "iff": 329,
801
+ "ific": 560,
802
+ "ig": 74,
803
+ "igh": 553,
804
+ "ight": 120,
805
+ "ign": 434,
806
+ "il": 128,
807
+ "ild": 421,
808
+ "ile": 429,
809
+ "ility": 626,
810
+ "ill": 118,
811
+ "ily": 522,
812
+ "im": 86,
813
+ "in": 10,
814
+ "ind": 191,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
815
  "ine": 263,
816
+ "ines": 877,
817
+ "iness": 783,
818
+ "ing": 21,
819
+ "ings": 687,
820
+ "ink": 188,
821
+ "int": 372,
822
+ "ion": 54,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
823
  "ions": 294,
824
+ "ious": 484,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
825
  "ip": 346,
826
+ "ir": 94,
 
 
 
 
 
 
 
 
 
827
  "ire": 357,
828
+ "is": 37,
829
+ "ise": 530,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
830
  "ish": 391,
831
+ "ision": 833,
832
+ "iss": 511,
833
+ "ist": 172,
834
+ "it": 36,
835
+ "ite": 320,
836
+ "ited": 673,
837
+ "ith": 87,
838
+ "ities": 627,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
839
  "ition": 420,
840
+ "its": 738,
841
+ "itt": 297,
842
+ "ittle": 399,
843
+ "ity": 181,
844
+ "iv": 300,
845
+ "ive": 171,
846
+ "ix": 566,
847
+ "iz": 336,
848
+ "ize": 793,
849
+ "j": 1020,
850
+ "ject": 586,
851
+ "k": 1018,
852
+ "ke": 78,
853
+ "kes": 646,
854
+ "king": 271,
855
+ "ks": 568,
856
+ "l": 1006,
857
+ "ld": 76,
858
+ "le": 52,
859
+ "led": 785,
860
+ "les": 610,
861
+ "li": 751,
862
+ "lic": 569,
863
+ "ll": 30,
864
+ "llow": 714,
865
+ "lud": 722,
866
+ "ly": 72,
867
+ "m": 1010,
868
+ "me": 791,
869
+ "ment": 161,
870
+ "ments": 529,
871
+ "mo": 831,
872
+ "n": 1002,
 
 
 
 
873
  "na": 458,
874
+ "nce": 992,
875
+ "nd": 14,
876
+ "nder": 309,
877
+ "nds": 531,
878
+ "ne": 728,
879
+ "ner": 896,
880
+ "ness": 699,
 
 
 
 
 
881
  "ning": 471,
882
+ "nt": 174,
883
+ "o": 1000,
884
+ "ock": 809,
885
+ "od": 186,
886
+ "ody": 506,
887
  "og": 476,
888
+ "ol": 166,
889
+ "olog": 782,
890
+ "om": 44,
891
+ "ome": 405,
892
+ "on": 18,
893
+ "one": 278,
894
+ "oney": 807,
895
+ "ong": 290,
896
+ "onna": 636,
897
+ "ons": 538,
898
+ "ont": 881,
899
+ "oo": 127,
900
  "ood": 481,
901
+ "ook": 210,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
902
  "ool": 528,
903
+ "op": 163,
904
+ "or": 28,
905
+ "ord": 418,
906
+ "ore": 285,
907
+ "orm": 390,
908
+ "ors": 473,
909
+ "ort": 177,
910
+ "ory": 444,
911
+ "os": 192,
912
+ "ose": 272,
913
+ "osed": 984,
914
+ "ot": 60,
915
+ "oth": 513,
916
+ "other": 460,
917
+ "ou": 13,
918
+ "oug": 195,
919
+ "ough": 252,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
920
  "ought": 562,
921
+ "ould": 116,
922
+ "ound": 248,
923
+ "ount": 644,
924
+ "our": 185,
925
+ "ous": 240,
926
+ "ouse": 768,
927
+ "out": 158,
928
+ "ow": 55,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
929
  "ower": 632,
930
+ "own": 332,
931
+ "oy": 672,
932
+ "p": 1015,
933
+ "pe": 133,
934
+ "pect": 590,
935
+ "per": 282,
936
+ "ph": 582,
937
+ "pl": 331,
938
+ "ple": 468,
939
+ "pp": 151,
940
+ "ps": 842,
941
+ "pt": 287,
942
+ "q": 1022,
943
+ "qu": 162,
944
+ "r": 1004,
945
+ "ra": 96,
946
+ "ract": 665,
947
  "ram": 633,
948
+ "re": 6,
949
+ "ready": 953,
950
+ "reat": 305,
951
+ "red": 265,
952
+ "ree": 302,
953
  "ren": 638,
954
+ "res": 180,
955
+ "ress": 380,
956
+ "ret": 963,
957
+ "ri": 167,
958
+ "ro": 73,
959
+ "ros": 776,
960
+ "ross": 886,
961
+ "ru": 426,
962
+ "ry": 347,
963
+ "s": 1003,
964
+ "se": 80,
965
+ "sel": 367,
966
+ "self": 499,
967
+ "so": 243,
968
+ "st": 82,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
969
  "stem": 732,
970
+ "t": 998,
971
+ "te": 501,
972
+ "ten": 856,
973
+ "ter": 132,
974
+ "th": 109,
975
+ "ther": 268,
976
+ "thing": 507,
977
+ "ting": 467,
978
+ "ty": 370,
979
+ "u": 1008,
980
+ "ual": 534,
981
+ "ually": 342,
982
+ "ub": 546,
983
+ "ud": 245,
984
+ "ue": 316,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
985
  "ues": 762,
986
+ "ug": 683,
987
+ "ul": 148,
988
+ "ular": 622,
989
+ "ull": 518,
990
+ "ult": 411,
991
+ "um": 196,
992
+ "un": 193,
993
+ "ur": 83,
994
+ "ure": 228,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
995
  "ures": 799,
996
+ "uring": 773,
997
+ "urn": 548,
998
+ "us": 51,
999
+ "use": 152,
1000
+ "ust": 113,
1001
+ "ut": 59,
1002
+ "ute": 639,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1003
  "ution": 829,
1004
+ "v": 1017,
1005
+ "ve": 47,
1006
+ "ved": 491,
1007
+ "vel": 436,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1008
  "velop": 936,
1009
+ "ven": 224,
1010
+ "ver": 71,
1011
+ "vern": 746,
1012
+ "vers": 542,
1013
+ "very": 169,
1014
+ "ves": 496,
1015
+ "ving": 393,
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1016
  "w": 1012,
1017
+ "ward": 577,
1018
+ "way": 366,
1019
+ "ways": 624,
1020
+ "we": 541,
1021
+ "ween": 748,
1022
+ "wn": 447,
 
 
1023
  "x": 1021,
1024
+ "xt": 448,
1025
+ "y": 1011,
1026
+ "ys": 532,
1027
  "z": 1023
1028
+ }