Anthropic's client side tokenizer.
Accuracy compared to actual Claude 3 Haiku tokenizer (Claude 3 family has the same tokenizer):
Tokenization results saved to __temp.txt.tokens
Text: Hello, world! This is a simple...
Actual tokens: 17
Predicted tokens: 10
Accuracy: 58.82%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: The quick brown fox jumps over...
Actual tokens: 19
Predicted tokens: 10
Accuracy: 52.63%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: In computer programming, a hel...
Actual tokens: 29
Predicted tokens: 21
Accuracy: 72.41%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: Artificial intelligence (AI) i...
Actual tokens: 30
Predicted tokens: 24
Accuracy: 80.00%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: The Eiffel Tower is a wrought-...
Actual tokens: 56
Predicted tokens: 48
Accuracy: 85.71%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: To be, or not to be, that is t...
Actual tokens: 60
Predicted tokens: 50
Accuracy: 83.33%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: In the beginning God created t...
Actual tokens: 38
Predicted tokens: 31
Accuracy: 81.58%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: Four score and seven years ago...
Actual tokens: 41
Predicted tokens: 34
Accuracy: 82.93%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: I have a dream that one day th...
Actual tokens: 51
Predicted tokens: 43
Accuracy: 84.31%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: That's one small step for man,...
Actual tokens: 22
Predicted tokens: 14
Accuracy: 63.64%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: Here are the key points about ...
Actual tokens: 203
Predicted tokens: 195
Accuracy: 96.06%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: This appears to be an excerpt ...
Actual tokens: 179
Predicted tokens: 180
Accuracy: 99.44%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: This is the beginning of the b...
Actual tokens: 194
Predicted tokens: 191
Accuracy: 98.45%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: That is the opening lines of t...
Actual tokens: 177
Predicted tokens: 163
Accuracy: 92.09%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: That's a powerful and inspirin...
Actual tokens: 193
Predicted tokens: 190
Accuracy: 98.45%
--------------------------------------------------
Tokenization results saved to __temp.txt.tokens
Text: That famous quote is from Neil...
Actual tokens: 131
Predicted tokens: 122
Accuracy: 93.13%
--------------------------------------------------
Average accuracy: 82.69%
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support