eaddario commited on 21 days ago

Commit

be45168

verified ·

1 Parent(s): 0ba2d3d

Generate Perplexity, KLD, ARC, HellaSwag, MMLU, Truthful QA and WinoGrande scores

Browse files

This view is limited to 50 files because it contains too many changes. See raw diff

Files changed (50) hide show

scores/Qwen3-30B-A3B-pruned-F16.arc +15 -0
scores/Qwen3-30B-A3B-pruned-F16.hsw +14 -0
scores/Qwen3-30B-A3B-pruned-F16.mmlu +15 -0
scores/Qwen3-30B-A3B-pruned-F16.tqa +15 -0
scores/Qwen3-30B-A3B-pruned-F16.wng +13 -0
scores/Qwen3-30B-A3B-pruned-iq3_m.arc +13 -0
scores/Qwen3-30B-A3B-pruned-iq3_m.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-iq3_m.mmlu +13 -0
scores/Qwen3-30B-A3B-pruned-iq3_m.ppx +37 -0
scores/Qwen3-30B-A3B-pruned-iq3_m.tqa +13 -0
scores/Qwen3-30B-A3B-pruned-iq3_m.wng +11 -0
scores/Qwen3-30B-A3B-pruned-iq3_s.arc +13 -0
scores/Qwen3-30B-A3B-pruned-iq3_s.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-iq3_s.mmlu +13 -0
scores/Qwen3-30B-A3B-pruned-iq3_s.ppx +37 -0
scores/Qwen3-30B-A3B-pruned-iq3_s.tqa +13 -0
scores/Qwen3-30B-A3B-pruned-iq3_s.wng +11 -0
scores/Qwen3-30B-A3B-pruned-iq4_nl.arc +13 -0
scores/Qwen3-30B-A3B-pruned-iq4_nl.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-iq4_nl.mmlu +13 -0
scores/Qwen3-30B-A3B-pruned-iq4_nl.ppx +37 -0
scores/Qwen3-30B-A3B-pruned-iq4_nl.tqa +13 -0
scores/Qwen3-30B-A3B-pruned-iq4_nl.wng +11 -0
scores/Qwen3-30B-A3B-pruned-q3_k_l.arc +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_l.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-q3_k_l.mmlu +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_l.ppx +37 -0
scores/Qwen3-30B-A3B-pruned-q3_k_l.tqa +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_l.wng +11 -0
scores/Qwen3-30B-A3B-pruned-q3_k_m.arc +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_m.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-q3_k_m.mmlu +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_m.ppx +37 -0
scores/Qwen3-30B-A3B-pruned-q3_k_m.tqa +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_m.wng +11 -0
scores/Qwen3-30B-A3B-pruned-q3_k_s.arc +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_s.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-q3_k_s.mmlu +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_s.ppx +37 -0
scores/Qwen3-30B-A3B-pruned-q3_k_s.tqa +13 -0
scores/Qwen3-30B-A3B-pruned-q3_k_s.wng +11 -0
scores/Qwen3-30B-A3B-pruned-q4_k_m.arc +13 -0
scores/Qwen3-30B-A3B-pruned-q4_k_m.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-q4_k_m.mmlu +13 -0
scores/Qwen3-30B-A3B-pruned-q4_k_m.ppx +37 -0
scores/Qwen3-30B-A3B-pruned-q4_k_m.tqa +13 -0
scores/Qwen3-30B-A3B-pruned-q4_k_m.wng +11 -0
scores/Qwen3-30B-A3B-pruned-q4_k_s.arc +13 -0
scores/Qwen3-30B-A3B-pruned-q4_k_s.hsw +12 -0
scores/Qwen3-30B-A3B-pruned-q4_k_s.mmlu +13 -0

scores/Qwen3-30B-A3B-pruned-F16.arc ADDED Viewed

	@@ -0,0 +1,15 @@

+build: 5553 (c7e0a205) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) for x86_64-amazon-linux
+llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA1 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA2 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA3 (Tesla T4) - 14810 MiB free
+llama_model_loader: loaded meta data with 40 key-value pairs and 579 tensors from ./Qwen3-30B-A3B-F16.gguf (version GGUF V3 (latest))
+Final result: 66.6667 +/- 1.7225
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =  476545.17 ms
+llama_perf_context_print: prompt eval time =  317100.07 ms / 35972 tokens (    8.82 ms per token,   113.44 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  320554.96 ms / 35973 tokens

scores/Qwen3-30B-A3B-pruned-F16.hsw ADDED Viewed

	@@ -0,0 +1,14 @@

+build: 5553 (c7e0a205) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) for x86_64-amazon-linux
+llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA1 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA2 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA3 (Tesla T4) - 14810 MiB free
+llama_model_loader: loaded meta data with 40 key-value pairs and 579 tensors from ./Qwen3-30B-A3B-F16.gguf (version GGUF V3 (latest))
+750	72.66666667%	[69.3676%, 75.7347%]
+llama_perf_context_print:        load time =   14042.88 ms
+llama_perf_context_print: prompt eval time =  953982.56 ms / 123581 tokens (    7.72 ms per token,   129.54 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  971012.88 ms / 123582 tokens

scores/Qwen3-30B-A3B-pruned-F16.mmlu ADDED Viewed

	@@ -0,0 +1,15 @@

+build: 5553 (c7e0a205) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) for x86_64-amazon-linux
+llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA1 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA2 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA3 (Tesla T4) - 14810 MiB free
+llama_model_loader: loaded meta data with 40 key-value pairs and 579 tensors from ./Qwen3-30B-A3B-F16.gguf (version GGUF V3 (latest))
+Final result: 42.1333 +/- 1.8042
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =   13886.47 ms
+llama_perf_context_print: prompt eval time =  494837.42 ms / 67719 tokens (    7.31 ms per token,   136.85 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  500285.67 ms / 67720 tokens

scores/Qwen3-30B-A3B-pruned-F16.tqa ADDED Viewed

	@@ -0,0 +1,15 @@

+build: 5553 (c7e0a205) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) for x86_64-amazon-linux
+llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA1 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA2 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA3 (Tesla T4) - 14810 MiB free
+llama_model_loader: loaded meta data with 40 key-value pairs and 579 tensors from ./Qwen3-30B-A3B-F16.gguf (version GGUF V3 (latest))
+Final result: 31.2000 +/- 1.6929
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =   13704.38 ms
+llama_perf_context_print: prompt eval time =  426482.94 ms / 49696 tokens (    8.58 ms per token,   116.53 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  433376.19 ms / 49697 tokens

scores/Qwen3-30B-A3B-pruned-F16.wng ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5553 (c7e0a205) with cc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-5) for x86_64-amazon-linux
+llama_model_load_from_file_impl: using device CUDA0 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA1 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA2 (Tesla T4) - 14810 MiB free
+llama_model_load_from_file_impl: using device CUDA3 (Tesla T4) - 14810 MiB free
+llama_model_loader: loaded meta data with 40 key-value pairs and 579 tensors from ./Qwen3-30B-A3B-F16.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 75.8667 +/- 1.5635
+llama_perf_context_print:        load time =   13885.91 ms
+llama_perf_context_print: prompt eval time =  165214.42 ms / 21448 tokens (    7.70 ms per token,   129.82 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  168672.50 ms / 21449 tokens

scores/Qwen3-30B-A3B-pruned-iq3_m.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_M.gguf (version GGUF V3 (latest))
+Final result: 56.8000 +/- 1.8100
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    5963.39 ms
+llama_perf_context_print: prompt eval time =   37054.73 ms / 35972 tokens (    1.03 ms per token,   970.78 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   37976.03 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_m.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_M.gguf (version GGUF V3 (latest))
+750	70.26666667%	[66.8989%, 73.4279%]
+llama_perf_context_print:        load time =     973.57 ms
+llama_perf_context_print: prompt eval time =  124967.37 ms / 126038 tokens (    0.99 ms per token,  1008.57 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  128697.47 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_m.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_M.gguf (version GGUF V3 (latest))
+Final result: 39.0667 +/- 1.7827
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =     991.14 ms
+llama_perf_context_print: prompt eval time =   66988.49 ms / 67719 tokens (    0.99 ms per token,  1010.91 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   68293.40 ms / 67720 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_m.ppx ADDED Viewed

	@@ -0,0 +1,37 @@

+====== Perplexity statistics ======
+Mean PPL(Q)                   :  77.090453 ±   1.044822
+Mean PPL(base)                :   8.445938 ±   0.065177
+Cor(ln(PPL(Q)), ln(PPL(base))):  73.55%
+Mean ln(PPL(Q)/PPL(base))     :   2.211294 ±   0.009454
+Mean PPL(Q)/PPL(base)         :   9.127518 ±   0.086296
+Mean PPL(Q)-PPL(base)         :  68.644515 ±   0.997862
+====== KL divergence statistics ======
+Mean    KLD:   2.063818 ±   0.006856
+Maximum KLD:  39.386982
+99.9%   KLD:  19.179407
+99.0%   KLD:  12.731147
+99.0%   KLD:  12.731147
+Median  KLD:   1.246560
+10.0%   KLD:   0.011396
+ 5.0%   KLD:   0.001648
+ 1.0%   KLD:   0.000063
+Minimum KLD:  -0.000003
+====== Token probability statistics ======
+Mean    Δp: -9.665 ± 0.088 %
+Maximum Δp: 99.654%
+99.9%   Δp: 93.397%
+99.0%   Δp: 73.865%
+95.0%   Δp: 40.361%
+90.0%   Δp: 19.833%
+75.0%   Δp:  0.586%
+Median  Δp: -0.448%
+25.0%   Δp: -15.350%
+10.0%   Δp: -62.944%
+ 5.0%   Δp: -90.459%
+ 1.0%   Δp: -99.970%
+ 0.1%   Δp: -100.000%
+Minimum Δp: -100.000%
+RMS Δp    : 35.199 ± 0.092 %
+Same top p: 57.360 ± 0.128 %

scores/Qwen3-30B-A3B-pruned-iq3_m.tqa ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_M.gguf (version GGUF V3 (latest))
+Final result: 30.6667 +/- 1.6849
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =    1061.94 ms
+llama_perf_context_print: prompt eval time =   52253.67 ms / 49696 tokens (    1.05 ms per token,   951.05 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   54015.85 ms / 49697 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_m.wng ADDED Viewed

	@@ -0,0 +1,11 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_M.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 62.5333 +/- 1.7686
+llama_perf_context_print:        load time =    1084.28 ms
+llama_perf_context_print: prompt eval time =   21513.44 ms / 21448 tokens (    1.00 ms per token,   996.96 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   22046.16 ms / 21449 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_s.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_S.gguf (version GGUF V3 (latest))
+Final result: 48.5333 +/- 1.8262
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    5752.48 ms
+llama_perf_context_print: prompt eval time =   37048.76 ms / 35972 tokens (    1.03 ms per token,   970.94 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   37992.87 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_s.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_S.gguf (version GGUF V3 (latest))
+750	68.66666667%	[65.2590%, 71.8841%]
+llama_perf_context_print:        load time =    1004.53 ms
+llama_perf_context_print: prompt eval time =  127701.50 ms / 126038 tokens (    1.01 ms per token,   986.97 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  131559.01 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_s.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_S.gguf (version GGUF V3 (latest))
+Final result: 37.0667 +/- 1.7648
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =    1051.40 ms
+llama_perf_context_print: prompt eval time =   67708.86 ms / 67719 tokens (    1.00 ms per token,  1000.15 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   69083.94 ms / 67720 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_s.ppx ADDED Viewed

	@@ -0,0 +1,37 @@

+====== Perplexity statistics ======
+Mean PPL(Q)                   :  69.935907 ±   0.918185
+Mean PPL(base)                :   8.445938 ±   0.065177
+Cor(ln(PPL(Q)), ln(PPL(base))):  72.89%
+Mean ln(PPL(Q)/PPL(base))     :   2.113894 ±   0.009177
+Mean PPL(Q)/PPL(base)         :   8.280419 ±   0.075991
+Mean PPL(Q)-PPL(base)         :  61.489969 ±   0.871820
+====== KL divergence statistics ======
+Mean    KLD:   1.997500 ±   0.006825
+Maximum KLD:  36.616871
+99.9%   KLD:  18.998100
+99.0%   KLD:  12.936396
+99.0%   KLD:  12.936396
+Median  KLD:   1.190034
+10.0%   KLD:   0.013888
+ 5.0%   KLD:   0.002134
+ 1.0%   KLD:   0.000090
+Minimum KLD:  -0.000004
+====== Token probability statistics ======
+Mean    Δp: -10.199 ± 0.088 %
+Maximum Δp: 99.504%
+99.9%   Δp: 93.029%
+99.0%   Δp: 72.891%
+95.0%   Δp: 39.848%
+90.0%   Δp: 19.063%
+75.0%   Δp:  0.528%
+Median  Δp: -0.540%
+25.0%   Δp: -16.787%
+10.0%   Δp: -63.592%
+ 5.0%   Δp: -91.195%
+ 1.0%   Δp: -99.977%
+ 0.1%   Δp: -100.000%
+Minimum Δp: -100.000%
+RMS Δp    : 35.474 ± 0.092 %
+Same top p: 56.834 ± 0.128 %

scores/Qwen3-30B-A3B-pruned-iq3_s.tqa ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_S.gguf (version GGUF V3 (latest))
+Final result: 32.0000 +/- 1.7045
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =    1017.78 ms
+llama_perf_context_print: prompt eval time =   51819.94 ms / 49696 tokens (    1.04 ms per token,   959.01 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   53589.19 ms / 49697 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq3_s.wng ADDED Viewed

	@@ -0,0 +1,11 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ3_S.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 63.8667 +/- 1.7553
+llama_perf_context_print:        load time =    1065.10 ms
+llama_perf_context_print: prompt eval time =   21437.40 ms / 21448 tokens (    1.00 ms per token,  1000.49 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   21986.82 ms / 21449 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq4_nl.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ4_NL.gguf (version GGUF V3 (latest))
+Final result: 61.7333 +/- 1.7759
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    7153.23 ms
+llama_perf_context_print: prompt eval time =   37237.77 ms / 35972 tokens (    1.04 ms per token,   966.01 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   38188.71 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq4_nl.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ4_NL.gguf (version GGUF V3 (latest))
+750	71.20000000%	[67.8576%, 74.3263%]
+llama_perf_context_print:        load time =    1206.30 ms
+llama_perf_context_print: prompt eval time =  127295.75 ms / 126038 tokens (    1.01 ms per token,   990.12 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  131179.47 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq4_nl.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ4_NL.gguf (version GGUF V3 (latest))
+Final result: 40.9333 +/- 1.7967
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =    1184.18 ms
+llama_perf_context_print: prompt eval time =   68103.64 ms / 67719 tokens (    1.01 ms per token,   994.35 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   69454.89 ms / 67720 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq4_nl.ppx ADDED Viewed

	@@ -0,0 +1,37 @@

+====== Perplexity statistics ======
+Mean PPL(Q)                   :  58.059268 ±   0.724129
+Mean PPL(base)                :   8.445938 ±   0.065177
+Cor(ln(PPL(Q)), ln(PPL(base))):  73.87%
+Mean ln(PPL(Q)/PPL(base))     :   1.927779 ±   0.008539
+Mean PPL(Q)/PPL(base)         :   6.874224 ±   0.058701
+Mean PPL(Q)-PPL(base)         :  49.613331 ±   0.677412
+====== KL divergence statistics ======
+Mean    KLD:   1.827625 ±   0.006356
+Maximum KLD:  37.203815
+99.9%   KLD:  17.289213
+99.0%   KLD:  12.241351
+99.0%   KLD:  12.241351
+Median  KLD:   1.062228
+10.0%   KLD:   0.013747
+ 5.0%   KLD:   0.002407
+ 1.0%   KLD:   0.000120
+Minimum KLD:  -0.000003
+====== Token probability statistics ======
+Mean    Δp: -10.074 ± 0.087 %
+Maximum Δp: 99.662%
+99.9%   Δp: 90.888%
+99.0%   Δp: 70.751%
+95.0%   Δp: 38.384%
+90.0%   Δp: 18.656%
+75.0%   Δp:  0.543%
+Median  Δp: -0.549%
+25.0%   Δp: -15.860%
+10.0%   Δp: -62.678%
+ 5.0%   Δp: -91.248%
+ 1.0%   Δp: -99.979%
+ 0.1%   Δp: -100.000%
+Minimum Δp: -100.000%
+RMS Δp    : 35.013 ± 0.093 %
+Same top p: 58.204 ± 0.128 %

scores/Qwen3-30B-A3B-pruned-iq4_nl.tqa ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ4_NL.gguf (version GGUF V3 (latest))
+Final result: 30.9333 +/- 1.6889
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =    1254.92 ms
+llama_perf_context_print: prompt eval time =   52206.26 ms / 49696 tokens (    1.05 ms per token,   951.92 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   53979.12 ms / 49697 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-iq4_nl.wng ADDED Viewed

	@@ -0,0 +1,11 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-IQ4_NL.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 65.8667 +/- 1.7325
+llama_perf_context_print:        load time =    1229.62 ms
+llama_perf_context_print: prompt eval time =   21328.35 ms / 21448 tokens (    0.99 ms per token,  1005.61 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   21840.27 ms / 21449 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_l.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_L.gguf (version GGUF V3 (latest))
+Final result: 57.8667 +/- 1.8042
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    5836.97 ms
+llama_perf_context_print: prompt eval time =   38153.96 ms / 35972 tokens (    1.06 ms per token,   942.81 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   39110.03 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_l.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_L.gguf (version GGUF V3 (latest))
+750	71.73333333%	[68.4062%, 74.8389%]
+llama_perf_context_print:        load time =    1007.76 ms
+llama_perf_context_print: prompt eval time =  130309.00 ms / 126038 tokens (    1.03 ms per token,   967.22 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  134163.99 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_l.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_L.gguf (version GGUF V3 (latest))
+Final result: 38.6667 +/- 1.7794
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =    1034.53 ms
+llama_perf_context_print: prompt eval time =   69376.50 ms / 67719 tokens (    1.02 ms per token,   976.11 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   70750.75 ms / 67720 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_l.ppx ADDED Viewed

	@@ -0,0 +1,37 @@

+====== Perplexity statistics ======
+Mean PPL(Q)                   :  60.855606 ±   0.768774
+Mean PPL(base)                :   8.445938 ±   0.065177
+Cor(ln(PPL(Q)), ln(PPL(base))):  73.47%
+Mean ln(PPL(Q)/PPL(base))     :   1.974818 ±   0.008712
+Mean PPL(Q)/PPL(base)         :   7.205311 ±   0.062773
+Mean PPL(Q)-PPL(base)         :  52.409668 ±   0.722246
+====== KL divergence statistics ======
+Mean    KLD:   1.886749 ±   0.006413
+Maximum KLD:  32.243908
+99.9%   KLD:  18.146301
+99.0%   KLD:  12.053426
+99.0%   KLD:  12.053426
+Median  KLD:   1.116526
+10.0%   KLD:   0.014264
+ 5.0%   KLD:   0.002443
+ 1.0%   KLD:   0.000117
+Minimum KLD:  -0.000003
+====== Token probability statistics ======
+Mean    Δp: -10.112 ± 0.088 %
+Maximum Δp: 99.677%
+99.9%   Δp: 92.123%
+99.0%   Δp: 72.105%
+95.0%   Δp: 39.096%
+90.0%   Δp: 19.004%
+75.0%   Δp:  0.549%
+Median  Δp: -0.551%
+25.0%   Δp: -16.110%
+10.0%   Δp: -63.590%
+ 5.0%   Δp: -91.581%
+ 1.0%   Δp: -99.973%
+ 0.1%   Δp: -100.000%
+Minimum Δp: -100.000%
+RMS Δp    : 35.285 ± 0.093 %
+Same top p: 57.653 ± 0.128 %

scores/Qwen3-30B-A3B-pruned-q3_k_l.tqa ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_L.gguf (version GGUF V3 (latest))
+Final result: 32.2667 +/- 1.7082
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =    1045.56 ms
+llama_perf_context_print: prompt eval time =   53181.04 ms / 49696 tokens (    1.07 ms per token,   934.47 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   54884.20 ms / 49697 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_l.wng ADDED Viewed

	@@ -0,0 +1,11 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_L.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 65.2000 +/- 1.7405
+llama_perf_context_print:        load time =     964.35 ms
+llama_perf_context_print: prompt eval time =   21817.15 ms / 21448 tokens (    1.02 ms per token,   983.08 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   22321.50 ms / 21449 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_m.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_M.gguf (version GGUF V3 (latest))
+Final result: 56.4000 +/- 1.8119
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    5706.95 ms
+llama_perf_context_print: prompt eval time =   37640.79 ms / 35972 tokens (    1.05 ms per token,   955.67 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   38556.26 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_m.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_M.gguf (version GGUF V3 (latest))
+750	70.80000000%	[67.4465%, 73.9415%]
+llama_perf_context_print:        load time =     924.40 ms
+llama_perf_context_print: prompt eval time =  127735.77 ms / 126038 tokens (    1.01 ms per token,   986.71 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  131427.60 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_m.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_M.gguf (version GGUF V3 (latest))
+Final result: 39.3333 +/- 1.7849
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =     943.54 ms
+llama_perf_context_print: prompt eval time =   67380.31 ms / 67719 tokens (    0.99 ms per token,  1005.03 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   68692.56 ms / 67720 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_m.ppx ADDED Viewed

	@@ -0,0 +1,37 @@

+====== Perplexity statistics ======
+Mean PPL(Q)                   :  59.072808 ±   0.741897
+Mean PPL(base)                :   8.445938 ±   0.065177
+Cor(ln(PPL(Q)), ln(PPL(base))):  73.82%
+Mean ln(PPL(Q)/PPL(base))     :   1.945085 ±   0.008614
+Mean PPL(Q)/PPL(base)         :   6.994227 ±   0.060246
+Mean PPL(Q)-PPL(base)         :  50.626870 ±   0.695177
+====== KL divergence statistics ======
+Mean    KLD:   1.857932 ±   0.006326
+Maximum KLD:  31.393671
+99.9%   KLD:  17.826597
+99.0%   KLD:  11.908570
+99.0%   KLD:  11.908570
+Median  KLD:   1.097884
+10.0%   KLD:   0.013517
+ 5.0%   KLD:   0.002297
+ 1.0%   KLD:   0.000108
+Minimum KLD:  -0.000003
+====== Token probability statistics ======
+Mean    Δp: -9.998 ± 0.087 %
+Maximum Δp: 99.678%
+99.9%   Δp: 91.515%
+99.0%   Δp: 72.068%
+95.0%   Δp: 39.410%
+90.0%   Δp: 18.817%
+75.0%   Δp:  0.535%
+Median  Δp: -0.541%
+25.0%   Δp: -15.834%
+10.0%   Δp: -62.929%
+ 5.0%   Δp: -91.189%
+ 1.0%   Δp: -99.971%
+ 0.1%   Δp: -99.999%
+Minimum Δp: -100.000%
+RMS Δp    : 35.157 ± 0.092 %
+Same top p: 57.800 ± 0.128 %

scores/Qwen3-30B-A3B-pruned-q3_k_m.tqa ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_M.gguf (version GGUF V3 (latest))
+Final result: 31.6000 +/- 1.6988
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =     973.04 ms
+llama_perf_context_print: prompt eval time =   52005.69 ms / 49696 tokens (    1.05 ms per token,   955.59 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   53651.35 ms / 49697 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_m.wng ADDED Viewed

	@@ -0,0 +1,11 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_M.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 64.8000 +/- 1.7451
+llama_perf_context_print:        load time =    1010.06 ms
+llama_perf_context_print: prompt eval time =   21536.50 ms / 21448 tokens (    1.00 ms per token,   995.89 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   22054.60 ms / 21449 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_s.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_S.gguf (version GGUF V3 (latest))
+Final result: 58.1333 +/- 1.8026
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    5783.27 ms
+llama_perf_context_print: prompt eval time =   37846.85 ms / 35972 tokens (    1.05 ms per token,   950.46 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   38757.96 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_s.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_S.gguf (version GGUF V3 (latest))
+750	71.46666667%	[68.1319%, 74.5827%]
+llama_perf_context_print:        load time =     885.76 ms
+llama_perf_context_print: prompt eval time =  127239.40 ms / 126038 tokens (    1.01 ms per token,   990.56 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  130949.53 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_s.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_S.gguf (version GGUF V3 (latest))
+Final result: 38.9333 +/- 1.7816
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =     972.00 ms
+llama_perf_context_print: prompt eval time =   67683.27 ms / 67719 tokens (    1.00 ms per token,  1000.53 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   68987.39 ms / 67720 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_s.ppx ADDED Viewed

	@@ -0,0 +1,37 @@

+====== Perplexity statistics ======
+Mean PPL(Q)                   :  61.676169 ±   0.780539
+Mean PPL(base)                :   8.445938 ±   0.065177
+Cor(ln(PPL(Q)), ln(PPL(base))):  73.64%
+Mean ln(PPL(Q)/PPL(base))     :   1.988212 ±   0.008711
+Mean PPL(Q)/PPL(base)         :   7.302465 ±   0.063613
+Mean PPL(Q)-PPL(base)         :  53.230232 ±   0.733873
+====== KL divergence statistics ======
+Mean    KLD:   1.888847 ±   0.006380
+Maximum KLD:  33.008038
+99.9%   KLD:  17.721254
+99.0%   KLD:  12.006232
+99.0%   KLD:  12.006232
+Median  KLD:   1.128817
+10.0%   KLD:   0.013821
+ 5.0%   KLD:   0.002327
+ 1.0%   KLD:   0.000107
+Minimum KLD:  -0.000003
+====== Token probability statistics ======
+Mean    Δp: -10.182 ± 0.088 %
+Maximum Δp: 99.676%
+99.9%   Δp: 91.955%
+99.0%   Δp: 72.330%
+95.0%   Δp: 39.211%
+90.0%   Δp: 18.768%
+75.0%   Δp:  0.478%
+Median  Δp: -0.585%
+25.0%   Δp: -16.265%
+10.0%   Δp: -63.264%
+ 5.0%   Δp: -91.364%
+ 1.0%   Δp: -99.971%
+ 0.1%   Δp: -99.999%
+Minimum Δp: -100.000%
+RMS Δp    : 35.283 ± 0.093 %
+Same top p: 57.426 ± 0.128 %

scores/Qwen3-30B-A3B-pruned-q3_k_s.tqa ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_S.gguf (version GGUF V3 (latest))
+Final result: 30.9333 +/- 1.6889
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =     988.65 ms
+llama_perf_context_print: prompt eval time =   52374.28 ms / 49696 tokens (    1.05 ms per token,   948.86 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   54015.45 ms / 49697 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q3_k_s.wng ADDED Viewed

	@@ -0,0 +1,11 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q3_K_S.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 63.0667 +/- 1.7635
+llama_perf_context_print:        load time =     929.52 ms
+llama_perf_context_print: prompt eval time =   21714.40 ms / 21448 tokens (    1.01 ms per token,   987.73 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   22212.48 ms / 21449 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_m.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_M.gguf (version GGUF V3 (latest))
+Final result: 60.5333 +/- 1.7860
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    7096.93 ms
+llama_perf_context_print: prompt eval time =   38331.71 ms / 35972 tokens (    1.07 ms per token,   938.44 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   39294.96 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_m.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_M.gguf (version GGUF V3 (latest))
+750	71.46666667%	[68.1319%, 74.5827%]
+llama_perf_context_print:        load time =    1212.85 ms
+llama_perf_context_print: prompt eval time =  130689.40 ms / 126038 tokens (    1.04 ms per token,   964.41 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  134566.53 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_m.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_M.gguf (version GGUF V3 (latest))
+Final result: 41.8667 +/- 1.8026
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =    1263.34 ms
+llama_perf_context_print: prompt eval time =   69966.59 ms / 67719 tokens (    1.03 ms per token,   967.88 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   71361.75 ms / 67720 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_m.ppx ADDED Viewed

	@@ -0,0 +1,37 @@

+====== Perplexity statistics ======
+Mean PPL(Q)                   :  58.664820 ±   0.740540
+Mean PPL(base)                :   8.445938 ±   0.065177
+Cor(ln(PPL(Q)), ln(PPL(base))):  74.32%
+Mean ln(PPL(Q)/PPL(base))     :   1.938155 ±   0.008608
+Mean PPL(Q)/PPL(base)         :   6.945921 ±   0.059790
+Mean PPL(Q)-PPL(base)         :  50.218882 ±   0.693471
+====== KL divergence statistics ======
+Mean    KLD:   1.826410 ±   0.006359
+Maximum KLD:  38.350769
+99.9%   KLD:  17.546576
+99.0%   KLD:  12.219206
+99.0%   KLD:  12.219206
+Median  KLD:   1.056963
+10.0%   KLD:   0.011905
+ 5.0%   KLD:   0.002018
+ 1.0%   KLD:   0.000098
+Minimum KLD:  -0.000003
+====== Token probability statistics ======
+Mean    Δp: -9.585 ± 0.087 %
+Maximum Δp: 99.633%
+99.9%   Δp: 91.281%
+99.0%   Δp: 72.420%
+95.0%   Δp: 39.757%
+90.0%   Δp: 19.570%
+75.0%   Δp:  0.574%
+Median  Δp: -0.481%
+25.0%   Δp: -15.306%
+10.0%   Δp: -60.898%
+ 5.0%   Δp: -90.180%
+ 1.0%   Δp: -99.976%
+ 0.1%   Δp: -100.000%
+Minimum Δp: -100.000%
+RMS Δp    : 34.806 ± 0.092 %
+Same top p: 58.557 ± 0.128 %

scores/Qwen3-30B-A3B-pruned-q4_k_m.tqa ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_M.gguf (version GGUF V3 (latest))
+Final result: 30.9333 +/- 1.6889
+Random chance: 19.8992 +/- 1.4588
+llama_perf_context_print:        load time =    1267.40 ms
+llama_perf_context_print: prompt eval time =   53738.21 ms / 49696 tokens (    1.08 ms per token,   924.78 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   55506.24 ms / 49697 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_m.wng ADDED Viewed

	@@ -0,0 +1,11 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_M.gguf (version GGUF V3 (latest))
+Final Winogrande score(750 tasks): 66.1333 +/- 1.7292
+llama_perf_context_print:        load time =    1232.55 ms
+llama_perf_context_print: prompt eval time =   21981.09 ms / 21448 tokens (    1.02 ms per token,   975.75 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   22534.28 ms / 21449 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_s.arc ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_S.gguf (version GGUF V3 (latest))
+Final result: 60.8000 +/- 1.7838
+Random chance: 25.0083 +/- 1.5824
+llama_perf_context_print:        load time =    6986.96 ms
+llama_perf_context_print: prompt eval time =   38183.57 ms / 35972 tokens (    1.06 ms per token,   942.08 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   39137.36 ms / 35973 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_s.hsw ADDED Viewed

	@@ -0,0 +1,12 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_S.gguf (version GGUF V3 (latest))
+750	71.06666667%	[67.7206%, 74.1981%]
+llama_perf_context_print:        load time =    1188.49 ms
+llama_perf_context_print: prompt eval time =  130416.13 ms / 126038 tokens (    1.03 ms per token,   966.43 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =  134331.92 ms / 126039 tokens
+ggml_metal_free: deallocating

scores/Qwen3-30B-A3B-pruned-q4_k_s.mmlu ADDED Viewed

	@@ -0,0 +1,13 @@

+build: 5580 (bfb1e012) with Apple clang version 17.0.0 (clang-1700.0.13.3) for arm64-apple-darwin24.4.0
+llama_model_load_from_file_impl: using device Metal (Apple M4 Max) - 49151 MiB free
+llama_model_loader: loaded meta data with 42 key-value pairs and 555 tensors from ./Qwen3-30B-A3B-Q4_K_S.gguf (version GGUF V3 (latest))
+Final result: 41.4667 +/- 1.8002
+Random chance: 25.0000 +/- 1.5822
+llama_perf_context_print:        load time =    1227.63 ms
+llama_perf_context_print: prompt eval time =   69732.87 ms / 67719 tokens (    1.03 ms per token,   971.12 tokens per second)
+llama_perf_context_print:        eval time =       0.00 ms /     1 runs   (    0.00 ms per token,      inf tokens per second)
+llama_perf_context_print:       total time =   71119.80 ms / 67720 tokens
+ggml_metal_free: deallocating