Atlas โ€” LLaMA-3.3-70B fine-tuned for Harmonized Tariff Schedule (HTS) classification

This model is presented in the paper ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification.

Atlas is a domain-specialized LLaMA-3.3-70B model fine-tuned on U.S. Customs CROSS rulings for Harmonized Tariff Schedule (HTS) code assignment.
It targets both 10-digit U.S. HTS (compliance) and 6-digit HS (globally harmonized) accuracy.

  • 10-digit exact match: 40.0%
  • 6-digit exact match: 57.5%

Atlas outperforms general-purpose LLMs while remaining deployable/self-hostable.

Example (from the demo):

User:
What is the HTS US Code for 4[N-(2,4-Diamino-6-Pteridinylmethyl)-N-Methylamino] Benzoic Acid Sodium Salt?

Model:
HTS US Code -> 2933.59.4700
Reasoning -> Falls under heterocyclic compounds with nitrogen hetero-atom(s); specifically classified within pteridine derivatives used in pharmaceutical or biochemical applications per CROSS rulings.


TL;DR

  • Task: Assign an HTS code given a product description (and optionally rationale).
  • Why it matters: Misclassification halts shipments; 6-digit HS is global, 10-digit is U.S.-specific.
  • Whatโ€™s new: First open benchmark + strong open model baseline focused on semiconductors/manufacturing.

Intended use & limitations

Use cases

  • Automated HTS/HS pre-classification with human-in-the-loop review.
  • Decision support for brokers, compliance, and trade workflows.
  • Research on domain reasoning, retrieval, and alignment.

Limitations

  • Not legal advice; rulings change and are context-dependent.
  • Training data is concentrated in semiconductors/manufacturing; performance may vary elsewhere.
  • Model can produce confident but incorrect codes; keep a human validator for high-stakes usage.
  • Always verify against the current HTS/USITC and local customs guidance.

Data

  • Source: CROSS (U.S. Customs Rulings Online Search System).
  • Splits: 18,254 train / 200 valid / 200 test.
  • Each example includes:
    • product description
    • chain-of-reasoning style justification
    • ground-truth HTS code

Dataset card: flexifyai/cross_rulings_hts_dataset_for_tariffs


Training setup (summary)

  • Base: LLaMA-3.3-70B (dense)
  • Objective: Supervised fine-tuning (token-level NLL)
  • Optimizer: AdamW (ฮฒ1=0.9, ฮฒ2=0.95, wd=0.1), cosine LR schedule, peak LR 1e-7
  • Precision: bf16, gradient accumulation (effective batch โ‰ˆ 64 seqs)
  • Hardware: 16ร— A100-80GB, 5 epochs (~1.4k steps)

We chose a dense model for simpler finetuning/inference and reproducibility under budget constraints.
Future work: retrieval, DPO/GRPO, and smaller distilled variants.


Results (200-example held-out test)

Model 10-digit exact 6-digit exact Avg. digits correct
GPT-5-Thinking 25.0% 55.5% 5.61
Gemini-2.5-Pro-Thinking 13.5% 31.0% 2.92
DeepSeek-R1 (05/28) 2.5% 26.5% 3.24
GPT-OSS-120B 1.5% 8.0% 2.58
LLaMA-3.3-70B (base) 2.1% 20.7% 3.31
Atlas (this model) 40.0% 57.5% 6.30

๐Ÿ’ฐ Cost note: Self-hosting Atlas on A100s can be significantly cheaper per 1k inferences than proprietary APIs.


Prompting

Atlas expects an instruction like:

User: What is the HTS US Code for [product_description]? Model: HTS US Code -> [10-digit code] Reasoning -> [short justification]

Minimal example

User:
What is the HTS US Code for 300mm silicon wafers, polished, un-doped, for semiconductor fabrication?

Model:
HTS US Code -> 3818.00.0000
Reasoning -> Classified under chemical elements/compounds doped for electronics; wafer form per CROSS precedents.


Authors

  • Pritish Yuvraj (Flexify.AI) โ€” pritishyuvraj.com
  • Siva Devarakonda (Flexify.AI)

๐Ÿ“– Citation

If you find this work useful, please cite our paper:

@misc{yuvraj2025atlasbenchmarkingadaptingllms,
  title={ATLAS: Benchmarking and Adapting LLMs for Global Trade via Harmonized Tariff Code Classification}, 
  author={Pritish Yuvraj and Siva Devarakonda},
  year={2025},
  eprint={2509.18400},
  archivePrefix={arXiv},
  primaryClass={cs.AI},
  url={https://arxiv.org/abs/2509.18400}, 
}
Downloads last month
29
Safetensors
Model size
71B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for flexifyai/atlas-llama3.3-70b-hts-classification

Finetuned
(210)
this model

Dataset used to train flexifyai/atlas-llama3.3-70b-hts-classification