π MyBusModel: A Custom NER Model for Public Transport Queries
BusRouteNER is a lightweight, rule-enhanced Named Entity Recognition (NER) model fine-tuned for identifying bus numbers and stops/locations in natural language queries related to public transportation in West Bengal, India.
β¨ What does this model do?
This model is trained to extract two key entity types from user queries:
BUS_NUMBER: Recognizes bus numbers like12C/1,S-12,12B, etc.LOCATION: Identifies source and destination locations such asHowrah,Barrackpore,Santragachi, etc.
It also filters out irrelevant noise words to give a clean and accurate entity list that can be used in downstream logic such as search, recommendations, or route-finding.
π Example
Input Query:
I want to go from Santragachi to Barrackpore, can I take 12C/1 or S-12?
Model Output:
Santragachi LOCATION
Barrackpore LOCATION
12C/1 BUS_NUMBER
S-12 BUS_NUMBER
π§ How it works
- Built using spaCy (
en_core_web_sm) and extended withEntityRulerfor custom NER logic. - Bus numbers and stop names are sourced from curated CSV datasets.
- Custom regex patterns identify bus numbers with formats like
12C/1,S-12, etc. - Noise words like I, want, take, can, should are excluded from final entity extraction.
π Results
- Precision: 0.8078439964943033
- Recall: 0.6660043352601156
- F1 Score: 0.7300990099009901
Model tree for Alapan/mybusbot
Base model
distilbert/distilbert-base-cased
Quantized
dslim/distilbert-NER