Model Card for Model ID

Resume Customization is a key step in the job search process. A customized resume can dramatically increase the chances of proceeding to the next stages in the job application process. Job recruiters use sophisticated ATS (Applicant Tracking Systems) which can rank job seeker resumes in order of customization to the job description in question and even prima facie reject resumes that are not customized enough and beyond a certain threshold. In such a situation, how does one go about customizing their resume? We at Veersynd Bessa, a freelance project group at Humber Polytechnic at Toronto, asked the same question and found that one of the best (and recommended) ways is to pick out keywords from the job description itself to populate your resume with.

It is true that one can simply pick the keywords out of job descriptions themselves and repeat the process for every job application that they fill. However, this extractor can save them considerable time and effort which can be utilized productively in other tasks. Our project team has even gone one step further and built a resume customizer which populates these identified keywords into resume bullet points using a Large Language Model (LLM) in the SAR (Story, Action, Result) format. You can find the link to the github here. The fact remains that as a first step, we need to identify meaningful keywords out of the job description.

We turned to Artificial Intelligence for the task of extracting keywords. Information Theory tells us that there are specific semantic patterns in a body of text (information) that makes the whole text meaningful. It thus follows that there are specific semantic patterns that make some words out of the body of text more important and relevant than others, i.e. keyphrases. We can use a Deep Learning Model to exploit these semantic patterns, learn them and predict these keyphrases in any given body of text. The task could also be achieved statistically (by employing the concepts of statistical entropy and frequency, etc.) but a Deep Learning model, in our opinion, can deliver much better results due to the bottom up approach it employs; consequently overcoming any formulative limitations. The present model has thus been trained to identify keywords out of job descriptions which can then be used to populate a resume either manually or through the use of an automation tool (not unlike ours).

Model Description

The present AI model is based on KBIR model which was fine tuned by ml6team who used it to extract keywords out of news articles by training the model on the KPCrowd Dataset. You can read more about their work here. The present model is fine tuned on a custom dataset which was painstakingly created by our team in our free time. At the time of writing this description, the dataset is still under development and will be made available to the public once it is of a more superior quality.

Like the KBIR-KPCrowd model, the present model is a transformer model fine tuned as a token classification problem where each word in the document is classified as being part of a keyphrase or not.

Label Description
B-KEY At the beginning of a keyphrase
I-KEY Inside a keyphrase
O Outside a keyphrase

Uses

The model is intended to be used in the context of Token Classification (or Name Entity Recognition) for Job Descriptions written in the English Language ONLY.

How to Get Started with the Model

Use the code below to get started with the model.

[More Information Needed]

Training Details

Training Data

[More Information Needed]

Training Procedure

Preprocessing [optional]

[More Information Needed]

Training Hyperparameters

  • Training regime: [More Information Needed]

Speeds, Sizes, Times [optional]

[More Information Needed]

Evaluation

Testing Data, Factors & Metrics

Testing Data

[More Information Needed]

Factors

[More Information Needed]

Metrics

[More Information Needed]

Results

[More Information Needed]

Summary

Model Examination [optional]

[More Information Needed]

Environmental Impact

Carbon emissions can be estimated using the Machine Learning Impact calculator presented in Lacoste et al. (2019).

  • Hardware Type: [More Information Needed]
  • Hours used: [More Information Needed]
  • Cloud Provider: [More Information Needed]
  • Compute Region: [More Information Needed]
  • Carbon Emitted: [More Information Needed]

Technical Specifications [optional]

Model Architecture and Objective

[More Information Needed]

Compute Infrastructure

[More Information Needed]

Hardware

[More Information Needed]

Software

[More Information Needed]

Citation [optional]

BibTeX:

[More Information Needed]

APA:

[More Information Needed]

Glossary [optional]

[More Information Needed]

More Information [optional]

[More Information Needed]

Model Card Authors [optional]

[More Information Needed]

Model Card Contact

[More Information Needed]

Downloads last month
2
Safetensors
Model size
354M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and the model is not deployed on the HF Inference API.

Model tree for penguincapo/vb-jd-kwextractor

Base model

bloomberg/KBIR
Finetuned
(1)
this model