parameterization
How do you do the parameterization? I want to use your model, but I don't know how to control the compression ratio
Hi Wen!
Thanks for your interest in our model!
We provide a threshold
argument in the provence.process
function which can increase or decrease the compression level, e.g. provence_output = provence.process(question, context, threshold=0.5, always_select_title=True)
. This is a threshold on per-token pruning probabilities, output by the model. For each sentence, we binarize these probabilities using this threshold, and then apply what we call sentence rounding: remove the sentence if the percentage of pruned tokens is higher than of the kept tokens.
If you mean the functionality to achieve the pre-specified target compression ratio for each context, Provence does not provide such functionality (the idea of Provence is that the compression ratio for each question-context pair is determined by the data + a threshold). However, in principle it is possible to modify code in https://huggingface.co/naver/provence-reranker-debertav3-v1/blob/main/modeling_provence.py to achieve a given compression ratio for each context, e.g. select top-p % tokens instead of pruning by a threshold in line 181. (Though even then the compression ratio will be not exactly p%, because of sentence rounding).
I hope it helps!