|
--- |
|
language: |
|
- en |
|
library_name: transformers |
|
license: cc-by-4.0 |
|
tags: |
|
- kl3m |
|
- kl3m-002 |
|
- patent |
|
- all the patents |
|
- slm |
|
date: '2024-03-12T00:00:00.000Z' |
|
pipeline_tag: text-generation |
|
widget: |
|
- text: "# Title\n" |
|
- temperature: 0.3 |
|
- do_sample: True |
|
--- |
|
|
|
# All the Patents 170m Model |
|
|
|
`kl3m-002-170m-patent` is a a (very) small language model (SLM) model fine-tuned from `kl3m-002-170m` to |
|
generate "realistic" patent text. For more information about the base model, |
|
please see [its model page](https://huggingface.co/alea-institute/kl3m-002-170m). |
|
|
|
# All the Patents |
|
|
|
## Why? |
|
|
|
#### If a GPT2-sized model can generate a valid set of claims, should anyone be able to monopolize the invention? |
|
|
|
At their heart, patents are a temporary, sanctioned monopoly on an invention through a license to sue. This monopoly |
|
is justified by the public good created by encouraging innovation and the long-term impact of that innovation being |
|
shared in the public domain. |
|
|
|
Unfortunately, this worthy policy goal has been lost in the chaos and misuse of the patent system. |
|
|
|
One of the most common sources of frustration is the granting of "obvious" patents. While some inventions are clearly novel |
|
and non-obvious, many are not - but still slip through the examination process. These obvious but granted patents then |
|
loom large over the market, creating a "thicket" that discourages use or subsequent invention in the area of the granted |
|
patent. "Undoing" the grant of a patent is a costly and time-consuming process with possible negative consequences, and |
|
so many of these patents simply sit as prior art on the books, even if the patentholder knows they could never enforce them. |
|
|
|
Congress and various stakeholders have discussed and proposed changes over time, including most recently the |
|
America Invents Act (AIA), but the problem of obvious patents persists. |
|
|
|
But what if someone were to generate all the obvious inventions and make them public? |
|
|
|
What if we shared the means of producing these obvious inventions so that everyone could help generate them on a normal CPU or consumer GPU? |
|
|
|
And what if we could then make those obvious inventions easily searchable for anyone, including PTO examiners themselves, to use? |
|
|
|
## How it Works |
|
|
|
We start with a small, GPT2-sized large language model - [kl3m-170](https://273ventures.com/kl3m-the-first-legal-large-language-model/) - which was trained on a clean, copyright-free dataset. |
|
This helps us ensure that generations do not include copyrighted text, which would allow third-parties to interfere with the project |
|
via DMCA takedown requests. |
|
|
|
Next, we fine-tune this model on two simultaneous tasks: |
|
|
|
1. **Top-down drafting**: We start from the most abstract parts of the patent - the title and abstract - and then generate the detailed claims. This is a traditional next-token prediction order. |
|
|
|
```text |
|
# Patent |
|
|
|
## Title |
|
{title} |
|
|
|
## Abstract |
|
{abstract} |
|
|
|
## Claims |
|
|
|
1. {claim 1} |
|
|
|
2. {claim 2} |
|
|
|
... |
|
``` |
|
|
|
2. **Bottom-up**: We start from the most detailed part of the patent - the claims - and then generate the abstract and title. This reversed order can be thought of as similar to traditional extractive/abstractive summarization tasks. |
|
|
|
```text |
|
# Patent |
|
|
|
## Claims |
|
|
|
1. {claim 1} |
|
|
|
2. {claim 2} |
|
|
|
... |
|
|
|
## Abstract |
|
{abstract} |
|
|
|
## Title |
|
{title} |
|
``` |
|
|
|
Once this fine-tuning is complete, we can then generate new patents using either technique by prompting the model as follows: |
|
|
|
1. **Top-down prompt**: `"# Patent\n\n## Title"` |
|
|
|
2. **Bottom-up prompt**: `"# Patent\n\n## Claims"` |
|
|
|
It's critical that generation occurs with sufficient randomness and diversity to ensure that the generated patents are not |
|
simply reproductions of the training data. This is a key area of ongoing research and development. |
|
|
|
**Much like the real process of invention, most of the "ideas" generated by this process will be either nonsense or |
|
unpatentable otherwise. Our goal is to estimate the "hit rate" of the model and continue to improve the efficiency and |
|
accessibility of the generation process so that the "cost per obvious invention" is as low as possible.** |
|
|
|
## Current Status |
|
|
|
This project is still in its infancy. We're doing R&D to develop prototype tools to demonstrate the possibility and |
|
cost of generating and sharing these obvious inventions. This R&D is currently focused on data collection, |
|
data curation, model training, and model evaluation. |
|
|
|
|
|
## Generation |
|
|
|
You can generate your own examples as follows. For a "complete" patent, you'll want to extend the `max_new_tokens` value to the biggest number you can fit in your available VRAM. |
|
|
|
```python |
|
import json |
|
from transformers import pipeline |
|
|
|
# Load the model and tokenizer on CPU |
|
p = pipeline('text-generation', 'alea-institute/kl3m-002-170m-patent', device='cpu') |
|
|
|
# Example usage on CPU |
|
text = "# Patent\n\n## Title" |
|
print( |
|
json.dumps( |
|
[ |
|
r.get("generated_text") |
|
for r in p(text, do_sample=True, temperature=0.5, num_return_sequences=3, max_new_tokens=32) |
|
], |
|
indent=2 |
|
) |
|
) |
|
``` |
|
|
|
```json |
|
[ |
|
"# Patent\n\n## Title\nMethod for manufacturing a temperature-controllable polyurethane composition and method", |
|
"# Patent\n\n## Title\nElectronic device\n\n## Abstract\nAn electronic device includes a display panel and a", |
|
"# Patent\n\n## Title\nMethods and devices for tissue repair using a neural network\n\n## Abstract" |
|
] |
|
``` |
|
|
|
### Related Material |
|
|
|
* https://www.federalregister.gov/documents/2024/02/27/2024-03967/updated-guidance-for-making-a-proper-determination-of-obviousness |
|
|
|
## License |
|
|
|
This model was originally developed by 273 Ventures and has been donated to the ALEA Institute. |
|
|
|
The model weights are released under the CC-BY 4.0 License. |
|
|
|
## Contact |
|
|
|
The KL3M model family is now maintained by the [ALEA Institute](https://aleainstitute.ai). For technical support, collaboration opportunities, or general inquiries: |
|
|
|
- GitHub: https://github.com/alea-institute/kl3m-model-research |
|
- Email: [email protected] |
|
- Website: https://aleainstitute.ai |
|
|
|
## Acknowledgments |
|
|
|
Special thanks to 273 Ventures for developing and donating this model to the open-source community through the Alea Institute. |
|
|
|
|
|
## Citation |
|
|
|
Tokenizer, dataset, and model publications are pending. |
|
|
|
## Contact |
|
|
|
For any questions, please contact [ALEA Institute](https://aleainstitute.ai) at [[email protected]](mailto:[email protected]) or |
|
create an issue on this repository or [GitHub](https://github.com/alea-institute/kl3m-model-research). |
|
|
|
 |