nielsr HF Staff commited on
Commit
c530556
·
verified ·
1 Parent(s): 3ee0085

Add sample usage to model card

Browse files

This PR enhances the model card for MMaDA-8B-MixCoT by adding a "Sample Usage" section.

The new section provides a Python code snippet, derived from the project's GitHub repository, demonstrating how to perform image generation with the model. This makes it easier for users to quickly get started with the model's capabilities.

Files changed (1) hide show
  1. README.md +36 -1
README.md CHANGED
@@ -1,8 +1,9 @@
1
  ---
2
- license: mit
3
  library_name: transformers
 
4
  pipeline_tag: any-to-any
5
  ---
 
6
  # MMaDA-8B-MixCoT
7
 
8
  We introduce MMaDA, a novel class of multimodal diffusion foundation models designed to achieve superior performance across diverse domains such as textual reasoning, multimodal understanding, and text-to-image generation. MMaDA is distinguished by three key innovations:
@@ -15,6 +16,40 @@ Compared to [MMaDA-8B-Base](https://huggingface.co/Gen-Verse/MMaDA-8B-Base), MMa
15
 
16
  [Paper](https://arxiv.org/abs/2505.15809) | [Code](https://github.com/Gen-Verse/MMaDA) | [Demo](https://huggingface.co/spaces/Gen-Verse/MMaDA)
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  # Citation
19
 
20
  ```
 
1
  ---
 
2
  library_name: transformers
3
+ license: mit
4
  pipeline_tag: any-to-any
5
  ---
6
+
7
  # MMaDA-8B-MixCoT
8
 
9
  We introduce MMaDA, a novel class of multimodal diffusion foundation models designed to achieve superior performance across diverse domains such as textual reasoning, multimodal understanding, and text-to-image generation. MMaDA is distinguished by three key innovations:
 
16
 
17
  [Paper](https://arxiv.org/abs/2505.15809) | [Code](https://github.com/Gen-Verse/MMaDA) | [Demo](https://huggingface.co/spaces/Gen-Verse/MMaDA)
18
 
19
+ ## Sample Usage
20
+
21
+ You can use the provided `FlexARInferenceSolver` from the [GitHub repository](https://github.com/Gen-Verse/MMaDA) to easily perform various tasks, such as image generation.
22
+
23
+ First, ensure you have cloned the repository and installed the necessary dependencies as per the GitHub repository's instructions (`pip install -r requirements.txt`).
24
+
25
+ ```python
26
+ from MMaDA.inference_solver import FlexARInferenceSolver
27
+ from PIL import Image
28
+
29
+ # ******************** Image Generation ********************
30
+ inference_solver = FlexARInferenceSolver(
31
+ model_path="Gen-Verse/MMaDA-8B-MixCoT",
32
+ precision="bf16",
33
+ target_size=768,
34
+ )
35
+
36
+ q1 = f"Generate an image of 768x768 according to the following prompt:\
37
+ " \
38
+ f"Image of a dog playing water, and a waterfall is in the background."
39
+
40
+ # generated: tuple of (generated response, list of generated images)
41
+ generated = inference_solver.generate(
42
+ images=[],
43
+ qas=[[q1, None]],
44
+ max_gen_len=8192,
45
+ temperature=1.0,
46
+ logits_processor=inference_solver.create_logits_processor(cfg=4.0, image_top_k=2000),
47
+ )
48
+
49
+ a1, new_image = generated[0], generated[1][0]
50
+ new_image.show() # Display the generated image
51
+ ```
52
+
53
  # Citation
54
 
55
  ```