metadata
base_model: THUDM/CogVideoX-5b
datasets: modal-labs/dissolve
library_name: diffusers
license: other
license_link: https://huggingface.co/THUDM/CogVideoX-5b/blob/main/LICENSE
instance_prompt: >-
PIKA DISSOLVE A pristine snowglobe featuring a winter scene sits peacefully.
The globe violently explodes, sending glass, water, and glittering fake snow
in all directions. The scene is captured with high-speed photography.
widget:
- text: >-
PIKA_DISSOLVE A meticulously detailed, tea cup, sits centrally on a dark
brown circular pedestal. The cup, seemingly made of clay, begins to
dissolve from the bottom up. The disintegration process is rapid but not
explosive, with a cloud of fine, light tan dust forming and rising in a
swirling, almost ethereal column that expands outwards before slowly
descending. The dust particles are individually visible as they float, and
the overall effect is one of delicate disintegration rather than
shattering. Finally, only the empty pedestal and the intricately patterned
marble floor remain.
output:
url: ./assets/output_cup.mp4
- text: >-
PIKA_DISSOLVE Resting quietly atop an ancient stone altar, a delicately
carved wooden mask starts to crumble from its outer edges. The intricate
patterns crack and give way, releasing a fine, smoke-like plume of
mahogany-hued particles that dance upwards, then disperse gradually into
the hushed atmosphere. As the dust descends, the once captivating mask is
reduced to an outline on the weathered altar.
output:
url: ./assets/output_altar.mp4
- text: >-
PIKA_DISSOLVE A slender glass vase, brimming with tiny white pebbles,
stands centered on a polished ebony dais. Without warning, the glass
begins to dissolve from the edges inward. Wisps of translucent dust swirl
upward in an elegant spiral, illuminating each pebble as they drop onto
the dais. The gently drifting dust eventually settles, leaving only the
scattered stones and faint traces of shimmering powder on the stage.
output:
url: ./assets/output_vase.mp4
- text: >-
PIKA_DISSOLVE On a narrow marble ledge, a gracefully folded paper crane
rests, its surface marked by delicate ink lines. It starts to fragment
from the tail feathers outward, releasing a cloud of feather-light pulp
fibers. Suspended for a moment in a magical swirl, the fibers drift back
down, cloaking the ledge in a near-transparent veil of white. Then the
ledge stands empty, the crane’s faint silhouette lingering in memory.
output:
url: ./assets/output_marble.mp4
tags:
- text-to-video
- diffusers-training
- diffusers
- cogvideox
- cogvideox-diffusers
- template:sd-lora
This is a fine-tune of the THUDM/CogVideoX-5b model on the modal-labs/dissolve dataset.
Code: https://github.com/a-r-r-o-w/finetrainers
Inference code:
from diffusers import CogVideoXTransformer3DModel, DiffusionPipeline
from diffusers.utils import export_to_video
import torch
transformer = CogVideoXTransformer3DModel.from_pretrained(
"sayakpaul/pika-dissolve-v0", torch_dtype=torch.bfloat16
)
pipeline = DiffusionPipeline.from_pretrained(
"THUDM/CogVideoX-5b", transformer=transformer, torch_dtype=torch.bfloat16
).to("cuda")
prompt = """
PIKA_DISSOLVE A slender glass vase, brimming with tiny white pebbles, stands centered on a polished ebony dais. Without warning, the glass begins to dissolve from the edges inward. Wisps of translucent dust swirl upward in an elegant spiral, illuminating each pebble as they drop onto the dais. The gently drifting dust eventually settles, leaving only the scattered stones and faint traces of shimmering powder on the stage.
"""
negative_prompt = "inconsistent motion, blurry motion, worse quality, degenerate outputs, deformed outputs"
video = pipeline(
prompt=prompt,
negative_prompt=negative_prompt,
num_frames=81,
height=512,
width=768,
num_inference_steps=50
).frames[0]
export_to_video(video, "output_vase.mp4", fps=25)