Skip to content

Configuration Reference

Global settings are stored in config.yaml at the root of your Onset Engine installation. These values serve as defaults — most can be overridden per-project in Studio Mode or via CLI flags.

editor:
audio_crossfade: 2.0 # Crossfade duration between clips (seconds)
ducking_volume: 0.3 # Volume of music during voice-detected segments (0.0–1.0)
fade_in_dur: 1.0 # Fade-in duration at video start (seconds)
fade_out_dur: 2.0 # Fade-out duration at video end (seconds)
fps: 30.0 # Output frame rate
mix_source_audio: false # Mix original clip audio with the music track
outro_padding: 2.0 # Padding after last clip before fade-out (seconds)
source_audio_threshold: 0.005 # Volume threshold for source audio detection
source_volume: 1.5 # Volume multiplier for source audio when mixed
target_res: # Output resolution [width, height]
- 1920
- 1080
watermark_text: "" # Watermark text overlay (leave empty for none)
generator:
batch_size: 64 # Number of frames processed per CLIP batch
clip_len_max: 5.0 # Maximum clip duration during ingest (seconds)
clip_len_min: 2.0 # Minimum clip duration during ingest (seconds)
clip_model: ViT-L-14 # CLIP model architecture
clip_pretrained: laion2b_s32b_b82k # CLIP pretrained weights
sample_fps: 4.0 # Frame sampling rate during CLIP analysis
paths:
music_dir: ./music # Default directory for music files
output_root: ./output_clips # Default directory for rendered output
jamendo:
client_id: "" # Your Jamendo API key (free at developer.jamendo.com)

Duration of the audio crossfade between clips. Higher values create smoother transitions but can muddy fast-paced edits. Set to 0 for hard audio cuts.

Output frame rate. Use 30.0 for standard quality or 60.0 for smooth motion. Higher FPS doubles render time and file size.

Output resolution as [width, height]. Common values:

  • [1920, 1080] — 1080p (default)
  • [3840, 2160] — 4K
  • [1280, 720] — 720p (draft)

When true, the original audio from source clips is mixed with the music track. Voice activity detection auto-ducks the music during dialogue. The ducking_volume and source_volume values control the mix.

Text overlaid on the rendered output. Set to an empty string to disable. The Demo tier forces a watermark regardless of this setting.

Number of frames sent to the CLIP model per inference batch. Higher values use more VRAM but process faster. Reduce to 32 if you hit OOM errors during ingest.

Controls the minimum and maximum clip duration during scene detection. Shorter ranges produce more, shorter clips — better for fast-paced edits. Longer ranges preserve more continuous footage.

How many frames per second are sampled for CLIP analysis. Higher values analyze more frames but increase ingest time. The default of 4.0 provides good coverage for most content.

Default directory scanned for music files. Shown in Studio Mode’s Audio Bin tab.

Where rendered videos and temporary chunks are saved. The chunks/ subdirectory is used for intermediate renders and can be safely deleted after rendering completes.

Your Jamendo API key for free music discovery. Register at developer.jamendo.com (free, 2 minutes). Can also be set via the Jamendo panel or Global Preferences in the GUI.

Most config values can be overridden in Studio Mode:

  • ⚙️ Global Preferences modal for output root, music dir, and watermark
  • Inspector panel for quality tier, FPS, and resolution
  • Per-project job files store project-specific overrides