Configuration Reference

Overview

Global settings are stored in config.yaml at the root of your Onset Engine installation. These values serve as defaults — most can be overridden per-project in Studio Mode or via CLI flags.

Full Reference

editor:
  audio_crossfade: 2.0          # Crossfade duration between clips (seconds)
  ducking_volume: 0.3           # Volume of music during voice-detected segments (0.0–1.0)
  fade_in_dur: 1.0              # Fade-in duration at video start (seconds)
  fade_out_dur: 2.0             # Fade-out duration at video end (seconds)
  fps: 30.0                     # Output frame rate
  mix_source_audio: false       # Mix original clip audio with the music track
  outro_padding: 2.0            # Padding after last clip before fade-out (seconds)
  source_audio_threshold: 0.005 # Volume threshold for source audio detection
  source_volume: 1.5            # Volume multiplier for source audio when mixed
  target_res:                   # Output resolution [width, height]
    - 1920
    - 1080
  watermark_text: ""            # Watermark text overlay (leave empty for none)

generator:
  batch_size: 64                # Number of frames processed per CLIP batch
  clip_len_max: 5.0             # Maximum clip duration during ingest (seconds)
  clip_len_min: 2.0             # Minimum clip duration during ingest (seconds)
  clip_model: ViT-L-14          # CLIP model architecture
  clip_pretrained: laion2b_s32b_b82k  # CLIP pretrained weights
  sample_fps: 4.0               # Frame sampling rate during CLIP analysis

paths:
  music_dir: ./music            # Default directory for music files
  output_root: ./output_clips   # Default directory for rendered output

jamendo:
  client_id: ""                 # Your Jamendo API key (free at developer.jamendo.com)

Editor Settings

`audio_crossfade`

Duration of the audio crossfade between clips. Higher values create smoother transitions but can muddy fast-paced edits. Set to 0 for hard audio cuts.

`fps`

Output frame rate. Use 30.0 for standard quality or 60.0 for smooth motion. Higher FPS doubles render time and file size.

`target_res`

Output resolution as [width, height]. Common values:

[1920, 1080] — 1080p (default)
[3840, 2160] — 4K
[1280, 720] — 720p (draft)

`mix_source_audio`

When true, the original audio from source clips is mixed with the music track. Voice activity detection auto-ducks the music during dialogue. The ducking_volume and source_volume values control the mix.

`watermark_text`

Text overlaid on the rendered output. Set to an empty string to disable. The Demo tier forces a watermark regardless of this setting.

Generator Settings

`batch_size`

Number of frames sent to the CLIP model per inference batch. Higher values use more VRAM but process faster. Reduce to 32 if you hit OOM errors during ingest.

`clip_len_min` / `clip_len_max`

Controls the minimum and maximum clip duration during scene detection. Shorter ranges produce more, shorter clips — better for fast-paced edits. Longer ranges preserve more continuous footage.

`sample_fps`

How many frames per second are sampled for CLIP analysis. Higher values analyze more frames but increase ingest time. The default of 4.0 provides good coverage for most content.

Path Settings

`music_dir`

Default directory scanned for music files. Shown in Studio Mode’s Audio Bin tab.

`output_root`

Where rendered videos and temporary chunks are saved. The chunks/ subdirectory is used for intermediate renders and can be safely deleted after rendering completes.

Jamendo Settings

`client_id`

Your Jamendo API key for free music discovery. Register at developer.jamendo.com (free, 2 minutes). Can also be set via the Jamendo panel or Global Preferences in the GUI.

Overriding via GUI

Most config values can be overridden in Studio Mode:

⚙️ Global Preferences modal for output root, music dir, and watermark
Inspector panel for quality tier, FPS, and resolution
Per-project job files store project-specific overrides

Configuration Reference

Overview

Full Reference

Editor Settings

audio_crossfade

fps

target_res

mix_source_audio

watermark_text

Generator Settings

batch_size

clip_len_min / clip_len_max

sample_fps