Skip to content

Studio Mode

Studio Mode is the primary GUI for Onset Engine — a full-featured desktop application built with CustomTkinter. It provides library management, timeline visualization, real-time previews, and batch rendering with granular control over every parameter.

AreaDescription
HeaderDJ MODE button, 🚀 AUTOPILOT, ⚙️ Preferences, ☀/🌙 theme toggle
Left panelTabbed library browser (Video Bin / Audio Bin) with thumbnail cards + 📥 INGEST
CenterVideo player preview with transport controls + Welcome Launcher
Right panelInspector tabs (General, Advanced, Autopilot) + Action Block
BottomTimeline with clip blocks, audio waveform, console logger (GUI filters DEBUG lines; full logs with system info header [OS, Python, PyTorch, CUDA, GPU, VRAM, FFmpeg, RAM] and per-line timestamps are saved to render.log in the project output directory)

Studio Mode uses a deliberate two-step workflow:

Runs clip selection in draft mode (~30 seconds). The engine:

  1. Analyzes your music track (beats, energy curve, drops)
  2. Matches clips to energy tiers using driver scoring
  3. Places clips on the timeline with transitions
  4. Displays the result in the timeline panel

Once you’re happy with the timeline, click Render to produce the final video:

  • Full-quality rendering at your chosen quality tier
  • VFX pipeline applies all preset effects (signature VFX, depth, optical flow, grading)
  • Audio mixing with optional ducking
  • Progress tracked via Crate Digger minigame

Render Queue & Blocking:

  • Concurrent render blocking prevents starting multiple renders at the same time.
  • A dialog offers to “Add to Queue” if you attempt to render while another is running.
  • Pending queue items feature a red X remove button for easy cancellation.

Studio Mode offers progressive disclosure for clip direction:

Two text fields at the top of the Clip Direction section:

  • “During quiet parts, focus on:” — purple-accented, maps to low energy tiers
  • “On the heavy drops, focus on:” — pink-accented, maps to high energy tiers

These auto-generate a 4-tier driver behind the scenes.

Below an ”── OR ──” divider, the advanced Driver section lets you:

  • Browse for a custom .json driver file
  • Open the ✨ Create wizard to build a driver visually
  • Clear the driver with the ✕ button

Priority: Driver JSON overrides text descriptions. When a driver is loaded, the text fields dim.

TierResolutionEncoderSpeed
Draft720pFFmpeg direct stitch~30 seconds
Balanced1080p/30fpsNVENC p4/CQ21Medium
Maximum 30fps1080p/30fpsNVENC p6/CQ17Slow
Maximum 60fps1080p/60fpsNVENC p6/CQ17Slowest

Control how energy maps across the timeline:

ModeBehavior
BuildupForces tier progression from LOW → MAX based on timeline position (like DJ keybinds)
FlatConstant energy (e_val = 0.50) — ignores music dynamics
NoneFollows raw music energy — the default behavior

A live 3×3 thumbnail grid that appears during ingest, showing AI analysis tags as each video file completes.

A star-rating minigame that starts when you click Render. Rate unrated clips while you wait — curate your library during downtime.

A 3-card zero-state shown in the player area when no project is loaded. Cards for Autopilot, DJ Mode, and Studio — dismissed on first interaction.

Projects are saved as JSON files in the jobs/ directory:

Terminal window
jobs/job_MyProject.json

All settings are persisted: preset, driver, quality, text descriptions, audio file, collection filter, narrative arc mode, and VFX options.

The Advanced tab includes a VFX OPTIONS section with opt-in effects. All VFX require Core tier or higher — checkboxes are disabled on Demo.

  • ✨ Cutout Pop Transitions — AI-powered subject extraction using rembg. Subject pops out of one scene into the next. Requires a one-time ~170MB model download (auto-installed on first render). Adds ~5% to render time.
  • Face Detection — Powered by face_detect.py using MediaPipe for nose-tip keypoint detection across all render paths, outputting normalized [x, y] coordinates. Falls back to [0.5, 0.4] if no face is found or MediaPipe is unavailable.
  • RIFE Frame Interpolation — Output now uses the same encoder (NVENC/x264) and quality parameters as the main render pipeline.