Dynamical Systems Orchestration v4.0 (MPE)

Overview

A lava lamp — a physical, analogue, thermodynamic system — is filmed continuously and translated, in real time, into continuous evolving audio. Lava lamp 'blobs' spawn a voice when they are detected. Each blob's location, coloration, and geometry is tracked throughout the span of its 'life'. To accomplish this, topological data analysis based on persistence, motion tracking using statistical attributes from image samples of the lamp and the physical dynamics of buoyant objects, and histogram matching based on hue are all employed.

Each blob's individual voice and contribution to the total score's orchestration changes depending on the blob's travel — both position and speed — temperature as derived from its coloration, and geometric evolution. Dynamic changes in the closed system of the lava lamp's lifecycle, such as the merging of two or bifurcation of one, correspond to harmonic and dissonant dynamics.

Dynamical Systems Orchestration (DSO) v4.0 is a generative-but-performable musical instrument. A lava lamp — a physical, analogue, thermodynamic system — is filmed continuously by a single USB camera. Persistent-homology blob detection, flow-advected mask tracking, and a physics-informed Kalman filter give every wax blob a stable identity and a stream of continuous control features. Those features no longer go straight to a synth voice as in earlier versions — instead, they drive a routing matrix on a sixteen-slot polyphonic synthesiser whose other input is a human player at a computer keyboard.

v4.0 reframes the system as a duet. The voice is the first-class object; the blob is one of several possible drivers of it. A voice slot can be entirely blob-driven (the v3.2 default), entirely keyboard-driven, or in HYBRID — the player sets pitch and timbre, the blob continues to modulate cutoff, resonance, pan, and amplitude. The closed acoustic feedback loop of the v3 plinth has been retired: the loop now closes through the player's ears and hands, not through the lamp.

AuthorSamuel D. Leventhal

Versionv4.0 · MPE Edition

PlatformmacOS · M-series

Voice Pool16 slots

StackOpenCV · GUDHI · SuperCollider · pygame

Scores and orchestrations with a MPE lava lamp polysynth with midi cc from blob topology/geometry and stochastic changes in a dynamic system.

Media

Short captures from the v4.0 development rig — single-camera capture, voice-slot overlay, and the keyboard controller wired through the pygame input window.

Tracking, Masking, Countours, Merge Tree, and MPE Tool

Voice-slot pipeline in motion

Blob-driven modulation, keyboard-played notes

Methodology

Six stages run as a continuous loop at roughly 15 fps. Stages 1–4 are the v3.2 tracking pipeline. Stage 5 is new in v4.0: a voice manager that decouples synth voices from blob identities, plus a controller layer that funnels player input through the same architecture.

Capture & Calibration

USB camera at 1280×960, 15 fps. A one-time freehand-drawing calibration GUI lets the user paint over typical wax blobs at different temperatures and the lamp's vertical range. From that single pass the system derives HSV wax-colour bounds, MOG2 ROI, lamp top/bottom, and the persistence threshold used downstream. Calibration is stored as lamp_profile.json.

Topological Detection

Every frame. MOG2 background subtraction (with the v3.2 post-warmup learning rate frozen at 1e-4) intersected with the wax-colour HSV mask, morphologically cleaned. The downscaled mask is fed to GUDHI's CubicalComplex for H0 persistence; features above pers_thresh are retained. Each surviving blob is enriched with contour moments (orientation, elongation, Hu invariants) and a wrap-safe circular-mean hue.

Flow Advection & Matching

Every frame. Farneback dense optical flow between previous and current greyscale frames. Every track's binary mask is inverse-warp remapped forward by the flow. IoU between advected masks and detections gives primary matches; ambiguous overlaps are resolved by Wasserstein-inspired distance in the (birth, death) persistence plane. Unmatched detections are checked against the fingerprint buffer for re-identification. Coasting tracks survive up to MAX_COAST = 20 frames (~1.3 s).

Physics Kalman Update

Every frame. Each matched track's 5-state Kalman filter [x, y, vx, vy, ay] is corrected with the detection measurement. Measurement noise adapts based on y-position: near the lamp top/bottom it is high (trust physics, expect turning); mid-column it is low (trust detection). The ay state mean-reverts toward zero and is initialised with a directional prior from thermodynamic direction.

Binding & Voice Management v4.0

Every frame. The BindingManager translates tracker birth/death/merge/split events into VoiceManager lifecycle calls. Blob birth either ties to a pre-armed empty voice, resurrects a zombie, or spawns a fresh BLOB_ONLY slot. Blob death zombifies or detaches depending on whether the player has played the voice recently. Meanwhile the Controller layer polls keyboard events and emits ControllerEvents: SELECT_SLOT, NOTE_ON, SET_PARAM, CREATE_EMPTY, CLAIM_HYBRID, TIE_BLOB, DETACH. Each voice runs its routing matrix — (source → destination, depth, curve) tuples mapping blob CV features onto synth parameters.

Synthesis & Output

SuperCollider's scsynth runs one \mpeVoice instance per active slot — a wavetable oscillator (4 banks × 4 wavetables) through a state-variable filter (LP/BP/HP crossfade) with full ADSR, addressed by per-voice OSC namespace /voice/<slot>/. \blobAngle (orientation-panned sine pad) and \blobStrike (Klank transient on birth) remain available as auxiliary voices. All output routes through a stereo bus to the audio interface.

Pipeline Architecture

v4.0 adds two identity layers above v3.2's flow-advected tracking: the voice slot, which holds full synth state independently of any blob, and the controller event, which lets keyboard and (Phase 2) MIDI inputs share the same downstream as blob CV. The result is three nested time-scales of identity: pixel-level blob mask (frame), voice slot (seconds to minutes), and controller binding (a player's set).

USB Camera (1280×960 @ 15 fps) │ ▼ blob_detection.py [BlobDetector · stateful, owns MOG2] MOG2 (history=500, varThreshold=16) ← learning_rate=1e-4 post-warmup ∩ Wax colour HSV mask (from lamp_profile) → Morphological clean (7×7 close, 5×5 open) → GUDHI CubicalComplex H0 persistence (downscale=4) → Filter by pers_thresh + branch deduplication → Contour moments (orientation, elongation, hu[7]) → Circular mean hue (wrap-safe) → Direction: rising | falling | turning │ ▼ [PersistentBlob] sorted by persistence desc blob_tracker.py [BlobTracker] A. Farneback dense flow B. Advect all track masks (inverse-warp) C. IoU matching (threshold=0.25) D. Wasserstein resolution (topo 0.75, pos 0.15, hue 0.10) E. Mask refinement every 5 frames (50/50 blend) F. Physics Kalman correct [x, y, vx, vy, ay] G. Coast / retire ← max_coast=20 (v3.2) H. Fingerprint re-ID ← hue weight /60 (v3.2) │ ▼ [TrackedBlob] + events: birth | death | merge | split binding_manager.py [BindingManager] ← v4.0 on_birth → tie pre-armed voice / resurrect zombie / spawn IDLE on_death → zombify | detach (based on recent-play window) on_merge / on_split → delegate to v3.x VoiceManager handlers │ ▼ slot allocation calls voice_manager.py [VoiceManager · 16-slot pool] ← v4.0 Slot states: IDLE · BLOB_ONLY · KEYBOARD_ONLY · HYBRID · ZOMBIE Each slot: SynthState (base + smoothed) + routing[] + sc_node ▲ │ │ ControllerEvent │ │ ▼ controller.py [Controller protocol] ← v4.0 /voice/<n>/... KeyboardController (pygame, Phase 1) │ MIDIController (mido, Phase 2 stub) ▼ → NOTE_ON / NOTE_OFF SuperCollider scsynth → SELECT_SLOT \mpeVoice (wavetable + SVF + ADSR) → SET_PARAM (pitch/wave/env/filt) \blobAngle (optional) → CREATE_EMPTY / TIE_BLOB / DETACH \blobStrike (transient on birth) → CLAIM_HYBRID │ → LEARN_ARM (Phase 2) ▼ ┌────────────────────────────────────────┐ Focusrite Scarlett → Speakers │ ZOMBIE POOL (within VoiceManager) │ │ BLOB_ONLY ──death──► ZOMBIE │ │ ZOMBIE ──resurrect──► BLOB_ONLY │ │ 4s window · soft-sustain amp scale │ └────────────────────────────────────────┘

Temporal coherence — four layers

A blob typically takes 90–180 seconds to rise from base to top. Identity must persist across that timescale through detection noise, transparency dips, and turning-point storms. Four mechanisms accumulate:

(1) Flow-advected mask at the tracker — the mask is always present, advected by optical flow even when the detector loses the blob. (2) Fingerprint re-identification — lost tracks are recoverable for 15 seconds by topological fingerprint. (3) Zombie voice pool — sonic identity persists for a further 4 seconds beyond the tracker gap, with soft-sustain amplitude scaling and centroid + persistence-ratio resurrection matching. (4) Voice slot lifetime — the slot itself, independent of any blob, can be authored, edited, and tied to a future blob by the player. Identity exists above the lamp.

Player Controls

v4.0 runs two OS windows simultaneously: an OpenCV video window with overlay graphics, and a pygame window that captures keyboard events. The pygame window must hold keyboard focus for music keys to work; the OpenCV window only receives toggle keys. The keyboard has two coexisting modes — PERFORMANCE (play notes on the selected slot) and EDIT (scroll the synth parameter tree). TAB toggles between them.

Slot selection (both modes)

The number row maps to the first ten of the sixteen voice slots. Selection happens immediately on key press; the selected slot gets a thicker ring in the video overlay and a row highlight in the side HUD. Slots 10–15 are reachable today via an external MIDI controller (Phase 2) or programmatically.

1234567890 → S0 S1 S2 S3 S4 S5 S6 S7 S8 S9

Performance mode

Plays MIDI notes on the currently selected slot. KEYBOARD_ONLY and HYBRID slots respond to notes; BLOB_ONLY slots ignore notes (use B in EDIT mode to claim them as HYBRID first). The lower QWERTY row covers one chromatic octave (Z = C4 by default); the upper row covers the next octave up. [ / ] shift octave. SPACE is all-notes-off (panic).

Z X C V B N M , .	Lower octave white keys (C, D, E, F, G, A, B, C↑, D↑)
S D G H J L	Lower octave sharps (C♯, D♯, F♯, G♯, A♯, C♯↑)
Q W E R T Y U I O P	Upper octave white keys

Edit mode — parameter families & binding

QWERTY keys select which parameter family is exposed for editing on the selected slot. Arrow keys then walk through individual parameters and adjust them; - / = are coarse (10×) versions of ↓ / ↑.

Key	Action
P	Pitch family — `base_note`, `bend`, `scale_id`, `chord_id`
W	Wave family — `wave_pos`, `wave_bank` (4 banks × 4 wavetables)
E	Envelope family — `attack`, `decay`, `sustain`, `release`
F	Filter family — `cutoff`, `res`, `fmode` (0=LP, 0.5=BP, 1=HP)
M	Mod family — walk the slot's routing list with arrow keys, edit depth
B	CLAIM_HYBRID — promote `BLOB_ONLY` → `HYBRID` (player can now play notes)
N	CREATE_EMPTY — next `IDLE` slot becomes `KEYBOARD_ONLY`
T	TIE_BLOB — bind selected `KEYBOARD_ONLY` slot to next blob to be born
D	DETACH — break blob bind; voice continues as `KEYBOARD_ONLY`
R	Reset selected param to its default base value
C	Clear all routings on the selected slot
TAB	Return to PERFORMANCE mode (edit state is saved on the slot)

Voice-state ring legend

In the video overlay, every tracked blob bound to a slot gets a coloured ring whose colour encodes the slot's lifecycle state. The currently selected slot's ring is drawn thicker (3 px vs 2 px) and slightly larger.

BLOB_ONLY — driven entirely by blob CV

KEYBOARD_ONLY — player edits & plays

HYBRID — player notes + blob modulation

ZOMBIE — soft-sustain, ~4 s resurrection window

OpenCV video-window keys

These keys work only when the OpenCV video window has focus. They toggle overlays and adjust the persistence threshold for blob detection — useful during calibration. Music keys do not work here.

Key	Toggle / action
q	Quit application
v	Voice slot HUD on/off (right-side panel)
t	Tracking overlay (rings, IDs, info line)
o	Blob contour outlines
m	Merge tree overlay
f	Optical flow vector field
c	Wax-colour and combined mask windows
[ / ]	Persistence threshold −/+ 2.0 (min 2.0, max 80.0)

Version History

Five architectural iterations, each addressing concrete failure modes observed in the previous version. The progression goes from generic multi-object tracking toward domain-specific methods that exploit the physics of a lava lamp — and, with v4.0, from autonomous translator to performable instrument.

v1.0Initial Implementation

OpenCV SimpleBlobDetector with HSV colour mask. Naive distance-threshold Hungarian assignment. Constant-velocity 4-state Kalman. Hardcoded MIN_BLOB_AREA. OSC bridge to SuperCollider with pitch / amplitude / pan mappings. No identity fingerprinting; spurious birth / death on every transparency fluctuation. No warm-up gate — hundreds of ghost IDs spawned during MOG2 stabilisation.

v2.0Robust Four-Layer Tracking

SimpleBlobDetector replaced by GUDHI CubicalComplex H0 persistence — a topological significance filter in place of an area threshold. MOG2 became the primary foreground filter. Contour moments (orientation, elongation, Hu invariants) added to PersistentBlob. Kalman + Hungarian stack. Coast-and-retire with MAX_COAST=8. Three SuperCollider SynthDefs. MOG2 warm-up gate eliminated ghost IDs during model build.

v3.0 / v3.1Physics-Grounded, Flow-Advected

Optical flow mask advection replaced per-frame assignment as the primary identity carrier — identity is the mask, solved once at birth, not every frame. Wasserstein-inspired topological matching in the birth-death plane (75% weight) resolves ambiguous overlaps. A 5-state physics Kalman [x, y, vx, vy, ay] with buoyancy mean-reversion and directional ay priors handles turning-point dynamics. Birth topology fingerprinting enables re-identification after transparency gaps. Circular mean hue classified into rising / falling / turning direction drives attack / release envelopes and oscillator detuning.

v3.2Stability Against Turning-Point Storms

Four targeted changes to eliminate the cascade of phantom births, deaths, and release-tail stacking that occurred when wax blobs reversed direction: (1) MAX_COAST 8 → 20 frames — tracks survive longer detection gaps, ~1.3 s at 15 fps. (2) MOG2 learning rate throttled to 1e-4 after warm-up, effectively freezing the background model. (3) Hue down-weighted in the fingerprint distance (divisor 25 → 60), so a full thermal reversal no longer blows past the match threshold on its own. (4) Zombie voice slot-pool: when a tracker ID disappears, its voice is held at soft sustain for up to 4 s; the next nearby blob with compatible persistence resurrects it in place.

v4.0MPE Edition — Voice Slots & Player Agencycurrent

v4.0 reframes the relationship between blob and voice. Through v3.2 the system was a one-way translator — blobs produced sound. v4.0 turns the voice into a first-class object that the blob is one of several possible drivers of; the other driver is the human at a computer keyboard.

(1) Voice-slot abstraction. A fixed pool of 16 voice slots replaces the implicit blob_id → voice mapping. Each slot carries its own synth state and a lifecycle state of IDLE, BLOB_ONLY, KEYBOARD_ONLY, HYBRID, or ZOMBIE. The v3.2 zombie pool is preserved as a sub-state of BLOB_ONLY. (2) MPE-friendly SynthDef. The v3.2 \blobVoice is replaced by \mpeVoice — a wavetable oscillator (VOsc across 16 contiguous buffers), state-variable filter with continuous LP↔BP↔HP crossfade, ADSR, and per-voice OSC namespace. \blobAngle and \blobStrike remain as optional auxiliary voices. (3) Computer-keyboard MPE controller. A pygame input layer turns the laptop keyboard into a 16-voice MPE surface with Performance and Edit modes. (4) Bidirectional blob ↔ voice binding. The BindingManager makes blob-driven and player-driven voice creation symmetric: blobs can spawn voices, the player can pre-arm empty voices and tie them to the next-born blob, and any voice can be in HYBRID. (5) Routing matrix. Each voice carries an explicit (source → destination, depth, curve) routing list. Default routings reproduce v3.2 behaviour; the player rewrites them in real time. (6) Controller abstraction. All input is funnelled through a Controller protocol that emits ControllerEvents. Phase 2's external MIDI surface adds a single new adapter — nothing downstream changes.

Removed in v4.0: the plinth, ESP32 feedback speaker, three-camera system, and the closed-loop acoustic feedback concept. The musical interest now comes from the player–blob duet, not from the lamp's sound modulating its own behaviour.

Technical Deep Dives

Persistent homology as a significance filter

Rather than detecting blobs by thresholding intensity or filtering by area — both of which conflate size with significance — persistence measures how prominent a blob is relative to its local surroundings across a continuous range of intensity thresholds. The H0 persistence diagram treats the greyscale image as a scalar function: as a threshold sweeps from high to low intensity, connected components 'born' at blob peaks 'die' when they merge with a brighter neighbour. A blob's persistence is the intensity gap between its peak and the saddle point where it merges. The input is inverted (255.0 - small) so OpenCV's 'bright = foreground' becomes 'low value = foreground' matching GUDHI's sublevel-set sweep; downscaling to ~320×240 keeps CubicalComplex under ~40 ms on an M-series Mac. A sublevel-set branch deduplication pass removes same-branch sub-features using the elder rule.

Flow-advected mask identity

Farneback dense optical flow estimates a displacement vector at every pixel between consecutive frames. Each track owns a binary mask initialised at birth from the blob's detection contour. Every frame the mask is remapped forward using the inverse-warp of the flow field. Every REFINE_INTERVAL = 5 frames the mask is 50/50 blended with the current detection contour, counteracting accumulated flow drift. A blob that briefly becomes semi-transparent and disappears from the persistence diagram does not fire a death event — the mask continues to exist and be advected forward.

Wasserstein-inspired topological matching

When multiple advected masks overlap a single detection, the system resolves the ambiguity using distance in the persistence diagram's birth–death plane rather than pixel space: cost = 0.75·topo_dist + 0.15·pixel_dist + 0.10·hue_cost. This is not full Wasserstein optimal transport — rather, a Wasserstein-inspired cost that prioritises topological feature space over ambient pixel space. Two blobs at similar positions but different thermal states are distinguished correctly where naïve position-based Hungarian would assign them arbitrarily.

Physics-informed buoyancy Kalman

The constant-velocity Kalman model assumes blobs move at fixed velocity between frames. Lava lamp blobs do not: they decelerate to near zero at the top and bottom as buoyancy force reverses, and accelerate through the middle column. The v3.x Kalman adds a fifth state variable ay (vertical acceleration), governed by mean-reverting dynamics and initialised with a directional prior (+0.15 rising, −0.15 falling, 0 turning).

Measurement noise adapts based on y-position. Within 12% of the lamp top/bottom, noise is 2.0 — the filter trusts its physics model more than the noisy detection. In mid-column it is 0.05 — detection dominates. This produces accurate predictions at the physically most important moments: turning points where identity ambiguity is highest.

The zombie voice pool

The zombie pool addresses the failure mode where a turning blob fires death → birth in rapid succession and the release tail of the dead voice stacks against the attack of the new voice — producing the ~9-second phantom polyphony that plagued v3.0 turning events.

Lifecycle. On on_blob_death, a BLOB_ONLY voice's slot transitions to ZOMBIE. The SC node stays alive (gate=1) but its amp is multiplied by 0.35, audibly softening without triggering release. Before spawning a fresh voice for any new blob, the BindingManager asks the VoiceManager to scan zombies for a match within ~15% of lamp height whose persistence is within a factor of 3×. The closest match wins: its slot transitions back to BLOB_ONLY, morphing seamlessly toward the new target values. Zombies older than zombie_window = 4 s release normally.

v4.0 integration. The zombie pool is now a sub-state of the slot lifecycle rather than a side-table. A HYBRID voice whose player has been active within blob_recent_play_window = 5 s detaches to KEYBOARD_ONLY instead of zombifying — the player keeps the voice; the blob bind drops.

Voice-slot architecture v4.0

v4.0's central new abstraction. A Voice is no longer implicitly a 1:1 image of a blob — it is a slot in a fixed pool, holding full synth state, a routing matrix, and a lifecycle state that is independent of any given blob's existence. The state determines what events the slot accepts:

IDLE accepts only spawn_for_blob or create_empty. BLOB_ONLY runs the routing matrix and forwards no keyboard input. KEYBOARD_ONLY accepts notes and edits but ignores blob updates. HYBRID accepts both — the player's notes set base values, the routing matrix adds modulation on top. ZOMBIE is a transient state with a timer. Each slot's SynthState tracks both editable base values and smoothed current/target values for every continuous output parameter; first-order lag on the SC side ensures parameter changes never click.

MPE routing matrix v4.0

Every Voice carries a list of Routing(src, dst, depth, curve) tuples. Sources read blob CV features each frame — y_norm, x_norm, vy, persistence, area, mean_hue, elongation, orientation, age. Destinations write directly to synth parameters — pitch, cutoff, res, amp, pan, wave_pos, attack, release. Curves are pre-tabulated lookup arrays (LIN, LOG, EXP, SCURVE, SQRT) — cheap to evaluate and trivial for the user to extend.

Default routings on BLOB_ONLY voices reproduce v3.2 behaviour: y_norm → pitch (depth +24 semitones, LIN), persistence → amp (SQRT), vy → wave_pos and area → wave_pos (kinematic + geometric), elongation → res (depth +0.6), orientation → pan (depth +1.0), plus an EventRoute firing direction_change → attack/release preset. The mean_hue → wave_pos mapping was removed in v3.2 because camera-firmware variations made blob hue an unreliable continuous source; Source.MEAN_HUE remains available for explicit wiring.

The \mpeVoice SynthDef v4.0

One \mpeVoice instance runs per active voice slot. The oscillator is VOsc reading across 16 contiguous wavetable buffers (4 banks × 4 wavetables, allocated with Buffer.allocConsecutive): bank 0 = analogue shapes (saw / square / pulse / triangle), bank 1 = harmonic-series sweeps, bank 2 = formants (ah / eh / ih / oh), bank 3 = metallic / bell partials. The filter section uses parallel RLPF / BPF / RHPF crossfaded via SelectX.ar on the fmode parameter, giving a continuous LP↔BP↔HP morph. Full ADSR envelope drives amplitude; .lag smoothing on every continuous parameter (50 ms typical, 100 ms on amp, 1 s on attack/release) matches Python's first-order smoothing so cross-frame changes never click.

Controller abstraction v4.0

All player input is funnelled through a Controller protocol that emits ControllerEvent dataclasses. The VoiceManager and BindingManager only see ControllerEvents; they do not know whether the source is a keyboard, an external MIDI controller, an OSC stream from a tablet, or a future voice-to-MIDI bridge. The Phase 1 KeyboardController and Phase 2 MIDIController are interchangeable adapters.

Why this is worth the small upfront cost: the temptation in Phase 1 is to wire pygame keys directly to VoiceManager methods. That works, but Phase 2 then becomes a refactor instead of an add. A 60-line Controller protocol now makes Phase 2 a ~100-line MIDIController plus a small JSON learn-map file. The same protocol can later host a Push 3 adapter, a TouchOSC layout, or a tablet UI without disturbing anything downstream of it.

Build

Phase 1 — Core (single camera + keyboard)

Component	Purpose	Est. (GBP)
MacBook Pro (M-series)	Central compute — vision, GUDHI, SuperCollider, keyboard input	existing or £1,099+
Logitech Brio 4K or C922 Pro	Primary camera (calibrated at 1280×960)	£75–95
Articulated camera arm / desk clamp	Position camera 30–40 cm from lamp	£8–12
Focusrite Scarlett Solo (3rd gen)	Audio interface — stereo SC output	£110–130
Mathmos Astro lava lamp (500 ml)	Primary installation object	£35–50
TRS 3.5 mm to XLR cables (pair)	Scarlett outputs to powered monitors	£8–12
Powered studio monitors (e.g. Yamaha HS5)	Main audio output	£120–300

Phase 1 — Optional

Component	Purpose	Est. (GBP)
Behringer DeepMind 6 Module (used)	Optional analogue voice for one or two slots; driven via Scarlett MIDI out	£130–165
DIN-5 MIDI cable 1 m	Required only if the DeepMind 6 is in the rig	£5–8

Phase 2 — External MIDI controller

Phase 2 swaps the laptop keyboard for an assignable knob/button MIDI surface. The rest of the system does not change — the MIDIController adapter sits behind the same Controller protocol, with a small JSON learn-map persisted to disk.

Candidate	Layout	Approx.
MIDI Fighter Twister	16 endless RGB push-encoders × 4 banks (64 logical) — best overall match; RGB feedback for state recall	£200
Faderfox EC4	16 endless encoders + OLED labels, 4 setups — most engineered controller in this class	£400
Faderfox PC12	12 endless encoders + buttons, 16 setups, OLED — cheaper EC4 alternative	£280
Novation Launch Control XL	24 absolute knobs + 8 faders + 16 buttons + 8 templates — voice-strip mixer feel	£170
Akai MIDImix	24 absolute knobs + 9 faders + 16 buttons — cheapest functional option, no knob LEDs	£100

Software stack

Python 3.11.9 pinned for macOS compatibility with GUDHI and current OpenCV wheels. Core dependencies (requirements.txt): opencv-python ≥ 4.8, numpy ≥ 1.24, scipy ≥ 1.10, python-osc ≥ 1.8, gudhi ≥ 3.7. v4.0 (Phase 1) adds pygame ≥ 2.5 for the keyboard controller; v4.0 (Phase 2 prep) adds mido ≥ 1.3 and python-rtmidi ≥ 1.5 for the MIDI controller stub. SuperCollider 3.13+ for the synthesis server.

Timeline & Milestones

Solo developer, 10–15 hours per week. Phase 1 is a complete, performable instrument on its own. Phase 2 is an additive upgrade — the system runs the entire time, gaining a new input source. The timeline assumes v3.2 tracking is already working.

Week 2

Slot abstraction landed — VoiceManager refactored to 16-slot pool with Voice dataclass; v3.2 zombie behaviour preserved as BLOB_ONLY sub-state.

Week 4

SC layer ready — \mpeVoice SynthDef in SuperCollider (wavetable osc, SVF, ADSR), /voice/<n>/ namespace, manual sanity tests passing.

Week 6

Blob CC modulating SC voices — pygame KeyboardController + routing matrix runtime; default routings reproducing v3.2 behaviour.

Week 7

Phase 1 complete — BindingManager bridges tracker events into slot lifecycle; CREATE_EMPTY / TIE_BLOB / DETACH / CLAIM_HYBRID functional end-to-end.

Week 10

External controller online — MIDIController adapter active; MIDI Learn mode persists bindings to learn_map.json.

Week 12

Performable instrument — live-set workflow tuned: bank/page conventions, slot-select ergonomics, recall.

Guides & Reference

Full build documentation and the keyboard reference card, hosted as PDFs.

PDF · v4.0 (MPE) Complete Build Guide Methodology, pipeline architecture, six-stage walk-through, parts list, software install, twelve-week timeline, full technical deep dives, and appendices. The canonical reference for v4.0. PDF · v4.0 Keyboard Reference Card One-page-per-side player reference: Performance / Edit modes, slot select, note layout, parameter families, binding commands, OpenCV-window toggles, and four common workflows. PDF · v3.0 (archive) v3.0 Build Guide Original physics-grounded, flow-advected tracking guide — flow advection, Wasserstein matching, physics Kalman, birth fingerprinting, calibration GUI. Preserved for reference; superseded by v4.0. PDF · v2.x (archive) Audible Lava — Phase I & II Earlier macOS edition covering the original three-camera + ESP32 + plinth concept. Useful historical context; the v4.0 system has retired the closed acoustic feedback loop.

Novelty

To the best of the author's knowledge, this is the first work to combine: (1) a physical lava lamp as live input, (2) H0 persistent homology for topological blob significance filtering, (3) flow-advected mask identity with Wasserstein-inspired topological matching, (4) physics-informed buoyancy Kalman with directional priors, (5) birth topology fingerprinting for re-identification, (6) thermodynamic hue as a phase indicator driving musical parameters, (7) a zombie voice pool preserving sonic identity across detection gaps, and (8) a bidirectional voice-slot architecture in which blob CV features and player input share the same MPE-style polysynthesiser as continuous-modulation and discrete-event sources respectively.