Usage

Quickstart (simulation mode)

You can test the alignment with a score and a performance file. No extra dependencies needed beyond the base install.

from matchmaker import Matchmaker

mm = Matchmaker(
    score_file=path/to/score.musicxml,
    performance_file=path/to/performance.wav,
    input_type=audio,
)
for current_position in mm.run():
    print(current_position)  # beat position in the score

The returned value is the current position in the score, represented in beats defined by partitura library’s note array system. Specifically, each position is calculated for every frame input and interpolated within the score’s onset_beat array. Please refer to here for more information about the onset_beat concept.

Live streaming (requires [devices])

To run with a live audio or MIDI input, install with pip install pymatchmaker[devices].

mm = Matchmaker(
    score_file=path/to/score.musicxml,
    input_type=audio,
    device_name_or_index=MacBookPro Microphone,
)
for current_position in mm.run():
    print(current_position)

If no device is specified, the system default is used.

from matchmaker import Matchmaker

# Audio input
mm = Matchmaker(
    score_file=path/to/score.musicxml,
    input_type=audio,
)

# MIDI input
mm = Matchmaker(
    score_file=path/to/score.musicxml,
    input_type=midi,
)

Streaming from a non-device source (BytesAudioStream / BytesMidiStream)

For input that does not come from a local audio / MIDI device (a WebSocket handler forwarding browser data, a subprocess, an IPC pipe, etc.), use the built-in BytesAudioStream and BytesMidiStream classes. Both pull raw bytes chunks from a queue.Queue you control and feed them through the same processor pipeline as the device-backed streams. No pyaudio or python-rtmidi install is required.

Audio. The producer pushes raw float32 PCM bytes (one hop_length chunk per item), followed by None to end the stream:

import queue
from matchmaker import Matchmaker
from matchmaker.io.audio import BytesAudioStream
from matchmaker.features.audio import ChromagramProcessor

data_queue = queue.Queue()

# In a producer thread (e.g. WebSocket handler):
#     data_queue.put(pcm_chunk_bytes)   # float32 PCM, hop_length samples
#     ...
#     data_queue.put(None)              # end of stream

stream = BytesAudioStream(
    processor=ChromagramProcessor(sample_rate=22050, hop_length=441),
    sample_rate=22050,
    hop_length=441,
    data_queue=data_queue,
)
mm = Matchmaker(
    score_file="path/to/score.musicxml",
    input_type="audio",
    stream=stream,
)
for current_position in mm.run():
    print(current_position)

MIDI. The producer pushes raw MIDI bytes (e.g. 3 bytes per note_on / note_off, exactly what the Web MIDI API gives you):

import queue
from matchmaker import Matchmaker
from matchmaker.io.midi import BytesMidiStream
from matchmaker.features.midi import PitchProcessor

data_queue = queue.Queue()

# In a producer thread:
#     data_queue.put(midi_bytes)   # e.g. bytes([0x90, 60, 100])
#     ...
#     data_queue.put(None)

stream = BytesMidiStream(processor=PitchProcessor(), data_queue=data_queue)
mm = Matchmaker(
    score_file="path/to/score.musicxml",
    input_type="midi",
    stream=stream,
)
for current_position in mm.run():
    print(current_position)

The browser side just forwards what Web MIDI API hands it:

const midiAccess = await navigator.requestMIDIAccess();
midiAccess.inputs.forEach((input) => {
  input.onmidimessage = (event) => {
    // event.data is a Uint8Array (typically 3 bytes for note_on / note_off)
    ws.send(event.data);   // forward as a binary WebSocket frame
  };
});

The Python WebSocket handler reads the binary frame and calls data_queue.put(message_bytes). No JSON / dict / base64 conversion is needed at any layer.

Running Examples

The repository includes a ready-to-use example script that demonstrates the complete workflow:

# Run with input type (uses default method by each input)
python run_examples.py --audio

# Run with specific method
python run_examples.py --midi --method hmm

This script runs a complete example with score following and evaluation, saving results to the results/ directory.

Testing with Different Methods or Features

You can specify the alignment method and feature processor as follows:

from matchmaker import Matchmaker

mm = Matchmaker(
    score_file="path/to/score",
    input_type="audio",
    method="arzt",       # see Alignment Methods section
    processor="chroma",  # see Features section
)
for current_position in mm.run():
    print(current_position)

For options regarding the method, please refer to the Alignment Methods section. For options regarding the processor, please refer to the Features section.

Package Overview

Matchmaker has the following pipeline:

   input source              Stream              Processor              OnlineAlignment
   (audio/MIDI                                   (chroma,               (e.g.,
    file or live)      ─►   AudioStream    ─►    pitch_chord,    ─►    PitchHMM,        ─►   alignment_path
                            MidiStream           ...)                   OLTWArzt, ...)        (2, T) array

Component signatures

  • Stream (AudioStream, MidiStream) reads from a file or live device, hands each frame to its Processor, and pushes the result to a RECVQueue, followed by a STREAM_END sentinel when the source is exhausted.

  • Processor (e.g., ChromagramProcessor, PitchChordProcessor) takes a (data, frame_time) tuple and returns either a (features, perf_time) tuple or None while buffering. data is np.ndarray for audio or List[(mido.Message, m_time)] for MIDI; perf_time is the timestamp the feature corresponds to (most processors pass frame_time through; chord-buffering MIDI processors emit the chord onset).

  • OnlineAlignment (the score follower base class; e.g., OnlineTimeWarpingArzt, PitchIOIHMM) consumes (features, perf_time) observations from the queue (or directly via __call__), updates its score position per step, and yields the current beat. On stream end it returns the final alignment_path — a (2, T) np.ndarray of (score_beat, perf_time) pairs.

STREAM_END is a module-level sentinel (not a tuple); OnlineAlignment.run() checks for it and exits the read loop.

Score representation

The example score matchmaker/assets/simple_mozart_k265_var1.musicxml is used in tests and the contribution guide. The first two measures:

Beat positions follow the onset_beat field of partitura’s note_array(), whose unit is the score’s denominator (the quarter note for this 2/4 piece). Notes start at beats 0.00, 0.25, 0.50, 0.75, 1.00, ....

import numpy as np
import partitura as pt

score = pt.load_score("matchmaker/assets/simple_mozart_k265_var1.musicxml")
note_array = score[0].note_array()
score_positions = np.unique(note_array["onset_beat"])
# array([0.  , 0.25, 0.5 , 0.75, 1.  , ..., 13.25, 13.5 ])  shape (54,)

If a score follower reaches the third unique onset:

follower.current_index    # 2
follower.current_position # 0.5  (= score_positions[2])

Alignment Methods

Audio (input_type="audio")

Default method: "arzt"

Method

Description

"arzt"

On-line time warping adapted from Brazier and Widmer (2020)

"dixon"

On-line time warping by Dixon (2005)

"outerhmm"

Outer-product HMM score follower by Nakamura (2014)

"skf"

Switching Kalman Filter with hidden tempo by Jiang and Raphael (2020)

MIDI (input_type="midi")

Default method: "pthmm"

Method

Description

"arzt"

On-line time warping adapted from Brazier and Widmer (2020)

"dixon"

On-line time warping by Dixon (2005)

"outerhmm"

Outer-product HMM score follower by Nakamura (2014)

"hmm"

HMM score follower by Cancino-Chacón et al. (2023)

"pthmm"

Pitch-based HMM score follower

Features

Audio (input_type="audio")

Default processor: "chroma"

Processor

Description

"chroma"

Chroma features

"mfcc"

Mel-frequency cepstral coefficients

"cqt"

Constant-Q transform

"mel"

Mel-spectrogram

"lse"

Log-spectral energy features used in Dixon (2005)

"cqt_spectral_flux"

CQT-based spectral flux used in Nakamura (2014)

"raw_spectrum"

Raw power spectrum used in Jiang and Raphael (2020)

MIDI (input_type="midi")

Default processor: "pitch_chord"

Processor

Description

"pitch_chord"

Pitch features grouped per chord onset

"pitch"

Pitch features per note (no chord grouping)

"pianoroll"

Piano-roll features

"pitchclass"

Pitch class features