Skip to content

Implementations

mzPeak is backed by seven independent, from-scratch implementations — not bindings to a single core. Building on the widely available Apache Parquet and Apache Arrow libraries keeps the on-disk structure language-independent, so each implementation reads and writes the same archives natively.

Language Capability API style
Rust read / write The reference reader and writer.
Python read Zero-copy Arrow / Pandas.
R read dplyr-compatible table access.
C# read / write Includes a Thermo RawFileReader demo.
JavaScript / TypeScript read Runs in the browser, Node, and Deno.
C++ read Built on the Apache Arrow C++ libraries.
Java read / write JVM reader/writer on Arrow / Parquet for Java.

Rust — reference implementation

HUPO-PSI/mzPeak (mzpeak_prototyping) is the reference reader and writer. As the canonical implementation, it tracks the specification most closely and is the place to confirm intended behaviour when the prose is ambiguous.

Python

A read-only library exposing mzPeak data as zero-copy Apache Arrow tables and Pandas DataFrames, so metadata and signal can be queried with the analytical tooling proteomics and metabolomics users already know.

R

A read-only library offering dplyr-compatible access to the packed metadata and signal tables, developed in coordination with the R for Mass Spectrometry community. The current interface uses S6-style classes; an S4 interface (for full Bioconductor ecosystem compatibility) is planned.

C

HUPO-PSI/mzPeak.NET is a read/write implementation for the .NET ecosystem. It ships with a demonstration that reads vendor data through Thermo's RawFileReader and writes it to mzPeak.

JavaScript / TypeScript

A read-only implementation that runs in the browser, Node.js, and Deno, making mzPeak archives directly inspectable on the web. An online viewer demonstrates reading local and remote files — including extracted-ion chromatograms, base-peak chromatograms, and metadata inspection — entirely client-side.

C++

A read implementation built on the Apache Arrow C++ libraries, bringing native mzPeak access to C++ analysis pipelines and to the many languages that bind to a C/C++ core.

Java

A read/write implementation for the JVM, built on Apache Arrow / Parquet for Java, so mzPeak archives can be produced and consumed across the Java and Scala ecosystems.

A common cross-language API is in progress

A language-agnostic, OpenAPI-style description of the shared reader/writer interface (open, get_spectrum, get_chromatogram, iterate, slice, and related methods) is planned, to keep the implementations aligned as the format matures.

Building your own?

The format is language-independent — start from this specification and validate your output against the conformance validator (see Tools). Contributions and feedback are welcome via the HUPO-PSI repositories.