Implementations¶
mzPeak is backed by seven independent, from-scratch implementations — not bindings to a single core. Building on the widely available Apache Parquet and Apache Arrow libraries keeps the on-disk structure language-independent, so each implementation reads and writes the same archives natively.
| Language | Capability | API style |
|---|---|---|
| Rust | read / write | The reference reader and writer. |
| Python | read | Zero-copy Arrow / Pandas. |
| R | read | dplyr-compatible table access. |
| C# | read / write | Includes a Thermo RawFileReader demo. |
| JavaScript / TypeScript | read | Runs in the browser, Node, and Deno. |
| C++ | read | Built on the Apache Arrow C++ libraries. |
| Java | read / write | JVM reader/writer on Arrow / Parquet for Java. |
Rust — reference implementation¶
HUPO-PSI/mzPeak (mzpeak_prototyping) is
the reference reader and writer. As the canonical implementation, it tracks the
specification most closely and is the place to confirm intended behaviour when
the prose is ambiguous.
Python¶
A read-only library exposing mzPeak data as zero-copy Apache Arrow tables and Pandas DataFrames, so metadata and signal can be queried with the analytical tooling proteomics and metabolomics users already know.
R¶
A read-only library offering dplyr-compatible access to the packed metadata and
signal tables, developed in coordination with the
R for Mass Spectrometry community. The
current interface uses S6-style classes; an S4 interface (for full Bioconductor
ecosystem compatibility) is planned.
C¶
HUPO-PSI/mzPeak.NET is a read/write
implementation for the .NET ecosystem. It ships with a demonstration that reads
vendor data through Thermo's RawFileReader and writes it to mzPeak.
JavaScript / TypeScript¶
A read-only implementation that runs in the browser, Node.js, and Deno, making mzPeak archives directly inspectable on the web. An online viewer demonstrates reading local and remote files — including extracted-ion chromatograms, base-peak chromatograms, and metadata inspection — entirely client-side.
C++¶
A read implementation built on the Apache Arrow C++ libraries, bringing native mzPeak access to C++ analysis pipelines and to the many languages that bind to a C/C++ core.
Java¶
A read/write implementation for the JVM, built on Apache Arrow / Parquet for Java, so mzPeak archives can be produced and consumed across the Java and Scala ecosystems.
A common cross-language API is in progress
A language-agnostic, OpenAPI-style description of the shared reader/writer
interface (open, get_spectrum, get_chromatogram, iterate, slice,
and related methods) is planned, to keep the implementations aligned as the
format matures.
Building your own?
The format is language-independent — start from this specification and validate your output against the conformance validator (see Tools). Contributions and feedback are welcome via the HUPO-PSI repositories.