WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content

Round 2 for ZSTD compression #377

@mobiusklein

Description

@mobiusklein

Describe the new term or terms you would like to add.

Revisits #167 and #168

Following on the discussion from the PSI Spring Workshop, since mzMLb did not receive uptake from the community for all the previously discussed reasons, we will revisit adding ZSTD compression to mzML.

The conversation proposes:

  1. Provide ZSTD compression with byte shuffling with a "trade name".
  2. Provide delta encoding followed by ZSTD compression with byte shuffling with a "trade name".
  3. Do not propose separate ZSTD without byte shuffling because it introduces an extra complexity.

I ran some experiments, generate points from 100.0 to 2100.0 for double precision floats spaced by 0.01 using ZSTD compression level 9 and zlib compression level 9:

Raw Zlib Shuffle + Zstd Delta + Shuffle + Zstd
Bytes 1600000 570704 12040 560

I did not run a timing experiment because that will be language-sensitive.

Begin Bikeshedding

We wanted to use a trade name because it seemed to the room that a verbose name would be harder to understand.

Some randomly generated names:

  1. ZSTD-Shuffle - The anti-trade name just call it what it is
  2. MZSTD - The least creative adaptation of the domain
  3. MZD - I was wrong, I could be less creative

Metadata

Metadata

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions