Xz Format Inadequate for General Use
canopic jug writes:
Developer Antonio Diaz has updated his report on the Xz compression format and concluded that the Xz format is inadequate for general use.
One of the challenges of digital preservation is the evaluation of dataformats. It is important to choose well-designed data formats for generaluse.This article describes the reasons why the xz compressed data format isinadequate for most uses, including long-term archiving, data sharing, andfree software distribution.The relevant weaknesses and design errors in the xz format are analyzed and,where applicable, compared with the corresponding behavior of the bzip2,gzip, and lzip formats.Key findings include: (1) safe interoperability between xz implementationsis not guaranteed; (2) xz is vulnerable to unprotected flags and lengthfields; (3) LZMA2 is unsafe and less efficient thanthe original LZMA; (4) xz's extensibility is unreasonable and problematic;(5) xz includes useless features that increase the number of false positivesfor corruption; (6) xz shows inconsistent behavior with respect to trailingdata; (7) error detection in xz is less accurate than in bzip2, gzip, andlzip.
Disclosure statement: The author is also author of the lzip format.
Acknowledgements: The author would like to thank Lasse Collin for hisdetailed and useful comments that helped improve the first version of thisarticle. The author would also like to thank the fellow GNU developers whoreviewed this article before publication and the people whose comments havehelped to fill in the gaps.
This article was originally published under the title "Xz format inadequatefor long-term archiving", but further analysis revealed that xz issignificantly less safe than bzip2, gzip, and lzip for most uses.
This article tries to be as objective and complete as possible. Pleasereport any errors or inaccuracies to the author at the lzip mailing list [...] orin private at [...].As of today, no formal refutation of this article has been reported.
Notably, Xz's integrity checking is optional, proving unreliable even when the file provides a check sequence.
Previously:
(2024) xz-style Attacks Continue to Target Open-Source Maintainers
(2024) The Mystery of 'Jia Tan,' the XZ Backdoor Mastermind
(2024) xz: Upstream Repository and the xz Tarballs Have Been Backdoored
Read more of this story at SoylentNews.