Issue6715
Created on 2009-08-17 09:47 by devurandom, last changed 2010-08-19 20:14 by jreese.
| Messages (23) | |||
|---|---|---|---|
| msg91657 - (view) | Author: (devurandom) | Date: 2009-08-17 09:47 | |
Python currently supports zlib, gzip and bzip2 compressors. What is missing is support for xz (http://tukaani.org/xz/). It comes with a C library. |
|||
| msg91658 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2009-08-17 11:37 | |
Is zc really a C library? I could find a standalone program, but no shared object. Actually, it seems that zc is a file format based on the lzma algorithm. The plan could be to first implement the lzma module (issue5689), then a xzfile module in pure python. |
|||
| msg91660 - (view) | Author: Skip Montanaro (skip.montanaro) * | Date: 2009-08-17 11:51 | |
What is xz compression and why is it important? Skip |
|||
| msg91661 - (view) | Author: (devurandom) | Date: 2009-08-17 12:13 | |
Yes, xz-utils contains a C library, though it still caries the name "liblzma.so", probably for historic reasons. You are right that xz is a file format based around the lzma algorithm. It just uses a more advanced container format. (lzma-utils had no container at all.) xz is the successor of lzma, which provides a better compression than bzip2, while decompression speed is comparable with gzip. It is used by the GNU project for source tarball compression (replacing bzip2) and supported by GNU tar. See http://en.wikipedia.org/wiki/Xz, http://tukaani.org/xz/ and http://tukaani.org/lzma/ for reference. |
|||
| msg92163 - (view) | Author: Antoine Pitrou (pitrou) | Date: 2009-09-02 11:02 | |
Are xz and lzma formats compatible with each other? If not, which one is the most popular? |
|||
| msg92167 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2009-09-02 14:22 | |
As I understand it from the http://tukaani.org/xz/ page, .lzma and .xz are two file formats bases on the lzma compression method. - .lzma simply stores the compressed stream. - .xz is more complex |
|||
| msg92174 - (view) | Author: (devurandom) | Date: 2009-09-02 18:10 | |
.lzma is actually not a format. It is just the raw output of the LZMA1 coder. XZ instead is a container format for the LZMA2 coder, which probably means LZMA+some metadata. XZ is the official successor to .lzma, and GNU is using it already (look at coreutils), and GNU tar officially supports it since 1.22. |
|||
| msg98774 - (view) | Author: Garen (Garen) | Date: 2010-02-03 06:27 | |
Once Python gets native support for lzma/xz like it does for zlib, bzip2 it could switch to using it for bundles and remote transfers. See: http://mercurial.selenic.com/bts/issue1463 With lzma/xz being able to compress so much better, it'd be really appreciated by users on especially slow links(!!). |
|||
| msg98776 - (view) | Author: Garen (Garen) | Date: 2010-02-03 06:34 | |
Ugh, can't edit previous message. Meant to say: "Once Python gets native support for lzma/xz like it does for zlib and bzip2, Mercurial could switch to using it for bundles and remote transfers." For platforms with native support in-kernel (e.g. Linux) that could be used instead of the bundled version. (Since Python is officially switching to Mercurial, arguably this issue even more important.) |
|||
| msg98794 - (view) | Author: Arkadiusz Miskiewicz Arkadiusz Miskiewicz (arekm) | Date: 2010-02-03 19:44 | |
About why xz is important. gnu.org, tug.org started publishing sources in xz format, quick grep: autoconf/autoconf.spec:Source0: http://ftp.gnu.org/gnu/autoconf/%{name}-%{version}.tar.xz coreutils/coreutils.spec:Source0: http://ftp.gnu.org/gnu/coreutils/%{name}-%{version}.tar.xz libpng12/libpng12.spec:Source0: http://downloads.sourceforge.net/libpng/libpng-%{version}.tar.xz libpng/libpng.spec:Source0: http://downloads.sourceforge.net/libpng/%{name}-%{version}.tar.xz parted/parted.spec:Source0: http://ftp.gnu.org/gnu/parted/%{name}-%{version}.tar.xz texlive-texmf/texlive-texmf.spec:Source0: ftp://tug.org/texlive/historic/%{year}/texlive-%{version}-texmf.tar.xz xz is also supported by automake as dist target. |
|||
| msg98806 - (view) | Author: Antoine Pitrou (pitrou) | Date: 2010-02-04 00:40 | |
We all agree that lzma/xz is important, what is needed is a patch. |
|||
| msg98899 - (view) | Author: Antoine Pitrou (pitrou) | Date: 2010-02-05 19:35 | |
There's a Python binding for some lzma lib here: https://launchpad.net/pyliblzma |
|||
| msg101941 - (view) | Author: tdjacr (thedjatclubrock) | Date: 2010-03-30 14:41 | |
Once xz is implemented, xz compatibility should be added to the tarfile library. |
|||
| msg106427 - (view) | Author: Per Øyvind Karlsen (proyvind) | Date: 2010-05-25 11:20 | |
Ooops, I kinda should've commented on this issue here in stead, rather than in issue5689, so I'll just copy-paste it here as well: I'm the author of the pyliblzma module, and if desired, I'd be happy to help out adapting pyliblzma for inclusion with python. Most of it's code is based on bz2module.c, so it shouldn't be very far away from being good 'nuff. What I see as required is: * clean out use of C99 types etc. * clean up the LZMAOptions class (this is the biggest difference from the bz2 module, as the filter supports a wide range of various options, everything related such as parsing, api documentation etc. was placed in it's own class, I've yet to receive any feedback on this decission or find any remote equivalents out there to draw inspiration from;) * While most of the liblzma API has been implemented, support for multiple/alternate filters still remains to be implemented. When done it will also cause some breakage with the current pyliblzma API. I plan on doing these things sooner or later anyways, it's pretty much just a matter of motivation and priorities standing in the way, actual interest from others would certainly have a positive effect on this. ;) For other alternatives to the LGPL liblzma, you really don't have any, keep in mind that LZMA is "merely" the algorithm, while xz (and LZMA_alone, used for '.lzma', now obsolete, but still supported) are the actual format you want support for. The LZMA SDK does not provide any compatibility for this. |
|||
| msg106430 - (view) | Author: Amaury Forgeot d'Arc (amaury.forgeotdarc) * | Date: 2010-05-25 12:05 | |
I will happily review any implementation, and I can help with inclusion into python trunk. > ...the LGPL liblzma... Can you check which licences cover the different parts of the module? I think that you will have to contribute your code under the Python Contributor Agreement; and I just grabbed some copy of the "xz-utils" source package, and it states that "liblzma is in the public domain". |
|||
| msg106433 - (view) | Author: Per Øyvind Karlsen (proyvind) | Date: 2010-05-25 13:06 | |
ah, you're right, I forgot that the license for the library had changed as well (motivated by attempt of pleasing BSD people IIRC;), in the past the library was LGPL while only the 'xz' util was public domain.. For my code, feel free to use your own/any other license you'd like or even public domain (if the license of bz2module.c that much of it's derived from permits of course)! I guess everyone should be happy now then. :) Btw. for review, I think the code already available should be pretty much good 'nuff for an initial review. Some feedback on things not derived from bz2module.c would be nice, especially the LZMAOptions class would be nice as it's where most of the remaining work required for adding additional filters support. Would kinda blow if I did the work using an approach that would be dismissed as utterly rubbish. ;) Oh well, it's out there available for anyone already, I probably won't(/shouldn't;) have time for it in a month at least, do as you please meanwhile. :) |
|||
| msg106441 - (view) | Author: Antoine Pitrou (pitrou) | Date: 2010-05-25 15:34 | |
Hello, > I'm the author of the pyliblzma module, and if desired, I'd be happy > to help out adapting pyliblzma for inclusion with python. > Most of it's code is based on bz2module.c, so it shouldn't be very far > away from being good 'nuff. Well, I wouldn't say bz2module is the best module out there, but as you say it's probably good enough :) And we can help you fix things if needed. Is pyliblzma compatible with Python 3.x? It's too late to incorporate any new feature in Python 2.x now. > * While most of the liblzma API has been implemented, support for > multiple/alternate filters still remains to be implemented. When done > it will also cause some breakage with the current pyliblzma API. Hmm, then perhaps you should first fix the current API so that adding new features doesn't force you to break the API again. There are strict rules for API breakage in the standard library. By the way, adding a new module to the stdlib probably requires writing a PEP (Python Enhancement Proposal). I wouldn't expect this very proposal to be controversial, but someone has to do it. Finally, when a module is in the stdlib, it is expected that maintenance primarily happens in the Python SVN (or Mercurial) tree. We have a couple of externally-maintained modules, but they're a source of problems for us. |
|||
| msg106567 - (view) | Author: Per Øyvind Karlsen (proyvind) | Date: 2010-05-26 18:48 | |
Yeah, I guess I anyways can just break the current API right away to make it compatible with future changes, I've already figured since long ago how it should look like. It's not like I have to implement the actual functionality to ensure compatibility, no-op works like charm. ;) |
|||
| msg106572 - (view) | Author: Martin v. Löwis (loewis) | Date: 2010-05-26 19:54 | |
[Replying to msg106566] > if you're already looking at issue6715, then I don't get why you're > asking.. ;) Can you please submit a contributor form? > Martin: For LGPL (or even GPL for that matter, disregarding linking > restrictions) libraries you don't have to distribute the sources of > those libraries at all (they're already made available by others, so > that would be quite overly redundant, uh?;). LGPL actually doesn't > even care at all about the license of your software as long as you > only dynamically link against it. Of course you do. Quoting from the LGPL "You may convey a Combined Work ... if you also do each of the following: ... d) Do one of the following: 0) Convey the Minimal Corresponding Source under the terms of this License, and the Corresponding Application Code in a form suitable for, and under terms that permit, the user to recombine or relink the Application with a modified version of the Linked Version to produce a modified Combined Work, in the manner specified by section 6 of the GNU GPL for conveying Corresponding Source. 1) [not applicable to Windows] " > I don't really get what the issue would be even if liblzma were still > LGPL, it doesn't prohibit you from distributing a dynamically linked > library along with python either if necessary (which of course would > be of convenience on win32..).. Of course I can distribute a copy of an lzma DLL. However, I would have to provide ("convey") a copy of the source code of that DLL as well. |
|||
| msg106578 - (view) | Author: Antoine Pitrou (pitrou) | Date: 2010-05-26 20:47 | |
> Of course I can distribute a copy of an lzma DLL. However, I would
> have to provide ("convey") a copy of the source code of that DLL as
> well.
Can you tell me where you are currently providing the source code for
the readline library, or the gdbm library?
Oh, and by the way, you should probably shut down PyPI, since there are
certainly Python wrappers for LGPL'ed libraries there (or even GPL'ed
one), and you aren't offering a link to download those libraries' source
code either.
You seem to have no problem "conveying" copies for the source code of
non-LGPL libraries such as OpenSSL. Why is that?
|
|||
| msg106580 - (view) | Author: Martin v. Löwis (loewis) | Date: 2010-05-26 21:01 | |
> Can you tell me where you are currently providing the source code for > the readline library, or the gdbm library? We don't, as they aren't included in the Windows distribution. The readline library doesn't work on Windows, anyway, and instead of gdbm, we had traditionally been distributing bsddb instead on Windows. > Oh, and by the way, you should probably shut down PyPI, since there are > certainly Python wrappers for LGPL'ed libraries there (or even GPL'ed > one), and you aren't offering a link to download those libraries' source > code either. This is off-topic for this bug tracker. Please remain objective and professional if you can manage to. > You seem to have no problem "conveying" copies for the source code of > non-LGPL libraries such as OpenSSL. Why is that? Not sure what you are referring to. We don't provide the sources for the OpenSSL libraries along with the Windows installer, because the license of OpenSSL doesn't require us to. This is very convenient for our users. |
|||
| msg106581 - (view) | Author: Antoine Pitrou (pitrou) | Date: 2010-05-26 21:34 | |
> > Oh, and by the way, you should probably shut down PyPI, since there are > > certainly Python wrappers for LGPL'ed libraries there (or even GPL'ed > > one), and you aren't offering a link to download those libraries' source > > code either. > > This is off-topic for this bug tracker. Please remain objective and > professional if you can manage to. This whole subthread was already off-topic (since it was pointed out, before your previous message, that the underlying lib is in the public domain). Actually, I would argue that the whole idea of promoting a rigorous interpretation of a license has no place on the bug tracker. It makes no sense to do this on an ad hoc fashion, especially if you want lawyers to be involved (they are certainly not reading this). (of course, you will also have understood that I disagree with such a rigorous interpretation) > Not sure what you are referring to. We don't provide the sources for the > OpenSSL libraries along with the Windows installer, because the license > of OpenSSL doesn't require us to. This is very convenient for our users. This was not about providing the sources together with the installer (which even the GPL or the LGPL don't require to do), but providing them as a separate bundle on the download site. We do have a copy of the OpenSSL source tree somewhere, it is used by the Windows build process. |
|||
| msg106710 - (view) | Author: Per Øyvind Karlsen (proyvind) | Date: 2010-05-29 06:48 | |
I've ported pyliblzma to py3k now and also implemented the missing functionality I mentioned earlier, for anyone interested in my progress the branch is found at: https://code.launchpad.net/~proyvind/pyliblzma/py3k I need to fix some memory leakages (side effect of the new PyUnicode/Pybytes change I'm not 100% with yet;) and some various memory errors reported by valgrind etc. though, but things are starting to look quite nice already. :) |
|||
| History | |||
|---|---|---|---|
| Date | User | Action | Args |
| 2010-08-19 20:14:16 | jreese | set | nosy:
+ jreese |
| 2010-06-22 16:08:27 | rcoyner | set | nosy:
+ rcoyner |
| 2010-05-29 06:48:48 | proyvind | set | messages: + msg106710 |
| 2010-05-26 21:34:48 | pitrou | set | messages: + msg106581 |
| 2010-05-26 21:15:37 | doko | set | nosy:
+ doko |
| 2010-05-26 21:01:18 | loewis | set | messages: + msg106580 |
| 2010-05-26 20:47:25 | pitrou | set | messages: + msg106578 |
| 2010-05-26 19:54:04 | loewis | set | nosy:
+ loewis messages: + msg106572 |
| 2010-05-26 18:48:55 | proyvind | set | messages: + msg106567 |
| 2010-05-26 04:55:06 | lars.gustaebel | set | nosy:
+ lars.gustaebel |
| 2010-05-25 15:34:23 | pitrou | set | messages: + msg106441 |
| 2010-05-25 13:18:45 | ysj.ray | set | nosy:
+ ysj.ray |
| 2010-05-25 13:06:35 | proyvind | set | messages: + msg106433 |
| 2010-05-25 12:05:03 | amaury.forgeotdarc | set | messages: + msg106430 |
| 2010-05-25 11:20:28 | proyvind | set | nosy:
+ proyvind messages: + msg106427 |
| 2010-05-21 20:38:46 | Christophe Simonis | set | nosy:
+ Christophe Simonis |
| 2010-05-21 15:29:20 | haypo | set | nosy:
+ haypo |
| 2010-05-20 20:41:00 | skip.montanaro | set | nosy:
- skip.montanaro |
| 2010-05-08 14:30:20 | brian.curtin | set | versions: - Python 2.7 |
| 2010-05-08 14:18:54 | ockham-razor | set | nosy:
+ ockham-razor |
| 2010-04-09 13:07:44 | Nikratio | set | nosy:
+ Nikratio |
| 2010-04-09 08:06:02 | nicdumz | set | nosy:
+ nicdumz |
| 2010-03-30 14:41:19 | thedjatclubrock | set | nosy:
+ thedjatclubrock messages: + msg101941 |
| 2010-02-05 19:39:41 | eric.araujo | set | nosy:
+ eric.araujo |
| 2010-02-05 19:35:27 | pitrou | set | messages: + msg98899 |
| 2010-02-04 00:40:56 | pitrou | set | messages: + msg98806 |
| 2010-02-03 19:44:25 | arekm | set | nosy:
+ arekm messages: + msg98794 |
| 2010-02-03 06:34:59 | Garen | set | messages: + msg98776 |
| 2010-02-03 06:27:58 | Garen | set | nosy:
+ Garen messages: + msg98774 |
| 2010-01-27 15:58:57 | pitrou | link | issue5689 dependencies |
| 2010-01-27 15:58:41 | pitrou | set | priority: high dependencies: - please support lzma compression as an extension and in the tarfile module |
| 2009-09-21 03:17:36 | leonov | set | nosy:
+ leonov |
| 2009-09-02 18:10:39 | devurandom | set | messages: + msg92174 |
| 2009-09-02 14:22:51 | amaury.forgeotdarc | set | messages: + msg92167 |
| 2009-09-02 11:02:12 | pitrou | set | nosy:
+ pitrou messages: + msg92163 |
| 2009-08-17 12:13:13 | devurandom | set | messages:
+ msg91661 title: xz compression support -> xz compressor support |
| 2009-08-17 11:51:44 | skip.montanaro | set | nosy:
+ skip.montanaro messages: + msg91660 |
| 2009-08-17 11:37:26 | amaury.forgeotdarc | set | versions:
- Python 2.6, Python 3.0, Python 3.1 nosy: + amaury.forgeotdarc messages: + msg91658 dependencies: + please support lzma compression as an extension and in the tarfile module stage: needs patch |
| 2009-08-17 09:47:22 | devurandom | create | |