What should be in the Python standard library?
Python has always touted itself as a "batteries included" language; its standard library contains lots of useful modules, often more than enough to solve many types of problems quickly. From time to time, though, some have started to rethink that philosophy, to reduce or restructure the standard library, for a variety of reasons. A discussion at the end of November on the python-dev mailing list revived that debate to some extent.
Jonathan Underwood raised the issue, likely unknowingly, when he asked about possibly adding some LZ4 compression library bindings to the standard library. As the project page indicates, it fits in well with the other compression modules already in the standard library. Responses were generally favorable or neutral, though some, like Brett Cannon, wondered if it made sense to broaden the scope a bit to create something similar to hashlib but for compression algorithms. Gregory P. Smith had a different take, however:
If anything, it'd be nice to standardize on some stdlib namespaces that others could plug their modules into. Create a compress in the stdlib with zlib and bz2 in it, and a way for extension modules to add themselves in a managed manner instead of requiring a top level name? Opening up a designated namespace to third party modules is not something we've done as a project in the past though. It requires care. I haven't thought that through.
Steven D'Aprano objected
to Smith's assertion about the Python Package
Index (PyPI): "PyPI makes getting
more algorithms easy for *SOME* people.
" He noted that in many
environments (e.g. schools, companies) users cannot install additional
software on the computers they are using, so PyPI is not the panacea it is
sometimes characterized as.
That led Cannon to suggest
discussing the standard library and its role: "We have never really
had a discussion about how we want to guide the stdlib going forward
(e.g. how much does PyPI influence things, focus/theme, etc.).
"
Paul Moore wasn't
sure that discussing the matter would really resolve anything, though:
A larger standard library would help those without access to PyPI, Antoine
Pitrou argued,
while a smaller one does not provide huge benefits: "Python doesn't
become magically
faster or more powerful by including less in its standard
distribution: the best it does is make the distribution slightly
smaller.
" But there are definite downsides to having a large
standard library, Benjamin Peterson said:
- The [development] of stdlib modules slows to the rate of the Python release schedule.
- stdlib modules become a permanent maintenance burden to CPython core developers.
- The blessed status of stdlib modules means that users might use a substandard stdlib modules when a better thirdparty alternative exists.
Steve Dower would rather see a smaller standard library with some kind of "standard distribution" of PyPI modules that is curated by the core developers. Later in the thread, he listed numerous different Python distributions as examples of what he meant, but that just highlighted another problem, Moore said: which of those should he recommend to his users? Right now, the standard library provides the base that a Python script can rely on:
Moore acknowledged that maintaining modules in the standard library has a
"significant cost
" but wondered if moving to the distribution
model was simply shifting those costs to users—without users gaining much
from it. Nathaniel Smith looked at the list of distributions and came to
a different conclusion: the "single-box-of-batteries
" model is
not really solving the problems it needs to solve.
It's really hard to tell whether specific packages would be good or bad additions to the stdlib, when we don't even know what the stdlib is supposed to be.
But Moore found that to be overstated somewhat. For him (and presumably
others), the standard library is what you can expect to find when you have
Python installed. That means that various things like StackOverflow
answers, tutorials, books, and so on can rely upon those pieces being
present, "much like you'd expect every
Linux distribution to include grep
". In addition, the "batteries
included" attribute is likely to have been part of what helped Python grow
into one of the most popular languages, D'Aprano said. "The
current model for the stdlib seems to be working well, and we mess
with it at our peril.
"
Nathaniel Smith sees
some advantages to the "standard distribution" model, though he is not sure
that it would really be the best option. "But what I like about it is that it could potentially reduce the conflict between what our different user groups need, instead of
playing zero-sum tug-of-war every time this comes up.
" Others
don't see it that way, though; "not every need can be solved by the
stdlib
", as Pitrou put
it. He continued:
Moore concurred: "In exploring alternatives, let's
not lose sight of the fact that the stdlib has been a huge success, so
we know we *can* deliver an extremely successful distribution based on
that model, no matter how much it might trigger regular debates :-)
"
In any case, as he pointed
out, a more concrete proposal (in the form of a PEP) is going to be
needed before any real progress can be made. Dower floated
some ideas about what a distribution might look like along the way, but,
without something like a PEP to discuss, participants are often
talking past each other based on their assumptions.
The topic has come up before on the Python mailing lists and at Python Language Summits. In 2015, there was a discussion at the summit on adding the popular Requests module to the standard library. Participants recognized that there were significant barriers—development pace, certificate handling, no asyncio support—to moving it into the standard library. In the end, it made sense for Requests to stay out. At the 2018 summit, Christian Heimes brought up a number of batteries that should perhaps be removed from the set, though the effort to create a PEP listing them seems to have stalled.
No firm conclusions were drawn in the discussion, but part of the underlying problem seems to be a lack of clarity on what the purpose of the standard library is. At the 2015 summit, Cannon suggested an informational PEP be drafted to solidify that; until that happens, there will be wildly differing views on what role the standard library serves. At the moment, though, there is no process to accept or reject a PEP even if one were on offer; that will have to await the new Python Steering Council, which will be elected in early February. One of the first orders of business of that group is likely to address the PEP process.
As far as adding LZ4 goes, the overall feeling from the thread is that it
would be useful to have it in the standard library—at least for those not
looking to change the standard library model. Adding LZ4 also requires a
PEP, however, so that process may be stalled by the governance
change, as well.
| Index entries for this article | |
|---|---|
| Python | Standard library |