-
Notifications
You must be signed in to change notification settings - Fork 39
Add Part 3 of the “Conda Is Not PyPI” series #252
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for conda-dot-org ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
|
Shouldnt this also mention pixi? Especially around the reproducibility part? |
|
@baszalmstra yes, it definitely should ... lets focus on part 2 first ... this PR exists as outlook and requires a bit more polishing. |
Title: Practical Power: Reproducibility, Automation, and Layering with Conda Focus: - Shows how conda’s embedded package metadata (`info/` + rendered recipe) enables auditable, rebuildable provenance. - Explains reproducibility workflows with `conda-lock` (multi-platform lockfiles) and Renovate-driven automated dependency updates. - Details performance and footprint benefits (no bundled glibc → faster CI/HPC/env creation). - Describes layering model: conda base (multi-language distro layer) + pip/npm application layer. - Highlights real-world advantages for data science, ML, research reproducibility, enterprise automation, and onboarding. - Extends scope beyond “scientific Python” to DevOps / platform tooling (helm, terraform, opentofu, packer, etc.). - Concludes the series: Part 1 (conceptual distinction), Part 2 (position on spectrum), Part 3 (practical workflows). Authored by @dbast and @jezdez to document the practical implications of conda as a user‑space distribution.
4ecb6d7 to
e777182
Compare
|
I congratulated Jannis for the first two blog posts, and mentioned a point that he said I should comment here. While this might have fit better into Part 2, I still think there's aspects that match with this part of the series. Feel free to incorporate stuff from this, I might have gotten a tad carried away on the line of thought I was pursuing. If you'd like to reuse the metaphor at the end, you should credit me somehow 😋 One of the key dimensions in which conda provides more than other approaches is the fact that it enables doing a rolling distribution. Let's look at this by contrasting the standard model of distros like Debian, Ubuntu, Fedora etc.: these all have versions with a defined life cycle, and most packages within the distro stay unchanged (up to bugfixes) for the lifetime of that distro version. Longer-lived distros will provide some newer versions of core packages (e.g. opt-in for a newer python or GCC), but that's the VIP treatment for a minority of packages rather than the rule. This affects everything from compression libraries onwards; it's why Of course, the ultimate "clock" in all of this is glibc1, but it also affects core components like the C++ standard library on the system. This is for example why Redhat needs to spend a lot of effort on its devtoolset backports: on the one hand, the RHEL distro lives so long that it can't stay relevant on the original compiler version, but on the other hand, the newer GCC backports absolutely must not change the ABI of libstdcxx (and where upstream GCC does that in any way, Redhat will patch it out in the backport)2. In short: ABI compatibility is such a thorny problem that most distros are happy to provide a single consistent set of libraries, which is essentially fully discretized by the version of that distro. There are other rolling distros, but most of them are considered too unstable for use as a daily driver, much less in production: Debian Sid & OpenSUSE Tumbleweed are explicitly the "HEAD" of distro-development; Arch Linux has a very loyal base, but is still considered "bleeding edge" by most people. Gentoo is rolling too, but explicitly built around falling back to source builds where necessary. In contrast, conda does not just cover linux, but also osx and windows, and does it all with ready-to-install binary artefacts. And even though it doesn't ship glibc, it will gracefully provide the right set of artefacts when you solve the same set of requirements on linux systems with different glibc versions. The magic sauce to all this is "migration" infrastructure that will take a new, binary-incompatible version of a library, and rebuild all directly dependent packages in the respective channel. This creates a gradual transition between what is a sharp cut between distro versions, e.g. Ubuntu 23.10 and 24.04. In fact, the major channels (i.e. Anaconda's default channels as well as conda-forge) are constantly being rebuilt as new versions of core libraries get released. It's the ultimate ship-of-Theseus, only that we've got armies of bots constantly replacing related planks across the whole ship, one at a time, but relentlessly (and in the order in which they depend on each other). All of this is also why conda needs an industrial grade constraint resolver, because every install has to query the enormous heap of (past and present) planks, in order to determine how to build the ship today. This was the source of one of the earliest and most persistent complaints, namely that conda installs takes a long time. We believe that this story has substantially improved with solutions like Footnotes
|
9c7879a to
32158c4
Compare
|
As this series documents conda’s conceptual foundation I am pinging the @conda/steering-council to help reviewing / improving part 3 in this series. Many thanks! |
|
@baszalmstra pixi is now more prominently mentioned especially in context of locking... makes sense? @h-vetinari thanks for your long comment. there is now a dedicated section. Hope that resonates with what you wrote. You are also welcome to review the entire blog post. thanks! |
|
Thanks! Looks good! :) |
Co-authored-by: h-vetinari <[email protected]>
for more information, see https://pre-commit.ci
pavelzw
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot!
|
@pavelzw thanks for your comments. they should be all addressed in the added commits. |
|
looks good to me! |
Co-authored-by: h-vetinari <[email protected]>
|
To simplify my review, I've opened #280 |
* Refine content and structure in "Practical Power" article for clarity and flow * Add banner image * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fewer semicolons. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
b8d0925 to
f901c3b
Compare
h-vetinari
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
haven't followed all changes, but I noticed two (very) more minor things.
| This creates a gradual transition through the entire dependency graph by replacing one "plank" at a time. As one community member describes it, an "[ultimate Ship of Theseus](https://en.wikipedia.org/wiki/Ship_of_Theseus)," where bots constantly rebuild related packages, one dependency at a time. At scale, conda-forge, the largest community channel, manages 20+ independent migrations across its 26,000+ packages at [any given time](https://conda-forge.org/status/), making this orchestration industrial in scope. | ||
|
|
||
| This continuous rebuilding as new versions of core libraries are released enables conda environments to maintain [ABI](https://pypackaging-native.github.io/background/binary_interface) compatibility as dependencies evolve across Linux, macOS, and Windows, something traditional distros can't easily do. For a deeper exploration of the binary compatibility challenges that conda's model solves, see [PyPackaging Native](https://pypackaging-native.github.io/), which contrasts these issues with PyPI's approach. | ||
|
|
||
| ### Industrial-grade solvers |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's "industrial in scope" and "Industrial-grade solvers" in pretty short succession; one of the two might want to change adjective...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy to read that we've reached this level of fine-tuning language :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hah! INDUSTRIAL
Co-authored-by: h-vetinari <[email protected]>
jezdez
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent article, @dbast!
|
Big thanks to everyone who reviewed and shaped this series with insights: thanks to the @conda/steering-council and thanks to @baszalmstra, @beckermr, @cbrueffer, @h-vetinari, @msarahan, @ocefpaf, @pavelzw, @xhochy, @wolfv. |
Description
Title: Practical Power: Reproducibility, Automation, and Layering with Conda
Focus:
info/+ rendered recipe) enables auditable, rebuildable provenance.conda-lock(multi-platform lockfiles) and Renovate-driven automated dependency updates.Authored by @dbast and @jezdez to document the practical implications of conda as a user‑space distribution.
Part 1: #250
Part 2: #251
learn/orcommunity/, I have added it to the corresponding_sidebar.jsonfile.