Has anyone nailed dependency management? [closed]

https://softwareengineering.stackexchange.com/questions/253122

05-10-2020
|

Question

I've used various dependency management tools for installing software: homebrew, cabal, rubygems, etc. Invariably, despite someone's simple instructions for installing their package, there are times when dependencies don't get worked out by the package manager and the installation fails. This is more common than it ought be. And frustrating! You end up trying to troubleshoot the installation process which I feel is something users (even if they are developers) shouldn't have to do.

This is one reason people knock Linux (which I love!) as compared to Windows: installations can be associated with complications. Windows users aren't tasked with compiling source.

Is someone aware of a dependency management tool (or dependency management strategies) that has been especially successful at eliminating the bumps associated with installs? Clearly some package managers must be better than others. Or does this remain an unsolved problem?

Solution

Package managers fail to install some software because of the intrinsic problem of releasing software in the wild: it can be installed anywhere, and should accommodate itself to different machines, different hardware, different versions of OS, different configuration, different side-by-side software and different problems.

You test your software on all Debian, Ubuntu, Fedora, Red Hat and SUSE distributions released since 2008, your Continuous Integration server deploys more than 150 VMs of different versions and variants of OSes with different configurations at every commit.

Then, when your product is finally released, it appears that it fails miserably on Joe's machine, because Joe has a custom-compiled Debian in Romanian, lacks a few packages that you assumed are always installed, has a hard drive which hangs for 30 seconds from time to time, doesn't have a mouse and has an antivirus which is popular in Romania and only there.

And now, he posts a message on a popular forum, telling that your software crashes, because, indeed, you had no idea that you should have had in your CI a Romanian VM with a custom-compiled Debian with an emulation of a dying hard disk, no mouse and some unknown antivirus.

Dependency management tools such as apt-get for Debian-type distributions, npm for Node.js or pip for Python work great as systems. The fact that sometimes, you install packages through them and it doesn't work is not the fault of those systems, but more the fault of developers who made the packages (and the installer). Don't blame the dependency management tools for the failures of individual packages.

Could the installation be less error prone? Yes. There are things you can do to make your life easier. All those points are crucial.

Use officially supported OSes and upgrade often (but don't be the guy who jumps to the newest pre-Alpha). The fact that you can install some software on Windows 2000 doesn't mean you should.
Use recent hardware and up-to-date drivers. The number of issues from old motherboards or old GPUs or outdated drivers is huge. When I started using Ubuntu, I hated Chromium: it crashed several times per day, often forcing the whole system to crash. Updating the GPU driver solved the problem.
Don't try to install every package you find on your system. I mean, if your corporate server serves as domain controller, DNS and DHCP and has on it Sharepoint Services, IIS (preferably two versions of it hacked side by side, with the support for PHP and Python as well), SQL Server (which surprisingly takes all the RAM), SMTP and a dozen of other services, you should start by firing your system administrator.
Set up security. Nobody should access your corporate server through Remote Desktop to "tweak a little thing".
Get your configuration under version control. You should be able to create a replica of any machine in an unattended way from a bunch of scripts in a matter of minutes. If you have to install this, then that, then configure this thing and move this file in that folder, you'll make mistakes which will cost a lot (unless you can afford a downtime of your entire infrastructure for a week).
Test third-party software before pushing it in production. The version 3.0 of the proxy server is released? Great. Let's configure it in staging, run automated tests on exact replicas of production servers, and see that everything works as expected. It does? We can now upgrade our production servers.
Virtualize severely. A VM is much easier to handle and replicate. "You can bootstrap a VM in less than a minute", which makes it very easy to deploy something for testing purposes, see if it works, experiment with it or throw the VM if it failed.

Real, non-virtual machines are great when you need the cutting edge performance and the VM abstraction becomes the bottleneck. But installing a distribution on a real machine trough PXE may take fifteen minutes, sometimes even longer. This makes experimentation much more difficult.
Compartmentalize severely. If you have a DNS service and an SMTP server, run two VMs. If you host two Python applications and they are too small to deserve a dedicated VM each, virtualenv is your friend.
Automate severely. VMs should be deployed automatically. Software should be pushed on VMs automatically. Software and infrastructure should be tested automatically. Anything done by hand will introduce random, difficult to track and reproduce mistakes, and slow processes down.
Snapshots help. Installing something in production and don't have time to test everything thoughtfully? Don't forget to create a snapshot of your disks in order to be able to revert to a working version within seconds if anything goes wrong.

Side note unrelated to the question: your second paragraph is more your personal opinion than a fact. The reason I knocked Windows is because it's sooo easier to apt-get install something (given that it works most of the time) compared to the time I spent on Windows searching for the software website, searching for the download button (mostly hidden for an unknown reason), then trying to install the software and finally spend an hour configuring it because the default configuration is surprisingly unusable for any one.

OTHER TIPS

Note that there are so many package managers; people keep thinking that it's an easy task and reinventing the wheel. The underlying problem is that software can interact in unexpected ways.

Most packaging systems which have versioned dependencies start off with "greater than or equal":

progX v2.8 requires libfoo >= v1.1

Then someone releases libfoo2.0 which changes the interface and breaks progX. So now the developer of progX decides to use exact dependencies:

progX v2.9 requires libfoo == v2.0

Then a security hole is discovered in libfoo, and everyone should urgently upgrade! But the exact dependency prevents this; either the package manager uninstalls progX or refuses to upgrade.

You might think you can avoid this by having libfoo1.1, libfoo2.0 and libfoo2.1 installed all at once. This mostly works, provided they can all be kept in distinct directories, you don't mind the disk space consumption, and they don't have to put files in some common location. But if libraries can depend on other libraries, this will fail: very few languages/systems will let you link libfoo1.1 and libfoo2.0 into the same program.

The other solution is to just bundle all the dependencies into progX. This is the approach taken by Windows, Android and iOS. You don't have external dependencies other than the operating system and its libraries. Hardly perfect ("this app won't work on your phone because it requires newer Android and your manufacturer won't upgrade", plus no automatic security upgrades), but you have basically reduced the number of points of failure to one.

I'm not familiar with cabal that you mention, but it seems to be the same basic problem: https://www.fpcomplete.com/user/simonmichael/how-to-cabal-install & https://hackage.haskell.org/package/cabal-nirvana imply that it's due to packages requiring incompatible versions of other packages.

This may require a social solution, of co-ordinating package releases to avoid this problem. Apt works because Debian's politics of coordination and release make sure it does, and Ubuntu build on this with a small team of smart people under a single employer.

(I'm disappointed that the functional programming community have bad dependency resolution, as it's the sort of mathematical graph problem that FP people and tools ought to excel on. They should look at apt/dpkg and see how it handles this.)

Licensed under: CC-BY-SA with attribution

Not affiliated with softwareengineering.stackexchange