Why do you not use CPAN modules? [closed]

https://stackoverflow.com/questions/678393

perl
cpan

21-08-2019
|

Question

ETA: When I ask "Why do you not use CPAN modules?", I am referring to the people who refuse to use any CPAN modules (including high quality ones like DBI). Not all CPAN code is of high quality, and it is fine to stay away from modules that are trivial or are based on experimental code (I got annoyed at a developer the other day for wanting to bring in Time::Format just because he didn't know that strftime was in POSIX).

Recently on Perl Beginners, someone want to know how to do something without resorting to the Perl module commonly suggested for that function. He or she did not want to install the module from CPAN. This made me think about the reasons I have seen people avoid using CPAN and I came up with five reasons for this behaviour and the solution for each one:

they scare you (answer, get over it)
they scare your sysadmins (answer, work around them by installing in your home directory and use the lib pragma)
you are using a hosting service that prevents you from installing modules (answer, get a better service, there are cheap services that don't behave like morons)
the target machine doesn't necessarily have the needed module (answer, use PAR or PAR::Packer)
the target machine is totally locked down (i.e. you login to rbash and have to provide code to a third party for inclusion on the box) (a combination of 4 and going through the bureaucracy)
You are using an embedded version of Perl that can't load modules (no answer, you are stuck, but this is very rare)

So, if you don't use CPAN, why, and why are the answers above not adequate? Note, I am not asking why you don't install directly from CPAN on production boxen, I am asking why you avoid using the modules from CPAN (installing via packaging systems count as using CPAN to me).

Solution

You may have the Perl scripting engine embedded in a host application (for example a webserver, or any complex application requiring scripting), and have a whole lot of restrictions in that embedded context, like not being able to load files.

OTHER TIPS

There are a few reasons that I sometimes counsel people not to use certain CPAN modules. Not all of CPAN is high-quality code, and there are varying levels of maintenance for different distributions. Everyone should consider how much work they have to do to use a particular CPAN module and what that module saves them (i.e. total cost of ownership). Using any particular CPAN module is not always a benefit. I don't say that people should not use any of CPAN, but they should consider what they really need from it.

An external module dependency allows someone else to break your application. The CPAN toolchain only ever cares about the latest version of a module and may upgrade your installation when it sees you have an earlier version. I've seen many applications break when the underlying external dependencies introduce new bugs, deprecated needed features, and so on. It's one of the reasons I've been developing my tools for companies to host their own CPAN repositories so they can control that. There are other ways to mitigate that, but not many people are sophisticated enough to have a good process for it.
You work in an environment where all code has to be approved. This seems like a silly requirement to a lot of people, but the risk management people have a job to do too. Sometimes that compliance is mandated by various laws, standards of care, and so on. Unless the module is really going to save a lot of time and energy, the benefit may not be worth the effort to go through that process. Really, how many of you ever seriously inspect the code you get from CPAN? There could be anything in there.
Some CPAN modules implement trivially-coded functionality. Using a module just because it's on CPAN and you don't want to write the three lines of code yourself is a bit silly. You can talk about code reuse all you like, but eventually that's reductio ad absurdum.
Installation of some modules can be quite tricky, fragile, and unpredictable, and sometimes this is due to the long list of dependencies to just build and test the module even though you don't need those dependencies to actually use the module. It takes a lot of work to handle these cases in automated testing environments.
Some CPAN authors are experimental coders, not maintainers. Creating dependencies on their work means you end up with an unsupported module that doesn't get patched and no one else really cares about. Getting your patches accepted is a really big deal for some important projects, and you can't fix the unresponsive author without resorting to some process for using a locally patched version and isn't overwritten by the CPAN toolchain.

You don't escape these reasons with glib answers about using another service, installing in a local directory, and so on. You can't apply your counter arguments to every situation and setup. Anyone telling you otherwise, such as the top post in Top Seven (Bad) Reasons Not To Use Modules that Leon links to, isn't really thinking about anyone's situation, and there are many thoughtful counter-counter arguments.

Don't ever start from the position of thinking anyone should or shouldn't use CPAN. Evaluate the local situation, evaluate the risks and rewards, develop safeguards for the risks, and use modules wisely. That's not any different from any other sort of serious software development or business practice.

You may find this essay (and its comments) interesting.

I'm glad you asked.

I was thinking of asking a question like "Why does CPAN have to suck so much?" but decided it wasn't worth sacrificing my reputation when I (think I) already know the answer. And since this question is marked "subjective" I'll thank you for not moderating me down for giving my personal take on this issue, even if you think I am mistaken or stupid.

First some background: I did quite a lot of Perl coding in the mid-nineties and enjoyed it, but eventually concluded that the language lacked a lot features that were needed for "real" object-oriented programming. I became a C++ developer for several years, and now am now a very technical project manager. I still use Perl for scripts and data crunching and other bits and pieces, and have recently started using Perl scripts to test the web services our coders have developed.

Anyway, I came to stack-overflow for the project-management, but stayed for the Perl. I'm pleased to see the language has grown up and has all sorts of fantastic module's like Moose and MVC and templates and so on, and would like to be using them... and I will. But it is taking time, and I only have a few hours now and then to work with it. Why isn't it easy?

But to answer the question...

First the obvious answer: most Perl programs don't need CPAN modules.
There's more than one way to do it. I don't need modules to do a lot of things that I would use modules for if it was easy to do so. For example, I have been parsing XML documents with split() and regular expressions. I know it's wrong (but the first step to recovery is admitting you have a problem). But I can copy and paste the code to do this in a few seconds, or I could go away and try and make cpan work for another month or so.
Now lets get a little more controversial. CPAN is brittle. Earlier this year I tried to use cpan to install Moose because I had read great things about it and was keen to do proper OO programming in Perl and for it not to be hard/ugly. So I followed the install instructions, and pressed 'Y' hundreds of time (it seemed) before getting dumped on with pages and pages of compiler warnings in the final step. What the hell do I do now? My main dev box has some sort of half-broken Moose module just waiting (I am sure) to bite me in the ass when I least expect it. That was about two months ago, and I have not been back. I speculate that lots of Perl/CPAN have dependencies on other programming languages and that makes it more brittle (as opposed to languages whose libraries are coded in the same language). I further speculate that experiences like this scare potential users off.
Documentation for CPAN beginners is poor. Where is the authoritative CPAN documentation anyway? Where's the introduction and tutorials for beginners? And how was I supposed to know that? I have been reading CPAN documentation on and off for a few months, and am starting to figure out where things are. (I see that almost all individual Perl modules on CPAN are beautifully documented internally, but it took me a long time to find that documentation.)
The install process is too hard. Four steps and hundreds of prompts may have been okay ten years ago when there were fewer packages and fewer dependencies, but now it is just crappy. Why can't I just type something like 'cpan-install Moose' in my shell and have it be done? This is particularly weird, given that advanced users often claim portability is a virtue, citing things like packages and PAR that I still don't get. And why is installing locally even harder when so many people seem to want to do it?
There are vexing issues, like whether I should install CPAN modules with cpan or with the package management system, where advice is inconsistent. More generally: there is more than one way to do it. And when you start doing advanced Perl, you have to make decisions about how to install modules and what modules to use and where do you start? Remember you're a beginner and the documentation is kind of fragmented and the learning curve is steep. My solution has been to try and work around this by not using cpan while I read a little more.
Finally, advanced Perl has a very steep learning curve. Advanced Perl users apparently do not remember this and cannot see it. IMO there is a world of difference between using Perl as it was originally conceived -- as a practical extraction and reporting language with powerful regular expressions -- and using it as a modern development platform with OO and templates and MVC and all sorts of other goodies. I have yet to find a gentle, incremental path from casual Perl use to advanced Perl use.

So there you go. Apologies for the rant.

Installing Perl modules locally is a tad challenging. Here's my process:

Setup user-localized CPAN config:

mkdir -p ~/.cpan/CPAN
touch ~/.cpan/CPAN/MyConfig.pm

If CPAN was previously setup for site-wide admin (meaning, you're on your own box and already fired up and configured CPAN), you can change to user-local by: "perl -MCPAN -e mkmyconfig". Then, edit "~/.cpan/CPAN/MyConfig.pm":

'makepl_arg' => q[LIB=/home/your_name/perllib],

Otherwise, you can startup CPAN normally: "perl -MCPAN -e shell" or simply "cpan". You'll be prompted for configuration. At the "Parameters for the 'perl Makefile.PL' command?", Enter: "PREFIX=~ LIB=~/lib/perl5".

To reference locally installed modules in your Perl scripts, you can do the use lib pragma, but I think it's an annoying dependency when you have numerous perl scripts and modules to update in your app. This is more of a workaround.

Instead, I can set the environment var PERL5LIB to the path where the module was installed locally, like "$HOME/lib/perl5". To set PERL5LIB for a CGI environment, figure out how to set environment variables in the server configuration. In Apache, I can do that in httpd.conf or in .htaccess using mod_env. (thanks, brian d foy)

If things "scare sysadmins" enough, they don't want you to put them on the machine, regardless of where it is you think you'll put them. Shops have standards for a reason.

There is no distribution of liability with a CPAN module. In the shop I currently work in, we have such a deal with our encapsulated accounting software provider. We call them in the middle of the night if our app is down and we need their expertise. Because if our calculations mess up badly enough, our contract with them ensures that they will be paying part of the bill, depending on their exposure with a given issue.

When you get out into the real world, where Perl scripts can run alongside 40-year old established, crusty, COBOL, you may understand how much more at ease managers are with running the COBOL than they are with "scripts" depending heavily on ardent hobbyists, however clever.

That said, my current shop is somewhat comfortable with Perl for non-critical scripts and reports, and will install the occasional CPAN module, but the approval process is rigorous, the sandbox testing is long (and expensive!), but makes it possible. I can only imagine that they could approve one or two new modules, not 50+ new modules, because of how many new situations it would expose them to. So the modules created by the "Let's just use CPAN" crowd are pretty much out if any dependency says "Not recommended for production code" or "experimental".

Here's one valid reason I can think of: you want to figure out how to do it yourself. Which is fine, as long as you realize a production environment isn't for your own personal experiments.

One might say "well look at how the CPAN module does it" but reading someone else's implementation is a poor substitute for doing it yourself. And honestly a lot of CPAN implementations are kind of terrifying. This might be disparaging on CPAN code quality, but its also a success story on just how well encapsulated and tested a CPAN module is that for the most part you don't notice.

As to all the answers which are variations on "the CPAN shell is hard to set up", I agree. However, this is an O(1) problem. You solve it once and then you get easy access to CPAN for the rest of your life.

Some of the modules are based on open source libraries and do not compile or behave well across all the nutty environments that you have. Consider for example needing to run on NCR, HP, SUN, Linux, and AIX.

the target machine doesn't necessarily have the needed module

This can be valid in some environments. One of my friends works at a huge mega conglomerate spanning countries and continents. Frequently, he uses perl to make tape drives do things all over the world. The scripts must be deployed on literally thousands of machines and installing modules is a really big deal -- usually involving a committee and multiple sysadmins at each physical location. He tends to avoid them at all costs and I can't say I blame him.

Is there a solution for that? I really don't think there is.

(This above is a cut and paste from my answer to a really similar question on permonks.)

http://perlmonks.org/?node_id=750387

(answer, use PAR or PAR::Packer)

I did suggest PAR to him once, but it wasn't practical at all. None of the machines are similar enough for PAR to really be useful in a general case. His options were: don't use modules or maintain 1300 PAR binaries. PAR is pretty hard to get working really well even when you definitely know the target patform, so he elected to not use modules.

The target host has an awkward operating system that is not well supported by CPAN modules (due in part to the lack of CPAN Testers).

Such examples are AIX and HP-UX. They have an old perl, either no C compiler or an old/broken one and old/broken libraries, so too many XS modules fail to install out of the box. Patching them takes time (especially when the CPAN author is not responding for months), and trying to workaround XS modules is not possible in practice (well, if you really want to go that way you'll have too often to patch pure perl modules that rely on XS modules).

This is an answer with the positive approach, i.e. it says how you can fix the restrictions keeping you from using CPAN modules: Yes, even you can use CPAN.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow