Question

I'm developing a pipeline for processing text that will go into production. The question I keep asking myself is: should I stick to one language for the project when I'm looking for a tool to do a particular task (e.g. NLTK, PDFMiner, CLD, CRFsuite, etc.)?

Or is it OK to mix and match languages on the project? So I pick the best tool regardless of what language it's written in (e.g. OpenNLP, ParsCit, poppler, CFR++, etc.) and warp (wrap) my code around it?

Note, I am not asking about should a developer stick to just one language for their career.

Was it helpful?

Solution

In a perfect world, we'd all be using the One True Language™. The reality is somewhat different.

  1. If you insist on a single language, you may be be excluding many tools from your toolbelt, regardless of the language you choose.

  2. Some applications are impossible or impractical to write in a single language. Web applications are a good example of this; unless you want to write a web server in node.js you're almost certainly going to be using different programming languages for the client and server.

  3. By limiting yourself to one language, you are depriving yourself of paradigms, software patterns and other ideas that are present in other languages, some of which you can apply to your language of choice once you learn them.

In practice, however, there's a lot one can do on a single platform. Most of the popular programming ecosystems have a wealth of tools available in their native language; choose language interop only when you must have functionality that you can't get any other way.

OTHER TIPS

I have found that large multi-programmer, multi-year projects are best served with a single language, while small, one-person projects are best served with a "whatever works" policy.

The issue is maintenance and bringing in new programmers. If you have a large project that spans many years, then you have a significant investment in the code base. When you recruit people you can recruit people who know the single technology that your project uses. Programmers that don't know it, can learn it. If you have a project that uses 10 different technologies, each of which is best at what it does, you'll have a situation where some programmers can't work on some parts, or else you'll only be able to hire people that know all of the core technologies.

If you have a small project, then the only technologies it uses will be those known to the solo-developer. This is a mess to maintain over time. Chance are, though, that you won't need to maintain it.

We had a project that grew from a small one to a big one. At year 4 we realized that we had written code in C++, Java, Python, Perl and SQL. We used every interprocess communication system available in Unix. We found it nearly impossible to hire people and, when we did, they couldn't work on the majority of our code base. Things did not work out well.

Licensed under: CC-BY-SA with attribution
scroll top