Question

First - This is not meant to be a 'which is better, ignorant nonionic war thread'... But rather, I generally need help in making an architecture decision / argument to put forward to my boss.

Skipping the details - I simply just would love to know and find the results of anyone who has done some performance comparisons of Shell vs [Insert General Purpose Programming Language (interpreted) here), such as C# or Java...

Surprisingly, I have spent some time on Google on searching here to not find any of this data. Has anyone ever done these comparisons, in different use-cases; hitting a database like in a XYX # of loops doing different types of SQL (Oracle pref, but MSSQL would do) queries such as any of the CRUD ops - and also not hitting database and just regular 50k loop type comparison doing different types of calculations, and things of that nature?

In particular - for right now, I need to a comparison of hitting an Oracle DB from a shell script vs, lets say C# (again, any GPPL thats interpreted would be fine, even the higher level ones like Python). But I also need to know about standard programming calculations / instructions/etc...

Before you ask 'why not just write a quick test yourself? The answer is: I've been a Windows developer my whole life/career and have very limited knowledge of Shell scripting - not to mention *nix as a whole.... So asking the question on here from the more experienced guys would be grealty beneficial, not to mention time saving as we are in near perputual deadline crunch as it is ;).

Was it helpful?

Solution

Once upon a time, ye olde The Great Computer Language Shootout did include some shell scripts.

So, courtesy of the Internet Archive, from 2004 -

Note the shell scripts didn't have programs for many of the tests.

    Score Missing-Tests

Java 20     1

Perl 16     0

Python 16   0

gawk 12     6 

mawk 10     6 

bash 7      12  

Note shell scripts can sometimes be small and fast :-)

"Reverse a file"

        CPU (sec)   Mem (KB)    Lines Code

bash    0.0670      1464        1

C gcc   0.0810    4064        59

Python  0.3869    13160       6

OTHER TIPS

It is highly dependent on what the script is doing. I've seen poorly written shell scripts sped up by one, two even three orders of magnitude by making simple changes.

Typically, a shell script is simply some glue logic that runs utilities that are usually compiled C or C++. If that's the case, there may not be much that can be done to speed things up. If the grunt work is being done by a poorly written utility that's compiled, it's just doing a lot of wasted effort really fast.

That said, Python or Perl are going to be much faster than a shell script, but a VM or native code will be faster yet.

Since you can't tell us any details, we can't really provide specific help.

If you want to see a simple demonstration for comparison, try my pure-Bash implementation of hexdump and compare it to the real thing:

$ time ./bash-hexdump /bin/bash > /dev/null
real    7m17.577s
user    7m2.570s
sys     0m14.745s
$ time hexdump -C /bin/bash > /dev/null
real    0m2.459s
user    0m2.260s
sys     0m0.176s

One of the main reasons the Bash version is slow is that it reads the file character by character which is necessary to handle null bytes (shells aren't very good at handling binary data), but the primary reason is the speed of execution. Here is an example of a Python script I found:

$ time ./hexdump.py /bin/bash > /dev/null
real    0m11.694s
user    0m11.605s
sys     0m0.040s

I simply just would love to know and find the results of anyone who has done some performance comparisons of...

The abiding lesson of such comparisons is that the particular details matter - a lot.

Not only the particular details of the task, but (shouldn't we know this as programmers) the particular details of how the shell script is written.

So can you find someone who understands that shell language and can check that shell script was written in an efficient way? (Wouldn't it be nice if changing a couple of lines took it from 40 minutes to 5 minutes.)

While this doesn't include "Shell" (aka sh/bash/ksh/powerscript) languages, it is a relatively large list of "language [implementation] performance" -- packed full with generalities and caveats. In any case, someone may enjoy it.

http://benchmarksgame.alioth.debian.org/

If you are writing code and you have concerns about the speed of processing, you should be writing code that is either compiled directly to assembly or compiled for a modern VM.

But... with Moore's Law kicking up processing power every 18 months, I wonder: are the performance requirements really necessary? Even interpreted code runs incredibly fast on most modern systems, and it's only going to get better with time. Do you really need the kind of speed improvements that compiled code would give you?

If the answer is no, then write in whatever makes you happy.

As mentioned above, you won't be able to do SQL queries from shell. Languages which runs on a VM will take a little time upfront because of the VM factor but otherwise the difference should be negligible.

If the question really is to decrease it from 40 to 5 minutes that I will try to find out which piece is taking the majority of the time. If the query is running for the longest time then switching language won't help you much.

Again (without much detail in the question) I would start with looking into different components of the system to see which one is the bottleneck.

Just did this very simple benchmark on my system, and the results are as expected.

Add up all integers between 1 and 50,000 and output answer at each step

Bash: 3 seconds C: 0.5 seconds

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top