Question

I want to get the size of a file on disk in megabytes. Using the -s operator gives me the size in bytes, but I'm going to assume that then dividing this by a magic number is a bad idea:

my $size_in_mb = (-s $fh) / (1024 * 1024);

Should I just use a read-only variable to define 1024 or is there a programmatic way to obtain the amount of bytes in a kilobyte?

EDIT: Updated the incorrect calculation.

Was it helpful?

Solution

If you'd like to avoid magic numbers, try the CPAN module Number::Bytes::Human.

use Number::Bytes::Human qw(format_bytes);
my $size = format_bytes(-s $file); # 4.5M

OTHER TIPS

You could of course create a function for calculating this. That is a better solution than creating constants in this instance.

sub size_in_mb {
    my $size_in_bytes = shift;
    return $size_in_bytes / (1024 * 1024);
}

No need for constants. Changing the 1024 to some kind of variable/constant won't make this code more readable.

Well, there's not 1024 bytes in a meg, there's 1024 bytes in a K, and 1024 K in a meg...

That said, 1024 is a safe "magic" number that will never change in any system you can expect your program to work in.

I would read this into a variable rather than use a magic number. Even if magic numbers are not going to change, like the number of bytes in a megabyte, using a well named constant is a good practice because it makes your code more readable. It makes it immediately apparent to everybody else what your intention is.

This is an old question and has been already correctly answered, but just in case your program is constrained to the core modules and you can not use Number::Bytes::Human here you have several other options I have been collected over time. I have kept them also because each one use a different Perl approach and is a nice example for TIMTOWTDI:

  • example 1: uses state to avoid reinitialize the variable each time (before perl 5.16 you need to use feature state or perl -E)

http://kba49.wordpress.com/2013/02/17/format-file-sizes-human-readable-in-perl/

    sub formatSize {
        my $size = shift;
        my $exp = 0;

        state $units = [qw(B KB MB GB TB PB)];

        for (@$units) {
            last if $size < 1024;
            $size /= 1024;
            $exp++;
        }

        return wantarray ? ($size, $units->[$exp]) : sprintf("%.2f %s", $size, $units->[$exp]);
    }
  • example 2: using sort map

.

sub scaledbytes {

    # http://www.perlmonks.org/?node_id=378580
    (sort { length $a <=> length $b 
          } map { sprintf '%.3g%s', $_[0]/1024**$_->[1], $_->[0]
                }[" bytes"=>0]
                ,[KB=>1]
                ,[MB=>2]
                ,[GB=>3]
                ,[TB=>4]
                ,[PB=>5]
                ,[EB=>6]
    )[0]
  }
  • example 3: Take advantage of the fact that 1 Gb = 1024 Mb, 1 Mb = 1024 Kb and 1024 = 2 ** 10:

.

# http://www.perlmonks.org/?node_id=378544
my $kb = 1024 * 1024; # set to 1 Gb

my $mb = $kb >> 10;
my $gb = $mb >> 10;

print "$kb kb = $mb mb = $gb gb\n";
__END__
1048576 kb = 1024 mb = 1 gb
  • example 4: use of ++$n and ... until .. to obtain an index for the array

.

# http://www.perlmonks.org/?node_id=378542
#! perl -slw
use strict;

sub scaleIt {
    my( $size, $n ) =( shift, 0 );
    ++$n and $size /= 1024 until $size < 1024;
    return sprintf "%.2f %s",
           $size, ( qw[ bytes KB MB GB ] )[ $n ];
}

my $size = -s $ARGV[ 0 ];

print "$ARGV[ 0 ]: ", scaleIt $size;  

Even if you can not use Number::Bytes::Human, take a look at the source code to see all the things that you need to be aware of.

1) You don't want 1024. That gives you kilobytes. You want 1024*1024, or 1048576.

2) Why would dividing by a magic number be a bad idea? It's not like the number of bytes in a megabyte will ever change. Don't overthink things too much.

Don't get me wrong, but: I think that declaring 1024 as a Magic Variable goes a bit too far, that's a bit like "$ONE = 1; $TWO = 2;" etc.

A Kilobyte has been falsely declared as 1024 Bytes since more than 20 years, and I seriously doubt that the operating system manufacturers will ever correct that bug and change it to 1000.

What could make sense though is to declare non-obvious stuff, like "$megabyte = 1024 * 1024" since that is more readable than 1048576.

Since the -s operator returns the file size in bytes you should probably be doing something like

my $size_in_mb = (-s $fh) / (1024 * 1024);

and use int() if you need a round figure. It's not like the dimensions of KB or MB is going to change anytime in the near future :)

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top