Question

We all know that working with S3 is a pain: deleting virtual directories requires to delete all the objects from within the path, etc. At least with RESTful API this is the case.

I was wondering whether there would be any performance improvement if I would use PHP to call GSUtil rather than using my own PHP class. Is there anything special the way GSUtil handles requests or is it the same REST wrapper?

The main issues I am having:

  • deleting big folders
  • uploading many small files
  • reading hierarchical data steps (e.g. only files and folders under /foo path, but not their children-children)
Was it helpful?

Solution

Fundamentally, your PHP code and gsutil are both using the RESTful interface (gsutil is actually layered atop an open source Python library called boto which implements the bulk of the REST interface), however, there are several reasons to consider using gsutil:

  • Gsutil takes care of OAuth 2.0 authentication/authorization for you.
  • Gsutil does wildcard expansion, which, for example would enable you to remove all objects in a bucket by specifying, simply, 'gsutil rm gs://bucket/*'
  • Gsutil has lots of other features (get/set ACLs and associated XML parse/build, listing bucket contents, dump object contents, etc.) which you would have to implement yourself (or find in some other PHP library) if you bypass gsutil.
  • Gsutil has some nice performance capabilities for your "uploading many small files" use case. In particular, the -m option runs your uploads in parallel processes and threads, which provides a substantial performance boost.

In summary, you can roll your own PHP code but I think you'll get your job done faster and have access to more functionality if you leverage gsutil.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top