Can HCatalog in Apache Pig just load a specific partition?

Question 1

I see two parts to your question.

Part 1, https://cwiki.apache.org/confluence/display/Hive/HCatalog+LoadStore But I am afraid this first loads the whole data in bag a, then only filters it in b. Am i correct or no ?

Ans 1) NO, when you apply filters just after the load statement, hcatalog is smart enough to load specified partitions, which you specified in your filter statement.

Part 2) LOAD DATA INPATH 'filepath' INTO TABLE tablename [PARTITION (partcol1=val1, partcol2=val2 ...)]

What is the equivalent of this construct in Pig with HCatalog ?

Ans 2) YES, you can use store a into 'tablename' using org.apache.hcatalog.pig.HCatStorer('particol1=val1,partcol2=val2');

eg: store a into 'tablename' using org.apache.hcatalog.pig.HCatStorer('datestamp=20110924');

Please drop a comment if you have any doubts.

Thanks

Question 2

The documentations states that if the loader (using HCatLoader()) is immediately followed by a filter the loader will only load the specified partitions, as opposed to loading the entire dataset and then filtering out records.

From the book "Programming Pig":

"HCatalog includes the load function HCatLoader. The location string for HCatLoader is the name of the table. It implements LoadMetadata, so you do not need to specify the schema as part of your load statement; Pig will get it from HCatLoader. Also, because it implements this interface, Pig can work with HCatalog’s partitioning. If you place the filter statement that describes which partitions you want to read immediately after the load, Pig will push that into the load so that HCatalog returns only the relevant partitions. "

The book is very good, and currently offered as open source material here: http://chimera.labs.oreilly.com/books/1234000001811/ch12.html#cassandra