Question

Please help me getting this done in pig

Input: 
record1: ("Ammit", 123, 234, 345)
record2: (map : [
    "123" : ("accountNo": 123, "bank": "ICICI Bank", "branch" : "Delhi"),
    "234" : ("accountNo": 234, "bank": "HDFC Bank", "branch" : "Mumbai"),
    "345" : ("accountNo": 345, "bank": "SBI", "branch" : "Bangalore"),
    ])

Above data represents Amit's bank accounts with the details of accountNo, bank and branch. Record1 contains name followed by 3 account number ids, which are ordered (i.e. they represent the order in which Amit opened the account)

output: ("Amit", 
    "123" : ("accountNo": 123, "bank": "ICICI Bank", "branch" : "Delhi"),
    "234" : ("accountNo": 234, "bank": "HDFC Bank", "branch" : "Mumbai"),
    "345" : ("accountNo": 345, "bank": "SBI", "branch" : "Bangalore"),
    )

How do I achieve this?

Était-ce utile?

La solution 2

I solved it using an UDF MapToBag defined here . That gave me access to values of the maps in record2, which I used to join with ids from record1.

Autres conseils

You could flatten the map and then a merge join will maintain the order

https://wiki.apache.org/pig/PigMergeJoin

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top