Pergunta

Please help me getting this done in pig

Input: 
record1: ("Ammit", 123, 234, 345)
record2: (map : [
    "123" : ("accountNo": 123, "bank": "ICICI Bank", "branch" : "Delhi"),
    "234" : ("accountNo": 234, "bank": "HDFC Bank", "branch" : "Mumbai"),
    "345" : ("accountNo": 345, "bank": "SBI", "branch" : "Bangalore"),
    ])

Above data represents Amit's bank accounts with the details of accountNo, bank and branch. Record1 contains name followed by 3 account number ids, which are ordered (i.e. they represent the order in which Amit opened the account)

output: ("Amit", 
    "123" : ("accountNo": 123, "bank": "ICICI Bank", "branch" : "Delhi"),
    "234" : ("accountNo": 234, "bank": "HDFC Bank", "branch" : "Mumbai"),
    "345" : ("accountNo": 345, "bank": "SBI", "branch" : "Bangalore"),
    )

How do I achieve this?

Foi útil?

Solução 2

I solved it using an UDF MapToBag defined here . That gave me access to values of the maps in record2, which I used to join with ids from record1.

Outras dicas

You could flatten the map and then a merge join will maintain the order

https://wiki.apache.org/pig/PigMergeJoin

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top