I solved it using an UDF MapToBag defined here . That gave me access to values of the maps in record2, which I used to join with ids from record1.
Pig : Multiple join statements in single statement
-
16-07-2023 - |
Вопрос
Please help me getting this done in pig
Input:
record1: ("Ammit", 123, 234, 345)
record2: (map : [
"123" : ("accountNo": 123, "bank": "ICICI Bank", "branch" : "Delhi"),
"234" : ("accountNo": 234, "bank": "HDFC Bank", "branch" : "Mumbai"),
"345" : ("accountNo": 345, "bank": "SBI", "branch" : "Bangalore"),
])
Above data represents Amit's bank accounts with the details of accountNo, bank and branch. Record1 contains name followed by 3 account number ids, which are ordered (i.e. they represent the order in which Amit opened the account)
output: ("Amit",
"123" : ("accountNo": 123, "bank": "ICICI Bank", "branch" : "Delhi"),
"234" : ("accountNo": 234, "bank": "HDFC Bank", "branch" : "Mumbai"),
"345" : ("accountNo": 345, "bank": "SBI", "branch" : "Bangalore"),
)
How do I achieve this?
Решение 2
Другие советы
You could flatten the map and then a merge join will maintain the order
Не связан с StackOverflow