ElasticSearch does not return correct bucket information
-
15-04-2021 - |
Question
I have 2 queries with elasticsearch that behave very differently with regards to the price bucket information
A first query with no filter returns all the expected info for the price bucket:
"price_bucket" : {
"count" : 1190,
"min" : 0.0,
"max" : 239.95,
"avg" : 0.6719663865546218,
"sum" : 799.64,
"sum_of_squares" : 102326.4176,
"variance" : 85.53704739382812,
"variance_population" : 85.53704739382812,
"variance_sampling" : 85.60898771964294,
"std_deviation" : 9.248624081117587,
"std_deviation_population" : 9.248624081117587,
"std_deviation_sampling" : 9.252512508483463,
"std_deviation_bounds" : {
"upper" : 19.169214548789796,
"lower" : -17.82528177568055,
"upper_population" : 19.169214548789796,
"lower_population" : -17.82528177568055,
"upper_sampling" : 19.176991403521548,
"lower_sampling" : -17.833058630412303
}
},
Instead, when searching for the word 'red', the results are fine but the price_bucket in the aggregations node is returning barely any information:
"price_bucket" : {
"count" : 48,
"min" : 0.0,
"max" : 0.0,
"avg" : 0.0,
"sum" : 0.0,
"sum_of_squares" : 0.0,
"variance" : 0.0,
"variance_population" : 0.0,
"variance_sampling" : 0.0,
"std_deviation" : 0.0,
"std_deviation_population" : 0.0,
"std_deviation_sampling" : 0.0,
"std_deviation_bounds" : {
"upper" : 0.0,
"lower" : 0.0,
"upper_population" : 0.0,
"lower_population" : 0.0,
"upper_sampling" : 0.0,
"lower_sampling" : 0.0
}
},
Query red filter
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{"from":0,"size":10000,"stored_fields":["_id","_score"],"sort":[{"_score":{"order":"desc"}}],"query":{"bool":{"must":[{"terms":{"visibility":["3","4"]}}],"should":[{"match":{"sku":{"query":"red","boost":2}}},{"match":{"_search":{"query":"red","boost":2}}},{"match":{"name":{"query":"red","boost":6}}},{"match":{"sku":{"query":"red","boost":7}}},{"match":{"description":{"query":"red","boost":2}}},{"match":{"short_description":{"query":"red","boost":2}}},{"match":{"manufacturer_value":{"query":"red","boost":2}}},{"match":{"status_value":{"query":"red","boost":2}}},{"match":{"url_key":{"query":"red","boost":2}}},{"match":{"tax_class_id_value":{"query":"red","boost":2}}},{"match":{"attr_16gb_premio_micro_sd_card":{"query":"red","boost":2}}},{"match":{"battery":{"query":"red","boost":2}}},{"match":{"bluetooth_speakers":{"query":"red","boost":2}}},{"match":{"brand_value":{"query":"red","boost":2}}},{"match":{"camera":{"query":"red","boost":2}}},{"match":{"cellular_network_value":{"query":"red","boost":2}}},{"match":{"clothing_type_value":{"query":"red","boost":2}}},{"match":{"color_value":{"query":"red","boost":2}}},{"match":{"size_value":{"query":"red","boost":2}}},{"match":{"data_100mb_12mnths":{"query":"red","boost":2}}},{"match":{"data_100mb_6mnths":{"query":"red","boost":2}}},{"match":{"data_200mb_6mnths":{"query":"red","boost":2}}},{"match":{"dual_sim":{"query":"red","boost":2}}},{"match":{"external_memory_slot":{"query":"red","boost":2}}},{"match":{"facebook":{"query":"red","boost":2}}},{"match":{"fm_radio":{"query":"red","boost":2}}},{"match":{"free_airtime":{"query":"red","boost":2}}},{"match":{"free_cover":{"query":"red","boost":2}}},{"match":{"free_data":{"query":"red","boost":2}}},{"match":{"free_sdcard":{"query":"red","boost":2}}},{"match":{"free_starter_pack":{"query":"red","boost":2}}},{"match":{"free_whatsapp_3gb_300mb":{"query":"red","boost":2}}},{"match":{"free_whatsapp_3gb_600mb":{"query":"red","boost":2}}},{"match":{"handset_ram_size":{"query":"red","boost":2}}},{"match":{"headset":{"query":"red","boost":2}}},{"match":{"image_name_label":{"query":"red","boost":2}}},{"match":{"mp3_player":{"query":"red","boost":2}}},{"match":{"os":{"query":"red","boost":2}}},{"match":{"otg_adapter":{"query":"red","boost":2}}},{"match":{"premio_micro_sd_card_16gb":{"query":"red","boost":2}}},{"match":{"product_attribute":{"query":"red","boost":2}}},{"match":{"product_decal":{"query":"red","boost":2}}},{"match":{"protective_cover":{"query":"red","boost":2}}},{"match":{"screen_protector":{"query":"red","boost":2}}},{"match":{"screen":{"query":"red","boost":2}}},{"match":{"tempered_glass":{"query":"red","boost":2}}},{"match":{"torch":{"query":"red","boost":2}}},{"match":{"two_back_covers":{"query":"red","boost":2}}},{"match":{"whatsapp":{"query":"red","boost":2}}},{"match":{"_search":{"query":"red","boost":2}}},{"match_phrase_prefix":{"name":{"query":"red","boost":2}}},{"match_phrase_prefix":{"sku":{"query":"red","boost":2}}}],"minimum_should_match":1}},"aggregations":{"price_bucket":{"extended_stats":{"field":"price_0_1"}},"category_bucket":{"terms":{"field":"category_ids","size":500}},"manufacturer_bucket":{"terms":{"field":"manufacturer","size":500}},"brand_bucket":{"terms":{"field":"brand","size":500}},"cellular_network_bucket":{"terms":{"field":"cellular_network","size":500}},"clothing_type_bucket":{"terms":{"field":"clothing_type","size":500}},"color_bucket":{"terms":{"field":"color","size":500}},"size_bucket":{"terms":{"field":"size","size":500}}}}
'
Query no filter
curl -X GET "localhost:9200/_search?pretty" -H 'Content-Type: application/json' -d'
{"from":0,"size":10000,"stored_fields":["_id","_score"],"sort":[{"_id":{"order":"desc"}}],"query":{"bool":{"must":[{"terms":{"visibility":["2","4"]}}]}},"aggregations":{"price_bucket":{"extended_stats":{"field":"price_0_1"}},"category_bucket":{"terms":{"field":"category_ids","size":500}},"manufacturer_bucket":{"terms":{"field":"manufacturer","size":500}},"brand_bucket":{"terms":{"field":"brand","size":500}},"cellular_network_bucket":{"terms":{"field":"cellular_network","size":500}},"clothing_type_bucket":{"terms":{"field":"clothing_type","size":500}},"color_bucket":{"terms":{"field":"color","size":500}},"size_bucket":{"terms":{"field":"size","size":500}}}}
'
My problem happens with Magento ver. 2.4.1 and ElasticSearch 7. Any insights on resolving this issue will be very appreciated.
Solution
It turned out the price product data had been imported as text and this was breaking the aggregations data that elasticsearch brings back to Magento.
By importing the catalog data and ensuring the price attribute metadata was valid, elasticsearch started returning aggregations data that I expected