Question

I am working on integrating affiliate sales into few existing sites. We are using a few merchants who work via different networks (cj, shareasale, linkshare, avantlink).

Now my observation is that all these networks provide data feeds in different formats. But that's not a big problem. My main concern is actually merchants using different titles on same products. I don't want to run into these situations:

a) two listings of the SAME product from N merchants (if titles are just a bit different)

b) one listing of N different products from merchants (if we don't use strict comparison algorithm)

We want to automate everything as much as possible, want to avoid operators scanning listings under question all the time.

How is this problem typically handled?

Was it helpful?

Solution

We have a similar issue with trying to collapse products from multiple merchant feeds. What we do is collapse products based on their brand (or manufacturer) + sku combo.

Our data is pretty messy so we have to do some work to normalize both the brand and the sku so the products collapse nicely. We have a list of brands that we care about and do some work to map brands from the merchant feed into our brand. e.g. If we have an "ACME" brand in our system we might map the following to that brand:

A.C.M.E => ACME
ACME Inc. => ACME
Acme Incorporated => ACME

For skus we usually just strip any non-alphanumeric characters for matching purposes. e.g. all the following would map to the same sku:

abc-123 => abc123
abc.123 => abc123
abc 123 => abc123
ab.c1.23 => abc123

So if we see brand "ACME Inc." and sku "abc-123" in one feed that will collapse with brand "A.C.M.E" and sku "abc 123" from another feed.

As part of the collapsing process we end up with multiple names/images/descriptions/categories/etc... for each collapsed part and need to choose the "best" one to show on the website.

That's a very high level overview of how we handle it.

OTHER TIPS

Look for merchants who provide UPC codes in their feeds. They are universal. Plus in AvantLink you can customize your own feed output so that's nice.

I was actually looking at 2 sample data feeds from AvantLink a minute ago. Here's the list of fields they provide (not filtered, so I assume it's everything):

SKU 
Manufacturer 
Id  
Brand Name  
Product Name    
Long Description    
Short Description   
Category    
SubCategory 
Product Group   
Thumb URL   
Image URL   
Buy Link    
Keywords    
Reviews 
Retail Price    
Sale Price  
Brand 
Page Link   
Brand Logo Image    
Product Page View Tracking  
Product Content Widget

I was thinking that yes, having UPC would be (almost) ideal but both stores I was looking at (one of them is REI) don't provide UPC's.

Checked Commission Junction and Sshareasale, a few large merchants, they don't include UPC's either.

How is this problem typically handled?

Such scenarios are typically covered by data warehouse systems like provided by ORACLE, HP, Microsoft, IBM, Netezza or Teradata.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top