Question

On my future website I try to convert tracks given as a string "Artist - TrackTitle" into the corresponding "spotify:track:trackCode".

Since I'm faster at programming PHP than javaScript (feel free to sneer), I do the following:

  1. Clear the string of a few things that a spotify search finds confusing, like things in brackets, symbols such as "`", "/", "-" etc.
  2. Convert spaces to the html entity "%20"
  3. retrieve the result of the spotify xml-page "http://ws.spotify.com/search/1/track?q=" and the string attached.
  4. If there was a result, retrieve the FIRST match in that page that matches the regex "(spotify:track:)(.*)(\">)"

Funnily, this works well for only about 80% of all the strings. Although the strings are rather well standardized (they come from a radio website, the swedish "Digilistan P3"), some searches give 0 results.

Possible solutions:

  • A) The track is not available on spotify.
  • B) The track IS available on spotify, but the search algorithm on ws.spotify.com/search differs from the desktop client.
  • C) The search string is not well prepared for both the url version OR the desktop client

Two tracks that fall into group B or C (after they are stripped from unsuitable characters):

  1. teddybears sthlm - rock´n´roll highschool
  2. bomfunk mc´s - b-boys & flygirls
  3. christina aguilera, mya, pink & lil´ kim - lady marmalade
  4. macklemore & ryan lewis feat. wanz - thrift shop ( I mean: REALLY? Are you kidding me? Not even ws.spotify.com/search/1/track?q=macklemore%20&thrift%20shop gives any result!)

Now the question

Can anyone suggest a better conversion or idea that improves my rating of success in finding suitable matches for the tracks?

The current algorithm can be found here.

Was it helpful?

Solution 2

The search algorithm in the client and the Web API are indeed slightly different, but you may also have found a bug.

The Web API uses a global popularity to rank search results (weighted with the actual search query). It also returns things that is available in any country.

The client only returns entities available in the country for the logged in user. It also uses the popularity from the logged in user's country to rank search results.

Depending on this and also the fact that labels very often send different copies of the exact same albums for different countries with different rights will make the search result different. We recently saw a bug because of this in some countries in the clients as well. https://twitter.com/swemoph/status/426260017847623680

So, by design it should be slightly different, however in your case it should only mean more search results in a slightly different order, but never zero.

2-4 could probably be explained with not escaping &.

Number 1 is more interesting. Looking at the actual uri of the track in the Web API and also the open site, we see it is misattributed to Teddybears (not Teddybears Sthlm):

$ curl -s 'http://ws.spotify.com/lookup/1/.json?uri=spotify:track:1JdC88rtMAwebQVFOcAg0D' | jq .track.artists
[
  {
    "name": "Teddybears",
    "href": "spotify:artist:3gqv1kgivAc92KnUm4elKv"
  },
  {
    "name": "Thomas Rusiak",
    "href": "spotify:artist:7amcWVAeY8e6YwgV9bXlKH"
  }
]

http://open.spotify.com/track/1JdC88rtMAwebQVFOcAg0D shows Rock 'n' Roll Highschool by Teddybears

This clearly explains why you don't find it in the Web API. By adding the search term sthlm you are excluding this track from the results. The query engine seems to work as intended (although I would have preferred if we allowed a more fuzzy search here, but that is a different problem). You are doing nothing wrong, but we need to figure out why the data looks different.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top