Recommender System for mostly unique user and items

https://datascience.stackexchange.com/questions/85979

16-12-2020
|

Question

I am trying to develop a recommender system for a job matching problem. My data consists of past matched candidate profiles and job profiles as well as if there was a success such that both, candidate and employer, accepted the offer.

Now, what I want is to build a recommendation system for future candidate and jobs which will recommend a fitting candidate to an open job posting based on past successes. The big problem here is, that each candidate and each job is unique. So, if a matching was successful, both candidate and job won't be available anymore. As for recommender systems, this is a huge problem, since both, collaborative filtering and content-based filtering, rely on user ratings/interaction and similar users, who already rated items for which a recommendation should be calculated. But these items don't exist anymore. What I have here is basically a cold-start Problem for user and items.

One way I could tackle this problem would be to user a knowledge-based recommender system, since I have candidate and job profiles describing skill, position, experience and so on. I could construct common feature vectors and simply calculate similarity between them, recommending the top n most similar candidates to a job posting. But I want to use also the knowledge of the past job-matchings since they contain information about what kind of candidate was suitable for a specific job.

Does anybody have an idea how I could do this or had a similar problem?

Solution

There are a lot of different ways you could approach this problem. Sort of depends on what kind of results you're hoping to get for your particular situation. I'm going to break up my answer into sections, as this will be a longer reply because I want to provide as much context as possible to the problems you will face with building a job/candidate recommender as they are quite numerous.

Some suggestions on how to deal with your current problem:

Your biggest challenge is data sparsity, so a few suggestions:

If you're not already, use a Factorization Machine as your core model. At least until you're ready to implement a more complicated neural net or something. FMs deal with sparse conditions really well and can take in the wide variety of data & embeddings that you're going to be using in this kind of task. Just doing this alone I think will get you on track to solving the problem you're talking about.
You need to define features to make impressions more dense. An example could literally be consolidating companies with sparse histories into segments:

['Netflix', 'Facebook', 'ComputerVisionCompaniesUnder10Employees', ...]

Then from now on any company that falls under the category of ComputerVisionCompaniesUnder10Employees will only be represented as that label.

You then need to one-hot-encode normalized job title as a separate feature from company and candidate.

Given how FMs work, you shouldn't need to do this, just including the feature should accomplish the same thing, but I've noticed that sometimes explicitly replacing companies labels like that does boost performance quite considerably.

So now variance in recommendation for companies under the same company label would depend on the job title and contextual features about that particular job. If two different companies under that same label had similar job postings, then they would get recommended similar (if not the same) candidates and that's OK.

As far as candidate abstractions go, you're probably going to want to find some way of normalizing and grouping skills as the skills matrix can get quite sparse on its own.

Additionally adding in aggregations of features of companies that a candidate previously worked for could help. Like how many "Large companies" has the candidate has worked for as an example.

The more categorizations you can feed to the FM the better, the FM will calculate interaction embeddings between every set of features. The important bit is to separate the different features about the jobs/candidates and abstract separately. Don't calculate abstractions for job postings, calculate abstractions for companies, job titles, and skill descriptions, and let the FM figure out how all those things relate.

Data issues in Job/Candidate Recommendation:

Data sparsity of impressions.

As you implied in your question, the only knowledge you currently have on candidates' job preferences is their past jobs and content related features. Getting a measure of whether or not the candidates liked their past jobs is also difficult.

Lower than expected information entropy in job-related data points.

Almost all resumes, LinkedIn profiles, and job postings are all designed with brevity in mind. It's a well known fact that recruiters spend very little time reading resumes, so it's become a standard that people should aim to keep their resume under 1 page. This makes information content lower than one may think. The same is often true of job postings, in fact most candidates don't even read job postings due to the prevalence of "1 click" apply features.

Non-trivial normalization of different job related entities

There aren't really any good open source standard job title sets out there last time I checked. Good skill ontologies are also hard to come by. These hurdles are a big pain, but you can overcome them with a lot of hard work. I'd start by picking some standard sets as truth (I personally recommend checking out the data referenced here). There's a couple of ways to try and do this automagically, but I personally haven't found a great way of dodging a lot of manual cleanup.

Notes on candidate recommendation:

Candidate recommendation is really only something that recruiter platforms like LinkedIn have had to think about all that much, as most job platforms recommend jobs to candidates and then candidates hit apply and that's when they show up in the employer's applicant pool.

From my experience, the features that recruiters care the most about when being recommended candidates is whether the technical skills match, whether or not they went to a prestigious university, and years of experience. As mentioned earlier, it's difficult to get any measure of what one might call "personality" from resume/profile data, so they just want to get candidates that match on a call to further vet them. Another note here is that most candidates don't actually read the JD, so getting candidates that are genuinely interested in the position beyond the paycheck/title saves recruiters a lot of time.

Notes on job recommendation:

There are actually very few platforms that do true job recommendation in the way that's been described here. Most just do job search. When you go to indeed and search for a job, indeed knows very little about the visitor on it's site beyond what they're typing in the search box. Indeed's success hinges on its search ranking far more than its recommender performance. Another note is that in my experience senior candidates like to have a lot of control of what jobs get thrown their way and junior candidates like to shotgun apply to every job that vaguely resembles what they're looking for as fast as possible. The junior case is at odds with what recruiters want, so the goal is make sure that candidates are only shown jobs that they are fit for.

My experience:

I've actually started a recruitment company before and have had to build a job recommender for candidates. Now this company didn't work out for reasons unrelated to the recommender performance. The job-search market is a very saturated one and surviving/innovating in it given the problem's constraints is very difficult but that's a whole other subject on it's own.

Licensed under: CC-BY-SA with attribution

Not affiliated with datascience.stackexchange