Make a job search site that actually works. Train a statistical model on a bunch of data from resumes and job postings, and try to predict what company a given individual will go to next. Offer it as a service to both individuals and companies -- upload a resume or a job description, and out pops an ordered list of suggestions.
Here's the statistical model I would use. Assume that the career history of an individual is generated by a random walk from one company to another, based on the individual's preferences and the type of the company. This is essentially an HMM with a few tweaks. The states in the HMM are (individual-type, company-type, role-type, location) tuples. Individual-type is generated from the individual's career history, and possibly also from the text of their resume. Role-type is generated from the individual-types of the individuals who filled that role, and possibly also from the job description. Location information can be gathered from the raw data, while the other three will have to be estimated using EM. To combat sparsity, transition probabilities are generated from a few independent distributions:
... or something like that, anyway.
Problems with the naive model:
1) Company reputations might vary over time, so the transition probabilities might not be stationary -- that could be accounted for by allowing the "company-type" of a company to change over time. To do this, just let EM determine the company-type independently for each year, and then marginalize to find the type of the company in a given year. You could use that distribution to predict what type a company will have in the coming year.
2) Startups and small companies will be under-represented. So, for years when a company didn't exist, just force its company-type to be 0. To adjust for company size, just condition on the prior probability that an individual of any type would have worked at the company.
3) When determining transition probabilities, you need to adjust for the probability that a company is hiring for a given role.
Practical problems:
1) Getting the raw data. Perhaps you could partner with monster.com to get a large data set. Colleges might also have information about which grads went to which companies.
2) Getting people to use it. Again, you might partner with an existing job board to provide recommendations.
Posted on November 29, 2006 04:11 AM
More projects articles