Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Online job-searching is tedious. Help me automate it

Many job sites have broken searches that don't let you narrow down jobs by experience level. Even when they do, it's usually wrong. This requires you to wade through hundreds of postings that you can't apply for before finding a relevant one, quite tedious. Since I'd rather focus on writing cover letters etc., I want to write a program to look through a large number of postings, and save the URLs of just those jobs that don't require years of experience.

I don't require help writing the scraper to get the html bodies of possibly relevant job posts. The issue is accurately detecting the level of experience required for the job. This should not be too difficult as job posts are usually very explicit about this ("must have 5 years experience in..."), but there may be some issues with overly simple solutions.

In my case, I'm looking for entry-level positions. Often they don't say "entry-level", but inclusion of the words probably means the job should be saved.

Next, I can safely exclude a job the says it requires "5 years" of experience in whatever, so a regex like /\d\syears/ seems reasonable to exclude jobs. But then, I realized some jobs say they'll take 0-2 years of experience, matches the exclusion regex but is clearly a job I want to take a look at. Hmmm, I can handle that with another regex. But some say "less than 2 years" or "fewer than 2 years". Can handle that too, but it makes me wonder what other patterns I'm not thinking of, and possibly excluding many jobs. That's what brings me here, to find a better way to do this than regexes, if there is one.

I'd like to minimize the false negative rate and save all the jobs that seem like they might not require many years of experience. Does excluding anything that matches /[3-9]\syears|1\d\syears/ seem reasonable? Or is there a better way? Training a bayesian filter maybe?

Edit: There's a similar, but harder problem, which would probably be more useful to solve. There are lots of jobs that just require an "engineering degree", as you just have to understand a few technical things. But searching for "engineering" gives you thousands of jobs, mostly irrelevant.

How do I narrow this down to just those jobs that require any engineering degree, rather than particular degrees, without looking at each myself?

like image 799
ehsanul Avatar asked Jun 15 '10 19:06

ehsanul


Video Answer


2 Answers

Ok, this answer is probably not going to be helpful -- I will say that up front. But, in my opinion, merely thinking about the problem in this way is enough to get you hired at most places I've worked. My suggestion? Contact the hiring manager at any of the postings in which you have interest, tell them this is what you are doing. Tell them generically what you have coded so far, and ask for assistance in learning the patterns they use when writing their adverts.

If I were on the receiving end of this letter, I think I would invite the person in for an interview.

like image 119
MJB Avatar answered Oct 16 '22 03:10

MJB


I developed a good parse and email routine for a couple of job websites when I was looking for work for myself and a couple of friends. I agree with the other posts, this is a great way to look at the problem. Just to drop a little info, I did it mostly in ruby, and used tor proxies and some other methods to make sure that I wouldn't be iced out of the job site. This sort of project is unlike usual scraping as you really can't afford to get kicked off a job board. In any case, I just have one piece of advice: forget about sorting and fine tuning this too intensely. Let the HR department do that for you and get your resume and credentials out everywhere. It's a statistical game, and you want to broadcast yourself and throw that net as widely as possible.

like image 23
riva Avatar answered Oct 16 '22 03:10

riva