|
|
1 年間 前 | |
|---|---|---|
| assets | 1 年間 前 | |
| native | 1 年間 前 | |
| src | 1 年間 前 | |
| test | 1 年間 前 | |
| .gitignore | 1 年間 前 | |
| README.md | 1 年間 前 | |
| make_prod.patch | 1 年間 前 | |
| make_prod.sh | 1 年間 前 | |
| manifest.json | 1 年間 前 | |
| package-lock.json | 1 年間 前 | |
| package.json | 1 年間 前 | |
| packages.dhall | 1 年間 前 | |
| spago.dhall | 1 年間 前 |
This extension helps you keep track of the job offers you stumble upon on LinkedIn, automagically saving all of them into a human-editable database file.
Here are some of its features :
From the settings of the extension, you can choose where the file will be located ; let's say /home/me/job_search/. A jobs.rec file will be created in this directory.
From then, everytime you visit a page that contains a job offer, /home/me/job_search/jobs.rec will be updated with data extracted from the page.
E.g. if you visit https://www.linkedin.com/jobs/view/3765452342/, you will get a record about the job offer itself :
first_seen_date: Mon, 19 Feb 2024 13:31:00 +0100
url: https://www.linkedin.com/jobs/view/3765452342/
title: Data Engineer
origin: linked_in
location: Amérique latine
id: linked_in_3765452342
flexibility: full_remote
company: Mentor Talent Acquisition
application_process: regular
... and another with info about the company :
url: https://www.linkedin.com/company/mentor-talent-acquisition/life
name: Mentor Talent Acquisition
domain: Recrutement et placement de personnel
Because the database is just a plain text file, you can then update those records with other information that is harder to extract automatically (e.g. required experience, skills, ...) or with information about a potential application. You can also version it with git. Data integrity can be ensured via recutils utilities.
native/install.shnpm install && npm run buildextension/manifest.json (NOT the manifest.json located at root)Overall the extension architecture is not too complex, even though web extension standard mandates lots of message passing between parts that run in isolation of each other and can only communicate via JSON messages. E.g. only "content scripts" can read/write a web page's content ; only a native application that is launched by the browser is allowed to interact with the local file system ; only a background script can interact with the native application.
The major hard point was parsing LinkedIn pages to extract meaningful information. The HTML structure is not very semantic (lots of nested div and span with little identifiable class names), quite hard to retro-engineer in a reliable way, and evolves with UI updates.
For this task especially, PureScript type system proved incredibly useful.
This extension was mostly written to :
Therefore it has the following caveats :
But it works on my machine 🤷 !
I will likely improve on this, e.g. by extracting the LinkedIn parsing code into a PureScript/Javascript standalone library, but don't hold your breath !
Frontend tests can be run with npm run test.
Native application tests can be run with (cd native && pdm run pytest).