Skip to content

Stage 4 — Additional Info

Status: 🔴 Not built


Overview

A further enrichment pass that augments jobs and companies with data from external sources — LinkedIn, company registries, and salary benchmarks.


Planned Classes

Class File Role
(not yet designed)

Planned Enrichments

Field Source Target Table
Company size, industry, founded year LinkedIn / company registry companies
LinkedIn job URL LinkedIn jobs
Salary benchmark External salary sites jobs

Planned Architecture

Follows the same polling pattern as Stages 2 and 3:

  1. A new class watches jobs for null columns (e.g. linkedin_url IS NULL)
  2. Fetches a batch of records
  3. Fires the appropriate Apify actor (LinkedIn scraper, company info actor, etc.)
  4. Streams results back and writes enriched fields to the database
  5. Repeats until the queue is empty

Dependencies

  • Stage 2 must have populated company_id and location_id before this stage can run
  • Stage 3 should have populated order_number so enrichment targets confirmed active listings only