Skip to content

Implementation

Stages

Stage Status Classes Description
1a — Real-time scraping 🔴 Not built Incremental scrape every 30 min. Stops when cached URLs are detected.
1b — Full refresh 🟠 Needs improvements JobsAustriaETLCache Full scrape every Monday. Deduplicates via INSERT IGNORE.
2 — Payload Sync 🟠 In progress JobsAustriaCacheProcess, JobsAustriaCacheSynchronizer Imports data_payload JSON into jobs, companies, locations. Syncs FKs.
3 — Detail Enrichment 🟠 In progress JobsAustriaDetailsETL, PortalRouter Scrapes full AMS detail pages via Apify. Writes to jobs and descriptions.
4 — Additional Info 🔴 Not built LinkedIn data, company firmographics, salary benchmarks.

Notes

  • 🔴 = not yet built
  • 🟠 = built but has known issues or is actively being extended
  • 🟢 = production-ready (no stages are at this level yet)
  • Stages run in order — each stage depends on the output of the previous one