Skip to content

JobsAustriaCacheSynchronizer

File: jobs_austria_cache_synchronizer.py
Inherits:

Orchestrates two-step synchronization of scrape_cache rows: 1. synchronize_fk_id – links scrape_cache rows to jobs via url_hash, writes jobs.id back into scrape_cache.fk_job_id 2. synchronize_company_id – extracts company from payload, syncs to the companies table, updates jobs.company_id

Only rows where fk_job_id IS NULL are ever fetched, so rows are never processed twice — once fk_job_id is filled the row graduates out of the queue.

Class Diagram

classDiagram
    class JobsAustriaCacheSynchronizer {
        +__init__()
        -_fetch_fk_pending_batch()
        -_fetch_payload_pending_batch()
        +synchronize_fk_id()
        -_bulk_update_scrape_cache_fk()
        +synchronize_company_id()
        -_unpack_payload()
        -_sync_companies()
        -_sync_locations()
        -_extract_portal()
        -_parse_date()
        -_str_or_none()
        -_update_jobs()
        +run_cycle() bool
        +async synchronize()
    }
Hold "Ctrl" to enable pan & zoom

Methods

Method Parameters Returns
__init__()
_fetch_fk_pending_batch()
_fetch_payload_pending_batch()
synchronize_fk_id() df
_bulk_update_scrape_cache_fk() to_update
synchronize_company_id() df_enriched
_unpack_payload() df
_sync_companies() df
_sync_locations() df
_extract_portal() url_str
_parse_date() val
_str_or_none() val
_update_jobs() df
run_cycle() bool
synchronize()

Attributes

No class-level attributes.