JobsAustriaCacheSynchronizer¶
File: jobs_austria_cache_synchronizer.py
Inherits: —
Orchestrates two-step synchronization of scrape_cache rows: 1. synchronize_fk_id – links scrape_cache rows to jobs via url_hash, writes jobs.id back into scrape_cache.fk_job_id 2. synchronize_company_id – extracts company from payload, syncs to the companies table, updates jobs.company_id
Only rows where fk_job_id IS NULL are ever fetched, so rows are never processed twice — once fk_job_id is filled the row graduates out of the queue.
Class Diagram¶
classDiagram
class JobsAustriaCacheSynchronizer {
+__init__()
-_fetch_fk_pending_batch()
-_fetch_payload_pending_batch()
+synchronize_fk_id()
-_bulk_update_scrape_cache_fk()
+synchronize_company_id()
-_unpack_payload()
-_sync_companies()
-_sync_locations()
-_extract_portal()
-_parse_date()
-_str_or_none()
-_update_jobs()
+run_cycle() bool
+async synchronize()
}
Hold "Ctrl" to enable pan & zoom
Methods¶
| Method | Parameters | Returns |
|---|---|---|
__init__() |
— | — |
_fetch_fk_pending_batch() |
— | — |
_fetch_payload_pending_batch() |
— | — |
synchronize_fk_id() |
df |
— |
_bulk_update_scrape_cache_fk() |
to_update |
— |
synchronize_company_id() |
df_enriched |
— |
_unpack_payload() |
df |
— |
_sync_companies() |
df |
— |
_sync_locations() |
df |
— |
_extract_portal() |
url_str |
— |
_parse_date() |
val |
— |
_str_or_none() |
val |
— |
_update_jobs() |
df |
— |
run_cycle() |
— | bool |
synchronize() |
— | — |
Attributes¶
No class-level attributes.