Python Developer (Scrapy + Zyte)

Please login or register as jobseeker to apply for this job.

TYPE OF WORK

Gig

SALARY

100000

HOURS PER WEEK

8

DATE UPDATED

May 25, 2025

JOB OVERVIEW

We are seeking a Senior Python Developer with expertise in Scrapy and API-based data extraction to build scalable, reliable scrapers deployed on Zyte Scrapy Cloud. You will primarily utilise websites' APIs to ensure efficiency and scalability, while managing data pipelines and ensuring compliance with ethical and legal standards.

Key Responsibilities
- Develop and deploy Scrapy spiders on Zyte Scrapy Cloud with a focus on API-based data extraction.
- Handle API challenges like rate limiting, pagination, and authentication (e.g., API keys, OAuth).
- Optimise spiders for performance, scalability, and minimal API/server load.
- Use Zyte Smart Proxy Manager for IP rotation and session handling as needed.
- Build and maintain pipelines for cleaning, validating, and storing data in cloud databases.
- Monitor scraper performance and resolve issues via Zyte Cloud’s tools.
- Ongoing management/ingestion of data into our data pipelines.
- Ensure compliance with terms of service and ethical scraping practices.

Requirements
- 5+ years of Python development experience with 3+ years in Scrapy.
- Proven expertise in API integration (REST, GraphQL) and challenges like rate limiting.
- Experience deploying and monitoring spiders on Zyte Scrapy Cloud.
- Proficiency in working with databases (e.g., PostgreSQL,) and optimising Scrapy pipelines.
- Familiarity with Zyte Smart Proxy Manager and handling anti-scraping techniques.
- Strong understanding of HTTP, JSON, and web technologies.

Preferred Skills
- Experience with GraphQL APIs and Zyte AutoExtract API.
- Familiarity with JavaScript-heavy page rendering (e.g., Splash, Playwright).
- Knowledge of distributed scraping or big data tools (e.g., Scrapy-Cluster, Airflow).

Benefits
Competitive salary and bonuses.
Flexible working hours and remote-friendly environment.
Opportunity to work on innovative, large-scale data extraction projects.

VIEW OTHER JOB POSTS FROM:
SHARE THIS POST
facebook linkedin
  BENCHMARKS  
Loading Time: Base Classes  0.0012
Controller Execution Time ( Jobseekers / Job )  0.0511
Total Execution Time  0.0530
  GET DATA  
No GET data exists
  MEMORY USAGE  
1,530,680 bytes
  POST DATA  
No POST data exists
  URI STRING  
jobseekers/job/Python-Developer-Scrapy-Zyte-1384483
  CLASS/METHOD  
jobseekers/job
  DATABASE:  onlinejobs (Jobseekers:$db)   QUERIES: 13 (0.0443 seconds)  (Hide)
0.0003   SELECT *
                                
FROM exrates
                                WHERE rate_name 
'USD-PHP' 
0.0011   SELECT *
FROM `employer_jobs`
WHERE `job_id` = 1384483
 LIMIT 1 
0.0018   SELECT *
FROM `employers`
WHERE `employer_id` = 471090
 LIMIT 1 
0.0007   SELECT COUNT(*) AS `numrows`
FROM `t_thread` `t`
LEFT JOIN `t_thread_misc` `miscON `t`.`id` = `misc`.`thread_id`
WHERE `t`.`job_id` = 1384483
AND `misc`.`idIS NULL 
0.0004   SELECT e.business_namee.logoe.websitee.rebill_datee.date_added member_datehitsDATEDIFF('2026-04-21',ej.date_added) duration_daysDATEDIFF('2026-04-21',e.rebill_date) duration_rebillej.*, e.deactivate FROM employers eemployer_jobs ej WHERE e.employer_id ej.employer_id AND
                                   ((
e.user_level >= '500' AND ej.date_added <= e.rebill_date)
                                   OR 
e.employer_id '' OR (ej.date_approved <> '2000-01-01' and DATEDIFF('2026-04-21',ej.date_added) <= 14 ))
                                   AND 
e.deactivate != AND ej.deleted AND job_id '1384483' 
0.0008   SELECT *
FROM `employer_jobs_skills` `ejs`
LEFT JOIN `skills_categories` `scON `ejs`.`skill_id` = `sc`.`id`
WHERE `job_id` = 1384483 
0.0032   UPDATE employer_jobs SET hit_counts '***May-25-2025=145***May-26-2025=47***May-27-2025=20***May-28-2025=11***May-29-2025=5***May-30-2025=5***May-31-2025=5***Jun-01-2025=7***Jun-02-2025=8***Jun-03-2025=3***Jun-04-2025=6***Jun-05-2025=4***Jun-06-2025=3***Jun-07-2025=3***Jun-09-2025=2***Jun-10-2025=2***Jun-11-2025=4***Jun-12-2025=2***Jun-13-2025=4***Jun-16-2025=1***Jun-17-2025=2***Jun-18-2025=2***Jun-19-2025=4***Jun-20-2025=1***Jun-23-2025=1***Jun-24-2025=2***Jun-26-2025=2***Jun-27-2025=3***Jun-28-2025=1***Jun-30-2025=2***Jul-01-2025=1***Jul-02-2025=7***Jul-03-2025=3***Jul-04-2025=1***Jul-07-2025=3***Jul-08-2025=2***Jul-09-2025=4***Jul-10-2025=2***Jul-11-2025=1***Jul-12-2025=1***Jul-13-2025=1***Jul-14-2025=2***Jul-15-2025=2***Jul-16-2025=1***Jul-17-2025=2***Jul-18-2025=1***Jul-19-2025=1***Jul-23-2025=3***Jul-24-2025=1***Jul-28-2025=1***Jul-29-2025=2***Jul-30-2025=1***Jul-31-2025=3***Aug-04-2025=2***Aug-11-2025=9***Aug-12-2025=4***Aug-13-2025=4***Aug-14-2025=2***Aug-15-2025=2***Aug-17-2025=3***Aug-18-2025=1***Aug-19-2025=1***Aug-22-2025=1***Aug-27-2025=2***Aug-28-2025=3***Aug-30-2025=2***Aug-31-2025=2***Sep-01-2025=1***Sep-02-2025=2***Sep-03-2025=2***Sep-05-2025=1***Sep-07-2025=1***Sep-08-2025=7***Sep-11-2025=1***Sep-13-2025=2***Sep-15-2025=3***Sep-16-2025=2***Sep-17-2025=1***Sep-18-2025=2***Sep-20-2025=1***Sep-21-2025=2***Sep-22-2025=1***Sep-25-2025=2***Sep-28-2025=1***Oct-01-2025=1***Oct-02-2025=1***Oct-03-2025=2***Oct-04-2025=1***Oct-05-2025=1***Oct-06-2025=1***Oct-08-2025=1***Oct-09-2025=1***Oct-11-2025=1***Oct-12-2025=1***Oct-14-2025=1***Oct-17-2025=1***Oct-18-2025=1***Oct-19-2025=1***Oct-20-2025=1***Oct-21-2025=1***Oct-22-2025=3***Oct-24-2025=1***Oct-25-2025=1***Oct-26-2025=1***Oct-27-2025=3***Oct-28-2025=4***Oct-29-2025=2***Oct-30-2025=3***Oct-31-2025=3***Nov-01-2025=5***Nov-02-2025=3***Nov-03-2025=6***Nov-04-2025=6***Nov-05-2025=1***Nov-07-2025=2***Nov-13-2025=3***Nov-14-2025=3***Nov-15-2025=1***Nov-19-2025=1***Nov-20-2025=1***Nov-25-2025=1***Nov-30-2025=2***Dec-02-2025=1***Dec-03-2025=2***Dec-05-2025=1***Dec-06-2025=1***Dec-07-2025=2***Dec-08-2025=2***Dec-10-2025=2***Dec-13-2025=2***Dec-14-2025=1***Dec-15-2025=2***Dec-18-2025=1***Dec-19-2025=2***Dec-20-2025=2***Dec-21-2025=1***Dec-25-2025=1***Dec-30-2025=1***Dec-31-2025=2***Jan-03-2026=1***Jan-05-2026=1***Jan-08-2026=1***Jan-09-2026=2***Jan-13-2026=5***Jan-15-2026=2***Jan-22-2026=1***Jan-27-2026=1***Jan-28-2026=1***Jan-30-2026=1***Feb-02-2026=2***Feb-04-2026=1***Feb-08-2026=1***Feb-19-2026=1***Feb-24-2026=2***Feb-27-2026=1***Feb-28-2026=1***Mar-03-2026=2***Mar-06-2026=1***Mar-08-2026=1***Mar-10-2026=1***Mar-11-2026=1***Mar-13-2026=2***Mar-17-2026=2***Mar-18-2026=1***Mar-19-2026=1***Mar-20-2026=2***Mar-22-2026=2***Mar-23-2026=3***Mar-24-2026=1***Mar-25-2026=1***Mar-26-2026=3***Mar-27-2026=2***Mar-31-2026=1***Apr-01-2026=2***Apr-02-2026=4***Apr-03-2026=2***Apr-04-2026=2***Apr-05-2026=1***Apr-06-2026=3***Apr-07-2026=3***Apr-08-2026=2***Apr-09-2026=2***Apr-10-2026=1***Apr-11-2026=3***Apr-13-2026=1***Apr-15-2026=2***Apr-16-2026=1***Apr-17-2026=3***Apr-18-2026=3***Apr-21-2026=1' WHERE job_id'1384483'  
0.0006   UPDATE employer_jobs SET monthly_hits '***May-2025=238***Jun-2025=69***Jul-2025=46***Aug-2025=38***Sep-2025=32***Oct-2025=38***Nov-2025=35***Dec-2025=26***Jan-2026=16***Feb-2026=9***Mar-2026=27***Apr-2026=36' WHERE job_id'1384483'  
0.0009   SELECT date_sent FROM jobseeker_sent_emails WHERE jobseeker_id '' AND job_id '1384483' AND status LIKE 'sent%' ORDER BY id DESC  
0.0004   SELECT *
FROM `employer_jobs_skills` `ejs`
LEFT JOIN `skills_categories` `scON `ejs`.`skill_id` = `sc`.`id`
WHERE `job_id` = 1384483 
0.0333   SELECT COUNT(*) AS `numrows`
FROM `employer_jobs`
WHERE `employer_id` = '471090'
AND `date_added` >= '2022-06-08' 
0.0005   select from teasers 
0.0002   SELECT FROM skill_categories WHERE skill_cat_id='' 
  HTTP HEADERS  (Show)
  SESSION DATA  (Show)
  CONFIG VARIABLES  (Show)