Gig
100000
8
May 25, 2025
We are seeking a Senior Python Developer with expertise in Scrapy and API-based data extraction to build scalable, reliable scrapers deployed on Zyte Scrapy Cloud. You will primarily utilise websites' APIs to ensure efficiency and scalability, while managing data pipelines and ensuring compliance with ethical and legal standards.
Key Responsibilities
- Develop and deploy Scrapy spiders on Zyte Scrapy Cloud with a focus on API-based data extraction.
- Handle API challenges like rate limiting, pagination, and authentication (e.g., API keys, OAuth).
- Optimise spiders for performance, scalability, and minimal API/server load.
- Use Zyte Smart Proxy Manager for IP rotation and session handling as needed.
- Build and maintain pipelines for cleaning, validating, and storing data in cloud databases.
- Monitor scraper performance and resolve issues via Zyte Cloud’s tools.
- Ongoing management/ingestion of data into our data pipelines.
- Ensure compliance with terms of service and ethical scraping practices.
Requirements
- 5+ years of Python development experience with 3+ years in Scrapy.
- Proven expertise in API integration (REST, GraphQL) and challenges like rate limiting.
- Experience deploying and monitoring spiders on Zyte Scrapy Cloud.
- Proficiency in working with databases (e.g., PostgreSQL,) and optimising Scrapy pipelines.
- Familiarity with Zyte Smart Proxy Manager and handling anti-scraping techniques.
- Strong understanding of HTTP, JSON, and web technologies.
Preferred Skills
- Experience with GraphQL APIs and Zyte AutoExtract API.
- Familiarity with JavaScript-heavy page rendering (e.g., Splash, Playwright).
- Knowledge of distributed scraping or big data tools (e.g., Scrapy-Cluster, Airflow).
Benefits
Competitive salary and bonuses.
Flexible working hours and remote-friendly environment.
Opportunity to work on innovative, large-scale data extraction projects.