About the Role Mantis Analytics is an AI-powered threat intelligence and simulation platform that helps enterprises and public-sector organizations detect, analyze, and forecast real-world risks affecting operations and supply chains.
This is the engineering backbone of an ML team. The model, analytics, and LLM-ops work sits with others — your job is to make everything around it solid and fast: clean Python, reliable pipelines, well-built services, the news-scraping system, and the data products behind our dashboards.
You’ll integrate model-serving endpoints into real systems, move data correctly between GCP components, and turn prototypes into things that hold up in production. You’ll work closely with infra and data engineering, owning the application-side Python that ties it all together. Responsibilities * Write clean, performant, well-structured Python that the rest of the team builds on. * Build and maintain the news-scraping system- classical scraping first, with clean hooks for an agentic fallback. * Build ETL and data pipelines feeding topic modelling and dashboards (big-data scale, embedding generation, clustering inputs). * Integrate model-serving endpoints into services: wire up vLLM / OpenAI-compatible APIs, handle resource lifecycle (start/stop, refcounting, graceful shutdown for self-hosted). * Build the data layer behind dashboards- reliable, queryable, and fast. * Run jobs on GCP (Cloud Run, Batch, Workflows) with correct auth (OIDC / OAuth, service accounts). * Design clean abstractions so scraping, retrieval, and serving components stay decoupled and testable.
What You Need * Strong Python fundamentals: fast, readable, well-structured code; you know how to lay out a codebase. * Solid web-scraping experience (requests / Playwright / Puppeteer; handling messy real-world sites). * Comfortable with cloud infrastructure, ideally GCP (Cloud Run, Batch, GCS, BigQuery). * Experience moving data at scale: ETL, Parquet / columnar formats, efficient I/O. * Clean third-party API / SDK integration with proper error handling and auth. * A pragmatic ship-it mindset balanced with genuine care for code quality.
Nice to Have * Exposure to LLM serving (vLLM) or any LLMOps-adjacent work. * Experience with embeddings / clustering pipelines or dashboard data backends. * Familiarity with orchestration (Airflow or similar) and version control best practices. * Containers and CI.
What You’ll Learn Here You’ll work across a real ML production stack — scraping, ETL, embedding pipelines, serving integration, and dashboard data layers — with hands-on knowledge sharing from the team.