Project Description Are you passionate about improving the way Machine Learning systems are developed, deployed, and scaled in real-world production environments? We are collaborating with a leading European Online Fashion & Beauty Retailer to find a highly capable and self-driven Machine Learning Engineer (MLE/MLOps Focus) to join a fast-moving and impactful team.
This role is centered around building robust ML workflows, streamlining feature creation, and standardizing ML components to ensure scalability, consistency, and speed across the organization. You’ll work at the intersection of engineering and data science, playing a key part in shaping how machine learning is delivered at scale.
Responsibilities * Build and automate end-to-end ML workflows using Airflow (MWAA), integrating with feature stores, training, fine-tuning, deployment, and monitoring systems * Implement ML model lifecycle management using MLFlow and/or SageMaker (preferred) * Design and deploy infrastructure for ML workflows using Infrastructure-as-Code tools such as CloudFormation and YAML * Monitor and maintain ML services, leveraging observability tools (Grafana, custom metric logging, drift detection, and alerting) * Ensure robust testing of ML pipelines, including data validation, integration/unit tests, and performance monitoring within CI/CD workflows * Champion MLOps best practices and help enforce security and compliance standards (e.g., secrets management, training data retention) * Contribute to central feature store development and component standardization efforts * Integrate and optimize OpenAI APIs in production environments, with focus on prompt engineering and token handling * Deploy containerized ML applications using Docker and Kubernetes
Skills Required * 5+ years of experience in Machine Learning Engineering or MLOps roles * Solid Python development skills * Strong hands-on experience with Airflow (MWAA), MLFlow, and/or SageMaker * Familiarity with ML observability tools such as Grafana, custom metric logging, model drift detection, and alerting mechanisms * Proficiency in building CI/CD pipelines for ML systems with automated testing and validation * Experience with Infrastructure-as-Code tools (CloudFormation, YAML) * Understanding of secure and compliant deployment of ML pipelines * Excellent debugging and problem-solving skills * Experience with OpenAI API usage in production, containerization, and Kubernetes orchestration is highly valued
Project Age — Ongoing project
Preferred / Allowed work schedule — 9 am — 6 pm CEST
The project’s purpose is to improve the end-to-end machine learning lifecycle across multiple teams by: * Automating and orchestrating ML workflows (from training to deployment) * Introducing and maintaining experiment tracking and model versioning * Ensuring robust observability, testing, and CI/CD for ML systems * Standardizing and scaling reusable ML components (e.g., centralized feature stores) * Supporting secure and compliant deployment practices across the platform * Enabling integration with LLMs (e.g., OpenAI APIs) as part of the production stack
The team is responsible for: * Designing and maintaining reusable, production-grade ML pipelines using Airflow (MWAA) * Managing and scaling infrastructure via CloudFormation and Kubernetes * Implementing robust MLOps practices (versioning, monitoring, automated testing) * Tracking and managing ML experiments using MLFlow and/or SageMaker * Observing, detecting, and responding to model drift in production * Integrating OpenAI APIs with optimized prompts and token handling * Establishing best practices for secrets management and compliance in ML workflows * Supporting centralized feature management efforts across the organization