Machine Learning Operations Engineer
Interswitch Group
Software Engineering, Operations
Lagos, Nigeria
Posted on Dec 15, 2024
Job Summary
We are seeking a highly skilled AI/MLOps Developer to join our team. The ideal candidate will have a strong background in machine learning operations, model deployment, and managing ML pipelines. This role requires a deep understanding of AI/ML workflows and the ability to optimize and maintain AI models in production.
Key Responsibilities
We are seeking a highly skilled AI/MLOps Developer to join our team. The ideal candidate will have a strong background in machine learning operations, model deployment, and managing ML pipelines. This role requires a deep understanding of AI/ML workflows and the ability to optimize and maintain AI models in production.
Key Responsibilities
- Model Development and Deployment: Design, develop, and deploy machine learning models into production environments, ensuring scalability and reliability.
- Pipeline Automation: Build and maintain automated ML pipelines to streamline model training, testing, and deployment processes.
- Monitoring and Optimization: Monitor deployed models for performance and accuracy, implementing retraining and optimization strategies as needed.
- Infrastructure Management: Manage cloud-based and on-premise infrastructure to support the training and deployment of ML models, ensuring cost efficiency and scalability.
- Security and Compliance: Implement and maintain robust security practices to protect sensitive data and ensure compliance with relevant regulations.
- Collaboration: Work closely with data scientists, software engineers, and other stakeholders to integrate ML models into applications and systems.
- Documentation: Maintain comprehensive documentation of ML pipelines, deployment processes, and model performance metrics.
- Continuous Improvement: Stay updated with the latest advancements in AI/ML and DevOps practices, and apply this knowledge to improve existing systems and processes.
- Troubleshooting: Identify and resolve issues related to model deployment, performance, and infrastructure on time.
- Education: Bachelor’s degree in Computer Science, Engineering, or a related field, or equivalent experience.
- Experience: Proven experience in AI/MLOps or a similar role, with hands-on experience in deploying and managing machine learning models.
- Technical Skills:
- Proficiency in programming languages such as Python, and familiarity with ML frameworks like TensorFlow, PyTorch, or Scikit-learn.
- Experience with containerization and orchestration tools such as Docker and Kubernetes.
- Knowledge of cloud platforms (AWS, Azure, GCP) and their ML services.
- Familiarity with CI/CD tools and processes for ML pipelines.
- Strong understanding of version control systems (e.g., Git, BitBucket).
- Experience with monitoring and logging tools for model performance tracking.
- Experience with advanced topics like distributed training, model explainability, and ethical AI considerations.
- Familiarity with data engineering practices and tools like Apache Spark or Hadoop.
- Contributions to open-source projects related to AI/MLOps.
- Understanding of software development best practices, including Agile methodologies.