Get in Touch

Course Outline

Preparing Machine Learning Models for Deployment

  • Packaging models with Docker.
  • Exporting models from TensorFlow and PyTorch.
  • Considerations for versioning and storage.

Model Serving on Kubernetes

  • Overview of inference servers.
  • Deploying TensorFlow Serving and TorchServe.
  • Setting up model endpoints.

Inference Optimization Techniques

  • Batching strategies.
  • Handling concurrent requests.
  • Tuning for latency and throughput.

Autoscaling ML Workloads

  • Horizontal Pod Autoscaler (HPA).
  • Vertical Pod Autoscaler (VPA).
  • Kubernetes Event-Driven Autoscaling (KEDA).

GPU Provisioning and Resource Management

  • Configuring GPU nodes.
  • Overview of the NVIDIA device plugin.
  • Setting resource requests and limits for ML workloads.

Model Rollout and Release Strategies

  • Blue/green deployments.
  • Canary rollout patterns.
  • A/B testing for model evaluation.

Monitoring and Observability for ML in Production

  • Metrics for inference workloads.
  • Logging and tracing practices.
  • Dashboards and alerting.

Security and Reliability Considerations

  • Securing model endpoints.
  • Network policies and access control.
  • Ensuring high availability.

Summary and Next Steps

Requirements

  • A solid understanding of containerized application workflows.
  • Hands-on experience with Python-based machine learning models.
  • Familiarity with the fundamentals of Kubernetes.

Audience

  • ML engineers.
  • DevOps engineers.
  • Platform engineering teams.
 14 Hours

Number of participants


Price per participant

Testimonials (3)

Upcoming Courses

Related Categories