更智能地部署 AI - LLM 可扩展性、ML-Ops 和成本效率

您还未登录,请点击这里登录

上一讲： 6. MLflow for Model Inference Techniques and Practices (Practical)

下一讲： 1. Efficiency through Batching and Dynamic Batches

视频列表

1. Introduction
1. Introduction & Welcome
2. Getting Started
1. Course Structure How to get the Most out of this Course
2. Environment Setup Prepare and Use the Resource of this Course Right
3. Pre-Deployment Strategies
1. Ensuring Model Correctness Evaluation Techniques
2. Performance Optimization Exploring Key Dimensions
3. Balancing Speed and Accuracy Best Practices
4. Advanced Model Management with ML-Ops
1. Fundamentals of ML Model Management and ML-Ops
2. Overview of Effective ML-Ops Frameworks
3. Setting up ML-Ops Framework Introduction to MLflow (Practical)
4. Getting Started with MLflow A Practical Approach (Practical)
5. Training Models with MLflow A Hands-On Guide (Practical)
6. MLflow for Model Inference Techniques and Practices (Practical)
7. Advanced Techniques in MLflow Extending Functionality (Practical)
5. Advanced Model Deployment Techniques
1. Efficiency through Batching and Dynamic Batches
2. Hands-on Application of Batching Techniques (Practical)
3. The Role of Sorting in Model Deployment (Practical)
4. Leveraging Quantization for Model Efficiency (Practical)
5. Inference Strategies Parallelism, Flash Attention, GPTQ & AVQ,
6. Next-Gen Scaling LoRa, Paged Attention, ZeRO
6. The Economics of Machine Learning Inference
1. The Broader Context of AI A Wider Perspective
2. Measuring Performance Key Metrics for Large AI Projects
3. Evaluating Deployment Strategies for Cost & Efficiency
4. Real-World Benchmarks for Success Case Studies and Insights
7. Effective Cluster Management for Large Scale ML Deployments
1. Basic Inference - First Levels of Deployment (Practical)
2. Entering Optimisations - Advanced Levels of Deployment (Practical)
3. Setting Up Data Access in Distributed Environments (Practical)
4. Distributing Data Across a Cluster with RabbitMQ (Practical)
5. Foundations of Distributed Computing with Ray (Practical)
6. Scaling Large Language Models on a Cluster (Practical)