Tag: apache

Inner workings of Apache Spark

Here’s a breakdown of key internal aspects of the inner workings of Apache Spark. : 1. Architecture: 2. Execution Model: 3. Data Partitioning: 4. Shuffle Operations: 5. Memory Management: In essence, Spark’s internal workings involve: Understanding these internal mechanisms is key to writing efficient and scalable Spark applications. Read more
MLOps pipeline

While a full-fledged MLOps pipeline involves integrating various tools and platforms, here are some illustrative code snippets demonstrating key MLOps concepts using popular Python libraries and tools. These examples focus on individual stages and can be combined to build a more comprehensive pipeline. 1. Data Versioning with DVC (Data Version Control): This isn’t Python code,… Read more
Developing and training machine learning models within an MLOps framework

The “MLOps training workflow” specifically focuses on the steps involved in developing and training machine learning models within an MLOps framework. It’s a subset of the broader MLOps lifecycle but emphasizes the automation, reproducibility, and tracking aspects crucial for effective model building. Here’s a typical MLOps training workflow: Phase 1: Data Preparation (MLOps Perspective) Phase… Read more

Inner workings of Apache Spark