Lädt...


🔧 Building scalable ML workflows


Nachrichtenbereich: 🔧 Programmierung
🔗 Quelle: dev.to

A little while back, I wrote a post introducing Tork, an open-source project I've been developing. In a nutshell, Tork is a general-purpose, distributed workflow engine suitable for various workloads. At my work, we primarily use it for CPU/GPU-heavy tasks such as processing digital assets (3D, videos, images etc.), as well as our CI/CD tool for our internal PaaS.

Recently, I've been thinking about how Tork could potentially be leveraged to run machine learning type workloads. I was particularly inspired by the Ollama project and wanted to see if I can do something similar, but using plain old Docker images rather than Ollama's Modelfile.

Given that ML workloads often consist of distinct, interdependent stages—such as data preprocessing, feature extraction, model training, and inference—it’s crucial to have an engine that can orchestrate these steps. These stages frequently require different types of compute resources (e.g., CPUs for preprocessing, GPUs for training) and can benefit greatly from parallelization.

Moreover, resiliency is a critical requirement when running machine learning workflows. Interruptions, whether due to hardware failures, network issues, or resource constraints, can result in significant setbacks, especially for long-running processes like model training.

These requirements are very similar to my other non-ML workloads, so I decided to put my all my theories to the test and see what it would take to execute a simple ML workflow on Tork.

The experiment

For this first experiment, let's try to execute a simple sentiment analysis inference task:

Download the latest Tork binary and untar it.

tar xvzf tork_0.1.109_darwin_arm64.tgz

Start Tork in standalone mode:

./tork run standalone

If all goes well, you should something like this:

...
10:36PM INF Coordinator listening on http://localhost:8000
...

Next, we need to build a docker image that contains the model and the necessary inference script. Tork tasks typically execute within a Docker container.

inference.py

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
import os

MODEL_NAME = os.getenv("MODEL_NAME")

def load_model_and_tokenizer(model_name):
    tokenizer = AutoTokenizer.from_pretrained(model_name)
    model = AutoModelForSequenceClassification.from_pretrained(model_name)
    return tokenizer, model

def predict_sentiment(text, tokenizer, model):
    inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)

    with torch.no_grad():
        outputs = model(**inputs)

    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_label = torch.argmax(predictions, dim=1).item()
    confidence = predictions[0][predicted_label].item()

    return predicted_label, confidence

if __name__ == "__main__":
    tokenizer, model = load_model_and_tokenizer(MODEL_NAME)
    text = os.getenv("INPUT_TEXT")
    label, confidence = predict_sentiment(text, tokenizer, model)
    sentiment_map = {0: "Negative", 1: "Positive"}
    sentiment = sentiment_map[label]
    print(f"{sentiment}")

Dockerfile

FROM huggingface/transformers-pytorch-cpu:latest

WORKDIR /app

COPY inference.py .

# Pre-load the model during image build
RUN python3 -c "from transformers import AutoTokenizer, AutoModelForSequenceClassification; AutoTokenizer.from_pretrained('distilbert-base-uncased-finetuned-sst-2-english'); AutoModelForSequenceClassification.from_pretrained('distilbert-base-uncased-finetuned-sst-2-english')"
docker build -t sentiment-analysis .

Next, let's create the Tork job to run the inference.

sentiment.yaml

name: Sentiment analysis example
inputs:
  input_text: Today is a lovely day
  model_name: distilbert-base-uncased-finetuned-sst-2-english
output: "{{trim(tasks.sentimentResult)}}"
tasks:
  - name: Run sentiment analysis
    var: sentimentResult
    # the image we created in the previous step, 
    # but can be any image available from Docker hub
    # or any other image registry
    image: sentiment-analysis:latest
    run: |
      python3 inference.py > $TORK_OUTPUT
    env:
      INPUT_TEXT: "{{inputs.input_text}}"
      MODEL_NAME: "{{inputs.model_name}}"

Submit the job. Tork jobs execute asynchronously. Once a job is submitted you get back a job ID to track its progress:

JOB_ID=$(curl -s \
  -X POST \
  -H "content-type:text/yaml" \
  --data-binary @sentiment.yaml \
  http://localhost:8000/jobs | jq -r .id)

Poll the job's status and wait for it to complete:

while true; do 
  state=$(curl -s http://localhost:8000/jobs/$JOB_ID | jq -r .state)
  echo "Status: $state"
  if [ "$state" = "COMPLETED" ]; then; 
     break 
  fi 
  sleep 1
done

Inspect the job results:

curl -s http://localhost:8000/jobs/$JOB_ID | jq -r .result
Positive

Try changing the input_text in sentiment.yaml and re-submit the job for different results.

Next steps

Now that I got this basic proof of concept working on my machine I need to push that Docker image to a Docker registry so it is available for any Tork workers on my production cluster. But this seems like a viable approach.

The code for this article can be found on Github.

If you're interested in learning more about Tork:

Documentation: https://www.tork.run
Backend: https://github.com/runabol/tork
Web UI: https://github.com/runabol/tork-web

...

🔧 Building scalable ML workflows


📈 28.46 Punkte
🔧 Programmierung

🔧 Building scalable ML workflows


📈 28.46 Punkte
🔧 Programmierung

🔧 🟢 GitHub Workflows reimagined - A visual node system for GitHub workflows


📈 24.23 Punkte
🔧 Programmierung

🔧 🟢 GitHub Workflows reimagined - A visual node system for GitHub workflows


📈 24.23 Punkte
🔧 Programmierung

🔧 ZEISS Demonstrates the Power of Scalable Workflows with Ampere Altra and SpinKube


📈 22.84 Punkte
🔧 Programmierung

🔧 How OpenAI Projects Revolutionizes Scalable AI Workflows and Persistent Storage Efficiency


📈 22.84 Punkte
🔧 Programmierung

🔧 How to Build Scalable Automated Workflows like HubSpot using AWS Lambda, SQS, Node.js, and MongoDB


📈 22.84 Punkte
🔧 Programmierung

📰 Building Generative AI prompt chaining workflows with human in the loop


📈 17.74 Punkte
🔧 AI Nachrichten

🔧 Building Event Driven, Reactive Apps with Temporal: Workflows vs Sagas


📈 17.74 Punkte
🔧 Programmierung

📰 Red Hat OpenShift, a foundation for building AI and ML workflows and AI-powered intelligent apps


📈 17.74 Punkte
📰 IT Security Nachrichten

🔧 Building workflows with the Durable Task Framework | On .NET


📈 17.74 Punkte
🔧 Programmierung

🔧 Building Resilient Workflows with Persistent Storage


📈 17.74 Punkte
🔧 Programmierung

🔧 Building Smarter Workflows: How All-in-One Tools Empower Developers and Small Teams


📈 17.74 Punkte
🔧 Programmierung

🔧 Building Serverless Agentic Workflows with Amazon Bedrock


📈 17.74 Punkte
🔧 Programmierung

🔧 Building event-driven workflows with DynamoDB Streams


📈 17.74 Punkte
🔧 Programmierung

🔧 Building Complex AI Workflows with LangGraph: A Detailed Explanation of Subgraph Architecture


📈 17.74 Punkte
🔧 Programmierung

🔧 Building Efficient Node.js Workflows in GitHub Actions: Leveraging Caching and Modular Job Structures


📈 17.74 Punkte
🔧 Programmierung

🔧 Building Structured Workflows with Tools and Functions in LangGraph


📈 17.74 Punkte
🔧 Programmierung

📰 Building an Interactive UI for Llamaindex Workflows


📈 17.74 Punkte
🔧 AI Nachrichten

🔧 Durable Python: Building Bullet-Proof Long-Running Workflows, Made Simple


📈 17.74 Punkte
🔧 Programmierung

📰 Nous: An Open-Source TypesScript Platform for Building Autonomous AI Agents and LLM Workflows


📈 17.74 Punkte
🔧 AI Nachrichten

🔧 Orchestrating the Cloud: Building Robust Workflows with AWS Step Functions


📈 17.74 Punkte
🔧 Programmierung

🔧 7 Serverless Architecture Patterns for Building Scalable Web Apps


📈 16.35 Punkte
🔧 Programmierung

🔧 A Deep Dive into Tailwind CSS: Building Modern, Scalable UIs with a Utility-First Approach🎨🚀


📈 16.35 Punkte
🔧 Programmierung

🔧 Building a Scalable Authentication System with JWT in a MERN Stack Application


📈 16.35 Punkte
🔧 Programmierung

🔧 Building a Scalable API with Node.js and Express


📈 16.35 Punkte
🔧 Programmierung

📰 Azure Cognitive Services for building enterprise ready scalable AI solutions


📈 16.35 Punkte
📰 IT Nachrichten

🔧 Building a Scalable API with Node.js and Express


📈 16.35 Punkte
🔧 Programmierung

🔧 Building a Scalable Real-Time Job Board with React, Node.js, and Google Authentication


📈 16.35 Punkte
🔧 Programmierung

🔧 Building a Scalable Minio Distributed Setup: A Step-by-Step Guide


📈 16.35 Punkte
🔧 Programmierung

🔧 Building a Generic Virtual Scroll Table in Angular: A Scalable Approach


📈 16.35 Punkte
🔧 Programmierung

🔧 Building a Scalable Web Server on AWS: A Hands-On Guide to Linux, Apache, and SSL Configuration


📈 16.35 Punkte
🔧 Programmierung

matomo