MuleSoft Runtime Fabric (RTF) Deployment Guide

Runtime Fabric enables MuleSoft deployments on your own infrastructure with Kubernetes. This guide covers everything from cluster setup to production deployment strategies.

Runtime Fabric Architecture

Understanding RTF components and deployment model:

# RTF Architecture Overview
# ========================
# Controller Nodes: Manage RTF operations (3 minimum for HA)
# Worker Nodes: Run Mule applications (scale based on workload)
# Components:
#   - rtf-agent: Manages communication with Anypoint Platform
#   - rtf-deployer: Handles application deployments
#   - rtf-resource-cache: Caches application resources
#   - rtf-persistence-gateway: Manages persistent storage
 
# Minimum Production Requirements
controller_nodes:
  count: 3
  cpu: 2 cores
  memory: 8GB
  disk: 60GB SSD
 
worker_nodes:
  count: 3  # Minimum for HA
  cpu: 2 cores
  memory: 15GB
  disk: 250GB SSD
 
networking:
  pod_cidr: "10.244.0.0/16"
  service_cidr: "10.96.0.0/12"
  required_ports:
    - 443   # HTTPS
    - 5044  # Filebeat
    - 32009 # RTF

Cluster Installation

Prerequisites and Installation

#!/bin/bash
# RTF Installation Script
 
# 1. Download RTF installer
curl -L https://anypoint.mulesoft.com/runtimefabric/api/download/scripts/latest \
  --header "Authorization: Bearer $ANYPOINT_TOKEN" \
  -o rtfctl
 
chmod +x rtfctl
 
# 2. Generate activation data from Anypoint Platform
# This is done in Runtime Manager > Runtime Fabric > Create
 
# 3. Initialize cluster (on first controller node)
./rtfctl install \
  --activation-data "$RTF_ACTIVATION_DATA" \
  --mule-license "$MULE_LICENSE_KEY" \
  --controller-ips "10.0.1.10,10.0.1.11,10.0.1.12" \
  --worker-ips "10.0.2.10,10.0.2.11,10.0.2.12" \
  --http-proxy "http://proxy.company.com:8080" \
  --no-proxy "localhost,127.0.0.1,.internal.company.com"
 
# 4. Join additional nodes
# On each additional controller:
./rtfctl join \
  --role controller \
  --token "$JOIN_TOKEN" \
  --server "10.0.1.10:443"
 
# On each worker:
./rtfctl join \
  --role worker \
  --token "$JOIN_TOKEN" \
  --server "10.0.1.10:443"
 
# 5. Verify installation
./rtfctl status
./rtfctl appliance status

RTF Configuration

# rtf-config.yaml
apiVersion: rtf.mulesoft.com/v1
kind: RuntimeFabricConfig
metadata:
  name: production-rtf
spec:
  # Ingress configuration
  ingress:
    enabled: true
    # Use internal load balancer for private clusters
    serviceType: LoadBalancer
    annotations:
      service.beta.kubernetes.io/aws-load-balancer-internal: "true"
      service.beta.kubernetes.io/aws-load-balancer-cross-zone-load-balancing-enabled: "true"
 
  # Resource allocation defaults
  defaults:
    cpu:
      limit: "1000m"
      reserved: "500m"
    memory:
      limit: "2Gi"
      reserved: "1Gi"
 
  # Persistent storage
  persistence:
    storageClass: "gp3"
    accessMode: ReadWriteOnce
 
  # Log forwarding
  logForwarding:
    enabled: true
    type: "splunk"
    host: "splunk.company.com"
    port: 8088
    token: "${SPLUNK_HEC_TOKEN}"
    index: "mulesoft"
    source: "rtf"
    sourcetype: "_json"
 
  # Monitoring
  monitoring:
    enabled: true
    prometheus:
      enabled: true
      serviceMonitor: true

Application Deployment

Deployment Configuration

# deployment-settings.yaml
apiVersion: rtf.mulesoft.com/v1
kind: MuleApplication
metadata:
  name: order-api
  namespace: production
spec:
  # Application artifact
  artifact:
    groupId: com.company
    artifactId: order-api
    version: 1.2.0
    repository: exchange
 
  # Runtime configuration
  runtime:
    version: "4.6.0"
    java: "17"
    releaseChannel: "LTS"
 
  # Replica configuration
  replicas:
    count: 3
    updateStrategy:
      type: RollingUpdate
      maxSurge: 1
      maxUnavailable: 0
 
  # Resource allocation
  resources:
    cpu:
      reserved: "500m"
      limit: "1500m"
    memory:
      reserved: "1000Mi"
      limit: "2000Mi"
 
  # Properties and secrets
  properties:
    - name: "env"
      value: "production"
    - name: "api.timeout"
      value: "30000"
 
  secureProperties:
    - name: "db.password"
      secretRef: "order-api-secrets"
      key: "database-password"
    - name: "api.key"
      secretRef: "order-api-secrets"
      key: "api-key"
 
  # Clustering
  clustering:
    enabled: true
 
  # Auto-scaling
  autoscaling:
    enabled: true
    minReplicas: 2
    maxReplicas: 10
    metrics:
      - type: cpu
        targetAverageUtilization: 70
      - type: memory
        targetAverageUtilization: 80
 
  # Health checks
  healthCheck:
    readinessProbe:
      path: /api/health
      port: 8081
      initialDelaySeconds: 30
      periodSeconds: 10
      timeoutSeconds: 5
      failureThreshold: 3
    livenessProbe:
      path: /api/health
      port: 8081
      initialDelaySeconds: 60
      periodSeconds: 20
      timeoutSeconds: 10
      failureThreshold: 3
 
  # Networking
  networking:
    publicEndpoint:
      enabled: true
      pathPrefix: "/order"
    internalEndpoint:
      enabled: true

Deployment Script

#!/bin/bash
# deploy-to-rtf.sh
 
set -e
 
# Configuration
APP_NAME="order-api"
ENVIRONMENT="production"
RTF_NAME="production-rtf"
VERSION="${1:-latest}"
 
# Anypoint Platform authentication
ANYPOINT_TOKEN=$(curl -s -X POST https://anypoint.mulesoft.com/accounts/login \
  -H "Content-Type: application/json" \
  -d '{
    "username": "'$ANYPOINT_USERNAME'",
    "password": "'$ANYPOINT_PASSWORD'"
  }' | jq -r '.access_token')
 
# Get organization and environment IDs
ORG_ID=$(curl -s https://anypoint.mulesoft.com/accounts/api/me \
  -H "Authorization: Bearer $ANYPOINT_TOKEN" | jq -r '.user.organizationId')
 
ENV_ID=$(curl -s "https://anypoint.mulesoft.com/accounts/api/organizations/$ORG_ID/environments" \
  -H "Authorization: Bearer $ANYPOINT_TOKEN" | jq -r ".data[] | select(.name==\"$ENVIRONMENT\") | .id")
 
# Get RTF target ID
RTF_ID=$(curl -s "https://anypoint.mulesoft.com/runtimefabric/api/organizations/$ORG_ID/fabrics" \
  -H "Authorization: Bearer $ANYPOINT_TOKEN" | jq -r ".[] | select(.name==\"$RTF_NAME\") | .id")
 
# Build deployment request
DEPLOYMENT_REQUEST=$(cat <<EOF
{
  "name": "$APP_NAME",
  "labels": ["production", "api"],
  "target": {
    "provider": "MC",
    "targetId": "$RTF_ID",
    "deploymentSettings": {
      "clustered": true,
      "enforceDeployingReplicasAcrossNodes": true,
      "http": {
        "inbound": {
          "publicUrl": "https://api.company.com/order"
        }
      },
      "jvm": {},
      "runtime": {
        "version": "4.6.0",
        "releaseChannel": "LTS",
        "java": "17"
      },
      "autoscaling": {
        "enabled": true,
        "minReplicas": 2,
        "maxReplicas": 10
      },
      "updateStrategy": "rolling",
      "resources": {
        "cpu": {
          "reserved": "500m",
          "limit": "1500m"
        },
        "memory": {
          "reserved": "1000Mi",
          "limit": "2000Mi"
        }
      },
      "disableAmLogForwarding": false,
      "persistentObjectStore": true
    },
    "replicas": 3
  },
  "application": {
    "ref": {
      "groupId": "com.company",
      "artifactId": "$APP_NAME",
      "version": "$VERSION",
      "packaging": "jar"
    },
    "desiredState": "STARTED",
    "configuration": {
      "mule.agent.application.properties.service": {
        "applicationName": "$APP_NAME",
        "properties": {
          "env": "production",
          "api.autodiscovery.id": "12345"
        },
        "secureProperties": {
          "db.password": "{{secure::db.password}}",
          "api.key": "{{secure::api.key}}"
        }
      }
    }
  }
}
EOF
)
 
# Check if application exists
EXISTING_APP=$(curl -s "https://anypoint.mulesoft.com/amc/application-manager/api/v2/organizations/$ORG_ID/environments/$ENV_ID/deployments" \
  -H "Authorization: Bearer $ANYPOINT_TOKEN" | jq -r ".items[] | select(.name==\"$APP_NAME\") | .id")
 
if [ -n "$EXISTING_APP" ]; then
  echo "Updating existing deployment: $EXISTING_APP"
  HTTP_METHOD="PATCH"
  DEPLOYMENT_URL="https://anypoint.mulesoft.com/amc/application-manager/api/v2/organizations/$ORG_ID/environments/$ENV_ID/deployments/$EXISTING_APP"
else
  echo "Creating new deployment"
  HTTP_METHOD="POST"
  DEPLOYMENT_URL="https://anypoint.mulesoft.com/amc/application-manager/api/v2/organizations/$ORG_ID/environments/$ENV_ID/deployments"
fi
 
# Deploy
RESPONSE=$(curl -s -X $HTTP_METHOD "$DEPLOYMENT_URL" \
  -H "Authorization: Bearer $ANYPOINT_TOKEN" \
  -H "Content-Type: application/json" \
  -d "$DEPLOYMENT_REQUEST")
 
DEPLOYMENT_ID=$(echo $RESPONSE | jq -r '.id')
echo "Deployment initiated: $DEPLOYMENT_ID"
 
# Wait for deployment to complete
echo "Waiting for deployment to complete..."
while true; do
  STATUS=$(curl -s "https://anypoint.mulesoft.com/amc/application-manager/api/v2/organizations/$ORG_ID/environments/$ENV_ID/deployments/$DEPLOYMENT_ID" \
    -H "Authorization: Bearer $ANYPOINT_TOKEN" | jq -r '.application.status')
 
  echo "Current status: $STATUS"
 
  case $STATUS in
    "RUNNING")
      echo "✅ Deployment successful!"
      exit 0
      ;;
    "DEPLOYMENT_FAILED"|"FAILED")
      echo "❌ Deployment failed!"
      exit 1
      ;;
  esac
 
  sleep 10
done

Resource Management

Node Affinity and Resource Quotas

# resource-quota.yaml
apiVersion: v1
kind: ResourceQuota
metadata:
  name: mule-apps-quota
  namespace: production
spec:
  hard:
    requests.cpu: "20"
    requests.memory: "40Gi"
    limits.cpu: "40"
    limits.memory: "80Gi"
    pods: "50"
 
---
# priority-class.yaml
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata:
  name: mule-critical
value: 1000000
globalDefault: false
description: "Critical Mule applications"
 
---
# node-affinity.yaml
apiVersion: rtf.mulesoft.com/v1
kind: DeploymentOverride
metadata:
  name: order-api-override
spec:
  selector:
    matchLabels:
      app: order-api
  template:
    spec:
      priorityClassName: mule-critical
      affinity:
        nodeAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
            nodeSelectorTerms:
              - matchExpressions:
                  - key: workload-type
                    operator: In
                    values:
                      - mule-production
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
            - weight: 100
              podAffinityTerm:
                labelSelector:
                  matchExpressions:
                    - key: app
                      operator: In
                      values:
                        - order-api
                topologyKey: kubernetes.io/hostname
      tolerations:
        - key: "dedicated"
          operator: "Equal"
          value: "mule"
          effect: "NoSchedule"

Monitoring and Alerting

Prometheus Metrics

# prometheus-rules.yaml
apiVersion: monitoring.coreos.com/v1
kind: PrometheusRule
metadata:
  name: mule-rtf-alerts
  namespace: monitoring
spec:
  groups:
    - name: mule-application-alerts
      rules:
        - alert: MuleAppHighCPU
          expr: |
            avg(rate(container_cpu_usage_seconds_total{
              namespace=~".*mule.*",
              container!=""
            }[5m])) by (pod) > 0.8
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "High CPU usage on Mule app {{ $labels.pod }}"
            description: "CPU usage is above 80% for 5 minutes"
 
        - alert: MuleAppHighMemory
          expr: |
            container_memory_working_set_bytes{
              namespace=~".*mule.*",
              container!=""
            } / container_spec_memory_limit_bytes > 0.85
          for: 5m
          labels:
            severity: warning
          annotations:
            summary: "High memory usage on Mule app {{ $labels.pod }}"
 
        - alert: MuleAppReplicasDown
          expr: |
            kube_deployment_status_replicas_available{
              namespace=~".*mule.*"
            } < kube_deployment_spec_replicas
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: "Mule app {{ $labels.deployment }} has unavailable replicas"
 
        - alert: MuleAppRestartLoop
          expr: |
            increase(kube_pod_container_status_restarts_total{
              namespace=~".*mule.*"
            }[1h]) > 5
          labels:
            severity: critical
          annotations:
            summary: "Mule app {{ $labels.pod }} is in restart loop"
 
    - name: rtf-infrastructure-alerts
      rules:
        - alert: RTFNodeNotReady
          expr: |
            kube_node_status_condition{
              condition="Ready",
              status="true"
            } == 0
          for: 5m
          labels:
            severity: critical
          annotations:
            summary: "RTF node {{ $labels.node }} is not ready"
 
        - alert: RTFAgentDown
          expr: |
            up{job="rtf-agent"} == 0
          for: 2m
          labels:
            severity: critical
          annotations:
            summary: "RTF agent is down"

Grafana Dashboard

{
  "dashboard": {
    "title": "MuleSoft RTF Overview",
    "panels": [
      {
        "title": "Application Status",
        "type": "stat",
        "targets": [
          {
            "expr": "count(kube_deployment_status_replicas_available{namespace=~\".*mule.*\"} > 0)",
            "legendFormat": "Running Apps"
          }
        ]
      },
      {
        "title": "CPU Usage by Application",
        "type": "timeseries",
        "targets": [
          {
            "expr": "sum(rate(container_cpu_usage_seconds_total{namespace=~\".*mule.*\", container!=\"\"}[5m])) by (pod)",
            "legendFormat": "{{ pod }}"
          }
        ]
      },
      {
        "title": "Memory Usage by Application",
        "type": "timeseries",
        "targets": [
          {
            "expr": "sum(container_memory_working_set_bytes{namespace=~\".*mule.*\", container!=\"\"}) by (pod) / 1024 / 1024",
            "legendFormat": "{{ pod }} (MB)"
          }
        ]
      },
      {
        "title": "Request Rate",
        "type": "timeseries",
        "targets": [
          {
            "expr": "sum(rate(http_server_requests_seconds_count{namespace=~\".*mule.*\"}[5m])) by (app)",
            "legendFormat": "{{ app }}"
          }
        ]
      },
      {
        "title": "Error Rate",
        "type": "timeseries",
        "targets": [
          {
            "expr": "sum(rate(http_server_requests_seconds_count{namespace=~\".*mule.*\", status=~\"5..\"}[5m])) by (app) / sum(rate(http_server_requests_seconds_count{namespace=~\".*mule.*\"}[5m])) by (app) * 100",
            "legendFormat": "{{ app }} error %"
          }
        ]
      }
    ]
  }
}

Production Best Practices

# production-checklist.yaml
production_readiness:
  infrastructure:
    - name: "High Availability"
      requirements:
        - "Minimum 3 controller nodes across availability zones"
        - "Minimum 3 worker nodes across availability zones"
        - "External load balancer with health checks"
        - "Persistent storage with replication"
 
    - name: "Security"
      requirements:
        - "Network policies restricting pod-to-pod traffic"
        - "Secrets stored in external vault (HashiCorp, AWS Secrets)"
        - "TLS termination at ingress"
        - "Pod security policies enabled"
        - "Image scanning in CI/CD pipeline"
 
    - name: "Monitoring"
      requirements:
        - "Centralized logging (Splunk/ELK)"
        - "Metrics collection (Prometheus)"
        - "Alerting configured and tested"
        - "Dashboard for operations team"
 
  application:
    - name: "Deployment"
      requirements:
        - "Minimum 2 replicas for HA"
        - "Pod anti-affinity configured"
        - "Resource limits defined"
        - "Health checks configured"
        - "Rolling update strategy"
 
    - name: "Configuration"
      requirements:
        - "Externalized configuration"
        - "Secure properties in secrets"
        - "Environment-specific configs"
        - "API autodiscovery enabled"
 
    - name: "Testing"
      requirements:
        - "Integration tests pass"
        - "Load testing completed"
        - "Failover testing completed"
        - "Rollback procedure documented"

Conclusion

Runtime Fabric provides enterprise-grade MuleSoft deployment on your infrastructure. Proper cluster sizing, application configuration, and monitoring are essential for production success. Follow security best practices, implement comprehensive monitoring, and maintain documented procedures for operations. RTF gives you full control while leveraging Anypoint Platform's management capabilities.