Production-Ready AI Applications with SAP AI Core

In our previous posts, we built a Support Ticket System with AI orchestration and RAG capabilities. Now it's time to deploy it to production with enterprise-grade reliability, security, and observability.

This post is part of a series:

Getting Started with SAP AI Core and the SAP AI SDK in CAP
Leveraging LLM Models and Deployments in SAP AI Core
Orchestrating AI Workflows with SAP AI Core
Document Grounding with RAG in SAP AI Core
Production-Ready AI Applications with SAP AI Core (this post)

What Production-Ready Means

A production AI application needs:

Requirement	Why It Matters
Security	Protect sensitive data and API keys
Monitoring	Track performance, costs, and failures
Resilience	Handle errors gracefully and retry transient failures
Scalability	Support growing user load
Cost Control	Manage token consumption and API costs
Observability	Debug issues quickly in production

Step 1: Secure Configuration Management

Never hardcode credentials. Use SAP BTP services for secrets management.

Create `mta.yaml` for Deployment

Copy

_schema-version: '3.3'
ID: support-ticket-ai
version: 1.0.0
description: Support Ticket System with AI

parameters:
  enable-parallel-deployments: true

build-parameters:
  before-all:
    - builder: custom
      commands:
        - npm ci
        - npx cds build --production

modules:
  # CAP Application
  - name: support-ticket-srv
    type: nodejs
    path: gen/srv
    parameters:
      buildpack: nodejs_buildpack
      memory: 512M
      disk-quota: 1024M
    build-parameters:
      builder: npm
    provides:
      - name: srv-api
        properties:
          srv-url: ${default-url}
    requires:
      - name: support-ticket-db
      - name: support-ticket-auth
      - name: support-ticket-destination
      - name: support-ticket-aicore

  # Database Deployer
  - name: support-ticket-db-deployer
    type: hdb
    path: gen/db
    parameters:
      buildpack: nodejs_buildpack
    requires:
      - name: support-ticket-db

resources:
  # HANA Cloud Database
  - name: support-ticket-db
    type: com.sap.xs.hdi-container
    parameters:
      service: hana
      service-plan: hdi-shared
    properties:
      hdi-service-name: ${service-name}

  # XSUAA Authentication
  - name: support-ticket-auth
    type: org.cloudfoundry.managed-service
    parameters:
      service: xsuaa
      service-plan: application
      path: ./xs-security.json
      config:
        xsappname: support-ticket-${org}-${space}
        tenant-mode: dedicated
        scopes:
          - name: '$XSAPPNAME.Admin'
            description: Admin access
          - name: '$XSAPPNAME.User'
            description: User access
        role-templates:
          - name: Admin
            description: Administrator
            scope-references:
              - '$XSAPPNAME.Admin'
          - name: User
            description: Regular user
            scope-references:
              - '$XSAPPNAME.User'

  # Destination Service
  - name: support-ticket-destination
    type: org.cloudfoundry.managed-service
    parameters:
      service: destination
      service-plan: lite

  # AI Core Service
  - name: support-ticket-aicore
    type: org.cloudfoundry.managed-service
    parameters:
      service: aicore
      service-plan: extended

Create `xs-security.json`

Copy

{
  "xsappname": "support-ticket",
  "tenant-mode": "dedicated",
  "description": "Security configuration for Support Ticket AI",
  "scopes": [
    {
      "name": "$XSAPPNAME.Admin",
      "description": "Admin access"
    },
    {
      "name": "$XSAPPNAME.User",
      "description": "User access"
    }
  ],
  "role-templates": [
    {
      "name": "Admin",
      "description": "Administrator",
      "scope-references": [
        "$XSAPPNAME.Admin"
      ]
    },
    {
      "name": "User",
      "description": "Regular User",
      "scope-references": [
        "$XSAPPNAME.User"
      ]
    }
  ]
}

Step 2: Implement Authentication & Authorization

Update srv/server.js to require authentication:

Copy

const cds = require('@sap/cds');
const xsenv = require('@sap/xsenv');

// Load environment variables
xsenv.loadEnv();

module.exports = cds.server;

// Add authentication middleware
cds.on('bootstrap', (app) => {
  const passport = require('passport');
  const { JWTStrategy } = require('@sap/xssec');
  
  // Configure passport with JWT strategy
  passport.use(new JWTStrategy(xsenv.getServices({ uaa: { tag: 'xsuaa' } }).uaa));
  
  app.use(passport.initialize());
  app.use(passport.authenticate('JWT', { session: false }));
});

// Add authorization checks
cds.on('served', () => {
  const { Tickets } = cds.entities;
  
  // Restrict access based on roles
  cds.before('CREATE', 'Tickets', async (req) => {
    if (!req.user.is('User')) {
      req.reject(403, 'Insufficient privileges');
    }
  });
  
  cds.before('UPDATE', 'Tickets', async (req) => {
    if (!req.user.is('Admin')) {
      req.reject(403, 'Only admins can update tickets');
    }
  });
});

Update package.json to add authentication:

Copy

{
  "cds": {
    "requires": {
      "auth": {
        "kind": "xsuaa"
      },
      "db": {
        "kind": "hana"
      }
    }
  }
}

Step 3: Implement Comprehensive Error Handling

Create /srv/lib/error-handler.js:

Copy

const cds = require('@sap/cds');

class AIErrorHandler {
  
  /**
   * Handle AI Core errors with proper retry logic
   */
  static async handleWithRetry(operation, options = {}) {
    const maxRetries = options.maxRetries || 3;
    const initialDelay = options.initialDelay || 1000;
    const backoffMultiplier = options.backoffMultiplier || 2;
    
    let lastError;
    
    for (let attempt = 0; attempt < maxRetries; attempt++) {
      try {
        return await operation();
      } catch (error) {
        lastError = error;
        
        // Don't retry certain errors
        if (this.isNonRetryableError(error)) {
          throw this.enhanceError(error);
        }
        
        // Log the error
        console.error(`Attempt ${attempt + 1} failed:`, {
          message: error.message,
          status: error.response?.status,
          code: error.code
        });
        
        // Wait before retrying
        if (attempt < maxRetries - 1) {
          const delay = initialDelay * Math.pow(backoffMultiplier, attempt);
          await this.sleep(delay);
        }
      }
    }
    
    throw this.enhanceError(lastError);
  }
  
  /**
   * Check if error should not be retried
   */
  static isNonRetryableError(error) {
    const status = error.response?.status;
    
    // Don't retry client errors (except 429 rate limit)
    if (status && status >= 400 && status < 500 && status !== 429) {
      return true;
    }
    
    // Don't retry authentication errors
    if (error.code === 'EAUTH' || error.message?.includes('authentication')) {
      return true;
    }
    
    return false;
  }
  
  /**
   * Enhance error with additional context
   */
  static enhanceError(error) {
    const enhanced = new Error(error.message);
    enhanced.name = 'AIOperationError';
    enhanced.originalError = error;
    enhanced.timestamp = new Date().toISOString();
    
    // Add status code if available
    if (error.response?.status) {
      enhanced.statusCode = error.response.status;
    }
    
    // Add rate limit info if available
    if (error.response?.headers) {
      const headers = error.response.headers;
      if (headers['x-ratelimit-remaining']) {
        enhanced.rateLimitRemaining = headers['x-ratelimit-remaining'];
        enhanced.rateLimitReset = headers['x-ratelimit-reset'];
      }
    }
    
    return enhanced;
  }
  
  /**
   * Sleep helper
   */
  static sleep(ms) {
    return new Promise(resolve => setTimeout(resolve, ms));
  }
  
  /**
   * Handle content filtering errors
   */
  static handleContentFilterError(error) {
    if (error.response?.status === 400) {
      const data = error.response.data;
      if (data?.error?.message?.includes('content filter')) {
        return {
          filtered: true,
          message: 'Your request was blocked by content safety filters.',
          categories: this.extractFilterCategories(data)
        };
      }
    }
    return null;
  }
  
  /**
   * Extract filter categories from error
   */
  static extractFilterCategories(errorData) {
    // Parse error data to extract which filters triggered
    const categories = [];
    const message = errorData?.error?.message || '';
    
    if (message.includes('hate')) categories.push('hate');
    if (message.includes('violence')) categories.push('violence');
    if (message.includes('self-harm')) categories.push('self-harm');
    if (message.includes('sexual')) categories.push('sexual');
    
    return categories;
  }
}

module.exports = AIErrorHandler;

Use it in your services:

Copy

const AIErrorHandler = require('./lib/error-handler');
const { OrchestrationClient } = require('@sap-ai-sdk/orchestration');

class TicketService {
  async processTicket(ticket) {
    return await AIErrorHandler.handleWithRetry(async () => {
      const client = new OrchestrationClient(/* ... */);
      return await client.chatCompletion(/* ... */);
    }, {
      maxRetries: 3,
      initialDelay: 1000
    });
  }
}

Step 4: Implement Monitoring & Observability

Create /srv/lib/ai-telemetry.js:

Copy

class AITelemetry {
  constructor() {
    this.metrics = {
      requests: 0,
      errors: 0,
      totalTokens: 0,
      totalCost: 0,
      latencies: []
    };
  }
  
  /**
   * Track AI request metrics
   */
  trackRequest(operation, result, duration) {
    this.metrics.requests++;
    
    if (result.usage) {
      this.metrics.totalTokens += result.usage.total_tokens || 0;
      
      // Estimate cost (adjust rates based on your model)
      const inputCost = (result.usage.prompt_tokens || 0) * 0.00000112;
      const outputCost = (result.usage.completion_tokens || 0) * 0.00000320;
      this.metrics.totalCost += inputCost + outputCost;
    }
    
    this.metrics.latencies.push(duration);
    
    // Log detailed metrics
    console.log('AI Request Completed', {
      operation,
      duration,
      tokens: result.usage,
      cost: this.estimateCost(result.usage)
    });
  }
  
  /**
   * Track errors
   */
  trackError(operation, error) {
    this.metrics.errors++;
    
    console.error('AI Request Failed', {
      operation,
      error: error.message,
      statusCode: error.statusCode,
      timestamp: new Date().toISOString()
    });
  }
  
  /**
   * Estimate cost for a request
   */
  estimateCost(usage) {
    if (!usage) return 0;
    
    const inputCost = (usage.prompt_tokens || 0) * 0.00000112;
    const outputCost = (usage.completion_tokens || 0) * 0.00000320;
    
    return {
      input: inputCost,
      output: outputCost,
      total: inputCost + outputCost
    };
  }
  
  /**
   * Get metrics summary
   */
  getMetrics() {
    const avgLatency = this.metrics.latencies.length > 0
      ? this.metrics.latencies.reduce((a, b) => a + b, 0) / this.metrics.latencies.length
      : 0;
    
    return {
      ...this.metrics,
      avgLatency,
      errorRate: this.metrics.requests > 0 
        ? this.metrics.errors / this.metrics.requests 
        : 0
    };
  }
  
  /**
   * Reset metrics (for testing or periodic reports)
   */
  reset() {
    this.metrics = {
      requests: 0,
      errors: 0,
      totalTokens: 0,
      totalCost: 0,
      latencies: []
    };
  }
}

// Singleton instance
const telemetry = new AITelemetry();

module.exports = telemetry;

Integrate telemetry into your service:

Copy

const telemetry = require('./lib/ai-telemetry');
const AIErrorHandler = require('./lib/error-handler');

class TicketService {
  async processTicketWithTelemetry(ticket) {
    const startTime = Date.now();
    
    try {
      const result = await AIErrorHandler.handleWithRetry(async () => {
        return await this.orchestrationClient.chatCompletion(/* ... */);
      });
      
      const duration = Date.now() - startTime;
      telemetry.trackRequest('processTicket', result, duration);
      
      return result;
    } catch (error) {
      telemetry.trackError('processTicket', error);
      throw error;
    }
  }
}

Add a metrics endpoint:

Copy

// In srv/server.js
cds.on('bootstrap', (app) => {
  const telemetry = require('./lib/ai-telemetry');
  
  app.get('/metrics', (req, res) => {
    res.json(telemetry.getMetrics());
  });
});

Step 5: Cost Optimization Strategies

Implement Response Caching

Create /srv/lib/response-cache.js:

Copy

const NodeCache = require('node-cache');

class AIResponseCache {
  constructor(ttlSeconds = 3600) {
    this.cache = new NodeCache({ 
      stdTTL: ttlSeconds,
      checkperiod: 600
    });
  }
  
  /**
   * Generate cache key from request
   */
  generateKey(prompt, model, temperature = 0) {
    const crypto = require('crypto');
    const data = JSON.stringify({ prompt, model, temperature });
    return crypto.createHash('sha256').update(data).digest('hex');
  }
  
  /**
   * Get cached response
   */
  get(prompt, model, temperature) {
    const key = this.generateKey(prompt, model, temperature);
    return this.cache.get(key);
  }
  
  /**
   * Store response in cache
   */
  set(prompt, model, temperature, response) {
    const key = this.generateKey(prompt, model, temperature);
    this.cache.set(key, response);
  }
  
  /**
   * Clear cache
   */
  clear() {
    this.cache.flushAll();
  }
  
  /**
   * Get cache statistics
   */
  getStats() {
    return this.cache.getStats();
  }
}

module.exports = AIResponseCache;

Use caching in your service:

Copy

const AIResponseCache = require('./lib/response-cache');
const cache = new AIResponseCache(3600); // 1 hour TTL

class TicketService {
  async processTicketWithCache(ticket) {
    const prompt = `${ticket.subject} ${ticket.description}`;
    const model = 'gpt-4o';
    const temperature = 0.3;
    
    // Check cache first
    const cached = cache.get(prompt, model, temperature);
    if (cached) {
      console.log('Cache hit - saved API call');
      return cached;
    }
    
    // Generate new response
    const response = await this.processTicket(ticket);
    
    // Cache the response
    cache.set(prompt, model, temperature, response);
    
    return response;
  }
}

Token Usage Optimization

Copy

class TokenOptimizer {
  
  /**
   * Truncate prompt to fit within token limit
   */
  static truncatePrompt(text, maxTokens = 4000) {
    // Rough estimate: 1 token ≈ 4 characters for English
    const maxChars = maxTokens * 4;
    
    if (text.length <= maxChars) {
      return text;
    }
    
    // Truncate and add ellipsis
    return text.substring(0, maxChars - 3) + '...';
  }
  
  /**
   * Optimize prompt by removing unnecessary whitespace
   */
  static optimizePrompt(text) {
    return text
      .replace(/\s+/g, ' ')  // Replace multiple spaces with single space
      .replace(/\n\s*\n/g, '\n')  // Remove empty lines
      .trim();
  }
  
  /**
   * Estimate token count (rough approximation)
   */
  static estimateTokens(text) {
    return Math.ceil(text.length / 4);
  }
  
  /**
   * Split long documents into chunks
   */
  static chunkDocument(text, chunkSize = 1000) {
    const words = text.split(/\s+/);
    const chunks = [];
    
    for (let i = 0; i < words.length; i += chunkSize) {
      chunks.push(words.slice(i, i + chunkSize).join(' '));
    }
    
    return chunks;
  }
}

module.exports = TokenOptimizer;

Model Selection Strategy

Copy

class ModelSelector {
  
  /**
   * Select appropriate model based on task complexity
   */
  static selectModel(task) {
    const complexity = this.assessComplexity(task);
    
    if (complexity === 'simple') {
      return 'gpt-4o-mini';  // Cheaper for simple tasks
    } else if (complexity === 'medium') {
      return 'gpt-4o';
    } else {
      return 'gpt-4o';  // Most capable for complex tasks
    }
  }
  
  /**
   * Assess task complexity
   */
  static assessComplexity(task) {
    const text = task.subject + ' ' + task.description;
    
    // Simple heuristics
    if (text.length < 100) {
      return 'simple';
    } else if (text.length < 500) {
      return 'medium';
    } else {
      return 'complex';
    }
  }
  
  /**
   * Get model configuration
   */
  static getModelConfig(modelName) {
    const configs = {
      'gpt-4o-mini': {
        max_tokens: 500,
        temperature: 0.5
      },
      'gpt-4o': {
        max_tokens: 1000,
        temperature: 0.3
      }
    };
    
    return configs[modelName] || configs['gpt-4o'];
  }
}

module.exports = ModelSelector;

Step 6: CI/CD Pipeline

Create .github/workflows/deploy.yml:

Copy

name: Deploy to Cloud Foundry

on:
  push:
    branches:
      - main
  workflow_dispatch:

jobs:
  build-and-deploy:
    runs-on: ubuntu-latest
    
    steps:
      - name: Checkout code
        uses: actions/checkout@v3
      
      - name: Setup Node.js
        uses: actions/setup-node@v3
        with:
          node-version: '18'
          cache: 'npm'
      
      - name: Install dependencies
        run: npm ci
      
      - name: Run tests
        run: npm test
      
      - name: Build MTA
        run: |
          npm install -g mbt
          mbt build
      
      - name: Deploy to Cloud Foundry
        uses: cloud-foundry/cf-cli-action@v1
        with:
          api_endpoint: ${{ secrets.CF_API }}
          username: ${{ secrets.CF_USERNAME }}
          password: ${{ secrets.CF_PASSWORD }}
          org: ${{ secrets.CF_ORG }}
          space: ${{ secrets.CF_SPACE }}
      
      - name: Deploy application
        run: cf deploy mta_archives/*.mtar

Step 7: Health Checks & Readiness Probes

Add health check endpoints in srv/server.js:

Copy

cds.on('bootstrap', (app) => {
  const telemetry = require('./lib/ai-telemetry');
  
  // Liveness probe - is the app running?
  app.get('/health/live', (req, res) => {
    res.status(200).json({ status: 'alive' });
  });
  
  // Readiness probe - is the app ready to serve?
  app.get('/health/ready', async (req, res) => {
    try {
      // Check database connection
      await cds.tx(async () => {
        await SELECT.one.from('sap.capire.tickets.Tickets');
      });
      
      // Check AI Core connection
      const { OrchestrationClient } = require('@sap-ai-sdk/orchestration');
      const client = new OrchestrationClient({
        promptTemplating: { model: { name: 'gpt-4o' } }
      });
      
      // Simple ping (don't use tokens)
      // Just check if we can initialize the client
      
      res.status(200).json({ 
        status: 'ready',
        metrics: telemetry.getMetrics()
      });
    } catch (error) {
      res.status(503).json({ 
        status: 'not ready',
        error: error.message 
      });
    }
  });
});

Step 8: Environment-Specific Configuration

Create environment-specific configurations:

Copy

// config/production.js
module.exports = {
  ai: {
    orchestration: {
      maxRetries: 3,
      timeout: 30000,
      cacheEnabled: true,
      cacheTTL: 3600
    },
    models: {
      default: 'gpt-4o',
      simple: 'gpt-4o-mini'
    }
  },
  monitoring: {
    enabled: true,
    logLevel: 'info'
  }
};

// config/development.js
module.exports = {
  ai: {
    orchestration: {
      maxRetries: 1,
      timeout: 10000,
      cacheEnabled: false
    },
    models: {
      default: 'gpt-4o-mini',  // Use cheaper model in dev
      simple: 'gpt-4o-mini'
    }
  },
  monitoring: {
    enabled: true,
    logLevel: 'debug'
  }
};

Load configuration:

Copy

// srv/lib/config.js
const path = require('path');
const env = process.env.NODE_ENV || 'development';

let config;
try {
  config = require(path.join(__dirname, '../../config', env));
} catch (error) {
  console.warn(`No config found for ${env}, using defaults`);
  config = {};
}

module.exports = config;

Step 9: Logging Best Practices

Create structured logging utility:

Copy

// srv/lib/logger.js
const config = require('./config');

class Logger {
  constructor(component) {
    this.component = component;
    this.level = config.monitoring?.logLevel || 'info';
  }
  
  log(level, message, data = {}) {
    if (!this.shouldLog(level)) return;
    
    const logEntry = {
      timestamp: new Date().toISOString(),
      level,
      component: this.component,
      message,
      ...data
    };
    
    // In production, send to application logging service
    if (process.env.NODE_ENV === 'production') {
      console.log(JSON.stringify(logEntry));
    } else {
      console.log(`[${level.toUpperCase()}] ${this.component}:`, message, data);
    }
  }
  
  shouldLog(level) {
    const levels = ['debug', 'info', 'warn', 'error'];
    const currentLevelIndex = levels.indexOf(this.level);
    const messageLevelIndex = levels.indexOf(level);
    return messageLevelIndex >= currentLevelIndex;
  }
  
  debug(message, data) { this.log('debug', message, data); }
  info(message, data) { this.log('info', message, data); }
  warn(message, data) { this.log('warn', message, data); }
  error(message, data) { this.log('error', message, data); }
}

module.exports = Logger;

Use in services:

Copy

const Logger = require('./lib/logger');
const logger = new Logger('TicketService');

class TicketService {
  async processTicket(ticket) {
    logger.info('Processing ticket', { ticketId: ticket.ID });
    
    try {
      const result = await this.generateResponse(ticket);
      logger.info('Ticket processed successfully', { 
        ticketId: ticket.ID,
        tokens: result.usage?.total_tokens
      });
      return result;
    } catch (error) {
      logger.error('Failed to process ticket', { 
        ticketId: ticket.ID,
        error: error.message 
      });
      throw error;
    }
  }
}

Step 10: Performance Optimization

Connection Pooling

Copy

// srv/lib/ai-client-pool.js
const { OrchestrationClient } = require('@sap-ai-sdk/orchestration');

class AIClientPool {
  constructor(size = 5) {
    this.size = size;
    this.clients = [];
    this.available = [];
    this.initialize();
  }
  
  initialize() {
    for (let i = 0; i < this.size; i++) {
      const client = new OrchestrationClient({
        promptTemplating: {
          model: { name: 'gpt-4o' }
        }
      });
      this.clients.push(client);
      this.available.push(client);
    }
  }
  
  async acquire() {
    if (this.available.length > 0) {
      return this.available.pop();
    }
    
    // Wait for a client to become available
    return new Promise((resolve) => {
      const interval = setInterval(() => {
        if (this.available.length > 0) {
          clearInterval(interval);
          resolve(this.available.pop());
        }
      }, 100);
    });
  }
  
  release(client) {
    this.available.push(client);
  }
}

module.exports = new AIClientPool();

Batch Processing

Copy

class BatchProcessor {
  constructor(batchSize = 10) {
    this.batchSize = batchSize;
    this.queue = [];
  }
  
  async addToQueue(ticket) {
    this.queue.push(ticket);
    
    if (this.queue.length >= this.batchSize) {
      await this.processBatch();
    }
  }
  
  async processBatch() {
    const batch = this.queue.splice(0, this.batchSize);
    
    // Process tickets in parallel
    const results = await Promise.all(
      batch.map(ticket => this.processTicket(ticket))
    );
    
    return results;
  }
  
  async processTicket(ticket) {
    // Your AI processing logic
  }
}

Deployment Checklist

Before deploying to production:

Deployment Commands

Copy

# Build the MTA
npm install -g mbt
mbt build

# Login to Cloud Foundry
cf login -a <api-endpoint>

# Deploy the application
cf deploy mta_archives/support-ticket-ai_1.0.0.mtar

# Check deployment status
cf apps

# View logs
cf logs support-ticket-srv --recent

# Check service bindings
cf services

# Scale the application
cf scale support-ticket-srv -i 2 -m 1G

Monitoring in Production

Key Metrics to Track

Request Metrics
- Total requests per minute
- Average response time
- Error rate
Token Usage
- Tokens per request
- Daily/monthly token consumption
- Cost per request
Cache Performance
- Hit rate
- Miss rate
- Cache size
Model Performance
- Latency by model
- Success rate by model
- Cost by model

Setting Up Alerts

Copy

// srv/lib/alerting.js
class AlertManager {
  
  checkThresholds(metrics) {
    const alerts = [];
    
    // High error rate
    if (metrics.errorRate > 0.05) {
      alerts.push({
        severity: 'high',
        message: `Error rate is ${(metrics.errorRate * 100).toFixed(2)}%`,
        metric: 'errorRate',
        value: metrics.errorRate
      });
    }
    
    // High cost
    if (metrics.totalCost > 100) {
      alerts.push({
        severity: 'medium',
        message: `Daily cost is $${metrics.totalCost.toFixed(2)}`,
        metric: 'cost',
        value: metrics.totalCost
      });
    }
    
    // High latency
    if (metrics.avgLatency > 5000) {
      alerts.push({
        severity: 'medium',
        message: `Average latency is ${metrics.avgLatency}ms`,
        metric: 'latency',
        value: metrics.avgLatency
      });
    }
    
    return alerts;
  }
  
  sendAlert(alert) {
    // Send to alerting service (email, Slack, PagerDuty, etc.)
    console.error('ALERT:', alert);
  }
}

module.exports = new AlertManager();

Recap

We've covered essential production requirements:

Security: XSUAA authentication, service bindings, no hardcoded secrets
Error Handling: Retry logic, graceful degradation, enhanced error messages
Monitoring: Telemetry, metrics tracking, structured logging
Cost Optimization: Response caching, token optimization, smart model selection
CI/CD: Automated testing and deployment pipeline
Performance: Connection pooling, batch processing, health checks
Observability: Comprehensive logging, alerting, health monitoring

Your AI application is now production-ready with enterprise-grade reliability and observability!

Additional Resources

Next Steps

With your production deployment complete, consider:

Advanced Features: Fine-tuning models, custom embeddings, multi-tenant architecture
Scaling: Load balancing, auto-scaling policies, database optimization
Compliance: Data residency, audit logging, compliance certifications
Innovation: Explore new SAP AI Core capabilities, experiment with new models

Congratulations! You've built and deployed a production-ready AI application on SAP BTP! 🎉

Production-Ready AI Applications with SAP AI Core

What Production-Ready Means

Step 1: Secure Configuration Management

Create `mta.yaml` for Deployment

Create `xs-security.json`

Step 2: Implement Authentication & Authorization

Step 3: Implement Comprehensive Error Handling

Step 4: Implement Monitoring & Observability

Step 5: Cost Optimization Strategies

Implement Response Caching

Token Usage Optimization

Model Selection Strategy

Step 6: CI/CD Pipeline

Step 7: Health Checks & Readiness Probes

Step 8: Environment-Specific Configuration

Step 9: Logging Best Practices

Step 10: Performance Optimization

Connection Pooling

Batch Processing

Deployment Checklist

Deployment Commands

Monitoring in Production

Key Metrics to Track

Setting Up Alerts

Recap

Additional Resources

Next Steps

Published

Topics

Content

Production-Ready AI Applications with SAP AI Core

What Production-Ready Means

Step 1: Secure Configuration Management

Create mta.yaml for Deployment

Create xs-security.json

Step 2: Implement Authentication & Authorization

Step 3: Implement Comprehensive Error Handling

Step 4: Implement Monitoring & Observability

Step 5: Cost Optimization Strategies

Implement Response Caching

Token Usage Optimization

Model Selection Strategy

Step 6: CI/CD Pipeline

Step 7: Health Checks & Readiness Probes

Step 8: Environment-Specific Configuration

Step 9: Logging Best Practices

Step 10: Performance Optimization

Connection Pooling

Batch Processing

Deployment Checklist

Deployment Commands

Monitoring in Production

Key Metrics to Track

Setting Up Alerts

Recap

Additional Resources

Next Steps

Published

Topics

Content

Create `mta.yaml` for Deployment

Create `xs-security.json`