Customer Cases
Pricing

AI Agents in Financial Testing: 2026 Guide to Multimodal & Cross-System Solutions

Discover how AI agents and multimodal testing are transforming financial QA in 2026. Real case studies show 40-80% efficiency gains and 62% risk reduction. Expert guide with ICBC, Tongdun implementations.

Executive Summary

Traditional financial testing—centered on a "text-driven" approach and focused on single-dimensional verification like interface calls and business logic—struggles to adapt to the complex scenarios spawned by financial digital transformation. This article explores how multimodal technology and AI agent collaboration are transcending testing boundaries, propelling financial testing from "single-point verification" toward "system-level intelligent testing."

Key takeaways for technical leaders:

  • The challenge: Multimodal interaction coverage <50% (China UnionPay data); cross-system coordination inefficiencies

  • The solution: Multi-agent architectures with specialized financial domain agents

  • The results: 40-80% efficiency gains, 35-62% risk reduction across case studies

  • The 2026 imperative: Testing must now verify business value and regulatory compliance, not just technical coverage

Introduction: Why Financial Testing Needs a Dimensional Breakthrough

The urgency for testing transformation is driven by three converging forces:

1. The Multimodal Reality.

Financial services now operate across voice, vision, and text simultaneously—yet traditional testing covers less than 50% of multimodal scenarios . When a customer queries "What were my dining expenses last month?" via voice while viewing charts, that's a multimodal interaction requiring coordinated testing.

2. Cross-System Complexity.

Cross-bank payment clearing, domestic IT infrastructure (Xinchuang) compatibility, and fintech-partner integrations create coordination nightmares. Traditional approaches require multiple teams and redundant environments.

3. Regulatory Mandates.

The National Financial Regulatory Administration's Implementation Plan for the High-Quality Development of Digital Finance explicitly requires a shift from "process digitization" to "value creation." Testing must now verify:

  • Technical coverage ✓
  • Business value ✓
  • Regulatory compliance ✓

4. Accelerating Investment.

With 358 large model projects in banking during Q1-Q3 2025 (totaling ¥955 million) and tech investment projected to exceed ¥450 billion by 2028, testing capabilities must evolve at the same velocity as the systems they validate .

Core Technology 1: Multimodal Data Fusion Testing – From "Structured Coverage" to "Omni-Dimensional Perception"

Modern multimodal testing has transcended traditional "text + numerical" limitations to achieve full-spectrum coverage across text, images, audio, video, and unstructured documents—with emphasis on contextual awareness.

Technical Architecture for 2026

The AI Readiness Imperative.

In 2026, content must be optimized not just for human readers but for "synthesizers"—AI models that construct answers . This requires:

  • Structured data that LLMs can parse (schema.org/Article, schema.org/TechArticle)

  • Entity-based organization using Wikidata IDs for financial concepts

  • Bottom Line Up Front (BLUF) formatting for AI Overview extractability

Technical Depth Upgrade

Tongji University's AIOpsArena

demonstrates production-grade implementation:

  • Kubernetes-based microservice simulation

  • Real-time parsing of UI interaction video streams (mobile banking transfer navigation)

  • Voice command analysis (dialect queries for credit card bills)

  • Result: Multimodal test coverage increased from 92% → 96%

Industrial Bank's Digital Twin Integration:

  • IoT sensor fusion (branch temperature/humidity, ATM vibration)

  • Full "test environment ↔ physical scenario" mapping

  • Result: Hardware failure prediction accuracy improved 85% → 89%

Extended Industry Practice

Tongdun Technology's Credit Risk Testing

innovatively fuses unstructured data:

  • OCR: ID card tampering detection

  • NLP: Risk signal extraction from corporate public opinion texts

  • Graph learning: Related-party transaction subgraph construction

  • Result: Identity fraud defect detection increased 40%

Expert Insight:

Li Lihui, former President of Bank of China, notes: "Current multimodal large models can perceive, understand, and simulate the dynamic physical world—for example, simulating voice customer service under extreme network conditions, filling gaps left by traditional text-based testing" .

Core Technology 2: "Testing Intelligence with Intelligence" – Multi-Agent Architecture Evolution

China UnionPay's "general + specialized" agent matrix has evolved into a composite architecture featuring vertical domain division and cross-scenario collaboration.

2026 Multi-Agent Architecture for Financial Testing

2026 Multi-Agent Architecture for Financial Testing

Orchestration Layer

Cross-Agent Coordination | Priority Scheduling | Compute Resource Allocation (Lightweight/Standard/High-Performance)

Functional Layer

• Test Case Agent

• Automation Agent

• Document Analysis Agent

Professional Domain Layer

• Financial Calculation Agent

• Compliance Document Agent

• Regulatory Intelligence Agent

Fitness Layer

 

• Evaluation Agent

• Strategy Optimization Agent

• Self-Healing Agent

 

Layer 1: Basic Functional Agents (Refined)

Test Case Agent

  • New capability: Multilingual corpus (English, Japanese, Southeast Asian languages)

  • Generates: "Ambiguous instruction + cross-border compliance" test cases

  • Example: "Repay Japanese yen credit card bill with US dollars" (implicit currency conversion)

  • Results: 12× test case generation increase; 60% cross-border coverage improvement

Automation Test Agent

  • Architecture: FinTeam multi-agent system with conversation memory mechanism

  • Capability: Simulates cross-platform user journeys (mobile banking account opening → online banking card binding → third-party payment authorization)

  • Context recognition accuracy: >98% (solves "conversation" issues)

Document Analysis Agent

  • Function: Scans test reports, user agreements for compliance defects

  • Detects: Vague statements, clause conflicts (e.g., "expected returns" missing risk warnings)

  • Efficiency: 20× faster than manual review; >92% defect identification rate

Layer 2: Professional Domain Agents (New)

Financial Calculation Test Agent

(FinTeam "Accountant Agent" lineage)

  • Focus: Financial product pricing, fee calculations

  • Verifies: Floating LPR interest calculations, fund NAV accuracy

  • Supports: Excel formula parsing, blockchain transaction record comparison

  • Error rate: <0.1% for calculation-focused tests

Compliance Document Test Agent

  • Function: Automated scanning of test reports, user agreements

  • Identifies: "Vague statements," "clause conflicts" (e.g., missing risk warnings on "expected returns")

  • Efficiency: 20× faster than manual review

  • Defect identification rate: >92%

Strategy Optimization Agent

(Tongdun Technology lineage)

  • Function: Analyzes historical defect patterns (e.g., frequent "UI button response delays")

  • Action: Automatically increases test case density for high-risk modules

  • Results:30% testing efficiency increase; 50% defect recurrence reduction

Layer 3: Fitness Agents

Evaluation Agent

  • Innovation: "Compliance quantification index" integration

  • Metrics: Beyond BLEU/semantic similarity, now includes regulatory compliance scoring

  • Regulations checked: Anti-money laundering, data security (per Implementation Plan)

  • Results: 80% evaluation efficiency increase; 35% compliance漏检率 (missed detection) reduction

Core Technology 3: Self-Healing and Self-Verifying Testing – Full-Process Closed Loop

Self-healing technologies have expanded from code repair to encompass compliance verification and domestic IT (Xinchuang) adaptation.

2026 Self-Healing Architecture

SELF-HEALING ENGINE
Module Category Specific Module Core Functions Key Indicators
Basic Repair Modules Code Repair Module

1. C language vulnerability repair;

2. ESBMC-AI integration

Repair accuracy >90%
Environment Repair Module

1. Xinchuang compatibility repair (KylinOS + JMeter);

2. Automatic plugin call

Success rate >85%
Verification and Report Modules Compliance Verification Module

1. Comparison of test reports with requirements;

2. Missing scenario detection

Verification accuracy >90%
Report Generation Module

1. Automatic generation of risk summaries;

2. Marking of high-risk items

No clear quantitative indicators

 

Code + Environment Dual Repair

Tongji University ESBMC-AI Evolution:

  • Original: C language vulnerability repair

  • New (2026): Xinchuang software/hardware compatibility repair

  • Capability: When compatibility issues arise between domestic OS (KylinOS) and testing tools (JMeter), automatically invokes Xinchuang adaptation plugins

  • Success rate: >85%

Bank of Suzhou "AutoUAT+Test Flow":

  • Integrates "requirements → testing → operations" chain

  • Test Flow generates Cypress scripts automatically synced to operations platform

  • Enables: One-click "test script → production monitoring rule" conversion

  • Results: UAT time reduced 70%; production issue溯源 (traceability) time reduced 50%

Compliance + Report Self-Verification

Hua Xia Bank "Test Report Self-Verification Module":

  • Large model compares test reports against business requirement documents

  • Identifies: "Omitted test scopes," "contradictory results"

  • Example detected: "Credit approval test" missing "non-local household registration customers" scenario

  • Accuracy: >90%; manual review time reduced 60%

Tongdun Technology "Decision Enhancement Agent":

  • Automatically generates test risk analysis summaries

  • Flags "high-risk items" (e.g., "cross-border payment test missing SWIFT code verification")

  • Benefit: Prioritizes critical scenario verification

Core Technology 4: Cross-Scenario Intelligent Orchestration – From Single-Institution to Ecosystem Collaboration

Industrial Bank's "Agent Swarm Architecture" has evolved into a layered orchestration model addressing resource adaptation across institutions of varying sizes.

2026 Orchestration Framework

Layer 1: Compute Tiering

(aligned with China Fintech and Digital Finance Development Report trends)

Tier Use Case Technologies Example
Lightweight Compute Interface functional tests Small models + edge computing Basic API validation
Standard Compute Multimodal interaction tests General LLMs + cloud servers Voice + UI testing
High-Performance Compute Encrypted link tests Large models + quantum communication Encrypted transfers

 

Quantum Integration: When testing bank (encrypted transfer) functions, system automatically orchestrates quantum-safe transmission compute (referencing ICBC and Hua Xia Bank quantum tech practices)

  • Response speed: 40% improvement

  • Security compliance: Meets Level 3 protection standards

Layer 2: Dynamic Priority Adjustment

  • New factor: "Business value weight"

  • High-priority classification: Digital yuan pilot tests, green credit tests

  • Example impact: Digital yuan red packet test response time: 5 minutes → 2 minutes; launch cycle accelerated by 3 days

Cross-Institution Collaborative Orchestration

Small/Medium Institution Challenge:

"Insufficient compute power, technical capability gaps" (China Internet Information Center)

Solution:

Leading banks (ICBC) spearhead "cross-institution test orchestration platforms"

  • Smaller banks access shared LLM compute and test resources via open APIs

  • Case study: City commercial bank using ICBC's "ICBC Zhiyong" LLM for credit risk testing

    • Cost reduction: 50%

    • Coverage increase: 88%

Cross-Industry Scenario Orchestration

  • Scenario: "Bank + e-commerce" (installment payment)

  • Coordination: Bank test agents ↔ e-commerce platform test interfaces

  • End-to-end coverage: Order generation → payment → reconciliation

  • Efficiency gain: 70% improvement; eliminates "multi-team offline coordination"

Practical Case Studies: Multimodal Testing Value Realization

Case Study 1: ICBC – "ICBC Zhiyong" LLM-Driven Full-Modal Trading System Testing

Background:

ICBC's financial markets trading (forex, bonds) required comprehensive multimodal testing.

Implementation:

  • Multimodal data coverage:

    • Trading instruction text: "Buy $10 million against Euro"

    • K-line chart images: Support/resistance level annotation accuracy

    • Trading voice commands: Real-time English trader instructions

  • Multi-agent collaboration:

    • "Analyst Agent": Macroeconomic data impact analysis

    • "Accountant Agent": Transaction fee verification

    • "Compliance Agent": Foreign exchange control checks

  • End-to-end coverage: Trading decision → execution → compliance

Results:

Metric Improvement
Forex trading decision response test speed 80% increase
Trading execution efficiency 3× improvement
Related business revenue (H1 2025) 15% YoY growth
Operational risk incidence 62% decrease

 

ICBC's implementation demonstrates strong E-E-A-T signals :

  • Experience: Real trading scenarios with multimodal interaction

  • Expertise: Domain-specific agents (Analyst, Accountant, Compliance)

  • Authoritativeness: Large-scale production deployment by tier-1 bank

  • Trustworthiness: 62% risk reduction validates reliability

Case Study 2: Tongdun Technology – Credit Risk Multimodal Testing System

Background:

Bank credit required comprehensive risk validation.

Architecture:

  • Multimodal risk signal testing:

    • OCR: ID card/business license tampering detection

    • NLP: "Debt default" extraction from public opinion texts

    • Graph learning: Guarantee chain construction (related-party risk)

  • Coverage: Identity verification → public opinion risk → related-party transactions

Self-Verification Closed Loop:

  • Test results → automatic risk analysis summaries

  • "High-risk items" flagged (e.g., "company address ≠ actual business location")

  • Automated comparison with regulatory requirements (Measures for Administration of Personal Loans)

  • Outcome: "Testing → compliance verification" automation

Results:

Metric Improvement
Identity fraud/document forgery detection 40% increase
Credit testing cycle 15 days → 7 days
NPL rate test prediction accuracy 91%

 

Case Study 3: City Commercial Bank – Inclusive Finance Testing via Cross-Institution Platform

Background:

Small/medium bank needed SME loan testing capabilities.

Resource Sharing (via ICBC platform):

  • PrivBayes synthetic test data (SME privacy protection)

  • Hengfeng Bank "Quanshutong" data governance tools for unstructured data (tax certificates)

  • Data preparation time reduction: 60%

Multi-Agent Collaboration:

  • Test Case Agent: SME loan use cases by industry

  • Automation Agent: Loan application → approval → disbursement simulation

  • Evaluation Agent: Policy适配 (alignment) quantification

Results:

Metric Improvement
Inclusive finance test coverage 65% → 88%
Testing costs 50% reduction
Defects identified 12 (tax data forgery, industry qualification mismatch)
Inclusive loan volume 25% YoY increase

 

Case Study 4: Tongji University – Financial Xinchuang Multimodal Testing

Background:

State-owned bank IT localization (Xinchuang) project required comprehensive compatibility testing.

Multimodal Coverage:

  • Numerical: Domestic server (Huawei Kunpeng) temperature, bandwidth

  • Image: Domestic OS (KylinOS) UI interactions

  • Text: Domestic database (Renmin Jincang) SQL execution

  • Validation scope: Hardware → OS → database compatibility

Self-Healing Application:

  • ESBMC-AI (repair) Xinchuang code compatibility vulnerabilities

  • C language syntax adaptation for domestic compilers

  • Repair accuracy: >90%

  • Output: Automated Xinchuang test report generation

Results:

Metric Improvement
Test environment downtime 45% reduction
Core system Xinchuang coverage 92%
Xinchuang compatibility defects identified/repaired 8

 

Future Outlook: Evolution Toward General-Purpose Intelligent Testing Entities

Synthesizing Top 10 Fintech Development Trends 2025 and industry practice, multimodal-AI agent synergy will accelerate toward "General-Purpose Intelligent Testing Entities."

Trend 1: Policy-Driven Quality-Efficiency Integration

Regulatory requirement:

Financial institutions must "establish digital transformation effectiveness evaluation systems"

Testing evolution:

From "coverage-first" → "coverage + quality/efficiency ratio"

2026 implementation:

General-purpose testing entities will include "quality/efficiency assessment modules" that automatically calculate:

  • Input: Compute resources, time

  • Output: Defect detection rate, business value enhancement

  • Sample output: "Testing cost reduced 30%, defect rate decreased 25%"

Trend 2: Technology Convergence for Cross-Dimensional Breakthroughs

Multi-stack integration

(2027 projection):

  • Quantum technology: Encrypted link testing for 80% of major bank core systems

  • Blockchain: Test data notarization

  • Edge computing: ATM offline scenario testing

Multimodal simulation upgrade:

Static data fusion → dynamic physical simulation

  • Earthquake scenarios: Mobile banking UI interaction testing

  • Network outage: System (robustness) validation

  • Method: Multimodal LLMs generate "extreme scenario test cases"

Trend 3: Ecosystem-Oriented Layered Adaptation

Leading institutions

(2026):

  • Build "general testing agent ecosystems"

  • Open APIs for capability sharing

  • ICBC target: 100 small/medium banks accessing "ICBC Zhiyong" testing LLM interface by 2026

Small/medium institutions:

  • Focus on niche scenario lightweight testing

  • Example: Rural commercial banks specializing in "agriculture, rural areas, farmers" loan testing

  • Strategy: "Small but precise" differentiation vs. "big but comprehensive" resource waste

Trend 4: Lowering Human-Machine Collaboration Barriers

Natural interaction evolution

(2026-2027):

  • Voice commands: "Test credit card repayment interface performance"

  • AR interaction: Testers view mobile banking UI results via AR glasses, issue voice commands to adjust strategy

"Zero-code" testing:

  • Drag-and-drop test scenario configuration

  • Business personnel build "credit approval test flow" without coding

  • Technical barrier reduction: 80%

  • Outcome: "Business-led testing" industry shift

Timeline Predictions

Year Milestone
2026 Leading banks achieve core business coverage with General-Purpose Intelligent Testing Entities
2027 Small/medium institutions achieve large-scale adoption via ecosystem sharing
2028 Industry-wide: 60% testing efficiency increase; 55% defect rate reduction
2028+ Financial digital transformation progresses from "technology application" to "value cultivation"

 

Latest Posts
1AI Agents in Financial Testing: 2026 Guide to Multimodal & Cross-System Solutions Discover how AI agents and multimodal testing are transforming financial QA in 2026. Real case studies show 40-80% efficiency gains and 62% risk reduction. Expert guide with ICBC, Tongdun implementations.
2Performance Testing Handbook: Key Concepts & JMeter Best Practices A complete guide to performance testing key concepts (concurrent users, QPS, JMeter threads), async/sync task testing, JMeter best practices, and exit criteria—helping B2B QA teams avoid pitfalls and align tests with customer requirements.
3The Future of Software Testing in the AI Era: Trends, Challenges & Practical Strategies Explore the future of software testing in the AI era—key challenges, trends in testing AI systems, how AI empowers traditional testing, and practical strategies for testers to thrive. Learn how to adapt without rushing or waiting.
4Practice of Large Model Technology in Financial Customer Service Discover how large model fine-tuning transforms financial customer service at China Everbright Bank. Explore 3 application paradigms, technical architecture, and achieve 80% ticket summary accuracy with AI.
5Application of Automated Testing in Banking Data Unloading Testing: A Complete Guide A complete guide to automated testing in banking data unloading. Learn GUT implementation, FLG/DAT parsing, and case studies for accurate cross-system data verification.