Traditional financial testing is centered on "text-driven" and focuses on single-dimensional verification such as interface calls and business logic. It is difficult to adapt to the complex scenarios spawned by financial digital transformation:
On the one hand, multi-modal interaction has become the norm in financial services (voice customer service, mobile banking UI visual interaction, visual financial report analysis, etc.), and China UnionPay data shows that the coverage rate of such business testing is less than 50%;
On the other hand, complex system collaboration scenarios such as cross-bank payment clearing and Xinchuang software and hardware compatibility are increasing. Traditional testing requires coordinating multiple teams and building a redundant environment, which highlights the problem of low efficiency.
Current policies and industry development trends have further strengthened the urgency of "dimension breakthroughs": the State Administration of Financial Supervision has released the "Implementation Plan for the High-Quality Development of Digital Finance in the Banking and Insurance Industry", proposing for the first time the dual strategic main lines of "artificial intelligence +" and "data elements"; at the same time, in the first three quarters of 2025, there have been 358 large-scale model projects in the banking industry, with a total investment of 955 million yuan ("China Fintech and Digital Finance Development Report"). Technology investment is expected to exceed 450 billion yuan in 2028, and the speed of technology iteration has forced the upgrade of testing capabilities.
In this context, "multimodal technology + agent collaboration" has become the core path to break through the testing boundary, promoting the transition of financial testing from "single point verification" to "system-level intelligent testing" and achieving the triple goals of "technical coverage + value verification + compliance assurance".
Multimodal data fusion testing has broken through the limitations of traditional "text + numerical values" to achieve coverage of all types of data such as text, images, audio, video, and unstructured documents, and pays more attention to "scenario-based perception" capabilities:
- Technological Depth Upgrade: The AIOpsArena platform of Jiji University simulates the microservice architecture through Kubernetes. In addition to collecting logs (text) and performance indicators (numeric values), it also adds real-time analysis of UI interactive video streams (such as mobile banking transfer page jump tracks) and voice command audio (such as the dialect voice of "query credit card bill"), further increasing the multi-modal test coverage from 92% to 96%.
- Industry Practice Extension: Tongdun Technology innovatively integrates multi-modal unstructured data in credit risk control testing - identifying tampering traces in ID card images through OCR, NLP analyzing risk signals in corporate public opinion texts, and graph learning to construct related transaction subgraphs, covering the full-link test of "identity verification - public opinion risk - transaction compliance", increasing the identification rate of fraudulent identity defects by 40% (Sina Finance "Tongdun Technology: Deepening Intelligent Financial Risk Control").
Li Lihui, former president of the Bank of China, pointed out that current multi-modal large models can already "perceive, understand and simulate the dynamic physical world", such as simulating voice customer service interaction tests under extreme network environments (internet disconnection, delay), filling the gap in traditional text testing scenarios (Tsinghua Financial Review "Li Lihui, former president of the Bank of China: Current status of large model applications in the financial industry").
China UnionPay's "general + dedicated" agent matrix has been further expanded into a composite architecture of "vertical field division of labor + cross-scenario collaboration", combining industry practice to add multiple types of professional agents:
(1) Basic Functional Layer Agents (Continuation and Optimization)
- Test Case Intelligent Agent: Automatically generates a new "multilingual corpus" (including English, Japanese and Southeast Asian languages commonly used in cross-border business) to generate "ambiguous instructions + cross-border compliance scenarios" test cases (such as the fuzzy instruction to "repay Japanese yen credit card bills with US dollars"). The number of use cases generated is increased by 12 times compared with the traditional model, and the cross-border business test coverage is increased by 60%.
- Automated Test Agent: Integrating the "session memory mechanism" of the FinTeam multi-agent system (51CTO "FinTeam: Multi-agent Collaborative Intelligent System for Comprehensive Financial Scenarios"), it can simulate users' continuous cross-platform operations (such as mobile banking account opening → online banking card binding → third-party payment authorization). The context recognition accuracy exceeds 98%, solving the "session break" problem of traditional agents.
- Assessment Agent: Introducing "quantitative compliance indicators", in addition to BLEU and semantic similarity, a new "risk compliance score" required by the "Implementation Plan for High-Quality Development of Digital Finance in the Banking and Insurance Industry" is added to automatically check whether the test results comply with anti-money laundering, data security and other regulatory requirements. While the assessment efficiency is increased by 80%, the compliance missed detection rate is reduced by 35%.
(2) Professional Field Layer Agents (New Expansion)
- Financial Calculation Test Agent: Refer to FinTeam "Accountant Agent", focusing on financial product pricing, fee calculation and other scenario tests, such as automatically verifying the accuracy of loan LPR floating interest calculation and fund net value calculation, supporting Excel formula analysis and blockchain transaction record comparison, and the calculation test error rate is less than 0.1%.
- Compliance Document Testing Agent: Based on the FinTeam "Document Analyst Agent", it automatically scans test reports, user agreements and other documents to identify compliance defects such as "ambiguous expressions" and "conflicts in terms" (such as "expected returns on financial products" without risk warnings). The document review efficiency is 20 times higher than manual work, and the compliance defect identification rate exceeds 92%.
- Strategy Optimization Agent: Drawing on Tongdun Technology's "Strategy Optimization Agent", it can automatically adjust the test strategy based on historical test defect data (such as "UI button response delay" that occurs frequently), such as increasing the test case density of high-risk modules, increasing test efficiency by 30% and reducing defect recurrence rates by 50%.
Self-healing and self-verification technology has extended from "code repair" and "script generation" to "compliance verification" and "information creation adaptation" and other scenarios, forming a more complete closed loop:
- **Dual Repair of Code and Environment**: On the basis of repairing C language vulnerabilities, Tongji University ESBMC-AI has added the "Xinchuang software and hardware compatibility repair" function - when a compatibility issue between a domestic operating system (such as Kirin OS) and a test tool (such as JMeter) is found during testing, the Xinchuang adaptation plug-in is automatically called, and the repair success rate exceeds 85%.
- End-to-end Link Opening: Suzhou Bank's "AutoUAT+TestFlow" system further opens up the "demand-test-operation and maintenance" link. Cypress scripts generated by TestFlow can be automatically synchronized to the operation and maintenance platform, realizing one-click conversion of "test script → production monitoring rules", shortening the acceptance test time by 70% and shortening the traceability time of production problems by 50%.
- Compliance and Report Self-verification:
- Based on the "automatic generation of due diligence report" technology, Hua Xia Bank developed a "test report self-verification module" - using a large model to compare test reports and business requirement documents, automatically identifying problems such as "missing test scope" and "contradictory results", such as the "credit approval test" does not cover the "customers with registered residence in other places" scenario, the report verification accuracy rate exceeds 90%, and the manual review time is reduced by 60%.
- Tongdun Technology uses "decision-enhanced intelligence" to automatically generate test risk analysis summaries, mark "high-risk test items" (such as "cross-border payment testing does not cover SWIFT code verification"), and assist the testing team to prioritize verification of key scenarios.
Industrial Bank's "intelligent swarm architecture" has been upgraded to a "layered scheduling + cross-agency collaboration" model based on industry trends to solve the resource adaptation problems of financial institutions of different sizes:
(1) Hierarchical Scheduling to Optimize Computing Power Stratification
- Computing Power Classification: Combined with the "Intelligent Computing Power Demand Growth" trend in the "China Fintech and Digital Finance Development Report", scheduling is divided into "light computing power" (small model + edge computing, supporting interface function testing), "standard computing power" (general large model + cloud server, supporting multi-modal interactive testing), "high-end computing power" (large model + quantum communication, supporting encrypted link testing). For example, when testing the bank's encrypted transfer function, automatically dispatching quantum secure transmission computing power (referring to the quantum technology practice of ICBC and Hua Xia Bank), the test response speed is increased by 40%, and the data transmission security meets the requirements of Class III protection.
- Priority Dynamic Adjustment: A new "business value weight" is added. In addition to emergency scenarios (retesting of production vulnerabilities), "high-value businesses" (such as digital renminbi pilot tests and green credit tests) are listed as high priority and resources are allocated first. For example, a bank's digital renminbi red envelope test task, the response time is shortened from 5 minutes to 2 minutes, and the business launch cycle is advanced by 3 days.
(2) Cross-agency Collaborative Scheduling
- Shared Scheduling of Small and Medium-sized Institutions: In response to the problem of "insufficient computing power and weak technology" of small and medium-sized financial institutions (China.com's "Top Ten Development Trends in Financial Technology in 2025"), leading banks (such as Industrial and Commercial Bank of China) took the lead in building a "cross-agency test scheduling platform". Small and medium-sized banks share large model computing power and testing resources through open API access - a city commercial bank called the ICBC "ICBC Smart" large model through the platform to conduct credit risk control testing, reducing testing costs by 50% and increasing coverage to 88%.
- Cross-industry Scenario Scheduling: Supports cross-industry business test scheduling such as finance, e-commerce, and medical care. For example, when testing the "bank + e-commerce" installment payment scenario, it automatically coordinates the bank test agent and the e-commerce platform test interface to realize the "order generation-payment-reconciliation" full-link test. The cross-industry test collaboration efficiency is increased by 70%, and the inefficiency problem of traditional "multi-team offline coordination" is avoided.
Based on the "ICBC Smart" large model, ICBC built a multi-modal testing system covering financial market transactions such as foreign exchange and bonds:
- Multi-modal Data Coverage: Integrate trading instruction text (such as "Buy 10 million US dollars against the euro"), K-line chart images (identify support level/pressure level labeling accuracy), trading voice instructions (such as real-time instructions from English traders), and the test covers the entire scenario of "text input - image analysis - voice interaction".
- Agent Collaborative Testing: Deploy FinTeam-style "analyst agents" (analyze the impact of macroeconomic data on transactions), "accountant agents" (verify transaction fee calculations), and "compliance agents" (check whether transactions comply with foreign exchange control requirements) to achieve full-process testing of "transaction decision-making-execution-compliance".
- Value Results: The foreign exchange transaction decision response test speed increased by 80%, the transaction execution efficiency test results showed that the efficiency increased by 3 times, related business income increased by 15% year-on-year in the first half of 2025, and the incidence of operational risks decreased by 62%, verifying the direct improvement of business value from multi-modal testing.
Tongdun Technology creates a multi-modal testing solution driven by "Credit Risk Verification Agent" for bank credit business:
- Multi-modal Risk Signal Testing: Analyzing ID card and business license images through OCR (identifying traces of tampering), NLP analyzing corporate public opinion text (extracting risk words such as "debt default"), graph learning to build a guarantee chain map (identifying related risks), covering the full-dimensional testing of "identity verification - public opinion risk - related transactions".
- Self-verification Closed Loop: The test results automatically generate a risk analysis summary, mark "high-risk items" (such as "the company address does not match the actual place of business"), and compare it with regulatory compliance requirements (such as the "Interim Measures for Personal Loan Management") to realize the automation of "testing-compliance verification".
- Value Results: The identification rate of defects such as identity fraud and data falsification increased by 40%, the credit testing cycle was shortened from 15 days to 7 days, and the bank's non-performing loan ratio test prediction accuracy increased to 91%, providing accurate testing support for credit business risk management and control.
A city commercial bank carried out an inclusive finance (small and micro enterprise loan) test through the "cross-institution test scheduling platform" opened by ICBC:
- Resource Sharing: Calling the PrivBayes synthetic test data shared by the platform (protecting the privacy of small and micro enterprises), and using Hengfeng Bank's "Quanshutong" data governance tool to clean unstructured data (such as corporate tax certificate text), reducing data preparation time by 60%.
- Multi-agent Collaboration: Deploy "test case agents" (generating loan use cases for small and micro enterprises in different industries), "automated test agents" (simulating the loan application-approval-loan process), and "evaluation agents" (quantifying the adaptability of test results to inclusive financial policies).
- Value Results: The inclusive finance test coverage increased from 65% to 88%, the test cost was reduced by 50%, and 12 defects such as "false tax data of small and micro enterprises" and "incompatible industry qualifications" were successfully identified, supporting a 25% year-on-year increase in the bank's inclusive loan scale.
Tongji University provides a "multi-modal + self-healing" Xinchuang test plan for a state-owned bank's Xinchuang project (combined with Xinchuang development trends):
- Multimodal Xinchuang Scenario Coverage: Testing temperature and bandwidth data (numeric values) of domestic devices (such as Huawei Kunpeng), UI interaction (image) of the Xinchuang operating system (Kirin OS), SQL command execution (text) of the Xinchuang database (Renmin University of Finance and Economics), and verification of "hardware-system-database" compatibility.
- Self-healing Ability Application: ESBMC-AI automatically repairs Xinchuang code compatibility vulnerabilities (such as the syntax adaptation problem of C language in domestic compilers), with a repair accuracy rate of over 90%, and simultaneously generates Xinchuang test reports.
- Value Results: Xinchuang test environment interruption time was reduced by 45%, the core system Xinchuang adaptation test coverage increased to 92%, and 8 Xinchuang software and hardware compatibility defects were identified and repaired, providing stable testing guarantee for the bank's Xinchuang transformation.
In the future, combined with the "Top Ten Development Trends in Financial Technology in 2025" and industry practices, multi-modal and intelligent agent collaboration will accelerate the evolution to "universal intelligent test entities". The core trends include:
In the context of regulatory authorities clearly requiring financial institutions to "establish an evaluation system for the effectiveness of digital transformation", testing has shifted from "coverage priority" to the dual goals of "coverage + quality-effectiveness ratio"; in terms of evolution direction, the general intelligent test entity will add a "quality and efficiency evaluation module" to automatically calculate the ratio of testing investment (computing power, time) and output (defect identification rate, business value improvement). For example, when testing a financial product, a quality and efficiency report of "testing cost reduced by 30% and defect rate reduced by 25%" will be simultaneously output to support the evaluation of the effectiveness of digital transformation of financial institutions.
- Multi-technology Stack Integration: Integration of quantum technology (such as quantum secure transmission testing), blockchain (test data storage), edge computing (offline scenarios such as edge testing of ATM machines) - It is expected that by 2027, quantum encryption link testing will cover 80% of the core trading systems of major state-owned banks.
- Multi-modal Simulation Upgrade: From "static data fusion" to "dynamic physical simulation", such as simulating mobile banking UI interactive testing in extreme physical environments such as earthquakes and network interruptions, and generating "extreme scenario test cases" through multi-modal large models to improve the authenticity of system robustness testing.
- Leading Institutions: Build a "universal test agent ecosystem" and open APIs to share capabilities with small and medium-sized institutions - for example, the Industrial and Commercial Bank of China plans to open the "ICBC Smart" test large model interface to 100 small and medium-sized banks in 2026 to promote the overall improvement of the industry's testing capabilities.
- Small and Medium-sized Institutions: Focus on "lightweight testing of segmented scenarios". For example, rural commercial banks focus on "agriculture, rural areas and farmers" loan testing, and call special testing agents through a shared platform to achieve "small but precise" differentiated testing and avoid "large and comprehensive" resource waste.
- Multi-dimensional Interaction: In addition to voice commands (such as "test the performance of the credit card repayment interface"), AR interaction is supported - testers use AR glasses to view the mobile banking UI test results (such as button response delay annotation), and issue voice commands in real time to adjust the test strategy.
- Zero-code Testing: The universal intelligent test entity provides "drag-and-drop" test scenario configuration, and business personnel can build a "credit approval test process" without coding knowledge. The test technology threshold is reduced by 80%, and the industry revolution of "business personnel-led testing" is realized.
It is expected that in 2027, leading banks will achieve core business coverage of "universal intelligent test entities"; in 2028, small and medium-sized institutions will realize the large-scale application of universal testing agents through ecological sharing, increase industry-wide testing efficiency by 60%, reduce defect rates by 55%, and promote the digital transformation of finance from "technology application" to "value cultivation".
(From: TesterHome)