Agentic AI is gaining attention in the financial sector, but the biggest obstacle for the sector is no longer whether the models are powerful enough. The more difficult problem is whether banks, asset managers and treasury desks have the infrastructure to delegate financial tasks to autonomous systems without losing control of money, accountability or compliance.
A Deloitte survey of more than 3,300 finance and accounting professionals showed the gap clearly: 80.5% said AI-powered tools like agents and GenAI chatbots could become standard within five years, but only 13.5% said their organizations were already using agentic AI.
Citi Sky showed why the infrastructure debate matters
Citi on April 22 launched Citi Sky, an AI-powered wealth assistant built with Google Cloud and Google DeepMind technologies. The tool was developed using Google’s Gemini Enterprise Agent Platform and will be rolled out in phases to Citigold customers in the US this summer.
The launch gave the agent AI debate an example of live banking. Citi’s head of wealth technology, Dipendra Malhotra, pointed to memory as a central limitation for high-stakes advisory AI, questioning how long a client can keep a conversation going before the system starts to hallucinate.
Most agents rely on fetch-enhanced generation to expand memory via external databases. Context windows still determine how much information an agent can hold at once.
In financial advice, treasury management or portfolio execution, that memory ceiling becomes more than a technical issue. It becomes an operational risk.
MihnChi Park, co-founder of CoinFello, said the conditions for trusted delegation are simple: the agent can only act within the user’s instructions, the user can stop it, and the underlying assets never go to a third party.
Ethereum establishes on-chain primitives for agent identity
Ethereum proposal ERC-8004 introduces systems for agent identity, reputation, and validation. The draft standard includes three registers: an identity register, a reputation register and a validation register.
Together they are intended to help autonomous agents prove who they are, build a behavioral record, and support verification by other market participants.
ERC-8183 takes a narrower route. It proposes a job assurance standard with an assessor’s attestation, where a client funds a job, a service provider submits work and an assessor completes or rejects the outcome.
The proposal does not provide for arbitration or formal dispute resolution, but gives agent-based markets a framework for blocked tasks and verifiable completion.
The arXiv paper “The Agent Economy: A Blockchain-Based Foundation for Autonomous AI Agents” maps out a five-layer architecture for this shift, covering physical infrastructure, on-chain identity, cognitive tools, economic settlement and collective governance.
The reputation layer still has a structural vulnerability. Agents can generate activity at a speed and scale that humans cannot match, making it possible to inflate trust signals for short periods of time.
That leaves financial institutions with a difficult question: If an agent has a good track record, is that evidence of trustworthiness or just evidence of repeated automated activity?
McKinsey places 50% to 60% of banking activities in scope
McKinsey estimates that 50% to 60% of banks’ full-time equivalents are related to business operations. Experts warn of a “pilot purgatory,” where institutions conduct limited proofs of concept without rewiring the operating model.
As Cryptopolitan reported at the Hong Kong Web3 Festival, McKinsey predicted that the agentic AI market would grow from $5.25 billion in 2024 to around $200 billion in 2034.
Porter Stowell, CEO of W3.io, said: “There’s no way for companies to see, control or audit what autonomous systems are doing with their money. Human oversight isn’t going away. It’s just moving up.”
Four questions remain unresolved: who is responsible if an AI agent causes financial loss, whether its reputation can be trusted, who is in control once these systems are widely deployed, and what regulatory framework applies if an agent acts outside its purview.
