From Days to Minutes: How We Built an AI Agent That Acts on Gold Abuse Before the Damage Compounds
The following is a guest editorial courtesy of Jarosław Klamut PhD, Head of Risk at trading technologies provider Match-Trade Technologies, a strategic technology supplier to CySEC licensed liquidity solutions provider Match-Prime.
Gold is a universally recognized financial instrument, trading as a truly global market influenced less by any single domestic economy and more by cross-market forces such as risk sentiment, inflation expectations, and geopolitical developments. This broad appeal and constant relevance keep it among the world’s most actively traded assets. Within our company, gold stands out as one of the most liquid and consistently traded instruments across our client base. Its high demand and continuous activity not only attract legitimate traders but also make it a prime target for potential abuse.
This is not about isolated incidents or random anomalies. We are seeing structured, recurring patterns in which certain clients exploit market conditions to generate a consistent, unfair advantage at the brokerage’s expense. These cases rarely involve a single account. More often, they surface as clusters of related users forming a trading clique, with similar timing and behavior. The problem is real, the financial impact is material, and the operational challenge lies in speed. Among confirmed abusive accounts, the mean profit extracted per account is measured in thousands of dollars – and when the full clique is taken together, that figure can easily reach tens of thousands in a single night.
By the time a traditional investigation process has identified the case, reviewed the evidence, and approved a response, the damage has already been done.
Why Detection Alone Is Not Enough
In the CFD and FX brokerage environment, abusive client flow is not just a surveillance concern but a direct commercial risk. It can create immediate PnL leakage, distort the broker’s view of client quality and risk, push the dealing desk into reactive decision-making, and expose weaknesses in execution, hedging, and control processes.
For market maker brokers servicing other brokers and their underlying retail flow, the challenge is even greater because visibility is often partial. In many cases, abusive behavior must be inferred from aggregate flow rather than a fully transparent end-client view.
The difficulty, however, is not simply detecting suspicious events. The harder problem is deciding quickly and responsibly which cases warrant action. In noisy markets, false positives are a constant risk. Legitimate trading can resemble abuse, and overreacting reduces liquidity and harms the trading experience we want to preserve. At the same time, relying on manual review creates false negatives in practice. Some cases are more serious than they first appear, yet they remain unrestrained for hours while the impact builds.
That tension is why our solution was built as a layered decision engine, not a simple alerting tool.
What We Needed Was Not More Alerts
Our solution was created for a specific, recurring pattern of abusive gold flow, not for suspicious trading in the abstract. The behavior is structurally consistent, and marked by execution characteristics that are difficult to dismiss as random.
What mattered most, however, was not improving detection. HawkEye RMS, our first-line surveillance system, was already identifying suspicious gold activity in real time, supported by the broader infrastructure already in place – risk databases, order history, and price feeds. We did not need to rebuild the surveillance stack.
The real gap came after the alert. Until now, every suspicious case required manual investigation, escalation, decision-making, and eventual restriction – often taking hours, and in complex situations days. In fast-moving abuse scenarios, that delay is exactly where the damage accumulates. So the problem we set out to solve was not how to create more alerts, but how to turn a credible detection into immediate, governed action. In practice, that meant building an AI agent that operates 24/7, never sleeps, and protects the broker while the opportunity to intervene still exists, including when human teams are offline.
Three Phases, One Goal
Our main challenge was not a lack of signals or data, but the volume and complexity behind them. To act quickly without sacrificing decision quality, we designed the system as a layered funnel. Each layer filters out what can be safely dismissed, which means recall stays high and genuinely harmful behavior is not lost. Every subsequent phase looks at fewer cases, but with more context and more rigor.
Phase 1: Surveillance identifies suspicious activity
HawkEye RMS flags cases that match the profile of suspicious gold trading. This is pattern detection at scale – monitoring large volumes of trading activity and surfacing candidates that warrant closer examination. The surveillance layer identifies potential abuse, but it does not confirm it.
By fine-tuning HawkEye RMS’s parameters, we can easily filter out around 90% of standard cases and focus in more detail on an order of magnitude fewer abnormal cases.
Phase 2: Analytics build the case
When a case is flagged as a potential abuse, the system pulls together everything relevant about that client’s recent trading activity. It analyses position dynamics and execution patterns over a meaningful time window, as well as calculates deeper quantitative evidence. This layer follows a classical data science approach, relying on statistical relationships between engineered expert metrics calibrated on historical cases. This allows us to set thresholds and decision boundaries that separate structurally suspicious patterns from coincidental market noise.
This phase constructs quantitative evidence, relying on knowledge derived from historical data. However, a single alert from Phase 2 is still not sufficient grounds for a final decision, but is built on the statistical grounding which determines whether the case is truly abnormal and suspicious. On this basis, we can again filter out around 90% of the remaining cases while maintaining recall at almost 100%. This approach ensures that, at the next phase, only a small number of cases remain, allowing us to apply heavier analytical methods such as AI, which should enable autonomous, highly reliable decision-making and action-taking.
Phase 3: AI evaluates and decides
After quantitative validation, the case reaches the final stage. At this point, we leverage recently developed and now widely adopted solutions built around AI agents. The Risk AI Agent is where the system shifts from measurement to judgment. It is a tuned AI decision layer that reviews a prepared evidence package rather than raw data, with key signals distilled into a structured, interpretable summary of what the behavior represents. It performs the final decision step with human-level precision and consistency, but at machine speed and around the clock. When the criteria are met, the AI triggers the restriction immediately without waiting for manual approval.
At this step, we filter out the remaining 50% of cases, which is crucial for not disrupting standard trading. The Risk AI Agent is a key stage that, with expert-like general scrutiny, can determine whether all suspicious markers combined indicate merely an atypical trader or a clear sign of abusive flow.
AI as the Primary Decision-Maker
When all the evidence gates are satisfied, the system sends a trading restriction directly into our risk management infrastructure. The abusive client’s gold trading activity is constrained automatically and immediately, without a human approving the action first. The risk team receives a full notification with charts, quantitative results, and the AI’s reasoning, but by the time they read it, the restriction is already active.
This represents a transformation that shifts humans away from constant monitoring, decision-making, and manual intervention toward a supervisory role focused on overseeing and validating the workflows executed by AI agents.

Core Strengths of the AI Agent
Near-real-time protection
The system operates within a timeframe that is still commercially and operationally relevant. Its role is not simply to analyse suspicious behavior after the fact, but to constrain it while intervention can still make a difference. In practice, it delivers a level of decision speed that is effectively unattainable for humans, even with a 24/7 dealing team monitoring the flow.
Autonomous first action
The AI does more than generate a recommendation. It makes the initial operational decision and triggers the restriction autonomously, with human validation rather than upfront approval taking place afterward.
Layered decision quality
The solution combines surveillance triggers, statistical evidence, and AI judgment instead of trusting one layer in isolation. This kind of ensemble of complementary approaches consistently delivers more robust decisions than relying on one model alone.
Explainable autonomy
The system is designed for speed, but not at the expense of control. Each action remains reviewable, traceable, and defensible, and the AI agent can be asked not only to make a decision but also to explain the reasoning behind it.
Business alignment
The design reflects the realities of dealing-desk risk management, where a delayed perfect decision is often less valuable than a fast, well-governed, high-quality one.
Flexibility and rapid iteration
Unlike classical ML models, the AI decision layer can be adapted to new abuse variants or structurally similar use cases without a full retraining cycle. In a highly paced market where abuse patterns constantly change, relying on a heavy, rigid ML model becomes a liability. A highly adaptive agentic AI approach remains far more flexible, making the solution both specialized today and extensible as operational needs evolve.
What This Means in Practice
One in three abusers extracts a profit exceeding $25,000. And because abuse almost never surfaces in isolation – coordinated cliques of related accounts are the norm rather than the exception – the combined impact per incident routinely compounds far beyond what any single account suggests. Every hour without an automated response is a measurable, avoidable loss. That is precisely why the system we built represents a fundamental shift in how gold abuse is managed, replacing a multi-day manual process with a near-real-time response that acts before losses have the chance to accumulate.
Rather than relying on a single alert or model output, it brings together upstream surveillance, deeper statistical validation, and AI-led final decisioning in one coordinated process. When the evidence is sufficient, action is taken immediately, without manual approval or delay. The result is a reduction in effective time-to-action from days to minutes.
All this represents a transformation in which humans no longer need to continuously monitor, make decisions, and intervene manually. Instead, they take on a supervisory role, overseeing the process and validating the workflows executed by AI agents.
For a closer look at how our AI agent was designed – from surveillance triggers and evidence construction to the governance model behind safe autonomous action – read the second part of the article (coming soon!).
