Fraud Score
What Is a Fraud Score? Definition and How It Works
Definition
A fraud score is a real-time numerical risk rating assigned to a payment transaction based on signals derived from the transaction data, device, behavioural patterns, and historical data, used to determine whether to approve, decline, or escalate the transaction for review.
How it works
Fraud scoring systems evaluate an incoming transaction against a set of signals and assign a score, typically on a scale of 0 to 100 or 0 to 1000, representing the estimated probability that the transaction is fraudulent. Higher scores indicate higher fraud risk. The score is computed and returned in milliseconds, in parallel with or preceding the authorisation request.
The signals feeding a fraud score fall into several categories. Transaction-level signals include card BIN characteristics, transaction amount, currency, merchant category, and transaction velocity for the card. Device and session signals include IP address, device fingerprint, browser characteristics, and geolocation. Behavioural signals include navigation patterns before checkout, typing speed, and time on page. Historical signals include the card's prior transaction history and any previous declines or disputes.
Machine learning models trained on labelled fraud and non-fraud transaction data are the basis for most modern fraud scoring engines. The model learns which signal combinations are predictive of fraud in the merchant's specific transaction population. Models require periodic retraining as fraud patterns evolve.
Fraud scores are consumed by risk rules: thresholds set by the merchant determine what action is taken at each score level, approve automatically below a low threshold, escalate to manual review between thresholds, or decline above a high threshold. Threshold calibration determines the trade-off between fraud loss and false-positive rate.
Why it matters
Threshold calibration directly affects both fraud loss and conversion: a high fraud threshold blocks more fraud but also declines more legitimate transactions (false positives). Merchants must measure both the fraud rate they accept and the false positive rate they generate to find the threshold that optimises total revenue net of fraud costs.
Single-signal rules produce high false positive rates: rules based on one signal (e.g., block all transactions from country X) are a blunt instrument that blocks legitimate customers at high rates. Multi-signal ML scoring produces fewer false positives for the same level of fraud protection.
Fraud models require maintenance: fraud attack patterns shift over time as attackers adapt to detection. A model trained on data from 12 months ago may perform materially worse on today's transaction population. Model drift monitoring and periodic retraining are part of fraud system maintenance.
Score-level data should inform chargeback root cause analysis: correlating fraud chargeback outcomes against the scores assigned at the time of the transaction reveals whether the scoring model flagged the fraud before the chargeback and whether the threshold was set correctly.
With PXP
PXP's risk engine assigns real-time fraud scores to transactions using a combination of rule-based checks and ML-assisted scoring. Score thresholds are configurable per merchant and per transaction type. Score data is retained and surfaced in reporting to support threshold tuning and fraud pattern analysis.
Frequently asked questions
What signals have the most predictive value in fraud scoring?
It varies by merchant vertical and fraud type, but high-value signals typically include device fingerprint (a device associated with prior fraud is a strong signal), transaction velocity on the card, IP-to-billing address geolocation mismatch, and BIN country mismatch. No single signal is reliable in isolation; the combination of signals and their interaction is what modern ML models are trained to evaluate.
How do merchants set fraud score thresholds?
Threshold setting requires data on fraud rates and false positive rates at each score band. Merchants typically start with a conservative threshold, measure the false positive rate (legitimate transactions declined), and adjust until the trade-off between fraud loss and blocked revenue is acceptable. Most fraud systems support separate thresholds by transaction type, card type, or geography.
What is model drift in fraud detection?
Model drift occurs when the statistical relationship between input signals and fraud outcomes changes over time, causing a model trained on historical data to perform less accurately on current transactions. Fraud attackers actively adapt their tactics to evade detection, which accelerates drift. Monitoring model performance metrics, fraud capture rate, false positive rate, and retraining on fresh labelled data are the standard mitigations.
How does 3DS interact with fraud scoring?
3DS and fraud scoring are complementary rather than redundant. Fraud scoring identifies high-risk transactions; 3DS provides strong authentication for transactions that proceed to authorisation. A common approach is to use fraud scoring to determine whether to request a 3DS exemption (for low-risk transactions) or to enforce 3DS challenge (for high-risk ones), using the score as input to the SCA exemption logic.
Revolutionize your business with PXP
Take complete control of your commerce and payments with one platform.
Get Started