High Availability

What Is High Availability in Payments? Definition and How It Works

Definition

High availability in payments refers to the architectural design of payment infrastructure to minimise downtime and ensure transaction processing continues without interruption, typically expressed as a percentage uptime target such as 99.99%.

How it works

High availability is achieved through redundancy at every layer of the payment stack: redundant servers, redundant network connections, redundant data centres, and redundant routing paths. No single component failure should take the entire system offline. This is typically implemented through active-active or active-passive architectures where a standby component takes over automatically when the primary fails.

A 99.99% uptime target (four nines) equates to approximately 52 minutes of allowable downtime per year. A 99.999% target (five nines) allows approximately 5 minutes of downtime per year. These figures sound similar but the operational gap is significant: achieving five nines requires automated failure detection and recovery in seconds, not minutes.

Payment infrastructure high availability requirements are more demanding than most web applications because transactions are synchronous, time-sensitive, and financially consequential. An e-commerce site experiencing a 30-second outage loses the conversions during that window. A payment processing outage during that window fails those transactions entirely; those sales cannot be recovered after the fact.

Geographic redundancy is a component of high availability for global merchants: infrastructure distributed across multiple data centre regions ensures that a regional outage or network event does not take all processing capacity offline. For merchants operating across multiple time zones, geographic distribution also enables lower-latency authorisation responses.

Why it matters

Uptime percentages translate directly to revenue impact: for a merchant processing $500M annually, each hour of payment downtime represents roughly $57,000 in lost transaction volume at average conversion. Understanding the revenue cost of downtime in concrete terms makes SLA negotiation with payment providers more grounded.

SLA terms must be read carefully: providers often define uptime based on their platform health page, excluding scheduled maintenance windows or incidents attributable to third parties (card networks, banks). Merchants should understand what is and is not counted in the SLA before relying on it.

Redundancy at the provider level does not eliminate merchant-side risk: if the merchant's checkout system, order management, or fraud tools have single points of failure, payment infrastructure uptime becomes irrelevant during those outages. End-to-end availability requires resilience across all components in the checkout flow.

Testing availability matters as much as the architecture: redundancy that has never been tested under realistic failure conditions may not perform as designed. Merchants should ask providers about their disaster recovery test cadence and failure simulation practices.

With PXP

PXP's cloud-native, microservices architecture is built for high availability, with automatic failover across acquirer connections at the routing layer. PXP provides real-time system status monitoring and SLA-backed uptime commitments for enterprise merchants, with failover handled without merchant-side intervention.

Talk to a payments specialist

Frequently asked questions

What does 99.99% uptime mean in practice for payment processing?

99.99% uptime equates to approximately 52 minutes of allowable downtime per year, or about 4 minutes per month. For a payment processing system, this means all planned maintenance, unplanned failures, and recovery time must fit within that budget. Achieving this requires automated failure detection, rapid failover, and zero-downtime deployment practices.

How is high availability different from disaster recovery?

High availability focuses on preventing downtime through redundancy so failures are handled automatically without service interruption. Disaster recovery addresses how systems are restored after a major failure that exceeds the high availability design, such as a data centre loss or catastrophic infrastructure event. High availability minimises the frequency and duration of outages; disaster recovery defines the response when they occur anyway.

What should merchants ask payment providers about their availability architecture?

Key questions include: what is the uptime SLA and how is it calculated; are maintenance windows excluded from uptime calculations; what is the architecture for geographic redundancy; what is the mean time to recovery for different failure scenarios; how often is disaster recovery tested; and what is the process for merchant notification during an incident?

How does geographic distribution improve payment availability?

Distributing payment infrastructure across multiple data centre regions means a regional network event, power outage, or data centre failure does not take all processing capacity offline. For global merchants, regional distribution also reduces authorisation latency by routing transactions to the nearest processing node. Most enterprise-grade payment providers operate across at least two geographic regions.

Revolutionize your business with PXP

Take complete control of your commerce and payments with one platform.

Get Started

High Availability

Definition

How it works

Why it matters

With PXP

Frequently asked questions

What does 99.99% uptime mean in practice for payment processing?

How is high availability different from disaster recovery?

What should merchants ask payment providers about their availability architecture?

How does geographic distribution improve payment availability?

related terms

Failover

Payment Switch

Multi-Acquirer Setup

Smart Routing

Settlement

Revolutionize your business with PXP