Zürcher Nachrichten - Clockwork Launches FleetIQ, the Software Layer That Recasts GPU Economics

EUR -
AED 4.278489
AFN 76.301366
ALL 96.530556
AMD 444.389335
ANG 2.085119
AOA 1068.154458
ARS 1670.316609
AUD 1.75427
AWG 2.096704
AZN 1.984845
BAM 1.955415
BBD 2.345238
BDT 142.439297
BGN 1.957372
BHD 0.439074
BIF 3456.06653
BMD 1.164835
BND 1.508396
BOB 8.046379
BRL 6.313529
BSD 1.16437
BTN 104.690912
BWP 15.469884
BYN 3.34764
BYR 22830.773166
BZD 2.341828
CAD 1.611422
CDF 2599.912958
CHF 0.937162
CLF 0.02734
CLP 1072.545921
CNY 8.235507
CNH 8.234944
COP 4446.759008
CRC 568.78787
CUC 1.164835
CUP 30.868137
CVE 110.780379
CZK 24.198994
DJF 207.014999
DKK 7.469472
DOP 74.84113
DZD 151.385181
EGP 55.40272
ERN 17.47253
ETB 180.60972
FJD 2.630723
FKP 0.8723
GBP 0.873382
GEL 3.149553
GGP 0.8723
GHS 13.337819
GIP 0.8723
GMD 85.033396
GNF 10119.511721
GTQ 8.919242
GYD 243.610929
HKD 9.068302
HNL 30.667954
HRK 7.538703
HTG 152.42995
HUF 382.163892
IDR 19442.733022
ILS 3.76907
IMP 0.8723
INR 104.795933
IQD 1525.399284
IRR 49054.133779
ISK 149.006189
JEP 0.8723
JMD 186.373259
JOD 0.825914
JPY 180.836077
KES 150.617641
KGS 101.8653
KHR 4665.166047
KMF 491.560932
KPW 1048.343898
KRW 1715.709753
KWD 0.357232
KYD 0.970405
KZT 588.861385
LAK 25249.913875
LBP 104272.296288
LKR 359.159196
LRD 204.939598
LSL 19.73441
LTL 3.439456
LVL 0.704598
LYD 6.329752
MAD 10.752872
MDL 19.812009
MGA 5193.953775
MKD 61.627851
MMK 2446.083892
MNT 4131.091086
MOP 9.337359
MRU 46.433846
MUR 53.664406
MVR 17.950554
MWK 2019.093291
MXN 21.176696
MYR 4.788683
MZN 74.437324
NAD 19.73441
NGN 1689.139851
NIO 42.851552
NOK 11.767103
NPR 167.505978
NZD 2.016522
OMR 0.447885
PAB 1.164465
PEN 3.914028
PGK 4.940241
PHP 68.699705
PKR 326.441746
PLN 4.232667
PYG 8008.421228
QAR 4.244263
RON 5.093014
RSD 117.420109
RUB 89.113003
RWF 1694.158743
SAR 4.371861
SBD 9.5794
SCR 15.722146
SDG 700.652754
SEK 10.953705
SGD 1.509027
SHP 0.873928
SLE 26.791608
SLL 24426.013032
SOS 664.266196
SRD 44.99647
STD 24109.740275
STN 24.495171
SVC 10.187374
SYP 12881.033885
SZL 19.719113
THB 37.125677
TJS 10.683448
TMT 4.076924
TND 3.415727
TOP 2.804644
TRY 49.510866
TTD 7.893444
TWD 36.432793
TZS 2836.374505
UAH 48.875802
UGX 4119.187948
USD 1.164835
UYU 45.541022
UZS 13930.253805
VES 289.561652
VND 30705.060237
VUV 142.19158
WST 3.250066
XAF 655.824896
XAG 0.019865
XAU 0.000276
XCD 3.148026
XCG 2.098577
XDR 0.815408
XOF 655.723589
XPF 119.331742
YER 277.700931
ZAR 19.720255
ZMK 10484.920268
ZMW 26.920577
ZWL 375.076512
  • RBGPF

    0.0000

    78.35

    0%

  • CMSC

    -0.0500

    23.43

    -0.21%

  • NGG

    -0.5000

    75.41

    -0.66%

  • CMSD

    -0.0700

    23.25

    -0.3%

  • GSK

    -0.1600

    48.41

    -0.33%

  • RIO

    -0.6700

    73.06

    -0.92%

  • AZN

    0.1500

    90.18

    +0.17%

  • BCC

    -1.2100

    73.05

    -1.66%

  • SCS

    -0.0900

    16.14

    -0.56%

  • BTI

    -1.0300

    57.01

    -1.81%

  • RELX

    -0.2200

    40.32

    -0.55%

  • JRI

    0.0400

    13.79

    +0.29%

  • BCE

    0.3300

    23.55

    +1.4%

  • VOD

    -0.1630

    12.47

    -1.31%

  • RYCEF

    -0.1600

    14.49

    -1.1%

  • BP

    -1.4000

    35.83

    -3.91%

Clockwork Launches FleetIQ, the Software Layer That Recasts GPU Economics
Clockwork Launches FleetIQ, the Software Layer That Recasts GPU Economics

Clockwork Launches FleetIQ, the Software Layer That Recasts GPU Economics

Uber accelerates incident detection, DCAI speeds up AI training and cluster efficiency, and Nebius improves MTBF in large-scale distributed AI training-all powered by Clockwork's first-of-a-kind Software-Driven-Fabric.

Text size:

Uber accelerates incident detection, DCAI speeds up AI training and cluster efficiency, and Nebius improves MTBF in large-scale distributed AI training-all powered by Clockwork's first-of-a-kind Software-Driven-Fabric.

PALO ALTO, CA / ACCESS Newswire / September 10, 2025 / Clockwork, the company redefining how enterprises run large-scale AI infrastructure, today announced FleetIQ, a first-of-its-kind Software-Driven Fabric (SDF) built to maximize GPU utilization, accelerate AI job performance, increase infrastructure reliability, and cut infrastructure waste. The launch marks a strategic expansion that extends Clockwork's Cloud capabilities-sub-microsecond visibility and cluster performance acceleration-into the AI and GPU domain, while adding stateful fault tolerance to prevent AI job crashes and slowdowns.

By transforming idle silicon into productive intelligence, Clockwork's FleetIQ empowers enterprises, neoclouds, and hyperscalers to unlock greater performance from the same GPUs-delivering AI that is faster, more reliable, energy-efficient, and economically sustainable. This first-of-its-kind technology has attracted strong industry support. Led by existing investor NEA, Clockwork closed a new funding round at four times the valuation of its previous raise just two years ago. The round welcomed distinguished new backers, including Intel CEO Lip-Bu Tan, former Cisco CEO John Chambers, venture pioneer Carl Ledbetter, and e& Capital. Underscoring this momentum, the company appointed industry veteran Suresh Vasudevan as Chief Executive Officer and Joe Tarantino as Vice President of Worldwide Sales.

As AI moves from research to production, the bottleneck has shifted from raw compute to communication-between GPUs, across clusters, and across clouds. Most training jobs today run on NVIDIA or AMD GPUs with NVLink, InfiniBand and RoCE networks, but the challenge is universal: large GPU fleets must stay perfectly synchronized, and if even one link lags, the entire job pauses. In practice, this leads to what's known as the "AI efficiency gap." Despite massive investment, real-world GPU clusters achieve only ~30-55% of their theoretical performance, with further losses from disruptive faults and link failures occurring multiple times a week in large-scale systems. At scale, the cost of this inefficiency is staggering. Training today's largest foundation models on a 100,000-GPU cluster-a $5-7 billion investment-can waste more than $2.25 billion in unused capacity. Closing this gap is now one of the defining challenges in scaling AI.

FleetIQ, Clockwork's Software-Driven Fabric, directly addresses this challenge. Cofounded by Yilong Geng, Deepak Merugu and their PhD supervisor Professor Balaji Prabhakar, Clockwork's foundational technology-software-based highly accurate Global Clock Sync and Dynamic Traffic Control-was developed at Stanford. FleetIQ leverages this foundation and delivers microsecond-level visibility across fleets and workloads to rapidly pinpoint slowdowns and failures. It adds stateful fault tolerance that keeps jobs running when links fail, avoiding costly AI job restarts; and boosts throughput with real-time, path-aware routing that eliminates contention and congestion. FleetIQ is hardware-agnostic, running across heterogeneous environments-NVIDIA, AMD, and custom accelerators; NCCL and RCCL; InfiniBand and Ethernet/RoCE-on-prem or in the cloud. The result: faster AI jobs and consistently high cluster utilization.

"AI has become the most distributed and demanding application in human history, and the next decade of AI infrastructure will belong to those who master communication between GPUs, between clusters, and clouds. Communication is the new Moore's Law: the defining constraint to overcome for scale. At Clockwork, we are pioneering a Software-Driven Fabric (SDF)-an intelligent abstraction layer between workloads and infrastructure-that observes, predicts, and controls in real time, dynamically aligning application requirements and fabric behavior. This is not just a technical breakthrough. It enables organizations to achieve more with the same infrastructure. FleetIQ will make AI more economically viable for the decade ahead."

- Suresh Vasudevan, CEO, Clockwork

"As AI infrastructure scales to tens of thousands of GPUs for training and inference, the bottleneck has shifted from compute to communication. With accelerators running in lockstep, a single link flap, congestion spike or straggler can stall progress and crater utilization. The operational priority is utilizing real-time fabric visibility for faster fault isolation and recovery to keep workloads moving instead of looping through costly restarts. And as Mixture of Experts (MoE) models with high rank expert parallelism proliferate, the all-to-all exchange intensifies, raising the bar even higher for GPU communication efficiency."

- Dylan Patel, Founder, CEO, and Chief Analyst, SemiAnalysis

FleetIQ improves overall cluster efficiency by extending end-to-end visibility and control across both CPU-driven front-end and GPU-powered back-end networks. This enables teams to run training, inference and user-facing applications concurrently on the same cluster, improving economics, shortening time-to-market, and simplifying operations. FleetIQ works across Ethernet, InfiniBand, and RoCE, supports heterogeneous GPU environments, and requires no proprietary hardware.

Clockwork's impact is already being felt by enterprises like Uber:

"At Uber, we tackle real-time logistics problems where every millisecond matters-latency spikes don't just hurt customer experience, they directly impact driver retention and revenue. In our tests across a hybrid, multi-cloud environment, Clockwork delivered significant coverage and accuracy improvements over networking observability. Their unique innovation can greatly help Uber expedite the detection and fault-localization of networking issues: from hours to minutes, which will greatly improve service tail latency and prevent noisy neighbor impact.

We are in the process of rolling out Clockwork across Uber infrastructure, and look forward to experiencing their full capabilities at Uber's scale. Clockwork's software-driven fabric provides foundational observability for the hybrid, multi-cloud environment, helping us deliver what matters most: improved infrastructure utilization, enhanced resiliency, and ultimately, a better experience for the millions of people who rely on our platform every day."

- Albert Greenberg, Chief Architect Officer, Uber

A vendor-neutral fabric for neocloud and enterprise AI at scale

As countries invest in sovereign, sustainable AI infrastructure, FleetIQ provides a hardware-agnostic control layer that raises cluster reliability and availability, delivering faster service for providers and better experiences for end users-making Clockwork a strategic design partner for neoclouds and enterprises.

"We have been working with Clockwork to evaluate their software-driven fabric on our AI infrastructure, and seeing meaningful improvements in reliability. This is exactly what our customers need when running large-scale AI workloads where any disruption can be costly. We like how this approach works across different network configurations without requiring hardware lock-in. As we continue to scale our infrastructure, solutions that focus on the communication layer-which is often a bottleneck-are becoming increasingly important for delivering the performance and reliability that our customers expect."

- Danila Shtan, CTO, Nebius

"At NScale, we are building the foundation for AI at planetary scale-making it faster, more efficient, and more resilient for the world's most ambitious organizations. To do that, we seek partners who share our vision for redefining what's possible. Clockwork's approach aligns perfectly with ours, and together we're creating an AI infrastructure that is not only powerful and reliable, but ready to support the most demanding innovations of the future."

- David Power, CTO, NScale

"At WhiteFiber, Clockwork helps us deploy GPU clusters faster and with greater consistency. Their observability and rapid localization of fabric issues not only reduce deployment times but also validate the reliability of our infrastructure, ensuring clients' AI workloads run on clusters built for performance, resilience and scale."

- Tom Sanfillippo CTO, White Fiber

Other European validation includes DCAI, operator of Gefion, Denmark's flagship AI supercomputer:

"Our mission at DCAI is to remove barriers to high-performance AI infrastructure-not only to serve researchers, startups, and enterprises today, but also to build the sovereign foundations of tomorrow's innovation economy. Gefion is a game-changing resource driving breakthroughs in quantum computing, drug discovery, advanced weather forecasting and beyond. To succeed, we must deliver resilience, reliability and efficiency at an unprecedented scale-performance once reserved for hyperscalers. Partnering with Clockwork enables us to operate Gefion seamlessly and reliably, even as workloads and demands increase.

The result is a compute-efficient, fault-tolerant infrastructure that researchers and industries can trust-lowering costs, eliminating wasted GPU cycles, and helping us deliver a sovereign AI capability second to none."

- Dr. Nadia Carlsten, CEO, DCAI

Notable industry validation

"At Broadcom, our focus has always been on delivering Ethernet-centric infrastructure that scales AI with both performance and efficiency. Clockwork's software-driven fabric adds an essential layer of agility and observability that enhances the power of our silicon. With proactive fleet monitoring and seamless failover, Clockwork enables platforms such as our Tomahawk 6 and Jericho4 to realize their full potential in flexibility, uptime, and AI performance. Together, we're driving open, adaptable fabrics that allow enterprises to build AI infrastructure that is resilient, high-performing, and future-ready."

- Ram Velaga, Senior Vice President and General Manager, Core Switching Group, Broadcom

"MI350X series systems with ROCm software and Pollara NICs provide a strong foundation for performance and reliability in AI training and inference. As deployments expand, ecosystem innovation, such as Clockwork's software-driven approach, adds complementary capabilities that help ensure efficiency and consistency at scale."

- Vamsi Boppana, SVP, AI, AMD

Clockwork's launch of FleetIQ is paired with seasoned leadership to scale the new category. Suresh Vasudevan joins as CEO, a trusted technology leader with a track record of category creation and IPO-scale growth. He built Nimble Storage from $0 to $500M in revenue and an IPO, grew Sysdig from $5M to over $100M in ARR and into a category leader in container and cloud security, and served as Chief Product Officer at NetApp. "I am excited Suresh has joined us as CEO. He brings an exceptional combination of go-to-market leadership and product building experience required to scale entirely new categories, positioning Clockwork to enter its next phase of hypergrowth-delivering enterprise-grade, scalable infrastructure that meets the demands of the next generation of AI workloads," said Balaji Prabhakar.

The company has also appointed Joe Tarantino as Vice President of Worldwide Sales-a proven sales leader instrumental in scaling Cohesity's growth who, most recently, served at GMI Cloud, a top 10 NVIDIA neocloud partner-underscoring Clockwork's strategic value. "While at GMI Cloud, I witnessed the explosive growth of enterprise AI and how quickly customers were consuming GPUs," said Tarantino. "Clockwork helps them accelerate those investments by optimizing performance and resilience, and by taking AI initiatives from prototype to production faster."

To learn more about Clockwork and the FleetIQ Platform, visit www.clockwork.io.

About Clockwork
Clockwork is the Software-Driven Fabric company for AI and high-performance workloads. Its FleetIQ intelligence layer addresses AI's core scaling bottleneck: communication between GPUs, clusters, and clouds. Positioned between workloads and infrastructure, FleetIQ observes, predicts, and controls in real time-maximizing GPU utilization and accelerating job completion. Delivered in pure software across heterogeneous networks, it transforms communication from the weakest link into a unified control plane that classifies flows by intent, steers around hot paths, and paces traffic to protect priority work-unlocking dramatically higher utilization. By turning idle silicon into productive intelligence, Clockwork enables enterprises, neoclouds, and hyperscalers to do more with the same GPUs-making AI faster, reliable, efficient, and sustainable. Companies including Uber, Wells Fargo, DCAI, Nebius, Nscale, and White Fiber trust Clockwork to power their most demanding workloads. Learn more at www.clockwork.io.

Media Contact
Dana Trismen
[email protected]

SOURCE: Clockwork



View the original press release on ACCESS Newswire

O.Pereira--NZN