HostingJournalist
banner
hostingjournalist.bsky.social
HostingJournalist
@hostingjournalist.bsky.social
HostingJournalist is your global industry news portal covering the worldwide business of cloud, hosting and data center. With HostingJournalist Insider, companies can self-publish press releases, blogs, events, and more.
Xen 4.21 Expands Performance, Security for Cloud and Automotive
The Xen Project has released Xen 4.21, marking one of the hypervisor’s most substantial modernization steps in recent years as it expands its role across cloud, data center, automotive, and emerging embedded workloads. The new release updates core toolchains, improves x86 performance efficiency, strengthens security on Arm-based platforms, and introduces early RISC-V enablement for future architectures. Hosted by the Linux Foundation, the open-source virtualization platform continues to evolve beyond its roots as a cloud hypervisor, aiming to serve as a unified foundation for compute environments ranging from hyperscale servers to safety-critical vehicle systems. For cloud providers, data center operators, and virtualization vendors, Xen 4.21 brings measurable performance improvements. Enhancements to memory handling, cache management, and PCI capabilities on x86 promise higher VM density and improved performance per watt - an increasingly important metric as operators refine infrastructure for AI, GPU-accelerated workloads, and large-scale multitenant environments. The release introduces a new AMD Collaborative Processor Performance Control (CPPC) driver, allowing finer-grained CPU frequency scaling on AMD platforms. Combined with an updated page-index compression (PDX) algorithm and support for resizable BARs in PVH dom0, the update is designed to extract more capability from modern multi-core CPUs without demanding architectural rewrites from operators. Xen’s role in the automotive and embedded sectors continues to expand as the industry shifts toward software-defined vehicles powered by heterogeneous SoCs. Xen 4.21 includes expanded support for Arm-based platforms with new security hardening, stack-protection mechanisms, MISRA-C compliance progress, and features designed to meet the stringent requirements of safety-certifiable systems. The release adds support for eSPI ranges on SoCs with GICv3.1+ and introduces advancements to dom0less virtualization - an architecture increasingly used in automotive deployments to isolate workloads such as infotainment, digital instrument clusters, and advanced driver-assistance systems. Demonstrations by AMD and Honda at Xen Summit 2025 showcased the hypervisor running on production-grade automotive hardware, signaling growing industry readiness. RISC-V support also advances with the addition of UART and external interrupt handling in hypervisor mode. While full guest virtualization is still under development, this early work lays the groundwork for future RISC-V systems that may require secure workload isolation in edge, automotive, or custom compute environments. Hypervisor Modernization Cody Zuschlag, Community Manager for the Xen Project, said the 4.21 release reflects a broader modernization strategy. “We’re modernizing the hypervisor from the inside out: updating toolchains, expanding architecture support, and delivering the performance that next-generation hardware deserves. It’s exciting to see Xen powering everything from next-generation cloud servers to real-world automotive systems,” he said. Toolchain updates represent one of the most significant architectural shifts in the release. Xen 4.21 raises minimum supported versions of GCC, Binutils, and Clang across all architectures - an essential but complex step that reduces technical debt and improves the platform’s long-term security and maintainability. The update also formalizes support for qemu-xen device models inside Linux stubdomains, an approach favored by security-focused Linux distributions, including QubesOS. The Xen Project remains backed by a wide ecosystem of contributors from AMD, Arm, AWS, EPAM, Ford, Honda, Renesas, Vates, XenServer, and numerous independent maintainers. Enterprise vendors leveraging Xen for commercial offerings welcomed the update. Citrix, for example, emphasized improvements that translate into better performance and reliability for users of XenServer. “Updates like the newly introduced page index compression algorithm and better memory cache attribute management translate into better performance and improved scalability for all our enterprise XenServer users,” said Jose Augustin, Product Management at Citrix. Arm echoed the significance of the release for software-defined automotive and edge platforms. “Virtualization is becoming central to how automotive and edge systems deliver safety, performance, and flexibility,” said Andrew Wafaa, Senior Director of Software Communities at Arm. “By expanding support for Arm Cortex-R technology, the latest Xen 4.21 release will help advance more scalable, secure, and safety-critical deployments on Arm-based platforms.” As cloud and AI workloads accelerate, and automotive manufacturers adopt virtualization for isolation and safety, Xen continues to position itself as a hypervisor built for the next generation of distributed compute environments. Xen 4.21 signals not only modernization, but a strategic expansion into industries where performance, resilience, and safety converge. Executive Insights FAQ: The Xen 4.21 Release How does Xen 4.21 improve performance for cloud and data center workloads? The release enhances memory handling, cache efficiency, PCI performance, and CPU scaling - allowing operators to run more virtual machines with lower overhead and greater performance per watt on modern x86 hardware. Why is the automotive sector interested in Xen? Xen’s dom0less architecture, MPU progress, MISRA-C compliance work, and strong isolation capabilities align with automotive safety and reliability requirements for systems such as ADAS, dashboards, and infotainment. What makes this release significant for Arm-based platforms? Xen 4.21 adds stack protection, eSPI support, refined Kconfig options, and Cortex-R MPU progress - key elements for building safety-certifiable embedded and automotive deployments. How far along is RISC-V support? Xen 4.21 introduces early hypervisor-mode capabilities such as UART and external interrupt handling, laying the foundation for full guest support in future releases. Why were toolchain upgrades emphasized in this release? Modern compilers and build tools improve code quality, reduce vulnerabilities, and enable architectural features needed for next-generation hardware - ensuring Xen remains maintainable and secure for long-term industry use.
dlvr.it
November 21, 2025 at 8:25 AM
Palo Alto to Buy Chronosphere for $3.35B to Boost AI Observability
Palo Alto Networks is making one of its most aggressive moves yet in the race to build infrastructure for AI-driven enterprises. The cybersecurity giant announced a definitive agreement to acquire Chronosphere, a fast-growing observability platform engineered to handle the scale, latency, and resilience requirements of modern cloud and AI workloads. The $3.35 billion acquisition signals Palo Alto Networks’ intention to unify telemetry, AI automation, and security into a single data platform capable of supporting the next wave of hyperscale applications. At the center of this deal is an industrywide shift: AI data centers and cloud-native environments now depend on uninterrupted uptime, deterministic performance, and the ability to detect and remediate failures instantly. Observability - once a domain of dashboards and log aggregation - has become mission-critical infrastructure. For Palo Alto Networks, Chronosphere represents the architectural foundation for this new reality. Chronosphere’s platform was built for organizations operating at extreme scale, including two leading large language model providers. Its architecture emphasizes cost-optimized data ingestion, real-time visibility across massive cloud environments, and resilience under unpredictable workloads. Chronosphere has also gained industry validation, recently recognized as a Leader in the 2025 Gartner Magic Quadrant for Observability Platforms. Palo Alto Networks Chairman and CEO Nikesh Arora said Chronosphere’s design aligns with the operational demands of AI-native companies. He emphasized that the acquisition will extend the reach of Palo Alto Networks’ AgentiX, its autonomous security and remediation framework. The combined offering is intended to rethink observability from passive alerting to active, AI-driven remediation. According to Arora, AI agents will be deployed across Chronosphere’s telemetry streams to detect anomalies, investigate root causes, and autonomously implement fixes - turning observability into a real-time automated control plane. Chronosphere co-founder and CEO Martin Mao described the acquisition as the natural next chapter for the company’s mission to “provide scalable resiliency for the world’s largest digital organizations.” He framed Palo Alto Networks as the right strategic match to expand Chronosphere’s capabilities globally, while deepening integration between security data and operational telemetry. Both companies aim to build a consolidated data layer that can keep pace with the explosion of metrics, traces, logs, and events produced by AI-powered infrastructure. Managing Observability Costs Beyond automation, the acquisition reflects rising pressure on enterprises to manage observability costs. Cloud-native architectures generate telemetry at petabyte scale, creating unsustainable ingestion and retention expenses for many organizations. Chronosphere’s optimized pipeline and data transformation technology promise to reduce operational costs by routing, deduplicating, and prioritizing telemetry in ways traditional observability stacks cannot. Chronosphere would bring more than technology. The company reports annual recurring revenue above $160 million as of September 2025, with triple-digit year-over-year growth - an uncommon trajectory in the observability market, which has grown crowded and competitive. Palo Alto Networks expects the acquisition to close in the second half of its fiscal 2026, subject to regulatory approval. The move positions Palo Alto Networks as an emerging heavyweight in observability, setting it up to compete more directly with Datadog, Dynatrace, and New Relic. But unlike its rivals, Palo Alto Networks aims to merge observability with active AI agents and security telemetry, betting that customers will increasingly prioritize unified control across performance, cost, and cyber risk. For enterprises navigating the uncertainty of AI-era operations, the promise of a consolidated observability and remediation engine may prove compelling. As workloads become distributed across clouds, GPUs, edge devices, and emerging AI fabrics, the companies argue that the old model of isolated dashboards can no longer keep up with the volume or velocity of operational data. Instead, the future will require autonomous systems capable of interpreting telemetry and responding in real time - exactly the space Palo Alto Networks hopes to define through this acquisition. Executive Insights FAQ: Palo Alto Networks + Chronosphere What strategic gap does Chronosphere fill for Palo Alto Networks? Chronosphere gives Palo Alto Networks a cloud-scale observability platform optimized for high-volume AI and cloud workloads, enabling unified security and performance visibility. How will AgentiX integrate with Chronosphere’s platform? AgentiX will use Chronosphere’s telemetry streams to deploy AI agents that detect issues, investigate root causes, and autonomously remediate failures across distributed environments. Why is observability suddenly mission-critical for AI workloads? AI data centers require continuous uptime and deterministic performance; observability becomes the real-time sensor layer that ensures reliability and cost-efficient scaling. What financial impact does Chronosphere bring? Chronosphere reports more than $160M in ARR with triple-digit annual growth, giving Palo Alto Networks a fast-expanding revenue engine in an increasingly competitive market. How will customers benefit from the combined offering? Enterprises would gain deeper visibility across security and observability data at petabyte scale, paired with automated remediation and significant cost reductions in telemetry ingestion.
dlvr.it
November 21, 2025 at 7:44 AM
IONOS Deploys Distributed High-Performance Network with VyOS
VyOS Networks is expanding its footprint in the enterprise cloud ecosystem as IONOS, one of Europe’s largest hosting and infrastructure providers, has completed a broad deployment of the VyOS open-source network operating system across its Bare Metal platform. The rollout marks a significant architectural shift for IONOS, replacing centralized, hardware-dependent networking models with a distributed, software-defined approach designed to support massive scale, improve resilience, and reduce operational costs. The deployment reflects a growing trend among global cloud providers: leveraging open-source network operating systems to accelerate infrastructure modernization while avoiding vendor lock-in. For IONOS, the move to VyOS enables the company to scale to hundreds of nodes, orchestrate workloads more flexibly across its European data centers, and achieve high-performance throughput without the licensing costs associated with traditional proprietary systems. According to IONOS, the shift was driven by a need to eliminate architectural bottlenecks and reduce the risk of outages tied to centralized network chokepoints. By distributing VyOS instances across its infrastructure, the company has built a fault-tolerant environment that maintains service continuity even when individual components fail. The redesign also positions IONOS to better support increasingly data-intensive customer workloads spanning bare metal compute, hybrid cloud deployments, and latency-sensitive applications. “VyOS gave us the freedom to build a resilient, distributed network without sacrificing performance or control,” said Tomás Montero, Head of Hosting Network Services at IONOS. “We can scale to hundreds of nodes efficiently and securely.” Performance metrics from the deployment indicate that VyOS is delivering high throughput at scale. Across IONOS clusters, aggregate speeds reach into the hundreds of gigabits per second. Individual clusters achieve peak throughput of 20 Gbps and sustain roughly 1.5 million packets per second. These figures position the open-source platform squarely within the performance range of commercial network operating systems traditionally relied upon by large cloud providers. VyOS Networks emphasized that the collaboration highlights a broader industry shift in favor of open-source networking as a strategic foundation for next-generation infrastructure. “IONOS’s adoption of VyOS demonstrates how open-source networking solutions can rival and even outperform proprietary systems in scalability, reliability, and cost efficiency,” said Santiago Blanquet, Chief Revenue Officer at VyOS Networks. “This collaboration showcases how enterprises can leverage VyOS to build cloud-ready, high-throughput infrastructures that deliver exceptional performance and resilience.” The move to VyOS has also yielded cost benefits for IONOS. The company reports significant savings tied to the elimination of traditional hardware and licensing expenditures. Instead of renewing contracts with established networking vendors, IONOS is investing in software-defined infrastructure that can scale horizontally and adapt to workload demands without requiring specialized hardware appliances. Looking ahead, IONOS plans to deepen its integration with the VyOS ecosystem. The company is preparing to adopt Vector Packet Processing (VPP) in VyOS 1.5 to further push throughput and efficiency across its networking layer. Additional enhancements planned for upcoming phases include expanded orchestration support and advanced load-balancing capabilities to optimize multi-tenant infrastructure performance. Taken together, these investments signal a long-term commitment to open-source networking as the backbone of IONOS’s infrastructure strategy. VyOS Networks, which has spent more than a decade developing open-source routing, firewall, and VPN technologies, now occupies a growing role in enterprise infrastructure modernization initiatives. Its software is deployed across bare-metal environments, hyperscale clouds, and distributed edge systems, giving organizations a unified networking platform that can be automated and scaled across heterogeneous environments. With competition in cloud infrastructure intensifying, the collaboration positions IONOS to offer customers more flexible, high-performance network services without the constraints of legacy architectures. For VyOS, it strengthens the company’s presence in the European infrastructure market and highlights the maturing role of open-source networking within mission-critical cloud platforms. Executive Insights FAQ: What This News Means for Enterprise Networking How does VyOS improve network scalability for cloud providers? VyOS enables distributed deployment across hundreds of nodes, allowing cloud operators to scale network capacity horizontally without relying on centralized hardware. What performance gains did IONOS achieve with VyOS? Clusters reached peak throughput of 20 Gbps and about 1.5 million PPS, with aggregate speeds in the hundreds of Gbps across the environment. How does VyOS reduce operational and financial risk? The distributed design eliminates single points of failure and VyOS’s open-source model removes licensing fees, reducing both downtime risk and recurring cost. Why is open-source networking gaining traction in hyperscale and cloud environments? Enterprises want vendor independence, automation-friendly infrastructure, and cost efficiency - areas where open-source NOS platforms increasingly match or surpass proprietary options. What comes next in the VyOS–IONOS collaboration? IONOS plans to adopt VPP in VyOS 1.5, enhance orchestration, and expand load-balancing capabilities to further improve throughput and operational efficiency across its bare-metal platform.
dlvr.it
November 20, 2025 at 6:42 PM
Cloudflare Outage Traced to Internal Error, Not Cyberattack
Cloudflare is detailing the root cause of a major global outage that disrupted traffic across a large portion of the Internet on November 18, 2025, marking the company’s most severe service incident since 2019. While early internal investigations briefly raised the possibility of a hyper-scale DDoS attack, Cloudflare cofounder and CEO Matthew Prince confirmed that the outage was entirely self-inflicted. The Cloudflare disruption, which began at 11:20 UTC, produced spikes of HTTP 5xx errors for users attempting to access websites, APIs, security services, and applications running through Cloudflare’s network - an infrastructure layer relied upon by millions of organizations worldwide. Cloudflare cofounder and CEO Matthew Prince confirmed that the outage was caused by a misconfiguration in a database permissions update.Cloudflare cofounder and CEO Matthew Prince confirmed that the outage was caused by a misconfiguration in a database permissions update, which triggered a cascading failure in the company’s Bot Management system, which in turn caused Cloudflare’s core proxy layer to fail at scale. The error originated from a ClickHouse database cluster that was in the process of receiving new, more granular permissions. A query designed to generate a ‘feature file’ - a configuration input for Cloudflare’s machine-learning-powered Bot Management classifier - began producing duplicate entries once the permissions change allowed the system to see more metadata than before. The file doubled in size, exceeded the memory pre-allocation limits in Cloudflare’s routing software, and triggered software panics across edge machines globally. Those feature files are refreshed every five minutes and propagated to all Cloudflare servers worldwide. The intermittent nature of the database rollout meant that some nodes generated a valid file while others created a malformed one, causing the network to oscillate between functional and failing states before collapsing into a persistent failure mode. The initial symptoms were misleading. Traffic spikes, noisy error logs, intermittent recoveries, and even a coincidental outage of Cloudflare’s independently hosted status page contributed to early suspicion that the company was under attack. Only after correlating file-generation timestamps with error propagation patterns did engineers isolate the issue to the Bot Management configuration file. By 14:24 UTC, Cloudflare had frozen propagation of new feature files, manually inserted a known-good version into the distribution pipeline, and forced resets of its core proxy service - known internally as FL and FL2. Normal traffic flow began stabilizing around 14:30 UTC, with all downstream services recovering by 17:06 UTC. The impact was widespread because the faulty configuration hit Cloudflare’s core proxy infrastructure, the traffic-processing layer responsible for TLS termination, request routing, caching, security enforcement, and API calls. When the Bot Management module failed, the proxy returned 5xx errors for all requests relying on that module. On the newer FL2 architecture, this manifested as widespread service errors; on the legacy FL system, Bot scores defaulted to zero, creating potential false positives for customers blocking bot traffic. Multiple services either failed outright or degraded, including Turnstile (Cloudflare’s authentication challenge), Workers KV (the distributed key-value store underpinning many customer applications), Access (Cloudflare’s Zero Trust authentication layer), and portions of the company’s dashboard. Internal APIs slowed under heavy retry load as customers attempted to log in or refresh configurations during the disruption. Cloudflare emphasized that email security, DDoS mitigation, and core network connectivity remained operational, although spam-detection accuracy temporarily declined due to the loss of an IP reputation data source. Prince acknowledged the magnitude of the disruption, noting that Cloudflare’s architecture is intentionally built for fault tolerance and rapid mitigation, and that a failure blocking core proxy traffic is deeply painful to the company’s engineering and operations teams. The outage, he said, violated Cloudflare’s commitment to keeping the Internet reliably accessible for organizations that depend on the company’s global network. Cloudflare has already begun implementing systemic safeguards. These include hardened validation of internally generated configuration files, global kill switches for key features, more resilient error-handling across proxy modules, and mechanisms to prevent debugging systems or core dumps from consuming excessive CPU or memory during high-failure events. The full incident timeline reflects a multi-hour race to diagnose symptoms, isolate root causes, contain cascading failures, and bring the network back online. Automated detection triggered alerts within minutes of the first malformed file reaching production, but fluctuating system states and misleading external indicators complicated root-cause analysis. Cloudflare teams deployed incremental mitigations - including bypassing Workers KV’s reliance on the proxy - while working to identify and replace the corrupted feature files. By the time a fix reached all global data centers, Cloudflare’s network had stabilized, customer services were back online, and downstream errors were cleared. As AI-driven automation and high-frequency configuration pipelines become fundamental to global cloud networks, the Cloudflare outage underscores how a single flawed assumption - in this case, about metadata visibility in ClickHouse queries — can ripple through distributed systems at Internet scale. The incident serves as a high-profile reminder that resilience engineering, configuration hygiene, and robust rollback mechanisms remain mission-critical in an era where edge networks process trillions of requests daily. Executive Insights FAQ: Understanding the Cloudflare Outage What triggered the outage in Cloudflare’s global network? A database permissions update caused a ClickHouse query to return duplicate metadata, generating a Bot Management feature file twice its expected size. This exceeded memory limits in Cloudflare’s proxy software, causing widespread failures. Why did Cloudflare initially suspect a DDoS attack? Systems showed traffic spikes, intermittent recoveries, and even Cloudflare’s external status page went down by coincidence - all patterns resembling a coordinated attack, contributing to early misdiagnosis. Which services were most affected during the disruption? Core CDN services, Workers KV, Access, and Turnstile all experienced failures or degraded performance because they depend on the same core proxy layer that ingests the Bot Management configuration. Why did the issue propagate so quickly across Cloudflare’s global infrastructure? The feature file responsible for the crash is refreshed every five minutes and distributed to all Cloudflare servers worldwide. Once malformed versions began replicating, the failure rapidly cascaded across regions. What long-term changes is Cloudflare making to prevent future incidents? The company is hardening configuration ingestion, adding global kill switches, improving proxy error handling, limiting the impact of debugging systems, and reviewing failure modes across all core traffic-processing modules.
dlvr.it
November 20, 2025 at 6:42 PM
VAST Data, Microsoft Unite to Deliver High-Scale Agentic AI on Azure
VAST Data and Microsoft are deepening their alignment around next-generation AI infrastructure, announcing a new collaboration that will bring the VAST Data AI Operating System (AI OS) natively to Microsoft Azure. Unveiled at Microsoft Ignite, the partnership positions VAST Data as a strategic technology layer supporting what both companies describe as the coming wave of agentic AI. These AI systems composed of autonomous, continuously reasoning software agents operate on massive, real-time datasets. For Azure customers, the integration means they will be able to deploy VAST’s full data platform directly within the Microsoft cloud, using the same governance, security, operational tooling, and billing frameworks that define Azure-native services. The VAST AI OS, long known in enterprise AI circles for its performance-oriented architecture and unified data model, will now be available as a cloud service, simplifying deployment for organizations scaling AI workloads across on-premises, hybrid, and multi-cloud environments. The partnership gives enterprises access to VAST’s unified storage, data cataloging, and database services, designed to support increasingly complex AI pipelines that incorporate vector search, retrieval-augmented generation (RAG), model training, inference, and real-time agentic processing. VAST’s architecture will run on Azure infrastructure, including the new Laos VM Series and Azure Boost accelerated networking, which are optimized for high-bandwidth AI workloads. Jeff Denworth, co-founder of VAST Data, described the partnership as an inflection point for enterprise AI deployment. “Performance, scale, and simplicity are converging,” he said. “Azure customers will be able to unify their data and AI pipelines across environments with the same power, simplicity, and performance they expect from VAST - now combined with the elasticity and geographic reach of Microsoft’s cloud.” Microsoft, for its part, sees the integration as a way to streamline the data and storage foundations required for the fast-growing segment of AI model builders working within Azure. “Many of the world’s leading AI developers leverage VAST for its scalability and breakthrough performance,” said Aung Oo, Vice President of Azure Storage. “Running VAST’s AI OS on Azure will help customers accelerate time-to-insight while reducing operational and cost barriers.” At the center of the offering is a platform designed for agentic AI. VAST’s InsightEngine provides stateless compute and database services optimized for vector search, RAG pipelines, and high-performance data preparation. Its companion AgentEngine coordinates autonomous AI agents working across distributed environments, enabling continuous reasoning over data streams without requiring multi-step orchestration frameworks. Azure CPU and GPU Clusters From an infrastructure perspective, the VAST AI OS is engineered to maximize utilization of Azure CPU and GPU clusters. The platform integrates intelligent caching, metadata-aware I/O, and high-throughput data services to ensure predictable performance across training, fine-tuning, and inference cycles. This aligns with Microsoft’s broader strategy of building vertically integrated AI infrastructure - one that increasingly includes custom silicon investments. A key differentiator of the VAST approach is its exabyte-scale DataSpace, which creates a unified global namespace across on-prem, co-lo, and cloud environments. The model gives enterprises the ability to burst GPU-intensive workloads into Azure without redesigning pipelines or migrating data - a capability that has traditionally slowed hybrid AI adoption. VAST Data’s disaggregated, shared-everything (DASE) architecture extends into Azure as well, allowing compute and storage resources to scale independently. With built-in Similarity Reduction technology reducing the storage footprint of large AI datasets, the combined platforms aim to give customers both elasticity and cost containment - critical factors as model development increasingly demands multi-region, multi-petabyte environments. The collaboration arrives as AI infrastructure requirements evolve rapidly. Autonomous agents, context-rich retrieval systems, and continuous-learning workflows require consistent performance across heterogeneous environments - something neither legacy storage architectures nor siloed cloud services were built to handle. By positioning VAST as a unified data substrate for Azure-based AI, Microsoft is betting on an architecture that can bridge those gaps at cloud scale. Both companies say they will co-engineer future capabilities as Microsoft advances its next-generation compute programs. The long-term goal, they emphasize, is to ensure that regardless of model architecture or processor design, the underlying data layer can support AI workloads with predictability and scale. Executive Insights FAQ What does this partnership enable for Azure customers? Azure users will be able to deploy the VAST AI Operating System natively in the cloud, giving them unified data services, high-performance storage, and AI-optimized compute pipelines without managing separate infrastructure. How does the VAST AI OS support agentic AI? VAST’s InsightEngine and AgentEngine allow organizations to run autonomous AI agents and stateful reasoning systems directly on real-time data streams, enabling continuous decision-making across hybrid and multi-cloud environments. What advantages does the integration bring for AI model builders? The platform keeps Azure GPU clusters fully utilized through high-throughput data services, intelligent caching, and metadata-optimized I/O - ensuring predictable performance for training, fine-tuning, and inference at scale. How does VAST improve hybrid AI workflows? Its global DataSpace functions as a unified namespace, allowing organizations to burst workloads into Azure without data migration or pipeline redesign, enabling seamless hybrid and multi-cloud operations. How will the collaboration evolve as Microsoft introduces new AI hardware? VAST Data and Microsoft will co-engineer future platform requirements so that emerging Azure infrastructure - including custom silicon initiatives - remains fully compatible with VAST’s AI OS, ensuring long-term scalability and performance.
dlvr.it
November 20, 2025 at 4:40 PM
AI Demand Reshapes Global Data Center Costs, Says New Report
Global data center construction is entering a period of rapid divergence as AI demand reshapes cost structures, power availability, and infrastructure strategy worldwide, according to a new analysis from Turner & Townsend, a multinational professional services company headquartered in Leeds, United Kingdom. As one of the world’s largest construction consultancies, Turner & Townsend provides program management, cost management, and infrastructure advisory services across property, transportation, and natural resources - and its newly released Data Centre Construction Cost Index 2025 signals a turning point for the industry. The report reveals a market in which traditional cloud facilities are stabilizing in cost, while next-generation AI data centers are breaking away with higher complexity, greater density requirements, and significantly elevated capital costs. With AI adoption accelerating faster than grid infrastructure and supply chains can adapt, Turner & Townsend warns that markets are entering a phase where delays, price divergence, and regional disparities will shape global competitiveness. According to the firm’s analysis, construction costs for conventional air-cooled cloud data centers are projected to rise 5.5 percent year-over-year in 2025, a notable cooling from the 9 percent increase reported in the previous year. This moderation reflects broader stabilization in construction markets globally, along with maturing supply chains in newer data center regions. Turner & Townsend’s Global Construction Market Intelligence Report for 2025 shows only 4.2 percent average inflation across the entire construction sector - an indicator that the data center ecosystem is settling after a period of extreme volatility. Widening Data Center Construction Cost Gap But the real story lies in the widening gap between traditional builds and facilities designed specifically for AI workloads. Turner & Townsend's benchmarking - supported by data from its global Hive intelligence platform - shows that liquid-cooled, high-density AI data centers in the United States carry a 7–10 percent construction cost premium over similarly sized air-cooled facilities. These projects not only require more complex mechanical systems but demand higher electrical capacity, more sophisticated rack design, and greater engineering coordination to support the thermal needs of GPU clusters used for training and inference. In these next-generation environments, mechanical systems comprise 33 percent of total build costs, compared with 22 percent for air-cooled designs, highlighting the shift from traditional fan-based cooling to immersion, rear-door heat exchange, and direct-to-chip liquid cooling technologies. Electrical systems remain the single largest cost driver - accounting for roughly half of total spend - reflecting the industry’s dramatic escalation in power density per rack. Industry survey data included in the report suggests that these pressures are being felt acutely. Nearly half of respondents (47 percent) said they experienced bid or tender price increases between 6 and 15 percent in the past year, and 21 percent reported increases above 15 percent. Looking ahead, 60 percent expect further construction cost escalation of 5 to 15 percent in 2026. The geography of cost pressure remains uneven. The world’s most expensive markets for data center construction are unchanged: Tokyo (US$15.2 per watt), Singapore (US$14.5), and Zurich (US$14.2), ranking as the top three globally. In these highly constrained regions, land scarcity, labor dynamics, and specialized contractor availability are driving persistently high pricing. Tokyo’s dominance is reinforced by the addition of Osaka to the index, signaling Japan’s emergence as a multi-hub data center market. In Europe, markets such as Paris and Amsterdam have climbed significantly due to maturing supply chains and currency effects from a softer U.S. dollar. Both now sit at US$10.8 per watt, comparable to Portland’s pricing in the United States. Meanwhile, Madrid and Dublin have surpassed major U.S. hubs including Atlanta and Phoenix, reflecting rapidly rising demand in Europe’s expanding cloud and AI ecosystem. Power Availability In the United States, a major shift is underway as long-standing power constraints in Northern Virginia push developers southward. Charlotte, North Carolina, newly added to the index at US$9.5 per watt, is experiencing a surge in hyperscale and colocation development. Favorable tax incentives, grid accessibility, and lower electricity prices have drawn new investments from Digital Realty, Microsoft, QTS, Compass, and Apple. Turner & Townsend notes that this marks an inflection point: power strategy is becoming the determining factor for where the next generation of AI centers will be built. Power availability is now the single greatest barrier to delivery. 48 percent of survey respondents identified power constraints - especially long grid connection timelines - as the primary cause of project delays. Across the U.S., UK, and Europe, utilities face competing demands from housing, manufacturing, and renewable energy deployment, forcing grid operators to prioritize connections. While governments are attempting to modernize planning rules and connection processes, progress remains slow. In response, Turner & Townsend stresses that clients will increasingly need to consider alternative or supplemental power strategies, including on-site renewable generation, battery energy storage, or grid-independent solutions. Yet only 14 percent of survey respondents have explored such approaches. As AI workloads become dominant, the consultancy warns that dependence on traditional grid connections will present an unsustainable bottleneck. Water use is emerging as a second major concern. Although many liquid-cooling systems operate in closed-loop designs, public scrutiny and local environmental policies are tightening. Regions facing water scarcity may restrict certain cooling configurations, pushing operators toward more efficient thermal designs that minimize environmental impact and accelerate planning approvals. Despite these headwinds, the data center sector remains highly optimistic. 75 percent of survey respondents are already involved in AI data center projects, and 47 percent expect AI workloads to represent more than half of total demand within the next two years. The industry has seen rack power density rise by 100x in the past decade, and Turner & Townsend argues that this momentum reflects only the earliest stage of an AI-driven infrastructure revolution. Executive Insights FAQ: AI Data Center Economics What is driving the cost premium for AI-optimized data centers? Higher power density, liquid cooling integration, and advanced electrical and mechanical systems push AI data center construction costs 7–10% above traditional designs. Why is power availability the biggest factor affecting project timelines? Grid connection queues and regional power shortages are delaying builds more than any other factor, forcing developers to seek alternative energy models or new markets. How is liquid cooling changing facility design and cost allocation? Mechanical system costs rise significantly, environmental considerations become more complex, and operators must integrate new thermal strategies to support GPU racks. Why are regional cost disparities narrowing across Europe and the U.S.? Maturing supply chains and currency shifts are balancing costs, while demand in secondary markets is rising due to power constraints in traditional hubs. What strategic steps should operators take to avoid future delays? Early procurement, diversified supplier networks, and exploration of on-site or hybrid power models are increasingly essential for AI-driven deployments.
dlvr.it
November 20, 2025 at 4:13 PM
Delta and Siemens Partner on Modular Power Systems for AI Data Centers
Delta and Siemens Smart Infrastructure are deepening their collaboration in the global data center market with a new partnership aimed at accelerating the deployment of high-density, power-intensive AI and cloud workloads. The companies have formalized a global agreement to co-deliver prefabricated, modular power systems that promise faster buildouts, lower capital expenditure, and improved reliability for hyperscale and colocation operators. The partnership combines Siemens’ expertise in electrical power distribution and engineering services with Delta’s high-efficiency UPS systems, battery technologies, and advanced cooling solutions. The joint offering centers on fully integrated, containerized power modules - SKIDs and eHouses - that can be prefabricated and tested off-site before being shipped to data center locations around the world. This approach is designed to bypass many of the structural delays associated with traditional construction and commissioning processes. Delta executives emphasized that the agreement reflects the company’s long-term strategy of forming dynamic alliances to meet the rapidly intensifying demands of AI infrastructure. Victor Cheng, CEO of Delta Electronics (Thailand), said the choice to sign the agreement in Thailand highlights the global importance of the firm's manufacturing and system integration hubs. These facilities, he noted, will play a key role in enabling localized delivery and ensuring predictable project timelines as operators race to deploy new capacity. Siemens executives underscored that the combined offering could shorten data center deployment schedules by as much as 50 percent while reducing construction and integration risks. Stephan May, CEO of Electrification & Automation at Siemens Smart Infrastructure, said the prefabricated, customizable design introduces a level of predictability and efficiency that is increasingly critical in markets where data center construction pipelines must scale rapidly to meet AI-driven demand. He added that energy efficiency and sustainability are central to the approach, aligning with the long-term goals of major cloud and hyperscale operators. Delta’s EVP of Global Business Operations, Jimmy Yiin, framed the partnership as a natural extension of the company’s mission to deliver energy-efficient infrastructure “from grid to chip.” Yiin said Delta’s UPS systems, batteries, and thermal technologies are engineered for high-density environments characteristic of AI workloads, where power integrity, redundancy, and thermal design are essential factors in operational cost and uptime. Working with Siemens, he added, allows Delta to deploy these technologies more broadly through a globally coordinated supply chain across APAC and EMEA. The prefabricated systems at the center of the collaboration allow operators to install containerized power assets that optimize space usage, reduce concrete requirements, and lower both carbon emissions and capital expenses. According to the companies, the design can reduce CAPEX by up to 20 percent and cut embodied carbon by up to 27 percent. For operators navigating grid interconnection backlogs, skilled labor shortages, and rising sustainability pressures, the modular approach is increasingly seen as a strategic alternative to traditional brick-and-mortar construction. Both companies highlight that the success of next-generation data center deployments will depend on ecosystem-level cooperation. Siemens and Delta describe their partnership as part of a broader effort to connect key players across the value chain in ways that support interoperability, accelerate innovation, and help the industry address growing challenges tied to power constraints, AI workload density, and global sustainability targets. Executive Insights FAQ: Prefabricated Power Systems for AI Data Centers Why do modular, prefabricated power systems gain traction in AI-focused data centers? Rapid deployment, reduced construction risk, and predictable integration timelines can make modular systems appealing as operators race to add AI-ready capacity. How can prefabricated power modules reduce CAPEX and carbon emissions? Factory-built systems reduce on-site labor, shorten project schedules, and require less concrete and physical footprint - cutting CAPEX by up to 20% and carbon by 27%. Why do Siemens and Delta partner on prefabricated power systems? It merges Siemens’ expertise in electrical distribution with Delta’s strengths in UPS, batteries, and cooling, creating a unified solution optimized for high-density AI loads. How can SKIDs and eHouses improve reliability for hyperscale operators? Pre-testing and integrated design reduce failure points, improve commissioning certainty, and help operators deploy consistent, repeatable infrastructure at scale. What challenges in AI infrastructure do these modular systems address? They mitigate grid-connection delays, accelerate capacity expansion, and enable operators to meet the extreme power and cooling demands of modern AI clusters.
dlvr.it
November 20, 2025 at 3:59 PM
Xen 4.21 Expands Performance, Security for Cloud and Automotive
The Xen Project has released Xen 4.21, marking one of the hypervisor’s most substantial modernization steps in recent years as it expands its role across cloud, data center, automotive, and emerging embedded workloads. The new release updates core toolchains, improves x86 performance efficiency, strengthens security on Arm-based platforms, and introduces early RISC-V enablement for future architectures. Hosted by the Linux Foundation, the open-source virtualization platform continues to evolve beyond its roots as a cloud hypervisor, aiming to serve as a unified foundation for compute environments ranging from hyperscale servers to safety-critical vehicle systems. For cloud providers, data center operators, and virtualization vendors, Xen 4.21 brings measurable performance improvements. Enhancements to memory handling, cache management, and PCI capabilities on x86 promise higher VM density and improved performance per watt - an increasingly important metric as operators refine infrastructure for AI, GPU-accelerated workloads, and large-scale multitenant environments. The release introduces a new AMD Collaborative Processor Performance Control (CPPC) driver, allowing finer-grained CPU frequency scaling on AMD platforms. Combined with an updated page-index compression (PDX) algorithm and support for resizable BARs in PVH dom0, the update is designed to extract more capability from modern multi-core CPUs without demanding architectural rewrites from operators. Xen’s role in the automotive and embedded sectors continues to expand as the industry shifts toward software-defined vehicles powered by heterogeneous SoCs. Xen 4.21 includes expanded support for Arm-based platforms with new security hardening, stack-protection mechanisms, MISRA-C compliance progress, and features designed to meet the stringent requirements of safety-certifiable systems. The release adds support for eSPI ranges on SoCs with GICv3.1+ and introduces advancements to dom0less virtualization - an architecture increasingly used in automotive deployments to isolate workloads such as infotainment, digital instrument clusters, and advanced driver-assistance systems. Demonstrations by AMD and Honda at Xen Summit 2025 showcased the hypervisor running on production-grade automotive hardware, signaling growing industry readiness. RISC-V support also advances with the addition of UART and external interrupt handling in hypervisor mode. While full guest virtualization is still under development, this early work lays the groundwork for future RISC-V systems that may require secure workload isolation in edge, automotive, or custom compute environments. Hypervisor Modernization Cody Zuschlag, Community Manager for the Xen Project, said the 4.21 release reflects a broader modernization strategy. “We’re modernizing the hypervisor from the inside out: updating toolchains, expanding architecture support, and delivering the performance that next-generation hardware deserves. It’s exciting to see Xen powering everything from next-generation cloud servers to real-world automotive systems,” he said. Toolchain updates represent one of the most significant architectural shifts in the release. Xen 4.21 raises minimum supported versions of GCC, Binutils, and Clang across all architectures - an essential but complex step that reduces technical debt and improves the platform’s long-term security and maintainability. The update also formalizes support for qemu-xen device models inside Linux stubdomains, an approach favored by security-focused Linux distributions, including QubesOS. The Xen Project remains backed by a wide ecosystem of contributors from AMD, Arm, AWS, EPAM, Ford, Honda, Renesas, Vates, XenServer, and numerous independent maintainers. Enterprise vendors leveraging Xen for commercial offerings welcomed the update. Citrix, for example, emphasized improvements that translate into better performance and reliability for users of XenServer. “Updates like the newly introduced page index compression algorithm and better memory cache attribute management translate into better performance and improved scalability for all our enterprise XenServer users,” said Jose Augustin, Product Management at Citrix. Arm echoed the significance of the release for software-defined automotive and edge platforms. “Virtualization is becoming central to how automotive and edge systems deliver safety, performance, and flexibility,” said Andrew Wafaa, Senior Director of Software Communities at Arm. “By expanding support for Arm Cortex-R technology, the latest Xen 4.21 release will help advance more scalable, secure, and safety-critical deployments on Arm-based platforms.” As cloud and AI workloads accelerate, and automotive manufacturers adopt virtualization for isolation and safety, Xen continues to position itself as a hypervisor built for the next generation of distributed compute environments. Xen 4.21 signals not only modernization, but a strategic expansion into industries where performance, resilience, and safety converge. Executive Insights FAQ: The Xen 4.21 Release How does Xen 4.21 improve performance for cloud and data center workloads? The release enhances memory handling, cache efficiency, PCI performance, and CPU scaling - allowing operators to run more virtual machines with lower overhead and greater performance per watt on modern x86 hardware. Why is the automotive sector interested in Xen? Xen’s dom0less architecture, MPU progress, MISRA-C compliance work, and strong isolation capabilities align with automotive safety and reliability requirements for systems such as ADAS, dashboards, and infotainment. What makes this release significant for Arm-based platforms? Xen 4.21 adds stack protection, eSPI support, refined Kconfig options, and Cortex-R MPU progress - key elements for building safety-certifiable embedded and automotive deployments. How far along is RISC-V support? Xen 4.21 introduces early hypervisor-mode capabilities such as UART and external interrupt handling, laying the foundation for full guest support in future releases. Why were toolchain upgrades emphasized in this release? Modern compilers and build tools improve code quality, reduce vulnerabilities, and enable architectural features needed for next-generation hardware - ensuring Xen remains maintainable and secure for long-term industry use.
dlvr.it
November 20, 2025 at 8:24 AM
Palo Alto to Buy Chronosphere for $3.35B to Boost AI Observability
Palo Alto Networks is making one of its most aggressive moves yet in the race to build infrastructure for AI-driven enterprises. The cybersecurity giant announced a definitive agreement to acquire Chronosphere, a fast-growing observability platform engineered to handle the scale, latency, and resilience requirements of modern cloud and AI workloads. The $3.35 billion acquisition signals Palo Alto Networks’ intention to unify telemetry, AI automation, and security into a single data platform capable of supporting the next wave of hyperscale applications. At the center of this deal is an industrywide shift: AI data centers and cloud-native environments now depend on uninterrupted uptime, deterministic performance, and the ability to detect and remediate failures instantly. Observability - once a domain of dashboards and log aggregation - has become mission-critical infrastructure. For Palo Alto Networks, Chronosphere represents the architectural foundation for this new reality. Chronosphere’s platform was built for organizations operating at extreme scale, including two leading large language model providers. Its architecture emphasizes cost-optimized data ingestion, real-time visibility across massive cloud environments, and resilience under unpredictable workloads. Chronosphere has also gained industry validation, recently recognized as a Leader in the 2025 Gartner Magic Quadrant for Observability Platforms. Palo Alto Networks Chairman and CEO Nikesh Arora said Chronosphere’s design aligns with the operational demands of AI-native companies. He emphasized that the acquisition will extend the reach of Palo Alto Networks’ AgentiX, its autonomous security and remediation framework. The combined offering is intended to rethink observability from passive alerting to active, AI-driven remediation. According to Arora, AI agents will be deployed across Chronosphere’s telemetry streams to detect anomalies, investigate root causes, and autonomously implement fixes - turning observability into a real-time automated control plane. Chronosphere co-founder and CEO Martin Mao described the acquisition as the natural next chapter for the company’s mission to “provide scalable resiliency for the world’s largest digital organizations.” He framed Palo Alto Networks as the right strategic match to expand Chronosphere’s capabilities globally, while deepening integration between security data and operational telemetry. Both companies aim to build a consolidated data layer that can keep pace with the explosion of metrics, traces, logs, and events produced by AI-powered infrastructure. Managing Observability Costs Beyond automation, the acquisition reflects rising pressure on enterprises to manage observability costs. Cloud-native architectures generate telemetry at petabyte scale, creating unsustainable ingestion and retention expenses for many organizations. Chronosphere’s optimized pipeline and data transformation technology promise to reduce operational costs by routing, deduplicating, and prioritizing telemetry in ways traditional observability stacks cannot. Chronosphere would bring more than technology. The company reports annual recurring revenue above $160 million as of September 2025, with triple-digit year-over-year growth - an uncommon trajectory in the observability market, which has grown crowded and competitive. Palo Alto Networks expects the acquisition to close in the second half of its fiscal 2026, subject to regulatory approval. The move positions Palo Alto Networks as an emerging heavyweight in observability, setting it up to compete more directly with Datadog, Dynatrace, and New Relic. But unlike its rivals, Palo Alto Networks aims to merge observability with active AI agents and security telemetry, betting that customers will increasingly prioritize unified control across performance, cost, and cyber risk. For enterprises navigating the uncertainty of AI-era operations, the promise of a consolidated observability and remediation engine may prove compelling. As workloads become distributed across clouds, GPUs, edge devices, and emerging AI fabrics, the companies argue that the old model of isolated dashboards can no longer keep up with the volume or velocity of operational data. Instead, the future will require autonomous systems capable of interpreting telemetry and responding in real time - exactly the space Palo Alto Networks hopes to define through this acquisition. Executive Insights FAQ: Palo Alto Networks + Chronosphere What strategic gap does Chronosphere fill for Palo Alto Networks? Chronosphere gives Palo Alto Networks a cloud-scale observability platform optimized for high-volume AI and cloud workloads, enabling unified security and performance visibility. How will AgentiX integrate with Chronosphere’s platform? AgentiX will use Chronosphere’s telemetry streams to deploy AI agents that detect issues, investigate root causes, and autonomously remediate failures across distributed environments. Why is observability suddenly mission-critical for AI workloads? AI data centers require continuous uptime and deterministic performance; observability becomes the real-time sensor layer that ensures reliability and cost-efficient scaling. What financial impact does Chronosphere bring? Chronosphere reports more than $160M in ARR with triple-digit annual growth, giving Palo Alto Networks a fast-expanding revenue engine in an increasingly competitive market. How will customers benefit from the combined offering? Enterprises would gain deeper visibility across security and observability data at petabyte scale, paired with automated remediation and significant cost reductions in telemetry ingestion.
dlvr.it
November 20, 2025 at 7:43 AM
Germany Breaks Ground on €11B AI Mega Data Center in Lübbenau
Germany has taken a high-visibility step toward reclaiming technological independence with the launch of an €11 billion (roughly $13 billion) mega data center project in the small town of Lübbenau, about 100 kilometers southeast of Berlin. The Schwarz Group - the retail conglomerate behind Lidl and Kaufland - has officially broken ground on what could become one of Europe’s largest GPU-accelerated computing sites. It would position the new data center as a critical pillar for building a domestic AI ecosystem capable of competing with the United States and China. At a rainy inauguration ceremony, executives joined German Minister for Digitalization and Government Modernization Karsten Wilderger to mark the start of construction on the multi-module facility. Mr. Wilderger called the site “the backbone of Germany’s digital sovereignty,” emphasizing that European companies will no longer need to rely on U.S. hyperscalers to store or process sensitive data. Supporters of the project argue that Europe’s strategic dependence on American cloud platforms - combined with lagging semiconductor and AI infrastructure - has constrained its ability to develop competitive AI models, industrial automation platforms, and sovereign cloud services. The stakes were highlighted in Berlin a day later, where German Chancellor Friedrich Merz and French President Emmanuel Macron jointly led a summit on digital sovereignty focused on reducing reliance on U.S. internet giants. European policymakers increasingly view hyperscale data center capability as essential for maintaining control over industrial data and ensuring that AI development aligns with regional privacy and security norms. Rolf Schumann, co-director of Schwarz Digits - the Schwarz Group subsidiary overseeing the Lübbenau program - said the new facility would deliver a “secure and independent infrastructure” that allows European businesses and citizens to shape their digital future without relying on foreign platforms. The design calls for six massive data center modules, each roughly equivalent to four football fields, built on the former site of a decommissioned power plant in the Spreewald region of former East Germany. Although physical construction remains in early stages, the company expects the first three modules to go live by the end of 2027. The scale of the build-out is notable. The data center will be able to house up to 100,000 GPUs - capacity that analysts describe as approaching an AI giga-factory, similar to the high-density GPU clusters under development in the U.S. and China. For comparison, U.S. hyperscalers have already begun constructing multi-million-GPU facilities, while Germany’s total national compute capacity remains roughly one-tenth that of the United States, according to industry association Bitkom. The Spreewald site is expected to support AI model training for the Schwarz Group as well as other private-sector clients and government agencies. One of the anchor customers will be Stackit, Schwarz Group’s cloud provider, which aims to offer an alternative to American platforms such as AWS and Microsoft Azure. Schumann emphasized that Stackit’s mission is to deliver “sovereign digital services with European values” and to give enterprises a compliant environment for hosting sensitive workloads. Despite strong political momentum, Europe’s efforts to build independent data and AI infrastructure have historically faced challenges ranging from regulatory complexity to cost disadvantages compared to U.S. cloud hyperscalers. The Lübbenau project will test whether European companies can scale AI infrastructure fast enough to close the gap. Executive Insights FAQ: AI Giga-Factories and EU Sovereign Compute Why are AI mega data centers seen as essential for digital sovereignty? They provide domestic control over sensitive data and model training pipelines, reducing dependence on foreign cloud platforms governed by different regulatory regimes. How does a 100,000-GPU facility change Europe’s AI capabilities? It enables training of frontier-scale models and high-density inference workloads within Europe, something currently available mostly through U.S. hyperscalers. What makes Germany’s Lübbenau project comparable to AI giga-factories? The scale of power, land, and GPU density follows the U.S. and Chinese model of hyperscale AI clusters designed for continuous training and industrial AI workloads. Why has Europe struggled to build independent digital infrastructure? High costs, fragmented regulations, smaller cloud providers, and limited semiconductor capacity have slowed the continent’s ability to compete with U.S. and Chinese firms. How will this data center support Europe’s enterprise ecosystem? It will offer locally governed cloud and AI compute, enabling industries like manufacturing, logistics, healthcare, and defense to keep data and model development inside the EU.
dlvr.it
November 20, 2025 at 6:20 AM
IonQ Acquires Skyloom to Advance Quantum-Secure Optical Networking
IonQ is expanding its ambitions in quantum-secure communications with a definitive agreement to acquire Skyloom Global, a U.S. company specializing in high-performance optical communications for both terrestrial and orbital networks. Financial terms of the deal were not disclosed. The acquisition brings Skyloom’s rapidly scaling optical communications terminal (OCT) technology into IonQ’s growing ecosystem, strengthening the company’s effort to build a fully integrated global quantum infrastructure platform. Skyloom, founded in 2017, has become a key supplier of laser-based communications terminals for U.S. government and commercial missions. The company has delivered nearly 90 SDA-qualified OCTs to the Space Development Agency as of 2025, giving it one of the most proven production pipelines for high-bandwidth satellite communications hardware. After the acquisition closes, Skyloom CEO Marc Eisenberg will join IonQ’s leadership structure under Frank Backes, President of Quantum Infrastructure. IonQ says the strategic value of the deal goes far beyond incremental network performance. According to Chairman and CEO Niccolo de Masi, integrating space-based optical links directly into IonQ’s quantum platform accelerates the buildout of secure global quantum networking, enabling the distributed entanglement architectures expected to define next-generation communication systems. Bringing Skyloom into the fold expands IonQ’s addressable market while significantly increasing downlink speeds across its existing product lines. IonQ anticipates immediate performance gains once Skyloom’s terminals are incorporated, including a potential fivefold increase in data throughput and dramatically lower latency. In mission scenarios involving large scientific datasets or time-sensitive intelligence information, transmission windows that once took several hours could drop below one hour. These improvements support IonQ’s broader plan to build a multi-layered quantum-secure network leveraging orbital assets, terrestrial stations, and quantum computing nodes. Skyloom CEO Eisenberg described the acquisition as a milestone that amplifies his company’s founding goal to advance secure communications infrastructure. By pairing Skyloom’s OCT hardware with IonQ’s quantum systems, both companies see an opportunity to shape the early architecture of global quantum-secure networking. This acquisition extends IonQ’s recent pattern of consolidating key technologies across the quantum stack. It follows the company’s purchases of Qubitekk, Capella Space, Lightsynq, and a controlling stake in ID Quantique - moves that collectively give IonQ ownership of capabilities spanning quantum computing, sensing, cryptographic security, and space-based communications. With the addition of Skyloom, IonQ becomes one of the only companies with a vertically integrated approach to distributed quantum entanglement networks The transaction remains subject to regulatory approval, but once finalized, it will further position IonQ as a leader building the infrastructure layer for future quantum networking and global secure communications. Executive Insights FAQ: Neutral Atom Computing and Skyloom Why are optical communications terminals important for quantum networking? High-bandwidth laser links enable low-latency data transport between satellites and ground stations, forming the physical backbone required for distributed quantum entanglement and quantum-secure communications. How does Skyloom’s technology enhance IonQ’s quantum platform? Skyloom’s OCTs can increase data throughput by up to 500% and reduce transfer times dramatically, enabling faster distribution of quantum keys, entanglement pairs, and mission-critical datasets. What role does a full-stack architecture play in quantum-secure communications? Owning compute, sensing, cryptography, and communications layers allows IonQ to integrate and optimize quantum-secure workflows without relying on external vendors or incompatible systems. How do space-based optical links support next-generation quantum applications? Orbital OCTs enable long-distance, line-of-sight quantum networking - necessary for global entanglement distribution, secure intercontinental communications, and resilient quantum-enabled infrastructure. Why is consolidation increasing in the quantum communications sector? Building global quantum networks requires seamless coordination across hardware domains. Companies are acquiring specialized players to assemble complete, interoperable platforms capable of supporting secure quantum connectivity at scale.
dlvr.it
November 19, 2025 at 9:45 PM
OVHcloud Launches Europe’s First Quantum-as-a-Service Platform
OVHcloud is expanding its role in Europe’s rapidly developing quantum computing landscape with the launch of its Quantum Platform, a new Quantum-as-a-Service environment designed to give enterprises on-demand access to some of the continent’s most advanced quantum processors. Announced at Choose France - France Edition, the platform represents the first European QaaS solution capable of providing cloud-based access to multiple quantum technologies, including Pasqal’s Orion Beta quantum processing unit, a neutral-atom system that now delivers a 100-qubit architecture to customers. The initiative positions OVHcloud at the center of Europe’s efforts to build a sovereign quantum capability that does not rely on overseas cloud or hardware stacks. Quantum computing remains out of reach for most organizations due to the extreme physical constraints of operating qubits at scale, but cloud-based delivery can abstract that complexity and let companies experiment with quantum algorithms long before they deploy them in production environments. By integrating Pasqal's neutral-atom system, OVHcloud is offering users access to a platform that can run early-stage workloads with higher qubit counts than most publicly available devices. This launch follows several years of groundwork. OVHcloud began its quantum roadmap in 2022 with the release of its first emulator and now hosts nine emulators - currently the broadest selection available on any European cloud. Nearly a thousand users have already tested workloads through these environments, exploring gate-based models, annealing-based approaches and neutral-atom simulations. The addition of Pasqal’s live quantum hardware extends this model into full hybrid experimentation, giving customers the ability to prototype algorithms in emulators and then validate them against real qubit behavior. OVHcloud intends to expand the platform significantly. By 2027, the company aims to integrate eight additional QPUs into the Quantum Platform, seven of which will come from European vendors. This strategy reflects a broader push for digital sovereignty, ensuring that quantum hardware, control stacks and cloud infrastructure can remain within European regulatory and operational domains. Company leaders emphasize the importance of preparing industry sectors - finance, logistics, materials science, cybersecurity and pharmaceuticals - for what comes after the current wave of classical AI acceleration. According to OVHcloud, quantum capability will become essential for solving high-dimensional optimization tasks, simulating chemical and physical systems, and enhancing machine learning pipelines through hybrid quantum-classical workflows. Giving enterprises early access to QaaS environments is meant to accelerate the learning curve well before quantum hardware reaches fault-tolerant maturity. Pasqal CEO Loïc Henriet said the collaboration demonstrates that Europe can deliver a fully local quantum stack, from the neutral-atom hardware platform to the cloud delivery layer. OVHcloud anticipates that the ability to access quantum resources alongside traditional compute, storage and AI toolchains will encourage developers to explore new algorithmic models without the deployment obstacles that have slowed broader adoption. The move signals a milestone in Europe’s quantum infrastructure development: a unified cloud entry point for multiple quantum technologies, scalable over the next several years, with both emulators and live systems available under a single commercial model. For enterprises navigating their post-AI roadmaps, it marks one of the first practical gateways into quantum computing at an industrial scale. Executive Insights FAQ: Neutral Atom Computing and 100-Qubit Systems How does neutral atom quantum computing differ from superconducting or trapped-ion systems? Neutral atom systems use laser-cooled atoms held in optical tweezers, allowing highly scalable qubit arrays and flexible connectivity patterns, potentially enabling larger system sizes with lower overhead. Why is access to a 100-qubit Pasqal system significant for enterprises? A 100-qubit device allows organizations to move beyond toy problems and test meaningful quantum algorithms for optimization, simulation and machine learning, providing insight into real-world performance. What advantages do neutral atom architectures offer for scaling to larger quantum processors? They can arrange qubits in reconfigurable 2D or 3D geometries, enabling higher qubit counts without the wiring complexity of superconducting designs and offering more natural multi-qubit interactions. How does a cloud-based QaaS model accelerate quantum adoption? It removes the need for specialized physical infrastructure, allowing companies to experiment with quantum workloads immediately and adopt hybrid classical-quantum development workflows within existing cloud pipelines. What role will quantum emulators continue to play as hardware matures? Emulators remain essential for algorithm prototyping, debugging and cost-efficient scaling tests. Combined with access to real QPUs, they give enterprises a full lifecycle environment for preparing quantum-enabled applications.
dlvr.it
November 19, 2025 at 9:19 PM
Cloudflare Outage Traced to Internal Error, Not Cyberattack
Cloudflare is detailing the root cause of a major global outage that disrupted traffic across a large portion of the Internet on November 18, 2025, marking the company’s most severe service incident since 2019. While early internal investigations briefly raised the possibility of a hyper-scale DDoS attack, Cloudflare cofounder and CEO Matthew Prince confirmed that the outage was entirely self-inflicted. The Cloudflare disruption, which began at 11:20 UTC, produced spikes of HTTP 5xx errors for users attempting to access websites, APIs, security services, and applications running through Cloudflare’s network - an infrastructure layer relied upon by millions of organizations worldwide. Cloudflare cofounder and CEO Matthew Prince confirmed that the outage was caused by a misconfiguration in a database permissions update.Cloudflare cofounder and CEO Matthew Prince confirmed that the outage was caused by a misconfiguration in a database permissions update, which triggered a cascading failure in the company’s Bot Management system, which in turn caused Cloudflare’s core proxy layer to fail at scale. The error originated from a ClickHouse database cluster that was in the process of receiving new, more granular permissions. A query designed to generate a ‘feature file’ - a configuration input for Cloudflare’s machine-learning-powered Bot Management classifier - began producing duplicate entries once the permissions change allowed the system to see more metadata than before. The file doubled in size, exceeded the memory pre-allocation limits in Cloudflare’s routing software, and triggered software panics across edge machines globally. Those feature files are refreshed every five minutes and propagated to all Cloudflare servers worldwide. The intermittent nature of the database rollout meant that some nodes generated a valid file while others created a malformed one, causing the network to oscillate between functional and failing states before collapsing into a persistent failure mode. The initial symptoms were misleading. Traffic spikes, noisy error logs, intermittent recoveries, and even a coincidental outage of Cloudflare’s independently hosted status page contributed to early suspicion that the company was under attack. Only after correlating file-generation timestamps with error propagation patterns did engineers isolate the issue to the Bot Management configuration file. By 14:24 UTC, Cloudflare had frozen propagation of new feature files, manually inserted a known-good version into the distribution pipeline, and forced resets of its core proxy service - known internally as FL and FL2. Normal traffic flow began stabilizing around 14:30 UTC, with all downstream services recovering by 17:06 UTC. The impact was widespread because the faulty configuration hit Cloudflare’s core proxy infrastructure, the traffic-processing layer responsible for TLS termination, request routing, caching, security enforcement, and API calls. When the Bot Management module failed, the proxy returned 5xx errors for all requests relying on that module. On the newer FL2 architecture, this manifested as widespread service errors; on the legacy FL system, Bot scores defaulted to zero, creating potential false positives for customers blocking bot traffic. Multiple services either failed outright or degraded, including Turnstile (Cloudflare’s authentication challenge), Workers KV (the distributed key-value store underpinning many customer applications), Access (Cloudflare’s Zero Trust authentication layer), and portions of the company’s dashboard. Internal APIs slowed under heavy retry load as customers attempted to log in or refresh configurations during the disruption. Cloudflare emphasized that email security, DDoS mitigation, and core network connectivity remained operational, although spam-detection accuracy temporarily declined due to the loss of an IP reputation data source. Prince acknowledged the magnitude of the disruption, noting that Cloudflare’s architecture is intentionally built for fault tolerance and rapid mitigation, and that a failure blocking core proxy traffic is deeply painful to the company’s engineering and operations teams. The outage, he said, violated Cloudflare’s commitment to keeping the Internet reliably accessible for organizations that depend on the company’s global network. Cloudflare has already begun implementing systemic safeguards. These include hardened validation of internally generated configuration files, global kill switches for key features, more resilient error-handling across proxy modules, and mechanisms to prevent debugging systems or core dumps from consuming excessive CPU or memory during high-failure events. The full incident timeline reflects a multi-hour race to diagnose symptoms, isolate root causes, contain cascading failures, and bring the network back online. Automated detection triggered alerts within minutes of the first malformed file reaching production, but fluctuating system states and misleading external indicators complicated root-cause analysis. Cloudflare teams deployed incremental mitigations - including bypassing Workers KV’s reliance on the proxy - while working to identify and replace the corrupted feature files. By the time a fix reached all global data centers, Cloudflare’s network had stabilized, customer services were back online, and downstream errors were cleared. As AI-driven automation and high-frequency configuration pipelines become fundamental to global cloud networks, the Cloudflare outage underscores how a single flawed assumption - in this case, about metadata visibility in ClickHouse queries — can ripple through distributed systems at Internet scale. The incident serves as a high-profile reminder that resilience engineering, configuration hygiene, and robust rollback mechanisms remain mission-critical in an era where edge networks process trillions of requests daily. Executive Insights FAQ: Understanding the Cloudflare Outage What triggered the outage in Cloudflare’s global network? A database permissions update caused a ClickHouse query to return duplicate metadata, generating a Bot Management feature file twice its expected size. This exceeded memory limits in Cloudflare’s proxy software, causing widespread failures. Why did Cloudflare initially suspect a DDoS attack? Systems showed traffic spikes, intermittent recoveries, and even Cloudflare’s external status page went down by coincidence - all patterns resembling a coordinated attack, contributing to early misdiagnosis. Which services were most affected during the disruption? Core CDN services, Workers KV, Access, and Turnstile all experienced failures or degraded performance because they depend on the same core proxy layer that ingests the Bot Management configuration. Why did the issue propagate so quickly across Cloudflare’s global infrastructure? The feature file responsible for the crash is refreshed every five minutes and distributed to all Cloudflare servers worldwide. Once malformed versions began replicating, the failure rapidly cascaded across regions. What long-term changes is Cloudflare making to prevent future incidents? The company is hardening configuration ingestion, adding global kill switches, improving proxy error handling, limiting the impact of debugging systems, and reviewing failure modes across all core traffic-processing modules.
dlvr.it
November 19, 2025 at 6:40 PM
IONOS Deploys Distributed High-Performance Network with VyOS
VyOS Networks is expanding its footprint in the enterprise cloud ecosystem as IONOS, one of Europe’s largest hosting and infrastructure providers, has completed a broad deployment of the VyOS open-source network operating system across its Bare Metal platform. The rollout marks a significant architectural shift for IONOS, replacing centralized, hardware-dependent networking models with a distributed, software-defined approach designed to support massive scale, improve resilience, and reduce operational costs. The deployment reflects a growing trend among global cloud providers: leveraging open-source network operating systems to accelerate infrastructure modernization while avoiding vendor lock-in. For IONOS, the move to VyOS enables the company to scale to hundreds of nodes, orchestrate workloads more flexibly across its European data centers, and achieve high-performance throughput without the licensing costs associated with traditional proprietary systems. According to IONOS, the shift was driven by a need to eliminate architectural bottlenecks and reduce the risk of outages tied to centralized network chokepoints. By distributing VyOS instances across its infrastructure, the company has built a fault-tolerant environment that maintains service continuity even when individual components fail. The redesign also positions IONOS to better support increasingly data-intensive customer workloads spanning bare metal compute, hybrid cloud deployments, and latency-sensitive applications. “VyOS gave us the freedom to build a resilient, distributed network without sacrificing performance or control,” said Tomás Montero, Head of Hosting Network Services at IONOS. “We can scale to hundreds of nodes efficiently and securely.” Performance metrics from the deployment indicate that VyOS is delivering high throughput at scale. Across IONOS clusters, aggregate speeds reach into the hundreds of gigabits per second. Individual clusters achieve peak throughput of 20 Gbps and sustain roughly 1.5 million packets per second. These figures position the open-source platform squarely within the performance range of commercial network operating systems traditionally relied upon by large cloud providers. VyOS Networks emphasized that the collaboration highlights a broader industry shift in favor of open-source networking as a strategic foundation for next-generation infrastructure. “IONOS’s adoption of VyOS demonstrates how open-source networking solutions can rival and even outperform proprietary systems in scalability, reliability, and cost efficiency,” said Santiago Blanquet, Chief Revenue Officer at VyOS Networks. “This collaboration showcases how enterprises can leverage VyOS to build cloud-ready, high-throughput infrastructures that deliver exceptional performance and resilience.” The move to VyOS has also yielded cost benefits for IONOS. The company reports significant savings tied to the elimination of traditional hardware and licensing expenditures. Instead of renewing contracts with established networking vendors, IONOS is investing in software-defined infrastructure that can scale horizontally and adapt to workload demands without requiring specialized hardware appliances. Looking ahead, IONOS plans to deepen its integration with the VyOS ecosystem. The company is preparing to adopt Vector Packet Processing (VPP) in VyOS 1.5 to further push throughput and efficiency across its networking layer. Additional enhancements planned for upcoming phases include expanded orchestration support and advanced load-balancing capabilities to optimize multi-tenant infrastructure performance. Taken together, these investments signal a long-term commitment to open-source networking as the backbone of IONOS’s infrastructure strategy. VyOS Networks, which has spent more than a decade developing open-source routing, firewall, and VPN technologies, now occupies a growing role in enterprise infrastructure modernization initiatives. Its software is deployed across bare-metal environments, hyperscale clouds, and distributed edge systems, giving organizations a unified networking platform that can be automated and scaled across heterogeneous environments. With competition in cloud infrastructure intensifying, the collaboration positions IONOS to offer customers more flexible, high-performance network services without the constraints of legacy architectures. For VyOS, it strengthens the company’s presence in the European infrastructure market and highlights the maturing role of open-source networking within mission-critical cloud platforms. Executive Insights FAQ: What This News Means for Enterprise Networking How does VyOS improve network scalability for cloud providers? VyOS enables distributed deployment across hundreds of nodes, allowing cloud operators to scale network capacity horizontally without relying on centralized hardware. What performance gains did IONOS achieve with VyOS? Clusters reached peak throughput of 20 Gbps and about 1.5 million PPS, with aggregate speeds in the hundreds of Gbps across the environment. How does VyOS reduce operational and financial risk? The distributed design eliminates single points of failure and VyOS’s open-source model removes licensing fees, reducing both downtime risk and recurring cost. Why is open-source networking gaining traction in hyperscale and cloud environments? Enterprises want vendor independence, automation-friendly infrastructure, and cost efficiency - areas where open-source NOS platforms increasingly match or surpass proprietary options. What comes next in the VyOS–IONOS collaboration? IONOS plans to adopt VPP in VyOS 1.5, enhance orchestration, and expand load-balancing capabilities to further improve throughput and operational efficiency across its bare-metal platform.
dlvr.it
November 19, 2025 at 6:40 PM
GMI Cloud Unveils $500M Taiwan AI Factory Powered by Blackwell
GMI Cloud is deepening Asia’s role in the global AI infrastructure race with a new half-billion-dollar AI Factory in Taiwan, built around NVIDIA’s newest Blackwell architecture. The company, one of the fastest-growing GPU-as-a-Service providers in the region and an official NVIDIA Cloud Partner, positions the facility as a cornerstone for sovereign AI development at a time when demand for high-density compute capacity is escalating across the public and private sectors. The project underscores a shift toward regional AI infrastructure that combines domestic data control with access to U.S.-based accelerated computing technologies, reflecting a broad reconfiguration of the international AI supply chain. At full buildout, the Taiwan AI Factory will house more than 7,000 NVIDIA Blackwell Ultra GPUs integrated across 96 GB300 NVL72 racks, pushing the design envelope for next-generation AI inference and multi-modal processing. The AI system, operating at 16 megawatts, is engineered to reach close to two million tokens per second for large-scale AI workloads, placing it among the most advanced GPU clusters in Asia. Its network backbone incorporates NVIDIA NVLink, NVIDIA Quantum InfiniBand, NVIDIA Spectrum-X Ethernet, and NVIDIA BlueField DPUs, creating a cohesive high-throughput, low-latency data path designed specifically for enterprise AI and generative model fine-tuning. GMI Cloud frames the facility as a blueprint for regional AI modernization, aiming to support sectors that increasingly depend on real-time analytics, autonomous systems, and simulation-driven decision-making. According to CEO Alex Yeh, thousands of synchronized Blackwell GPUs will form the computational core for what the company calls a new model of sovereign AI infrastructure - one capable of absorbing local data, meeting regional regulatory expectations, and powering workloads that previously required cross-border compute resources. The significance of the new AI Factory is reflected in early partnerships that showcase how accelerated infrastructure reshapes workflows across cybersecurity, manufacturing, data operations, and energy systems. Trend Micro, which has been developing digital twin–driven security modelling, will use the platform to simulate and validate cyber risk scenarios without exposing live production environments. This approach combines NVIDIA AI Enterprise software, NVIDIA BlueField, and GMI Cloud’s compute to stress-test complex environments at a fidelity and speed not achievable with conventional security tooling. By partnering with Magna AI, Trend Micro aims to create an adaptive security ecosystem that continuously models evolving threats. Smart Manufacturing Systems Hardware manufacturer Wistron is adopting the AI Factory as a foundation for smart manufacturing systems that integrate computer vision, predictive maintenance, and continuous simulation. GMI Cloud’s infrastructure allows Wistron to train and deploy models directly onto active production lines, reducing downtime and enabling automated quality control loops. This shift toward real-time, in-factory inference reflects a broader industry trend where manufacturers rely on tightly integrated AI stacks to optimize throughput and reduce operational variability. In data infrastructure, VAST Data will provide the high-performance storage substrate necessary to sustain exabyte-scale pipelines. Its architecture aligns with the bandwidth and concurrency demands of the Blackwell platform, ensuring rapid retrieval for training and inference workloads that benefit from high-parallel access paths. As model sizes grow and the ratio between compute and data intensity shifts, VAST Data’s role is to eliminate I/O bottlenecks and maintain predictable performance at scale. A third pillar comes from TECO, which is supplying energy optimization systems and modular data center solutions. Leveraging decades of expertise in electrification and industrial equipment, TECO is transforming physical infrastructure - motors, HVAC systems, drives, and power distribution - into digitalized assets that can be orchestrated within the AI Factory. This results in a more responsive energy architecture capable of supporting fluctuating AI loads while enabling new Energy-as-a-Service delivery models for global customers. Together, these deployments highlight how AI factories are evolving from conceptual frameworks into operational assets that fuse compute, data, and energy. The Taiwan facility represents a hybrid of regional autonomy and global technological cooperation, positioning Asia to accelerate its adoption of advanced AI while aligning closely with U.S. innovation cycles. NVIDIA’s regional leadership emphasizes that such factories represent a new stage in AI industrialization, where intelligence is produced much like conventional goods - through scalable, optimized, repeatable infrastructure systems. As AI adoption shifts from experimentation to enterprise-level deployment, the GMI Cloud AI Factory presents a model for how organizations can combine high-performance GPU clusters with sovereign control, sector-specific use cases, and energy-efficient design. It reflects a larger structural change in global AI computing, where supply chains, regulatory environments, and scalability considerations increasingly determine where and how AI capabilities are built. Executive Insights FAQ: NVIDIA Blackwell Architecture How does Blackwell improve performance for large-scale enterprise inference? Blackwell’s architecture is optimized for high-throughput inference with significantly higher tensor processing density, enabling faster token generation at lower cost per watt. Its design supports massive concurrent workloads typical of enterprise generative AI and multi-modal systems. What advantage does NVLink provide in an AI Factory environment? NVLink interconnects GPUs with extremely high bandwidth and low latency, allowing them to function as a unified compute fabric. This is essential for large model parallelism and for training or serving models that exceed the memory of a single GPU. Why are Spectrum-X Ethernet and Quantum InfiniBand both used? Quantum InfiniBand supports ultra-low-latency HPC-style GPU communication, while Spectrum-X Ethernet enables scalable, AI-optimized data center networking. Combined, they provide flexibility across AI, HPC, and enterprise workloads. How does Blackwell architecture support multi-modal AI workloads? With increased memory bandwidth, larger HBM configurations, and efficient tensor core designs, Blackwell GPUs can handle heterogeneous data streams - text, vision, audio, sensor data - without requiring separate hardware subsystems. What operational efficiencies does Blackwell introduce for data centers? Blackwell GPUs emphasize energy-efficient performance scaling, reducing the compute per watt cost. This allows data centers to increase AI capacity within the same or smaller power envelope, a critical factor as AI energy demands rise globally.
dlvr.it
November 19, 2025 at 5:56 PM
Lambda Raises $1.5B to Build Gigawatt-Scale AI Factory Infrastructure
Lambda is accelerating its ambition to build the foundational infrastructure for the era of artificial intelligence with a new $1.5 billion Series E round, one of the largest private capital injections into AI compute this year. The timing aligns with a significant limitation in the AI space: the structural limitations of data center availability and GPU capacity. The funding is led by TWG Global, the holding company founded by Thomas Tull and Mark Walter, with further participation from Tull’s US Innovative Technology Fund (USIT) and several existing backers. The investment signals intensifying confidence in Lambda’s strategy to create gigawatt-scale AI factories capable of supporting both the training and inference workloads that underpin next-generation AI deployments across industries. Lambda positions itself as the “Superintelligence Cloud,” a framing that reflects the company’s attempt to build infrastructure that operates at a scale similar to national utilities - continuous, resilient and priced for ubiquity. CEO Stephen Balaban describes the goal as making compute as commonplace and accessible as electricity, with the provocative mantra “one person, one GPU.” The funding cycle, he argues, provides the capital required to expand into multi-gigawatt facilities capable of serving models that power services used by hundreds of millions of people each day. The timing coincides with a major constraint in the AI sector: GPU capacity and data center availability remain structurally limited. Even as demand accelerates from hyperscalers, enterprises, model developers and edge-AI applications, new infrastructure has lagged due to land, power and construction shortages. This mismatch has elevated the value of specialized AI factories - facilities engineered around high-density GPU clusters, low-latency fabrics, liquid cooling and optimized energy-to-compute conversion. Lambda’s existing footprint includes cloud supercomputers favored by researchers, startups and major organizations training advanced models. The company’s specialization is rooted in its origins: Lambda was founded by published machine learning researchers who built infrastructure tailored to the needs they once faced themselves. That perspective has translated into a design philosophy centered on throughput, determinism, and direct usability rather than generic cloud abstractions. Its systems are optimized for high-bandwidth interconnects, efficient cooling, and predictable cluster scheduling - features that have become essential as model sizes, context windows and inference concurrency all continue to grow. Divergence in Cloud Strategies Investors suggest that the ability to reliably convert ‘kilowatts into tokens,’ as USIT managing director Gaetano Crupi described it, will define the competitive landscape of the AI decade. With electrical infrastructure emerging as the primary bottleneck for AI scaling, the value is shifting from raw GPU counts to end-to-end system efficiency: power distribution, thermal design, networking fabric, and the software layers that tie those components into cohesive large-scale clusters. From this perspective, Lambda is part of a new wave of companies aiming to industrialize AI - not as a series of cloud workloads but as a full-stack production pipeline similar to traditional heavy industry. Thomas Tull, who has supported Lambda for several years, emphasized that compute scarcity is becoming one of the dominant macroeconomic issues of the 2020s. Delivering enough capacity for model training and real-time inference, he argued, is a generational challenge akin to earlier national-scale infrastructure projects. The funding round positions Lambda to expand aggressively, potentially turning the company into a long-term operator of critical digital infrastructure with national significance. The broader Superintelligence Cloud thesis would align with a shift toward sovereign AI capabilities, edge-to-cloud compute ecosystems, and a tightening linkage between energy infrastructure and cognitive compute. As AI systems move further into mission-critical domains - from autonomous systems to defense, healthcare, finance and industrial automation - the reliability and availability of high-performance clusters are becoming foundational requirements. AI factories, in this context, are not merely data centers but vertically tuned production engines that transform data and energy into usable intelligence at scale. Lambda’s approach also reflects increasing divergence in cloud strategies. While hyperscalers continue expanding general-purpose compute with AI accelerators layered on top, AI specialists are building vertically optimized environments where networking topologies, storage paths, GPU interconnects and compilers are tuned explicitly for AI workloads. This specialization can yield significant gains in training throughput, inference cost-per-token, and cluster utilization - three metrics that now define competitiveness in model development and deployment. The new funding is likely to intensify competition within the GPU cloud and AI infrastructure sectors, particularly among providers pursuing supercomputer-class clusters. It may also influence where developers choose to train and deploy frontier-scale models, especially those requiring predictable performance on dense clusters or multi-node training jobs that can’t tolerate variability. Lambda’s rise captures a pivotal moment in the AI buildout phase: experimentation is giving way to industrialization, and the winners will be the companies capable of building and operating AI factories at unprecedented scale and efficiency. The company’s trajectory will now depend on how fast it can convert capital into operational infrastructure - and how effectively it can navigate the power, supply chain and regulatory realities that govern modern AI compute. Executive Insights FAQ: About AI Factories What makes an AI factory different from a traditional data center? AI factories are purpose-built with dense GPU clusters, high-bandwidth fabrics, and energy-optimized architectures tailored for AI training and inference, unlike general-purpose data centers that serve mixed workloads. Why is power availability emerging as the biggest constraint for AI infrastructure? Large-scale model training requires enormous electrical capacity, and utilities cannot expand quickly enough. The ability to convert power efficiently into usable compute has become a central competitive advantage. How do AI factories improve mission-critical inference workloads? By optimizing interconnects, memory bandwidth, and scheduling, AI factories deliver predictable low-latency inference at scale - crucial for applications in finance, healthcare, logistics, and autonomous systems. Why are investors focusing on ‘kilowatts-to-tokens’ efficiency? As models grow, the cost of both training and inference is increasingly determined by the energy required per token. Improving this ratio directly reduces operational cost and increases competitiveness. How will AI factories shape the next decade of AI deployment? They will underpin sovereign AI strategies, enable frontier model development outside hyperscalers, and provide the backbone for mission-critical AI applications that depend on scalable, deterministic performance.
dlvr.it
November 19, 2025 at 5:56 PM
VAST Data, Microsoft Unite to Deliver High-Scale Agentic AI on Azure
VAST Data and Microsoft are deepening their alignment around next-generation AI infrastructure, announcing a new collaboration that will bring the VAST Data AI Operating System (AI OS) natively to Microsoft Azure. Unveiled at Microsoft Ignite, the partnership positions VAST Data as a strategic technology layer supporting what both companies describe as the coming wave of agentic AI. These AI systems composed of autonomous, continuously reasoning software agents operate on massive, real-time datasets. For Azure customers, the integration means they will be able to deploy VAST’s full data platform directly within the Microsoft cloud, using the same governance, security, operational tooling, and billing frameworks that define Azure-native services. The VAST AI OS, long known in enterprise AI circles for its performance-oriented architecture and unified data model, will now be available as a cloud service, simplifying deployment for organizations scaling AI workloads across on-premises, hybrid, and multi-cloud environments. The partnership gives enterprises access to VAST’s unified storage, data cataloging, and database services, designed to support increasingly complex AI pipelines that incorporate vector search, retrieval-augmented generation (RAG), model training, inference, and real-time agentic processing. VAST’s architecture will run on Azure infrastructure, including the new Laos VM Series and Azure Boost accelerated networking, which are optimized for high-bandwidth AI workloads. Jeff Denworth, co-founder of VAST Data, described the partnership as an inflection point for enterprise AI deployment. “Performance, scale, and simplicity are converging,” he said. “Azure customers will be able to unify their data and AI pipelines across environments with the same power, simplicity, and performance they expect from VAST - now combined with the elasticity and geographic reach of Microsoft’s cloud.” Microsoft, for its part, sees the integration as a way to streamline the data and storage foundations required for the fast-growing segment of AI model builders working within Azure. “Many of the world’s leading AI developers leverage VAST for its scalability and breakthrough performance,” said Aung Oo, Vice President of Azure Storage. “Running VAST’s AI OS on Azure will help customers accelerate time-to-insight while reducing operational and cost barriers.” At the center of the offering is a platform designed for agentic AI. VAST’s InsightEngine provides stateless compute and database services optimized for vector search, RAG pipelines, and high-performance data preparation. Its companion AgentEngine coordinates autonomous AI agents working across distributed environments, enabling continuous reasoning over data streams without requiring multi-step orchestration frameworks. Azure CPU and GPU Clusters From an infrastructure perspective, the VAST AI OS is engineered to maximize utilization of Azure CPU and GPU clusters. The platform integrates intelligent caching, metadata-aware I/O, and high-throughput data services to ensure predictable performance across training, fine-tuning, and inference cycles. This aligns with Microsoft’s broader strategy of building vertically integrated AI infrastructure - one that increasingly includes custom silicon investments. A key differentiator of the VAST approach is its exabyte-scale DataSpace, which creates a unified global namespace across on-prem, co-lo, and cloud environments. The model gives enterprises the ability to burst GPU-intensive workloads into Azure without redesigning pipelines or migrating data - a capability that has traditionally slowed hybrid AI adoption. VAST Data’s disaggregated, shared-everything (DASE) architecture extends into Azure as well, allowing compute and storage resources to scale independently. With built-in Similarity Reduction technology reducing the storage footprint of large AI datasets, the combined platforms aim to give customers both elasticity and cost containment - critical factors as model development increasingly demands multi-region, multi-petabyte environments. The collaboration arrives as AI infrastructure requirements evolve rapidly. Autonomous agents, context-rich retrieval systems, and continuous-learning workflows require consistent performance across heterogeneous environments - something neither legacy storage architectures nor siloed cloud services were built to handle. By positioning VAST as a unified data substrate for Azure-based AI, Microsoft is betting on an architecture that can bridge those gaps at cloud scale. Both companies say they will co-engineer future capabilities as Microsoft advances its next-generation compute programs. The long-term goal, they emphasize, is to ensure that regardless of model architecture or processor design, the underlying data layer can support AI workloads with predictability and scale. Executive Insights FAQ What does this partnership enable for Azure customers? Azure users will be able to deploy the VAST AI Operating System natively in the cloud, giving them unified data services, high-performance storage, and AI-optimized compute pipelines without managing separate infrastructure. How does the VAST AI OS support agentic AI? VAST’s InsightEngine and AgentEngine allow organizations to run autonomous AI agents and stateful reasoning systems directly on real-time data streams, enabling continuous decision-making across hybrid and multi-cloud environments. What advantages does the integration bring for AI model builders? The platform keeps Azure GPU clusters fully utilized through high-throughput data services, intelligent caching, and metadata-optimized I/O - ensuring predictable performance for training, fine-tuning, and inference at scale. How does VAST improve hybrid AI workflows? Its global DataSpace functions as a unified namespace, allowing organizations to burst workloads into Azure without data migration or pipeline redesign, enabling seamless hybrid and multi-cloud operations. How will the collaboration evolve as Microsoft introduces new AI hardware? VAST Data and Microsoft will co-engineer future platform requirements so that emerging Azure infrastructure - including custom silicon initiatives - remains fully compatible with VAST’s AI OS, ensuring long-term scalability and performance.
dlvr.it
November 19, 2025 at 4:39 PM
AI Demand Reshapes Global Data Center Costs, Says New Report
Global data center construction is entering a period of rapid divergence as AI demand reshapes cost structures, power availability, and infrastructure strategy worldwide, according to a new analysis from Turner & Townsend, a multinational professional services company headquartered in Leeds, United Kingdom. As one of the world’s largest construction consultancies, Turner & Townsend provides program management, cost management, and infrastructure advisory services across property, transportation, and natural resources - and its newly released Data Centre Construction Cost Index 2025 signals a turning point for the industry. The report reveals a market in which traditional cloud facilities are stabilizing in cost, while next-generation AI data centers are breaking away with higher complexity, greater density requirements, and significantly elevated capital costs. With AI adoption accelerating faster than grid infrastructure and supply chains can adapt, Turner & Townsend warns that markets are entering a phase where delays, price divergence, and regional disparities will shape global competitiveness. According to the firm’s analysis, construction costs for conventional air-cooled cloud data centers are projected to rise 5.5 percent year-over-year in 2025, a notable cooling from the 9 percent increase reported in the previous year. This moderation reflects broader stabilization in construction markets globally, along with maturing supply chains in newer data center regions. Turner & Townsend’s Global Construction Market Intelligence Report for 2025 shows only 4.2 percent average inflation across the entire construction sector - an indicator that the data center ecosystem is settling after a period of extreme volatility. Widening Data Center Construction Cost Gap But the real story lies in the widening gap between traditional builds and facilities designed specifically for AI workloads. Turner & Townsend's benchmarking - supported by data from its global Hive intelligence platform - shows that liquid-cooled, high-density AI data centers in the United States carry a 7–10 percent construction cost premium over similarly sized air-cooled facilities. These projects not only require more complex mechanical systems but demand higher electrical capacity, more sophisticated rack design, and greater engineering coordination to support the thermal needs of GPU clusters used for training and inference. In these next-generation environments, mechanical systems comprise 33 percent of total build costs, compared with 22 percent for air-cooled designs, highlighting the shift from traditional fan-based cooling to immersion, rear-door heat exchange, and direct-to-chip liquid cooling technologies. Electrical systems remain the single largest cost driver - accounting for roughly half of total spend - reflecting the industry’s dramatic escalation in power density per rack. Industry survey data included in the report suggests that these pressures are being felt acutely. Nearly half of respondents (47 percent) said they experienced bid or tender price increases between 6 and 15 percent in the past year, and 21 percent reported increases above 15 percent. Looking ahead, 60 percent expect further construction cost escalation of 5 to 15 percent in 2026. The geography of cost pressure remains uneven. The world’s most expensive markets for data center construction are unchanged: Tokyo (US$15.2 per watt), Singapore (US$14.5), and Zurich (US$14.2), ranking as the top three globally. In these highly constrained regions, land scarcity, labor dynamics, and specialized contractor availability are driving persistently high pricing. Tokyo’s dominance is reinforced by the addition of Osaka to the index, signaling Japan’s emergence as a multi-hub data center market. In Europe, markets such as Paris and Amsterdam have climbed significantly due to maturing supply chains and currency effects from a softer U.S. dollar. Both now sit at US$10.8 per watt, comparable to Portland’s pricing in the United States. Meanwhile, Madrid and Dublin have surpassed major U.S. hubs including Atlanta and Phoenix, reflecting rapidly rising demand in Europe’s expanding cloud and AI ecosystem. Power Availability In the United States, a major shift is underway as long-standing power constraints in Northern Virginia push developers southward. Charlotte, North Carolina, newly added to the index at US$9.5 per watt, is experiencing a surge in hyperscale and colocation development. Favorable tax incentives, grid accessibility, and lower electricity prices have drawn new investments from Digital Realty, Microsoft, QTS, Compass, and Apple. Turner & Townsend notes that this marks an inflection point: power strategy is becoming the determining factor for where the next generation of AI centers will be built. Power availability is now the single greatest barrier to delivery. 48 percent of survey respondents identified power constraints - especially long grid connection timelines - as the primary cause of project delays. Across the U.S., UK, and Europe, utilities face competing demands from housing, manufacturing, and renewable energy deployment, forcing grid operators to prioritize connections. While governments are attempting to modernize planning rules and connection processes, progress remains slow. In response, Turner & Townsend stresses that clients will increasingly need to consider alternative or supplemental power strategies, including on-site renewable generation, battery energy storage, or grid-independent solutions. Yet only 14 percent of survey respondents have explored such approaches. As AI workloads become dominant, the consultancy warns that dependence on traditional grid connections will present an unsustainable bottleneck. Water use is emerging as a second major concern. Although many liquid-cooling systems operate in closed-loop designs, public scrutiny and local environmental policies are tightening. Regions facing water scarcity may restrict certain cooling configurations, pushing operators toward more efficient thermal designs that minimize environmental impact and accelerate planning approvals. Despite these headwinds, the data center sector remains highly optimistic. 75 percent of survey respondents are already involved in AI data center projects, and 47 percent expect AI workloads to represent more than half of total demand within the next two years. The industry has seen rack power density rise by 100x in the past decade, and Turner & Townsend argues that this momentum reflects only the earliest stage of an AI-driven infrastructure revolution. Executive Insights FAQ: AI Data Center Economics What is driving the cost premium for AI-optimized data centers? Higher power density, liquid cooling integration, and advanced electrical and mechanical systems push AI data center construction costs 7–10% above traditional designs. Why is power availability the biggest factor affecting project timelines? Grid connection queues and regional power shortages are delaying builds more than any other factor, forcing developers to seek alternative energy models or new markets. How is liquid cooling changing facility design and cost allocation? Mechanical system costs rise significantly, environmental considerations become more complex, and operators must integrate new thermal strategies to support GPU racks. Why are regional cost disparities narrowing across Europe and the U.S.? Maturing supply chains and currency shifts are balancing costs, while demand in secondary markets is rising due to power constraints in traditional hubs. What strategic steps should operators take to avoid future delays? Early procurement, diversified supplier networks, and exploration of on-site or hybrid power models are increasingly essential for AI-driven deployments.
dlvr.it
November 19, 2025 at 4:11 PM