Can Server Makers Escape Thin Margins in the AI Era?

Advertisements

In the rapidly evolving world of artificial intelligence (AI), a significant shift is underway, primarily driven by the emergence of advanced models such as OpenAI's latest release, o1. This new model not only surpasses its predecessor, GPT-4o, in terms of reasoning capabilities but also ushers in a new era of computing demandsThe introduction of thought chains in the reasoning process of o1 allows it to dismantle questions while simultaneously providing answers, resulting in more reliable outputsHowever, this leap in functionality comes with an increased need for computational power.

The enhancement in AI reasoning capabilities signifies a drastic escalation in computational needs, as the model's architecture integrates reinforcement learning into the training of large language models (LLMs). As a consequence, not only does the inference process require more computational power, but the training stages do as well

Xu Zhijun, the Vice Chairman and rotating Chairman of Huawei, succinctly summarized the problem: "The greatest difficulty encountered in AI research is the lack of computing power—AI is fundamentally about brute-force calculation."

Reflecting on this escalating need, technology companies have ramped up their investments in AI infrastructure over recent yearsNotably, Nvidia has seen its stock price rise dramatically, and companies specializing in AI server hardware have reported significant increases in their revenues as they benefit from the demand for "AI shovels"—the essential tools required to dig deeper into the AI landscape.

With the growing demand for AI computational efforts and the decentralization of infrastructure, server manufacturers are poised to profit even moreThe surge in performance among these manufacturers can be attributed to the tighter integration of servers with AI systems

Companies engaged in AI training have adopted various methods to accelerate the entire AI training process, transforming heterogeneous computing AI servers into effective distributions for AI tasksMeanwhile, to resolve the challenge of hardware shortages, server manufacturers have leveraged their experience operating large server clusters, creating platforms for mixed training of large-scale models using GPUs from Nvidia, AMD, Huawei, Intel, and others.

As organizations deepen their understanding of AI—from training to hardware optimization—these server manufacturers are expanding their roles in the AI value chain beyond mere hardware assemblersMany have begun to redesign the hardware infrastructure of their AI server clusters in alignment with the requirements of modern AI applicationsMoreover, with growing engagement in the domestic chip markets, customized solutions designed by local manufacturers are becoming widespread.

Software advancements are also seen, as server companies dive deeper into the productivity attributes AI brings to infrastructure

By launching large AI models and agents, these manufacturers have forged closer connections with AI application clients, enabling them to generate greater revenue from software solutionsIn essence, the transformation brought about by the AI era has fundamentally altered the industry logic governing computational supply.

As the industry experiences this "computing power drought," AI server companies are increasingly recognized as critical suppliers, providing the necessary resources to satisfy soaring demands.

The AI Industry: Profiting from "Selling Shovels"

Accelerated investments by major AI firms have favored the "shovel sellers," the AI server manufacturers who have begun to see profitsData from IT Juzi indicates that many AI-related publicly traded companies still report losses as of September 1stWhile 15 profitable AI companies recorded a net profit of 2.78 billion yuan, 19 companies suffered cumulative losses amounting to 6.24 billion yuan.

AI has yet to yield overall profitability for the industry, as leading AI companies are still in an investment-heavy phase

alefox

This year, reports show that three major AI companies—Baidu, Alibaba, and Tencent (often referred to as BAT)—spent a whopping 50 billion yuan on AI infrastructure in the first half of the year, more than doubling their investment from 23 billion yuan the previous yearGlobally, Amazon also decided to enter a capital expansion cycle by increasing its fixed capital expenditure by 18% last quarterSimilarly, tech giants such as Microsoft, Amazon, Google, and Meta have reached a consensus to escalate their AI investments.

Risks associated with insufficient AI investment far exceed those related to overinvestment, comments Sundar Pichai, CEO of Alphabet, Google's parent companyHe clearly emphasizes an aggressive stance, dismissing concerns about potential investment bubblesMeanwhile, AI infrastructure providers are capitalizing greatly on this momentum.

Among those profiting immensely are established server manufacturers like HP and Dell, who are experiencing renewal in the AI era

HP recently revealed that its server business surged by 35.1% compared to the previous yearDell followed with an impressive report of an 80% year-on-year growth in its server and networking divisionLenovo also reported that its infrastructure solutions business crossed 3 billion USD in quarterly revenues for the first time due to growing AI demand—an increase of 65%. Furthermore, Inspur's interim report indicated a staggering 90.56% rise in net profits from the previous year, while Digital China reported net profits of 510 million yuan, representing a 17.5% increase, with its AI server division raking in 560 million yuan—up 273.3%.

The astounding growth exceeding 50% is a testament to the widespread deployment of AI serversAside from cloud providers, telecom operators have emerged as major consumers of AI servers; since 2023, their demand has more than doubledAdditionally, the burgeoning requirement for intelligent computing centers has expedited the deployment of AI servers

According to reports from Mingyang, head of Intel's AI chip division in China, over 50 government-led intelligent computing centers have been established in the past three years, with more than 60 additional projects set for planning and development.

The robust demand for AI servers is redefining the growth dynamics of the entire server industryA recent report from TrendForce forecasts that AI server shipments will reach $187 billion in 2024—an impressive growth rate of 69%—in contrast to a mere 1.9% anticipated increase in general server shipments.

As cloud service providers (CSPs) gradually complete intelligent computing center constructions, the demand for AI servers is poised to accelerate further, concurrently broadening the scope of edge computing applicationsSubsequently, the sales landscape for AI servers will shift from bulk purchasing by CSPs to smaller-scale acquisitions for enterprise edge computing.

The evolving procurement model grants AI server manufacturers enhanced bargaining power and profitability, establishing a significant gap with the lengthy return on investment cycles faced by AI server customers.

Looking at the computational leasing business model, calculations within the industry have highlighted the lengthy return periods stemming from the supporting infrastructure (storage, networking) of intelligent computing centers

Without factoring in the annual drop in computational pricing, the payback period for investing in GPUs like Nvidia's H100 can stretch up to five years, while utilizing the more cost-effective Nvidia 4090 graphics card still results in a return period of over two years.

Thus, aiding clients in maximizing the use of AI servers has become a core competitive focus for the entire server industry.

Accelerating Stability: AI Server Providers Showcase Their Abilities

The process of deploying large models is intricate, involving various advanced technological practices such as distributed parallel computing, computational scheduling, and large-scale networking, as summarized by Feng Liang, a Senior Product Manager at H3C GroupThis highlights the dual challenges surrounding AI server deployment: optimization of computational resources and scalability.

A sales representative at Guangrong Zhixin shared insights on common client needs, which include hardware specifications, AI training support, and expansive clustering capabilities.

Optimization of computational power primarily relates to the challenges of heterogeneous computing within AI servers

Current industry solutions focus on enhancing computational assignment and collaboration among diverse chip sets.

Shifting away from an isolated CPU task approach, contemporary AI servers now operate with a collaborative model where CPUs interact with specialized hardware such as GPUs, NPUs, and TPUsThis distribution is akin to Nvidia's CUDA approach, which aims to maximize overall computing by coordinating multiple hardware resources.

The Principle Behind Heterogeneous Computing Power Distribution

As a result, the structural evolution of AI servers reflects their capability to stack various hardware blocks like building blocksAI servers are now considerably larger, evolving from the former 1U configurations to become more common at 4U and 7U formats.

To further optimize computational effectiveness, many manufacturers have put forth innovative solutions

For example, H3C's Aofei computing platform allows for granular control of computational and video memory distributions, enabling on-demand schedulingLenovo's Wanquan heterogeneous computing platform autonomously identifies AI scenarios, algorithms, and computational clusters based on a knowledge baseClients only need to provide the data and context, allowing the platform to automatically load optimal algorithms and orchestrate the best cluster configurations.

Conversely, the collaborative challenges among different hardware servers, especially as Nvidia GPUs face shortages, lead many intelligent computing hubs to mix AMD, Huawei, and Intel GPUs for combined training of AI modelsSuch intersections create potential complications regarding communication efficiency, inter-connectivity, and collaborative scheduling.

Typically, the process of training AI in server clusters can be illustrated as cyclical

A task is segmented across all computational units, and results are consolidated for subsequent calculationsIf the coordination falters—such as a slower GPU hinders progress—the entire network is forced to pause until synchronization is restored, which can severely prolong training durations.

To address these hurdles, leading solutions now automate the training process through advanced cloud management systems encompassing scheduling, Platform as a Service (PaaS), and Machine as a Service (MaaS). H3C's model boasts a heterogeneous resource management platform that implements a unified communication library to shield results from discrepancies between different manufacturers, while Baidu's Boating platform enables multi-core hybrid training by integrating a broad range of chips into large clusters that can support extensive tasks.

Comparatively similar solutions mark the goal highlighted by the co-founder and CEO of Wu Wenxinqiong Initiative, Xia Lixue—"Before turning on the tap, we do not need to know from which river the water is sourced."

Resolving heterogeneous computing issues grants immense flexibility in the choice of hardware components in intelligent computing clusters, driving synergy among server manufacturers, computational chip producers, and AI infrastructure firms as they collaboratively sustain the stability of large-scale AI server setups.

Using the experience of Meta in employing computational clusters as a reference, the training of large AI models is fraught with its own challenges

Studies indicate that during synchronous training with a 16K H100 cluster, Meta encountered job abnormalities 466 times over 54 daysThe mainstream solution to achieving stability post-issues is to integrate a "firewall" interlude during training.

For instance, Lenovo’s approach utilizes AI modeling to forecast training faults, optimally adjustments before interruptions can occurOn the other hand, both Super Fusion and Huawei Ascend have adopted direct countermeasures, promptly isolating faulty nodes and resuming training from the latest checkpoint.

Overall, as AI server manufacturers increasingly comprehend AI, upgrading optimization and stability, they enhance their value proposition in the marketplace.

Leveraging the transformative powers of AI, players in the server industry are revitalizing what has traditionally been a classic business-to-business (B2B) domain, generating newfound worth in their offerings.

Does AI Enhance the Value of Server Manufacturers?

Reflecting on historical trajectories, server manufacturers have consistently found themselves positioned in the middle of the "smile curve."

Post the Third Industrial Revolution, burgeoning market demands led to an influx of server manufacturers aiming to capitalize on the expanding market space.

In the era of personal computing, the X86 architecture of the Wintel Alliance facilitated the rise of major server corporations, such as Dell and HP

As cloud computing demands burgeoned, various OEM manufacturers, including Inspur and Foxconn, emerged to fulfill the call.

However, despite the prospect of annual revenues soaring into the hundreds of billions, these manufacturers have wrestled with persistently low net profit margins, often remaining in single-digit territoryUnder the model established by Inspur's Joint Design and Manufacturing (JDM) approach, total production efficiency has yielded gross profit margins as low as 1-2%.

"The cause behind the smile curve phenomenon is not rooted in production challenges but rather arises from a lack of control over proprietary technologies within the industry itselfWithout critical patents and technologies, they can only standardize production, which diminishes the uniqueness of their offerings," explained an analyst from Guotai Junan's electronics division.

In the age of AI, the value of server manufacturers is being redefined as they adapt to novel applications within the computational landscape

The ability to vertically integrate AI solutions is at the core of the competition among contemporary manufacturers.

As hardware development intensifies, many server manufacturers are actively involved in the construction of intelligent computing centers.

For example, in addressing Power Usage Effectiveness (PUE), companies including H3C, Inspur, Super Fusion, and Lenovo have rolled out liquid cooling solutionsBeyond simply releasing specialized equipment like silicon photonic switches to reduce energy output in data centers, H3C has also optimized their entire network product suite through AI technologySimilarly, companies like Digital China and Lenovo are pushing the integration of domestic chip technologies, striving for breakthroughs in China's chip industry.

Additionally, manufacturers are exploring AI's productivity attributes, transcending the mere provision of hardware

One common approach is the development of AI-enabling platforms, with Digital China integrating model power management, industry-specific knowledge databases, and AI application frameworks to create its Shenzhou Wenshi platformThis allows the user-experience of server usage to become increasingly efficient.

Digital China's Vice President, Li Gang, emphasized the necessity of a platform capable of embedding verified industry-specific knowledge structures while continuously evolving to incorporate new frameworks, which illustrates the value embedded in the company's AI application engineering platform.

H3C is leveraging existing strengths in network products with AI-generated insights for processes like fault detection and prediction, while also launching its Baiye Lingxi AI model in an effort to integrate ubiquitous AI model applications across various sectors to expand its original B2B hardware business offerings.

Through relentless technological innovation and ongoing product development, they are pursuing fresh breakthroughs amid the AI tide, unveiling new potentials within AI infrastructures

Share this Article