HBM 4, about to be completed.

HBM 4, about to be completed.

Recently, the JEDEC Solid State Technology Association announced that the next version of the highly anticipated High Bandwidth Memory (HBM) DRAM standard: HBM4, is nearing completion.

According to the introduction, HBM4 is an evolutionary version of the currently released HBM3 standard, designed to further improve data processing rates while maintaining fundamental features such as higher bandwidth, lower power consumption, and larger capacities per chip and/or stack. These advancements are crucial for applications that require efficient processing of large datasets and complex computations, including generative artificial intelligence (AI), high-performance computing, high-end graphics cards, and servers.

Compared to HBM3, HBM4 plans to double the number of channels per stack and also has a larger physical footprint. To support device compatibility, the standard ensures that a single controller can work with both HBM3 and HBM4 when needed. Different configurations will require different interposer layers to accommodate the different footprints. HBM4 will specify 24 Gb and 32 Gb layers, and will have the option to support 4-high, 8-high, 12-high, and 16-high TSV stacks.

Advertisement

JEDEC pointed out that the committee has reached a preliminary agreement on a speed grade of up to 6.4 Gbps, and is currently discussing higher frequencies.

What updates does HBM4 have?

High Bandwidth Memory has been in existence for about ten years, and during its continuous development, its speed has steadily increased, with data transfer rates starting from 1 GT/s (the initial HBM) to 9 GT/s for HBM3E. This has enabled an impressive leap in bandwidth in less than 10 years, making HBM an important cornerstone for new HPC accelerators introduced to the market thereafter.

However, as the memory transfer rate increases, especially without changes to the basic physical properties of the DRAM cells, this speed is becoming increasingly difficult to maintain. Therefore, for HBM4, the main memory manufacturers behind the specification are planning more substantial changes to the high-bandwidth memory technology, starting with a wider 2048-bit memory interface.

HBM4 will expand the memory stack interface from 1024 bits to 2048 bits, which will be one of the most significant changes in the HBM specification since this type of memory was introduced eight years ago. Doubling the number of I/O pins while maintaining a similar physical footprint is extremely challenging for memory manufacturers, SoC developers, foundries, and outsourced assembly and test (OSAT) companies.According to the plan, this will enable HBM4 to achieve significant technological leaps on multiple levels. In terms of DRAM stacking, the 2048-bit memory interface will require a significant increase in the number of through-silicon vias in the memory stack wiring. At the same time, the external chip interface will need to reduce the bump pitch to below 55 micrometers, while significantly increasing the total number of micro-bumps from the current number of HBM3 (approximately) 3982 bumps.

Memory manufacturers have stated that they will also stack up to 16 memory chips in a single module, known as a 16-Hi stack, which adds some complexity to the technology. (HBM3 technically also supports 16-Hi stacking, but so far no manufacturers have actually used it) This will allow memory suppliers to significantly increase the capacity of their HBM stacks, but it brings new complexities, such as connecting a larger number of DRAM chips without defects, and then maintaining the final HBM stack to be appropriately and consistently short. All of this, in turn, requires closer cooperation between chip manufacturers, memory manufacturers, and chip packaging companies to ensure everything goes smoothly.

However, with the increase in the number of DRAM stacks, some have pointed out that packaging technology faces limitations.

The existing HBM uses TC (Thermocompression) bonding technology, which creates TSV channels in DRAM and makes electrical connections through micro-bumps in the form of small protrusions. The specific methods of Samsung Electronics and Hynix differ, but the similarity lies in the use of bumps.

Initially, customers stacked DRAM up to 16 layers and requested that the final packaging thickness of HBM4 be 720 micrometers, the same as the previous generation. The general view is that it is actually impossible to achieve a 16-layer DRAM stacking HBM4 at 720 micrometers using the existing bonding. Therefore, the industry's focus on alternative solutions is hybrid bonding. Hybrid bonding is a technology that directly bonds copper wiring between chips and wafers. Since no bumps are used between DRAMs, it is easier to reduce the packaging thickness.

However, according to a report by Korean media in March, in the discussions at that time, the relevant companies decided to relax the packaging thickness standard to 775 micrometers, thicker than the previous generation of 720 micrometers. The main participants of the International Semiconductor Standard Organization (JEDEC) also agreed to set the standard for HBM4 products at 775 micrometers. If the packaging thickness is reduced to 775 micrometers, it is fully possible to achieve a 16-layer DRAM stacking HBM4 even with the existing bonding technology. Considering the huge investment cost of hybrid bonding, memory companies are likely to focus on upgrading existing bonding technology.

According to the roadmap shared by Trendforce at the end of last year, the first batch of HBM4 samples is expected to have a capacity of up to 36 GB per stack, and the full specifications are expected to be released by JEDEC around the second half of 2024-2025. The first batch of customer samples and delivery times are expected to be in 2026, so we still have a long way to go before we can see the new high-bandwidth memory solutions put into use.

Latest layout of the three giants

At present, there are three major players in the market, SK Hynix, Samsung, and Micron, who are also competing in HBM 4.Firstly, looking at SK Hynix, in an industry event in May, SK Hynix indicated that it may be the first to introduce the next-generation HBM4 in 2025. SK Hynix plans to adopt TSMC's advanced logic process in the base chip of HBM4, in order to pack additional functions into a limited space, helping SK Hynix to customize HBM to meet a wider range of performance and energy efficiency requirements.

At the same time, SK Hynix stated that the two parties also plan to work together to optimize the combination of their HBM and Chip-on-Wafer-on-Substrate (CoWoS, a packaging technology of TSMC) technologies, and to meet customers' HBM needs.

In SK Hynix's view, the company's HBM products have the best speed and performance in the industry. Especially our unique MR-MUF technology, provides the most stable heat dissipation for high performance, and provides a guarantee for creating the world's top performance. SK Hynix claims that the Mass Rebate Molding Underfill (MR-MUF) technology manufacturing is 60% stronger than the product made by using Thermal Compression Non-conductive Film (TC-NCF). In addition, the company has the ability to mass-produce high-quality products quickly, and our response speed to customer needs is also second to none. The combination of these competitive advantages makes the company's HBM stand out and rank among the industry leaders.

Specifically in terms of DRAM, it is reported that SK Hynix plans to apply 1b DRAM to HBM4, and to apply 1c DRAM from HBM4E. However, it is understood that SK Hynix still has the flexibility to change the application technology according to market conditions.

As for Samsung, as a follower, Samsung is also fully armed.

Samsung Electronics has established a new "HBM Development Team" within its Device Solutions (DS) division to enhance its competitiveness in high-bandwidth memory (HBM) technology. This strategic move was taken more than a month after Vice Chairman Kyung-Hyun Kyung took office as the head of the DS division, reflecting the company's determination to maintain a leading position in the rapidly developing semiconductor market.

The newly established HBM development team will focus on advancing HBM3, HBM3E, and the next-generation HBM4 technology. The plan is to meet the surging demand for high-performance memory solutions brought about by the expansion of the artificial intelligence (AI) market. Earlier this year, Samsung had already established a task force (TF) to enhance its HBM competitiveness, and the new team will integrate and enhance these existing efforts.

Samsung Electronics also emphasized that it will strengthen its custom services for the sixth-generation high-bandwidth memory (HBM4) scheduled to be released next year.

The vice president of the new business planning group of the company's memory business department, Choi Jang-seok, said: "Compared with HBM3, the performance of HBM4 has been significantly improved," and added: "We are expanding our production capacity to 48GB (gigabytes) and developing with the production target for next year."Samsung Electronics has applied the MOSFET process to HBM3E and is actively considering applying the FinFET process starting from HBM4. As a result, compared to the MOSFET application, the speed of HBM4 has increased by 200%, the area has been reduced by 70%, and the performance has improved by more than 50%. This is the first time Samsung Electronics has publicly disclosed the HBM4 specifications.

Vice President Choi said, "There will be significant changes in the HBM architecture. Many customers are targeting customized optimization rather than the existing general-purpose applications." He added, "For example, the 3D stacking of HBM DRAM and customized logic chips has significantly improved." "Due to the intermediary layer and a large number of input/output (I/O) of general HBM, it will be possible to reduce performance and eliminate the barriers to performance expansion," he explained.

He continued, "HBM cannot ignore performance and capacity, nor can it ignore power consumption and thermal efficiency. To this end, the 16-layer HBM4 not only uses various cutting-edge packaging technologies (non-conductive adhesive film) assembly technology such as HCB (hybrid bonding) technology in addition to NCF, but also new processes. 'It is crucial to correctly implement various new technologies, and Samsung is preparing as planned,' he added.

There are reports that Samsung Electronics has recently formulated a plan internally to change the 1b DRAM originally planned to be installed in HBM4 to 1c DRAM. And the mass production target date has been advanced from the end of next year to the middle and late next year, but this rumor has not been confirmed because the yield must be supported.

Another HBM participant, Micron, is expected to launch 12H and 16H HBM4 with a capacity of 36GB to 48GB and a speed of over 1.5TB/S between 2025 and 2026. According to Micron, after HBM4, HBM4E will be introduced in 2028. The extended version of HBM4 is expected to achieve higher clock frequencies and increase the bandwidth to 2+ TB/s, and the capacity to 48GB to 64GB per stack.

Accelerate high-bandwidth memory to the speed of light

The emergence of HBM is to provide more memory to GPUs and other processors than what the standard x86 slot interface can support. However, the functions of GPUs are becoming more and more powerful, and they need to access data from memory faster to shorten the application processing time - for example, large language models (LLM) may involve repeatedly accessing tens of billions or even trillions of parameters in machine learning training runs, which may take hours or days to complete.

The current HBM follows a relatively standard design: the HBM memory stack is connected to the intermediary layer located on the base packaging layer through microbumps, and the microbumps are connected to the through-silicon via (TSV or connection hole) in the HBM stack. A processor is also installed on the intermediary layer, providing a connection from HBM to the processor.HBM Suppliers and HBM Standard Organizations are researching the use of technologies such as photonics or directly installing HBM on processor chips to speed up access from HBM to the processor. Suppliers are setting HBM bandwidth and capacity speeds—seemingly faster than what the JEDEC standard organization can keep up with.

Samsung is researching the use of photonic technology in the interposer layer, where photons flow faster on the link than bits encoded as electrons and consume less power. Photonic links can operate at femtosecond speeds. This means 10-15 unit time—quadrillionths of a second (one hundred millionth of a billionth).

According to South Korean media reports, SK Hynix is also researching the direct HBM-logic connection concept. This concept involves manufacturing GPU chips together with HBM chips in a hybrid-purpose semiconductor. The chip factory views this as HBM4 technology and is negotiating with Nvidia and other logic semiconductor suppliers. The idea involves memory and logic manufacturers jointly designing chips, which are then manufactured by wafer factory operators such as TSMC.

This is somewhat similar to the concept of Processing-in-Memory (PIM), which would be proprietary and have the prospect of vendor lock-in unless protected by industry standards.

Unlike Samsung and SK Hynix, Micron has not talked about integrating HBM and logic into a single chip. It will tell GPU suppliers (AMD, Intel, and Nvidia) that they can achieve faster memory access speeds with the combined HBM-GPU chip, and GPU suppliers will be very aware of the dangers of proprietary lock-in and single sourcing.

As ML training models become larger and training times longer, the pressure to shorten runtime by accelerating memory access speeds and increasing the memory capacity per GPU will also increase in tandem. Abandoning the competitive supply advantage of standardized DRAM in favor of a locked HBM-GPU combined chip design (although with better speed and capacity) may not be the right way forward.

Comments