Russia’s Elbrus 8CB Microarchitecture: 8-core VLIW on TSMC 28nm
by Dr. Ian Cutress on June 1, 2020 8:00 AM ESTAll of the world’s major superpowers have a vested interest in building their own custom silicon processors. The vital ingredient to this allows the superpower to wean itself off of US-based processors, guarantee there are no supplemental backdoors, and if needed add their own. As we have seen with China, custom chip designs, x86-based joint ventures, or Arm derivatives seem to be the order of the day. So in comes Russia, with its custom Elbrus VLIW design that seems to have its roots in SPARC.
Russia has been creating processors called Elbrus for a number of years now. For those of us outside Russia, it has mostly been a big question mark as to what is actually under the hood – these chips are built for custom servers and office PCs, often at the direction of the Russian government and its requirements. We have had glimpses of the design, thanks to documents from Russian supercomputing events, however these are a few years old now. If you are not in Russia, you are unlikely to ever get your hands on one at any rate. However, it recently came to our attention of a new programming guide listed online for the latest Elbrus-8CB processor designs.
The latest Elbrus-8CB chip, as detailed in the new online programming guide published this week, built on TSMC’s 28nm, is a 333 mm2 design featuring 8 cores at 1.5 GHz. Peak throughput according to the documents states 576 GFLOPs of single precision, with the chip offering four channels of DDR4-2400, good for 68.3 GB/s. The L1 and L2 caches are private, with a 64 kB L1-D cache, a 128 kB L1-I cache, and a 512 kB L2 cache. The L3 cache is shared between the cores, at 2 MB/core for a total of 16 MB. The processor also supports 4-way server multiprocessor combinations, although it does not say on what protocol or what bandwidth.
It is a compiler focused design, much like some other complex chips, in that most of the optimizations happen at the compiler level. Based on compiler first designs in the past, that typically does not make for a successful product. Documents from 2015 state that a continuing goal of the Elbrus design is x86 and x86-64 binary translation with only a 20% overhead, allowing full support for x86 code as well as x86 operating systems, including Windows 7 (this may have been updated since 2015).
The core has six execution ports, with many ports being multi-capable. For example, four of the ports can be load ports, and two of the ports can be store ports, but all of them can do integer operations and most can do floating point operations. Four of the ports can do comparison operations, and those four ports can also do vector compute.
This short news post is not meant to be a complete breakdown of the Elbrus capabilities – we have amusingly joked internally at what frequency a Cortex X1 with x86 translation would match the capabilities of the 8-core Elbrus, however users who want to get to grips with the design can open and read the documentation at the following address:
http://ftp.altlinux.org/pub/people/mike/elbrus/docs/elbrus_prog/html/index.html
The bigger question is going to be how likely any of these state-funded processor development projects are going to succeed at scale. State-funded groups should, theoretically, be the best funded, however even with all the money in the world, engineers are still required to get things done. Even if there ends up being a new super-CPU for a given superpower, there will always be vested interests in an amount of security though obscurity, especially if the hardware is designed specifically to cater to state-secret levels of compute. There's also the added complication of the US government tightening its screws around TSMC and ASML to not accept orders from specific companies - any plans to expand those boundaries could occur, depending how good the products are or how threatened some nations involved feel.
Source: Blu (Twitter)
93 Comments
View All Comments
mode_13h - Tuesday, June 2, 2020 - link
You're not going to change any minds. Please, just let it be.bagamut - Tuesday, June 2, 2020 - link
You all keep falling for the same old Cold War propaganda tricks. Previously that was "Evil Communism". There is no communism now, but like an old dog you know only old tricks.Eventually, the walls collapsing now in US...
mode_13h - Tuesday, June 2, 2020 - link
Leaving geopolitics aside, thanks for your posts. Congrats on a cool chip!We're all geeks, here; not politicians. If we just keep that in mind, I think we'll be fine.
: )
bagamut - Tuesday, June 2, 2020 - link
Would better if you write an article instead of this one. :)Actually this one is a crap.
EntityFX - Monday, June 1, 2020 - link
Maybe you will be interested in this: https://translate.google.ru/translate?sl=ru&tl...Warning: autotranslated text: RUS -> ENG.
AlB80 - Monday, June 1, 2020 - link
> Peak throughput according to the documents states 576 GFLOPs of double precision.288 or single precision.
> Elbrus 8CB Core
It's not a core scheme. It's an instruction format. Very-VLIW.
bagamut - Tuesday, June 2, 2020 - link
Article is quite misleading.This Elbrus has nothing to do with SPARC. Core has 20+ execution ports, not 6. It is not "new" CPU, it's CPU with long history. It is state funded cuz previously that was military project. That is mostly number crunching datacenter CPU, but it is more or less good enough to run desktop PC. TSMC produced a big enough batch to cover all needs for secure state computing. It's not planned to install this CPU in every smartphone.
Looks like that was written in 5 min without any clue.
abufrejoval - Tuesday, June 2, 2020 - link
Honestly, I am quite impressed by the cleverness of their approach.What people here may fail to appreciate is the fact that Elbrus was designed to solve a specific problem, not to kick Intel (or AMD) out of Amazon’s data centers. That’s a goal (and one of many intermediate others) that China is also covering, but Elburs was never designed for commercial success. It’s a military asset not designed for fuel economy or to win at Le Mans.
Russia needs the ability to *functionally* run any software, be it Western or their own, preferably even if it was ‘government class’ malicious. It’s about foreign closed source binaries they really need to run, but where they need to protect against that being infected right from the source (Snowden showed what Russia knew). In that sense it resembles China’s Jintide use case (HotChips 2019).
Russia also needs the ability to run their own mission critical software, with very little fear of any foreign malware infestation. It’ similar to Google’s trusted root investments, but with a very limited number of users and a slightly smaller budget.
The underlying assumption is that Russia will usually be able to source sufficient Western IT for their own civil consumption and since they aren’t producing billions of mobile phones and other IT for export, they don’t have Huawei problems. They just need these to protect the very same domestic assets they have (tried or succeeded) to cleverly weaponize with potential enemies abroad.
And they solve these two distinct use cases with the same hardware (economy!), which is costing them several orders of magnitude more than any high-volume ARM or x86 product (even z/Arch is probably cheap by comparison) per given unit of compute. But they are more than willing to pay for a fleet of IT tanks, than be caught with their pants, Internet, government, power- and water-supplies and military down.
It’s just very naïve to judge an architecture without appreciating the constraints and conditions it was designed for. And then you should ask if your country or region actually has those same abilities and facilities.
abufrejoval - Tuesday, June 2, 2020 - link
so actually it's three use cases:1. forensics
2. executing critical foreign and potentially infected code in a hardware sandbox, that eliminates or avoids the infection vector (think SCADA/Stuxnet)
3. Trusted compute for their own mission critical code
And if you then see that they achieve 100% software compatibility with x86 code with hardware armoring and speeds much better than any FPGA or emulator, that sure impresses me (do they microprogram AVXn code 360 style, I wonder).
I grew up in Berlin during the Cold War and I remember looking into those tank guns at Allied parades: Made me appreciate certain things.
mode_13h - Tuesday, June 2, 2020 - link
Interesting perspectives. Thanks for sharing.