Apple S1 Analysis

One of the biggest issues with the smartwatch trend that I’ve seen is that as a result of most companies entering the market with smartphone backgrounds, we tend to see a lot of OEMs trying to shove smartphone parts into a smartwatch form factor. There have been a lot of different Android Wear watches, but for the most part everything seems to use Qualcomm’s Snapdragon 400 without the modem. Even though A7 is relatively low power for a smartphone, it’s probably closer to the edge of what is acceptable in terms of TDP for a smartwatch. Given that pretty much every Android Wear watch has around a 400 mAh battery at a 3.8 or 3.85 volt chemistry to attempt to reach 1-2 days of battery life and a relatively large PCB, the end result is that these smartwatches are really just too big for a significant segment of the market. In order to make a smartwatch that can scale down to sizes small enough to cover most of the market, it’s necessary to make an SoC specifically targeted at the smartwatch form factor.


Capped Apple S1 SoC (Image Courtesy iFixit)

The real question here is what Apple has done. As alluded to in the introduction, it turns out the answer is quite a bit. However, this SoC is basically a complete mystery. There’s really not much in the way of proper benchmarking tools or anything that can be run on the Watch to dig deeper here. Based on teardowns, this SoC is fabricated on Samsung’s 28nm LP process, although it’s not clear which flavor of LP is used. It’s pretty easy to eliminate the high power processes, so it’s really just a toss-up between HKMG and poly SiON gate structure. For those that are unfamiliar with what these terms mean, the main difference that results from this choice is a difference in power efficiency, as an HKMG process has less leakage power. Given how little cost is involved in this difference in process compared to a move to 20/14nm processes, it’s probably a safe bet that Apple is using an HKMG process here especially when we look at how the move from 28LP to 28HPm at TSMC dramatically affected battery life in the case of SoCs like Snapdragon 600 and 800.


Decapped & Labeled S1 SoC (Image Courtesy ABI Research)

We also know that binaries compiled for the watch target ARMv7k. Unfortunately, this is effectively an undocumented ISA. We know that Watch OS is built on iOS/Darwin, so this means that a memory management unit (MMU) is necessary in order to make it possible to have memory protection and key abstractions like virtual memory. This rules out MCU ISAs like ARMv7m even if it's possible to add an MMU to such an architecture, so it’s likely that we’re looking at some derivative of ARMv7-A, possibly with some unnecessary instructions stripped out to try and improve power consumption.

The GPU isn’t nearly as much of a mystery here. Given that the PowerVR drivers present in the Apple Watch, it’s fairly conclusive that the S1 uses some kind of PowerVR Series 5 GPU. However which Series 5 GPU is up to debate. There are reasons to believe it may be a PowerVR SGX543MP1, however I suspect that it is in fact PowerVR's GX5300, a specialized wearables GPU from the same family as the SGX543 and would use a very similar driver. Most likely, dedicated competitive intelligence firms (e.g. Chipworks) know the answer, though it's admittedly also the kind of information we expect they would hold on to in order to sell it to clients as part of their day-to-day business activities.

In any case, given that native applications won’t arrive until WatchOS 2 is released I don’t think we’ll be able to really do much in the way of extensive digging on what’s going on here as I suspect that graphics benchmarks will be rare even with the launch of WatchOS 2.

Meanwhile, after a lot of work and even more research, we're finally able to start shining a light on the CPU architecture in this first iteration of Apple's latest device. One of the first things we can start to look at is the memory hierarchy, which is information crucial to applications that require optimization to ensure that code has enough spatial and/or temporal locality to ensure that code is performant.

As one can see, there’s a pretty dramatic fall-off that happens between 28 and 64KB of “DRAM”, as we exit the local maximum of L1 data cache, so we can safely bet that the L1 data cache size is 32KB given current shipping products tend to fall somewhere between 32 and 64KB of L1 data cache. Given the dramatic fall-off that begins to happen around 224KB, we can also safely bet that we’re looking at a 256KB L2 combined cache which is fairly small compared to the 1-2MB shared cache that we might be used to from today’s large smartphone CPUs, but compared to something like an A5 or A7 it’s about right.

If Apple had just implemented the Cortex A7 as their CPU of choice, the obvious question at this point is whether they’ve really made anything “original” here. To try and dive deeper here, we can start looking past the memory hierarchy and looking closer at the machine itself. One of the first things that is obvious is that we’re looking at a CPU with a maximum frequency of 520 MHz, which is telling of the kind of maximum power that Apple is targeting here.

Apple S1 CPU Latency and Throughput
Instruction Throughput (Cycles/Result) Latency (Cycles/Result)
Loads (ldr reg,[reg]) 1 N/A
Stores (str reg,[reg]) 1 N/A
Move (mov reg, reg) 1/2 -
Integer Add (add reg, reg, imm8) 1/2 -
Integer Add (add reg,reg,reg) 1 1
Integer Multiply (mul reg,reg,reg) 1 3
Bitwise Shift (lsl reg,reg) 1 2
Float Add (vadd.f32 reg,reg,reg) 1 4
Double Add (vadd.f64 reg,reg,reg) 1 4
Float Multiply (vmul.f32 reg,reg,reg) 1 4
Double Multiply (vmul.f64 reg,reg,reg) 4 7
Double Divide (vdiv.f64 reg,reg,reg) 29 32

Obviously, talking about the cache hierarchy isn’t enough, so let’s get into the actual architecture. On the integer side of things, integer add latency is a single cycle, but integer multiplication latency is three cycles. However, due to pipelining integer multiplication throughput can produce a result every clock cycle. Similarly, bitshifts take two cycles to complete, but the throughput can be once per clock. Attempting to interleave multiplies and adds results in only achieving half the throughput. We can guess that this is because the integer add block and the integer multiply block are the same block, but that doesn’t really make sense because of just how different addition and multiplication are at the logic level.

Integers are just half of the equation when it comes to data types. We may have Booleans, characters, strings, and varying bit sizes of integers, but when we need to represent decimal values we have to use floating point to enable a whole host of applications. In the case of low power CPUs like this one, floating point will also often be far slower than integers because the rules involved in doing floating point math is complex. At any rate, a float (32-bit) can be added with a throughput of one result per cycle, and a latency of four cycles. The same is true of adding a double or multiplying a float. However, multiplying or dividing doubles is definitely not a good idea here because peak throughput of multiplying doubles is one result per four clock cycles, with a latency of 7 clock cycles. Dividing doubles has a peak throughput of a result every 29 clock cycles, with a latency of 32 clock cycles.

If you happen to have a webpage open with the latency and throughput timings for Cortex A7, you’d probably guess that this is a Cortex A7, and you’d probably be right as well. Attempting to do a load and a store together has a timing that indicates these are XOR operations which cannot be executed in a parallel manner. The same is true of multiplication and addition even though the two operations shouldn’t have any shared logic. Conveniently, the Cortex A7 has a two-wide pipeline that has similar limitations. Cortex A5 is purely single-issue, so despite some similarity it can't explain why addition with an immediate/constant value and a register can happen twice per clock.

Given the overwhelming amount of evidence at the timing level of all these instructions, it’s almost guaranteed that we’re looking at a single core Cortex A7 or a derivative of it at 520 MHz. Even if this is just a Cortex A7, targeting a far lower maximum clock speed means that logic design can prioritize power efficiency over performance. Standard cells can favor techniques and styles that would otherwise unacceptably compromise performance in a 2+ GHz chip could be easily used in a 520 MHz chip such as device stacking, sleepy stack layout, higher Vt selection with negative active body biasing, and other techniques that would allow for either lower voltage at the same frequency, or reduced capacitance in dynamic power and reduced static leakage. Given that Cortex A7 has generally been a winning design for perf/W metrics, I suspect that key points of differentiation will come from implementation rather than architecture for the near future. Although I was hoping to see Apple Watch on a more leading-edge process like 14LPP/16FF+, I suspect this will be deferred until Apple Watch 2 or 3.

Design WatchOS: Time and Notifications
Comments Locked

270 Comments

View All Comments

  • JoshHo - Tuesday, July 21, 2015 - link

    It would be great to get specific instances of overly wordy areas, and information that you have learned elsewhere that is redundant in the review to improve our wearable reviews going forward.
  • Blairh - Monday, July 20, 2015 - link

    As an iPhone user I think the notifications aspect of the AW would be very appealing, but Apple is asking for too much money for such a luxury. And I'm talking about the Sport models. The SS models are ridiculously expensive. It's no surprise that roughly 3/4 of all AW sales have been the Sport models. Seriously you are nuts IMO to buy the SS model unless you have money to burn. Plus I think the Sport models are just nicer looking in general. And lighter to boot.

    Anyways, this review highlights a current glaring weakness which is the inability to respond to IM 3rd party apps directly on the AW. If you use WhatsApp or Facebook Messenger often as I do you are SOL if you want to respond with your AW right now. Perhaps this will change with the 2.0 update come fall, but still, right now this is really only ideal if you main communication is the messages app. Email is another story as there are several 3rd party email clients that offer voice dictation.

    I'm waffling between an AW and the Vivosmart. The Vivosmart won't let me reply to any notifications from my wrist however it's a third of the price of the 38mm AW and feels awesome on your wrist.

    I do believe in the future of the AW, but right now its got a lot of glaring holes to fill.
  • nrencoret - Monday, July 20, 2015 - link

    The worst article I've ever read on this site by miles. Too many words for nothing insightful. What I find here is a desperate struggle to justify what cannot be justified. As a person who loves the site's content I'm stumped by the horrible mess I have just read just a few points:

    - Apple has "solved" how a watch has to fit like no other company, traditional (ie. Rolex) or tech focused. That is a simply mindboggling statement.

    - The UI/UX is great. The Apple mouse and the iPhone have just one primary button for interacting. The crown, side button and force touch trilogy are the work of a comitee which couldn't settle for a simple means of interacting with a piece of technology. What Apple is best known for is how great they are at removing complexity -"just works" and "boom" come to mind- the reviewers were far to forgiving to all the usabily issues (ie. force touch discoverability). These would have been major issues on any other piece of technology.

    - Understanding what it is you get for your money: If you own a jewel like a watch or ring its timeless and has an intangible value. The watch can cost a pretty penny for something that has no better hardware than whats out there. There is no inherent intangible value in the watch because as has been stated in the review there will be future iterations of it, killing the timeles argument. As such, this watch is a piece of technology not jewelery and thus, its way overpriced. Lets just see how many dads give their sons Apple Watches and how those sons give them to theirr own.

    - Battery life of a single day for a timepiece is not even remotely acceptable. The Basis Peak, Fitbits and Pebbles may not be as smart but they nail the basic concept of a a time keeping device must do.

    - Nowhere was there a real argument of how the current incarnation of the watch is mostly useless without being tethered. Basis Peak comes to mind as how useful a device can be with our without tether.

    I could go on, given the amount of sheer nonsense of this review. I'm really dissapointed that this came from Anandtech.
  • alanpgh1 - Monday, July 20, 2015 - link

    Awesome Review... and right on target.
    I've had an Apple Watch for 2 months, and it continues to be an important and non-intrusive assistant in my life. I seem to learn something new that is helpful all the time.

    The only thing I ask the author to consider are these words from your review:
    "Finally, "Hey Siri" works well in terms of activation, but it's really kind of disappointing that the hotword detection doesn't work with the display off. I suspect this is due to power requirements as I haven't seen any other wearable have screen-off hotword detection, but it would definitely be great to see such a feature in the future."

    It is actually a feature to have the watch only listen for the "Hey Siri" hotword when the arm is lifted.
    Otherwise, if listening all the time, the system would have false triggers. Think about it; this way of operation is by design.

    Thanks for an excellent and thorough review!
  • TheRealArdrid - Tuesday, July 21, 2015 - link

    Gotta admit: I didn't get past the second page of this review. This is dripping with the feel of an Apple shill piece. Am I really to believe that no other watch in history, including recent smartwatches, properly fit the author's wrist but the Apple Watch, with its amazing Milanese band, magically did? Statements like that completely destroy legitimacy and credibility. Come on man...
  • zodiacfml - Tuesday, July 21, 2015 - link

    Their failure is sticking to the old, physical idea of a watch.
  • FunBunny2 - Tuesday, July 21, 2015 - link

    -- Their failure is sticking to the old, physical idea of a watch.

    Yeah, and what would GUIs be without radio buttons, menus, and all of the other analog clones they're built on? Face it: it's just pixels made to look "physical".
  • Oxford Guy - Tuesday, July 21, 2015 - link

    Honestly, I love that Apple is successful. The sound of PC-worshiping heads exploding all over the Internet is amusing. It lifts my spirits on a regular basis.

    Seriously, people... Apple didn't run over your mother, kill your dog, or beat your sister.

    The level of nerd rage over Apple's success really is misplaced. There are far worse things to cry over than yet another big tech firm that dodges taxes and overprices stuff. It's not like Apple is the only one and it's not like society in general doesn't reward that behavior.

    I've seen the anti-Apple zealotry for decades. It never changes. It always comes down to whinging about how much Apple charges, along with accusations that only gays, girls, and social-climbing superficial people use the products. In reality, despite their flaws, Apple products have been dependable workhorses for people for a long time, and some of them have been pretty innovative.

    The Lisa was a thousand times more innovative than the IBM PC. Apple didn't execute because of some poor management and the sudden spike in DRAM cost (caused by Japanese firms pushing US firms out of the market with price dumping and then colluding to raise prices, as far as I have read). Yes, it was expensive but the platform was a very solid foundation for line of machines. Apple had an office suite, multitasking, protected memory, tool-less design, a bootloader that made it easy to boot from multiple operating systems, and a plethora of other modern features back in '83.

    Unfortunately, the Mac was botched because it was turned from what was envisioned to be a $500 computer into a $1000 computer and then into a $2400 computer -- without making the underlying OS robust enough to justify that price or the hardware expandable enough. But, despite that, it had a very efficient GUI and people were willing to put up with bombs and freezes because that GUI was miles nicer to work with than Windows (up until 95 when things almost became as good on Windows, but not quite).

    If you think Apple is so fraudulent then start your own company or get a job running one already out there and out-compete them. Then let us know about your success. Until then, find something more productive to do with your time than rant ineffectually on Internet forums.
  • Oxford Guy - Tuesday, July 21, 2015 - link

    As for this product specifically, my advice is to wait for the next iteration that comes with a shrunken process. Apple's first iPad had a relatively short lifespan, rapidly orphaned. I wouldn't want to be stuck with this device if the same thing were to happen. It has generally been the same advice for quite some time: when Apple comes out with a new form factor, wait until version 2.
  • Oxford Guy - Tuesday, July 21, 2015 - link

    This even applied to the Mac, come to think of it. Jobs demoed (without telling the audience or the press, of course) a 512k prototype in order to run speech synthesis when he was unveiling the first Mac (128K, not expandable) to the press.

Log in

Don't have an account? Sign up now