Apple S1 Analysis

One of the biggest issues with the smartwatch trend that I’ve seen is that as a result of most companies entering the market with smartphone backgrounds, we tend to see a lot of OEMs trying to shove smartphone parts into a smartwatch form factor. There have been a lot of different Android Wear watches, but for the most part everything seems to use Qualcomm’s Snapdragon 400 without the modem. Even though A7 is relatively low power for a smartphone, it’s probably closer to the edge of what is acceptable in terms of TDP for a smartwatch. Given that pretty much every Android Wear watch has around a 400 mAh battery at a 3.8 or 3.85 volt chemistry to attempt to reach 1-2 days of battery life and a relatively large PCB, the end result is that these smartwatches are really just too big for a significant segment of the market. In order to make a smartwatch that can scale down to sizes small enough to cover most of the market, it’s necessary to make an SoC specifically targeted at the smartwatch form factor.


Capped Apple S1 SoC (Image Courtesy iFixit)

The real question here is what Apple has done. As alluded to in the introduction, it turns out the answer is quite a bit. However, this SoC is basically a complete mystery. There’s really not much in the way of proper benchmarking tools or anything that can be run on the Watch to dig deeper here. Based on teardowns, this SoC is fabricated on Samsung’s 28nm LP process, although it’s not clear which flavor of LP is used. It’s pretty easy to eliminate the high power processes, so it’s really just a toss-up between HKMG and poly SiON gate structure. For those that are unfamiliar with what these terms mean, the main difference that results from this choice is a difference in power efficiency, as an HKMG process has less leakage power. Given how little cost is involved in this difference in process compared to a move to 20/14nm processes, it’s probably a safe bet that Apple is using an HKMG process here especially when we look at how the move from 28LP to 28HPm at TSMC dramatically affected battery life in the case of SoCs like Snapdragon 600 and 800.


Decapped & Labeled S1 SoC (Image Courtesy ABI Research)

We also know that binaries compiled for the watch target ARMv7k. Unfortunately, this is effectively an undocumented ISA. We know that Watch OS is built on iOS/Darwin, so this means that a memory management unit (MMU) is necessary in order to make it possible to have memory protection and key abstractions like virtual memory. This rules out MCU ISAs like ARMv7m even if it's possible to add an MMU to such an architecture, so it’s likely that we’re looking at some derivative of ARMv7-A, possibly with some unnecessary instructions stripped out to try and improve power consumption.

The GPU isn’t nearly as much of a mystery here. Given that the PowerVR drivers present in the Apple Watch, it’s fairly conclusive that the S1 uses some kind of PowerVR Series 5 GPU. However which Series 5 GPU is up to debate. There are reasons to believe it may be a PowerVR SGX543MP1, however I suspect that it is in fact PowerVR's GX5300, a specialized wearables GPU from the same family as the SGX543 and would use a very similar driver. Most likely, dedicated competitive intelligence firms (e.g. Chipworks) know the answer, though it's admittedly also the kind of information we expect they would hold on to in order to sell it to clients as part of their day-to-day business activities.

In any case, given that native applications won’t arrive until WatchOS 2 is released I don’t think we’ll be able to really do much in the way of extensive digging on what’s going on here as I suspect that graphics benchmarks will be rare even with the launch of WatchOS 2.

Meanwhile, after a lot of work and even more research, we're finally able to start shining a light on the CPU architecture in this first iteration of Apple's latest device. One of the first things we can start to look at is the memory hierarchy, which is information crucial to applications that require optimization to ensure that code has enough spatial and/or temporal locality to ensure that code is performant.

As one can see, there’s a pretty dramatic fall-off that happens between 28 and 64KB of “DRAM”, as we exit the local maximum of L1 data cache, so we can safely bet that the L1 data cache size is 32KB given current shipping products tend to fall somewhere between 32 and 64KB of L1 data cache. Given the dramatic fall-off that begins to happen around 224KB, we can also safely bet that we’re looking at a 256KB L2 combined cache which is fairly small compared to the 1-2MB shared cache that we might be used to from today’s large smartphone CPUs, but compared to something like an A5 or A7 it’s about right.

If Apple had just implemented the Cortex A7 as their CPU of choice, the obvious question at this point is whether they’ve really made anything “original” here. To try and dive deeper here, we can start looking past the memory hierarchy and looking closer at the machine itself. One of the first things that is obvious is that we’re looking at a CPU with a maximum frequency of 520 MHz, which is telling of the kind of maximum power that Apple is targeting here.

Apple S1 CPU Latency and Throughput
Instruction Throughput (Cycles/Result) Latency (Cycles/Result)
Loads (ldr reg,[reg]) 1 N/A
Stores (str reg,[reg]) 1 N/A
Move (mov reg, reg) 1/2 -
Integer Add (add reg, reg, imm8) 1/2 -
Integer Add (add reg,reg,reg) 1 1
Integer Multiply (mul reg,reg,reg) 1 3
Bitwise Shift (lsl reg,reg) 1 2
Float Add (vadd.f32 reg,reg,reg) 1 4
Double Add (vadd.f64 reg,reg,reg) 1 4
Float Multiply (vmul.f32 reg,reg,reg) 1 4
Double Multiply (vmul.f64 reg,reg,reg) 4 7
Double Divide (vdiv.f64 reg,reg,reg) 29 32

Obviously, talking about the cache hierarchy isn’t enough, so let’s get into the actual architecture. On the integer side of things, integer add latency is a single cycle, but integer multiplication latency is three cycles. However, due to pipelining integer multiplication throughput can produce a result every clock cycle. Similarly, bitshifts take two cycles to complete, but the throughput can be once per clock. Attempting to interleave multiplies and adds results in only achieving half the throughput. We can guess that this is because the integer add block and the integer multiply block are the same block, but that doesn’t really make sense because of just how different addition and multiplication are at the logic level.

Integers are just half of the equation when it comes to data types. We may have Booleans, characters, strings, and varying bit sizes of integers, but when we need to represent decimal values we have to use floating point to enable a whole host of applications. In the case of low power CPUs like this one, floating point will also often be far slower than integers because the rules involved in doing floating point math is complex. At any rate, a float (32-bit) can be added with a throughput of one result per cycle, and a latency of four cycles. The same is true of adding a double or multiplying a float. However, multiplying or dividing doubles is definitely not a good idea here because peak throughput of multiplying doubles is one result per four clock cycles, with a latency of 7 clock cycles. Dividing doubles has a peak throughput of a result every 29 clock cycles, with a latency of 32 clock cycles.

If you happen to have a webpage open with the latency and throughput timings for Cortex A7, you’d probably guess that this is a Cortex A7, and you’d probably be right as well. Attempting to do a load and a store together has a timing that indicates these are XOR operations which cannot be executed in a parallel manner. The same is true of multiplication and addition even though the two operations shouldn’t have any shared logic. Conveniently, the Cortex A7 has a two-wide pipeline that has similar limitations. Cortex A5 is purely single-issue, so despite some similarity it can't explain why addition with an immediate/constant value and a register can happen twice per clock.

Given the overwhelming amount of evidence at the timing level of all these instructions, it’s almost guaranteed that we’re looking at a single core Cortex A7 or a derivative of it at 520 MHz. Even if this is just a Cortex A7, targeting a far lower maximum clock speed means that logic design can prioritize power efficiency over performance. Standard cells can favor techniques and styles that would otherwise unacceptably compromise performance in a 2+ GHz chip could be easily used in a 520 MHz chip such as device stacking, sleepy stack layout, higher Vt selection with negative active body biasing, and other techniques that would allow for either lower voltage at the same frequency, or reduced capacitance in dynamic power and reduced static leakage. Given that Cortex A7 has generally been a winning design for perf/W metrics, I suspect that key points of differentiation will come from implementation rather than architecture for the near future. Although I was hoping to see Apple Watch on a more leading-edge process like 14LPP/16FF+, I suspect this will be deferred until Apple Watch 2 or 3.

Design WatchOS: Time and Notifications
Comments Locked

270 Comments

View All Comments

  • TedKord - Monday, July 27, 2015 - link

    Holy crap. That post was longer than the review.
  • Figaro56 - Saturday, August 1, 2015 - link

    Holly crap you sound exactly like a manic depressive friend of mine. You lost me at the gazillionth POS comment.
  • michellepennie - Wednesday, August 26, 2015 - link

    Boohoo you sound the jealous type and i bet you couldn't afford one :P lol
  • dsumanik - Monday, July 20, 2015 - link

    You know what Ryan, I'll give you the benefit of the doubt.

    Whether it was apple PR or not is irrelevant. I'll even admit I didn't even finish the entire article because it read like a kid getting a new toy for christmas explaining how magical it was.

    Can you explain to my why the author(s) felt the need to photograph and post not one, not two, not three, not four but FIVE neatly arranged unboxing shots... on the very first page of the article.

    The shots were deliberately arranged on a cleaned, attractive, ironed cloth that tied into the watch's color scheme.

    Some questions about the opening sequence of photos :

    - Do you think that these shots reveal any info to your readers? Tech specs, warranty info, durability?
    - Why does the EXIF info read adoble lightroom. Like gimme a break. They were enhanced.
    - If we remove all verbiage, does the watch look attractive, or unattractive in any way shape or form?
    - Do people generally wear a timepiece nicely draped over their fingers in front of a sunny picturesque tree?
    - Is it just a coincidence that not only I but others, thought the photo's looked 'funny'?

    The author(s) deliberately took time and significant effort to make the product to look as attractive as possible. The opening page, it's photographs and presentation instantly clue the reader that this piece is obviously written with heavy marketing bias and the overall tone and conclusion will be a positive one.

    Is my original post inflammatory? Sure. Beligerent? yes.

    100% True?

    YES.

    You know why this watch isn't selling? Apple's customers are thinking this:

    "Cool! New apple watch! What does it do?"
    "Hmm, it doesnt really do that much. I was kind of expecting more."
    "You know what, it's kinda chunky and why does it stutter?...OMG, 400 bucks? pffff totally not worth it."

    I know this because I am an apple customer, and this thing pretty much just sucks.

    Some more questions:

    - Do you think it would be good for your publication to write a scathing review of an apple product that went viral? Isn't that kind of sad?
    - Would you recommend this product for a single mother, your grandma, or anyone else close to you?
    - Have you thought about purchasing this product for ANYONE as a gift?
    - Had this review not taken place would you have gone out and purchased this item for yourself? LOL!!!!
    - Can you link me an article written on anandtech that portrayed any apple product in a negative light, ever?

    I'm sorry RyanI know you are jsut doing your job but the 'general consumer' is getting smarter and the internet is getting clogged up with this kind editorial crap.

    The only way to stop it is to speak up, LOUD, and be heard.

    Didn't Ellen Pao just say it best?

    "The trolls are winning"

    By trolls, she meant the general public tha ist sick of being lied to and manipulated.

    Lied to by presidents, company reps, journalists, law enforcement, intelligence agencies....right down to silly little amazon reviews.

    2 weeks later he files a patent to provide an advertisement based on your bank account balance!

    Call a turd a turd.

    Dont photoshop it then, sprinkle whipcream and cherries on top.

    Just sayin.
  • Ryan Smith - Monday, July 20, 2015 - link

    "You know what Ryan, I'll give you the benefit of the doubt."

    Thank you. I appreciate it.

    "Whether it was apple PR or not is irrelevant. I'll even admit I didn't even finish the entire article because it read like a kid getting a new toy for christmas explaining how magical it was.
    Can you explain to my why the author(s) felt the need to photograph and post not one, not two, not three, not four but FIVE neatly arranged unboxing shots... on the very first page of the article."

    The short answer is that our Apple reviews have a wider reach than our standard technical articles. The range of readers that will show up to AnandTech for a MacBook or iPhone review has a much more distinctive consumer shift than say an SSD or CPU review. And while we still have a large number of technical readers (who are our heart and soul), it's also good for us to be visible to less technical consumers, as it helps them learn that we exist and, hopefully, come back to learn things that no other site can offer.

    In any case, when you're working to reach a broader audience, you need to focus on more than just words. Less technical consumers aren't going to care about the S1 analysis for example, and that's okay, because we reach these users in other ways. And one of the ways we do that is in photography. Broader audiences like pictures - they like good pictures - and that means we step up our game on photography for these reviews in order to accommodate those users. There are a number of other sites out there reviewing the Watch, and there is a segment of the broader audience that will write us off in favor of another review if we show up with poor photography, so we need to make sure that not only is our analysis top notch, but our prose and imagery is competitive as well.

    At the end of the day we won't make any compromises on the technical side for our regular technical readers, but if we can also bring wider consumers into the fold through materials such as improved photography, we will do that as well. This way both techies and non-techies alike can enjoy our articles and learn something from them.
  • mapesdhs - Tuesday, July 21, 2015 - link

    Irony is, if the photos were 'poor', someone would be complaining about that instead. Ya can't please all of the people all of the time...

    Ian.
  • Schickenipple - Tuesday, July 21, 2015 - link

    It's sad, Ryan, that you actually had to explain this to someone. I thought your core readers would understand that an Apple Watch review isn't in the same category as a NVMe PCIe SSD. Guess not.
  • bo3bber - Saturday, July 25, 2015 - link

    Ryan, just wanted to observe that this approach has had the opposite effect on me. I used to come to AnandTech as my absolute goto first tech site, and these Apple puff pieces made me question your other reviews. So instead of improving your reach, at least for me, you reduced your reach because I feel that I cannot trust you as much as I did.

    The fact that Anand himself also left to go to Apple would strongly suggest you be wary of running Apple stuff that is fluffy.

    I only read the summary, because the out-of-box first page showed me it was going to be a puff piece not a technical review.

    I think you do yourself a disservice and have damaged your brand by trying to reach a larger market.
  • Samus - Tuesday, July 21, 2015 - link

    They clearly haven't paid attention to the high production value of ALL Anandtech articles over the past decade. You guys use top notch photography and lightboxes all the time. These comments are ridiculous.

    The Reddit fallout must have sent trolls to every corner of the internet.
  • victorson - Wednesday, July 22, 2015 - link

    Ryan, I do respect Anandtech coverage a lot and Josh has done some great research articles that I'm still digesting. However, I have to agree with others: this review just reads strangely lacking in perspective. It is choke-full of weird claims about watchOS being the iOS in the watch world, and about all that first-gen BS that gets throw around. Why is it that every tech reviewer would gladly slam a device for its poor functionality, but once we start talking about Apple, suddenly you guys chicken out and rather than saying that it's shit, you say that 'well, it lags like hell, but that's okay, because it's a first gen product.' And how about commenting on the lack of any actual useful functionality on the watch that would make users spend a ludicrous $700 for a single-core 500MHz processor running a 1.5" display? Don't get me started on forgetting to mention that other competitors have always-on screen (the WatchOS is a sore disappointment) AND come with two days of battery life. AND half the price! But no, rather than giving us some insightful comments on that, we get the 'I'm definitely convinced in the smartwatch now'. Thanks, very useful! /s

Log in

Don't have an account? Sign up now