Welcome to the MacNN Forums.

If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

You are here: MacNN Forums > Community > MacNN Lounge > iPhone Apple XS Max?

iPhone Apple XS Max? (Page 3)
Thread Tools
Waragainstsleep
Posting Junkie
Join Date: Mar 2004
Location: UK
Status: Offline
Reply With Quote
Sep 22, 2018, 06:10 AM
 
Does anyone ever really require that on a phone?
I have plenty of more important things to do, if only I could bring myself to do them....
     
OreoCookie
Moderator
Join Date: May 2001
Location: Hilbert space
Status: Offline
Reply With Quote
Sep 22, 2018, 12:16 PM
 
@P
You bring up a lot of points, and instead of going point-by-point, let me try and zoom out a little.

I have never heard that out-of-order designs are more efficient. Can you link to any evidence, not a back-of-the-envelope calculation (which is more complicated anyway, because we would have to talk about implementations using fast vs. slow transistor types among many other factors)?

ARM seems to agree (we can quibble about how ARM measures performance and what obvious and less obvious things they take into account), for example Peter Greenhalgh (the chief architect of the Cortex R4, A8, A5 and A53 as well as big.LITTLE) answered some comments on Anandtech (including one of mine):
Originally Posted by wrkingclasshero
What is ARM's most power efficient processing core? I don't mean using the least power, I mean work per watt. [...]

Originally Posted by Peter Greenhalgh
In the traditional applications class, Cortex-A5, Cortex-A7 and Cortex-A53 have very similar energy efficiency. Once a micro-architecture moves to Out-of-Order and increases the ILP/MLP speculation window and frequency there is a trade-off of power against performance which reduces energy efficiency. There’s no real way around this as higher performance requires more speculative transistors. This is why we believe in big.LITTLE as we have simple (relatively) in-order processors that minimise wasted energy through speculation and higher-performance out-of-order cores which push single-thread performance.

Across the entire portfolio of ARM processors a good case could be made for Cortex-M0+ being the more energy efficient processor depending on the workload and the power in the system around the Cortex-M0+ processor.
There are more details in point 5 of this Q&A on ARM's website. Of course, you could now say that this guy didn't just drink the kool aid, he thought of the recipe before he drank it! But I think it explicitly supports my claim.
Originally Posted by P View Post
It isn't a contest because they're running on different tracks. More efficient would mean that a dotted line (in-order core) would be higher up than a solid line (OoOE cores) at the same X coordinate, and doesn't happen very often, because they don't exist at the same performance level.
No, the curve need not be higher for the little core here. The graph I have finally found makes that more explicit:

It shows that near the cross over point, the efficiency of the big and the LITTLE core are comparable, although I think I have seen actual measurements where the big core even had a very slight edge. But the other point is that the line of the big processor does not extend down to the very low frequencies nor does the LITTLE core reach the very high performance region. Put succinctly, because you have a narrower optimization window for each of the cores, i. e. you can optimize the big core for high frequencies (in terms of pipeline length, transistor types, etc. etc.) and the LITTLE core for energy efficiency, you get a SoC that is more energy efficient than if you used just one core to span the whole gamut.

Now you could ask yourself whether you could do a big.BIG core SoC (just like the 2 x 4 x Cortex A53 SoCs that were around for a while with 4 energy efficient A53s and 4 higher clocked A53s) and get something that is more efficient? One counterargument here would be die area, it'd be quite wasteful, but strictly speaking that wasn't my claim. (Even though it is certainly a consideration in SoC design even if money is no object.)
Originally Posted by P View Post
Not because of their design, because of their clockspeed and voltage target. If you need performance X and you have an in-order and an OoOE core that can both deliver performance X, the most efficient way to do it is to use the OoOE core and clock it down to the same performance. That will use less power than the in-order core, unless you have to clock it so low that you're past the peak efficiency.
Oh, yes, design has something to do with it: you tend to need to lengthen pipelines if you want designs to hit high frequencies, which makes bubbles more costly. This is the problem that OoO solves: keep the pipeline (“more”) filled.
Originally Posted by P View Post
But how does the scheduler know that? Are the threads marked, somehow? If they are, that just further reinforces my point that the in-order cores aren't cut out for general purpose computing.
I don't really know how the schedulers figure this stuff out, but here is the relevant info straight from the horse's mouth.
Originally Posted by ARM
In big.LITTLE Global Task Scheduling, the same mechanism is in operation, but the OS keeps track of the load history of each thread and uses that history plus real-time performance sampling to balance threads appropriately among big and LITTLE cores.
Apple did something similar when it explained where some of its iOS 12 performance benefits come from. Among other things, the OS would ramp up CPU frequencies much more aggressively and ramp them down more quickly.
I don't suffer from insanity, I enjoy every minute of it.
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 22, 2018, 02:21 PM
 
Originally Posted by Brien View Post
Are you sure it isn't shooting actual HDR video (ala Dolby Vision) and not just using some image magic to fuss out details? The entire concept of HDR photos was to show more dynamic range than monitor tech allowed - now that we have wider gamuts and thousands of nits of brightness that could start to change.
Not nits, bits.

An 8-bit display has 256 steps between black and white. The highest contrast 8-bit monitor in the world still has only 256 steps between black and white.

What would give this display more dynamic range is increasing the bit-depth. If it was 10-bit, then there’s 1024 steps between black and white. That’s actual high dynamic range
     
Brien
Professional Poster
Join Date: Jun 2002
Location: Southern California
Status: Offline
Reply With Quote
Sep 22, 2018, 06:11 PM
 
You still need actual screen brightness for HDR though.
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 22, 2018, 07:44 PM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:00 AM. )
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 23, 2018, 12:53 PM
 
Originally Posted by Brien View Post
You still need actual screen brightness for HDR though.
Ehhh...

This is technically true, but there’s a vast difference in relative importance between brightness/contrast and bit-depth.

Like I said, no amount of brightness/contrast makes an 8-bit display HDR.
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 23, 2018, 12:55 PM
 
Originally Posted by And.reg View Post
So, I just finished reading all of these posts on HDR, and how it applies to still pictures and video.

I guess from what I am reading, 4K at 30fps does some real-time image reduction (but not as much as true Smart HDR) to give something like an HDR effect, whereas 4K at 60fps has no dynamic range capabilities, and so... videos will looks like as if they were taken on the iPhone X?
Yup. 4K 60 fps XS footage will be equivalent to 4K 60 fps X footage, changes in lens or sensor quality notwithstanding.

My assumption is the HDR effect at 30 fps will be lesser in quality and/or intelligence than Smart HDR, otherwise Apple would have branded it “Smart HDR video” and told everyone at the keynote.

Instead, they called it something else and buried it on the camera page.
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 23, 2018, 08:02 PM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:00 AM. )
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 23, 2018, 08:15 PM
 
I’m not understanding the math.

If one frame takes one trillion, then 30 takes 30 trillion.

Compensate for smaller frame (0.69x) and that’s ~ 20 trillion.
( Last edited by subego; Sep 23, 2018 at 08:51 PM. )
     
ort888
Addicted to MacNN
Join Date: Feb 2001
Location: Your Anus
Status: Offline
Reply With Quote
Sep 24, 2018, 10:06 AM
 
Originally Posted by And.reg View Post
So they have new "Live" animations for the wallpapers.

How can I get BOTH the old X and the new XS Live wallpapers onto the new XS?
I dunno, but I was annoyed when the 8 didn't get the same live wallpapers the X did. Apple did that for no reason other than to differentiate the product tiers. The phones are exactly the same technically.

I was also annoyed that the 8 didn't have a flashlight button on the lock screen. Again, Why? So stupid and frustrating.

My sig is 1 pixel too big.
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 24, 2018, 11:12 AM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:00 AM. )
     
Brien
Professional Poster
Join Date: Jun 2002
Location: Southern California
Status: Offline
Reply With Quote
Sep 24, 2018, 04:19 PM
 
Am I the only one that thinks iOS 12 suggestions are the dumbest thing ever? I don’t need Siri to remind me to call someone or edit a photo.
     
andi*pandi  (op)
Moderator
Join Date: Jun 2000
Location: inside 128, north of 90
Status: Offline
Reply With Quote
Sep 24, 2018, 04:29 PM
 
I'd file that under annoying. I have enough things nagging me.

Now, an app that reminds me to send birthday cards in time to mail them... well that would be useful.
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 24, 2018, 05:44 PM
 
Is a week enough lead-in?

If so, Birthday Calendar + does it.

Free to try, buck to ungimp.
     
turtle777
Clinically Insane
Join Date: Jun 2001
Location: planning a comeback !
Status: Offline
Reply With Quote
Sep 24, 2018, 10:49 PM
 
Originally Posted by andi*pandi View Post
I'd file that under annoying. I have enough things nagging me.

Now, an app that reminds me to send birthday cards in time to mail them... well that would be useful.
This, and many other things.

Due:

https://itunes.apple.com/us/app/due-...rs/id390017969

-t
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 25, 2018, 10:21 AM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:00 AM. )
     
driven
Addicted to MacNN
Join Date: May 2001
Location: Atlanta, GA
Status: Offline
Reply With Quote
Sep 26, 2018, 05:23 PM
 
Originally Posted by reader50 View Post
I routinely spend over $1000 for a later computer. But I get a long life out of each upgrade. 5+ years every single time, with RAM and storage upgrades along the way. Graphics upgrades on the desktops, and sometimes even CPU upgrades.

But a smartphone does not behave like a pocket computer purchase. No internal upgrades, very limited external upgrades. And you're expected to pitch it every 1-3 years. Under those conditions, I'm having trouble justifying even a $500 purchase price.

If they were upgradeable even a little (storage and/or RAM) to extend their lifespan, the premium price would make a lot more sense.
SD-Slots in Android. You can upgrade the Note 9 up to 1TB of memory. (But it is also expensive as heck when you buy it)
- MacBook Air M2 16GB / 512GB
- MacBook Pro 16" i9 2.4Ghz 32GB / 1TB
- MacBook Pro 15" i7 2.9Ghz 16GB / 512GB
- iMac i5 3.2Ghz 1TB
- G4 Cube 500Mhz / Shelf display unit / Museum display
     
Brien
Professional Poster
Join Date: Jun 2002
Location: Southern California
Status: Offline
Reply With Quote
Sep 26, 2018, 06:23 PM
 
I would never need 512GB. I could fill 64 if you gave me a couple years.
     
Thorzdad
Moderator
Join Date: Aug 2001
Location: Nobletucky
Status: Offline
Reply With Quote
Sep 26, 2018, 08:09 PM
 
Originally Posted by Brien View Post
I would never need 512GB. I could fill 64 if you gave me a couple years.
You obviously aren’t a grandparent. New grandchild pics will eat 64GB by the end of the week.
     
turtle777
Clinically Insane
Join Date: Jun 2001
Location: planning a comeback !
Status: Offline
Reply With Quote
Sep 26, 2018, 08:39 PM
 
Originally Posted by Thorzdad View Post
You obviously aren’t a grandparent. New grandchild pics will eat 64GB by the end of the week.
Pron ?

-t
     
OreoCookie
Moderator
Join Date: May 2001
Location: Hilbert space
Status: Offline
Reply With Quote
Sep 26, 2018, 10:02 PM
 
Originally Posted by Thorzdad View Post
You obviously aren’t a grandparent. New grandchild pics will eat 64GB by the end of the week.
Or a parent for that matter
I don't suffer from insanity, I enjoy every minute of it.
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Sep 27, 2018, 04:37 AM
 
Originally Posted by OreoCookie View Post
@P
You bring up a lot of points, and instead of going point-by-point, let me try and zoom out a little.

I have never heard that out-of-order designs are more efficient. Can you link to any evidence, not a back-of-the-envelope calculation (which is more complicated anyway, because we would have to talk about implementations using fast vs. slow transistor types among many other factors)?
https://www.amazon.com/Inside-Machin.../dp/1593276680

Which was unfair, because that is a decently thick book and I don't remember page numbers or anything, but I will check it when I have time (this weekend?).

But more directly pertaining to the topic: Because Intel says so, and they have done more simulation on actual code that anyone.

https://newsroom.intel.com/news-rele...oarchitecture/

(Silvermount is the OoOE Atom core, and the previous ones were in-order. Silvermont even ditched HT, which is an efficiency boost that was on the in-order cores)

I have more links on that, but I will have to dig them up this weekend.

ARM seems to agree (we can quibble about how ARM measures performance and what obvious and less obvious things they take into account), for example Peter Greenhalgh (the chief architect of the Cortex R4, A8, A5 and A53 as well as big.LITTLE) answered some comments on Anandtech (including one of mine):


There are more details in point 5 of this Q&A on ARM's website. Of course, you could now say that this guy didn't just drink the kool aid, he thought of the recipe before he drank it! But I think it explicitly supports my claim.
He did brew the kool aid, those are all in-order cores.

I think that they key here is that Intel is good at making OoOE cores, and ARM is terrible at it. Their core efficiency has gone down since the A9. Just look at how Apple's cores from the Apple A7 on absolutely crushes everything ARM puts out. The A7 doesn't do that by using more power - hence, it has higher performance per watt.

No, the curve need not be higher for the little core here. The graph I have finally found makes that more explicit:

It shows that near the cross over point, the efficiency of the big and the LITTLE core are comparable, although I think I have seen actual measurements where the big core even had a very slight edge. But the other point is that the line of the big processor does not extend down to the very low frequencies nor does the LITTLE core reach the very high performance region. Put succinctly, because you have a narrower optimization window for each of the cores, i. e. you can optimize the big core for high frequencies (in terms of pipeline length, transistor types, etc. etc.) and the LITTLE core for energy efficiency, you get a SoC that is more energy efficient than if you used just one core to span the whole gamut.
But the whole thing falls on the fact that the Cortex A15 was a terribly inefficient core. It was designed for microservers, not mobile phones. Apple A7 was easily twice as efficient if not more. ARM invented big.LITTLE to save their terrible core design.

Let me take another example: The iPhone 4 used a single Cortex A8 core. The iPhone 4S used two Cortex A9 cores. The clockspeeds were the same (800 MHz), the process was the same, and the cooling capacity and battery life were the same. The performance and power consumption?

https://www.anandtech.com/show/4971/...-att-verizon/4
https://www.anandtech.com/show/4971/...att-verizon/15

So the Cortex A9 - with everything else being equal - had better performance per watt than the Cortex A8. The Cortex A7 is essentially identical to a Cortex A8 (the A7 is optimized for core size, and loses some of the dual-issue capabiity. This makes it slightly less efficient, but it doesn't matter much in practice). Therefore, if you put an A9 in your graph below, it would be better than both the A7 and the A15.

I think that this is a fair comparison - the A8 is a reasonable in-order design that was widely used and forms the basis for everything ARM did after. The A15, by comparison, is an abandoned failure of a design. It's like bringing up Itanium (Merced) as an example for how in-order cores are terribly inefficient - even Pentium 4 Prescott is a wonder of efficiency compared to that one.

Now you could ask yourself whether you could do a big.BIG core SoC (just like the 2 x 4 x Cortex A53 SoCs that were around for a while with 4 energy efficient A53s and 4 higher clocked A53s) and get something that is more efficient? One counterargument here would be die area, it'd be quite wasteful, but strictly speaking that wasn't my claim. (Even though it is certainly a consideration in SoC design even if money is no object.)
nVidia did this with multiple A9 cores at one point. Performed quite decently, I believe.

I don't really know how the schedulers figure this stuff out, but here is the relevant info straight from the horse's mouth.

Apple did something similar when it explained where some of its iOS 12 performance benefits come from. Among other things, the OS would ramp up CPU frequencies much more aggressively and ramp them down more quickly.
But that is not the same thing. If you base the count on load, it is a way to eke out maximum performance, but not maximum efficiency. If you want to have maximum efficiency, you need to look at the code coming in and decide if it runs efficiently on a (fairly narrow) in-order core. I don't see how you can determine that on the fly.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 27, 2018, 09:27 AM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:01 AM. )
     
andi*pandi  (op)
Moderator
Join Date: Jun 2000
Location: inside 128, north of 90
Status: Offline
Reply With Quote
Sep 27, 2018, 12:51 PM
 
all the $%^&*( live action photos take up space too. I wish I could shut that off.
     
turtle777
Clinically Insane
Join Date: Jun 2001
Location: planning a comeback !
Status: Offline
Reply With Quote
Sep 27, 2018, 01:10 PM
 
Originally Posted by andi*pandi View Post
all the $%^&*( live action photos take up space too. I wish I could shut that off.
I think you can.

Go in the Camera Settings - Preserve Settings - turn off Live Photos.

Now turn it off in the camera app, and it should remember that you turned it off.

-t
     
Laminar
Posting Junkie
Join Date: Apr 2007
Location: Iowa, how long can this be? Does it really ruin the left column spacing?
Status: Offline
Reply With Quote
Sep 27, 2018, 04:57 PM
 
Originally Posted by andi*pandi View Post
all the $%^&*( live action photos take up space too. I wish I could shut that off.
I straight up love live photos of the kids. It's a mini video of what they're doing, and it's fun seeing them wiggle or pose for the actual shot. For everything else, live is worthless.
     
sek929
Posting Junkie
Join Date: Nov 1999
Location: Cape Cod, MA
Status: Offline
Reply With Quote
Sep 27, 2018, 05:35 PM
 
I mean yeah, taking pictures and videos of the little squirt is gonna pile the data on your phone. There are like, a billion easy/cheap/free* storage solutions to free up space, do we really need every picture and video of our kids directly on the phone at all times? I can find an old photo on Prime Photos pretty easily, it's no issue.
     
Brien
Professional Poster
Join Date: Jun 2002
Location: Southern California
Status: Offline
Reply With Quote
Sep 27, 2018, 08:02 PM
 
I had 90gb free my my 7 Plus (of 128) so we shall see if ai regret saving the $150 down the road.
     
OreoCookie
Moderator
Join Date: May 2001
Location: Hilbert space
Status: Offline
Reply With Quote
Sep 27, 2018, 08:18 PM
 
Originally Posted by P View Post
https://www.amazon.com/Inside-Machin.../dp/1593276680

Which was unfair, because that is a decently thick book and I don't remember page numbers or anything, but I will check it when I have time (this weekend?).
Please do, because I'd like to understand your argument better.
Originally Posted by P View Post
But more directly pertaining to the topic: Because Intel says so, and they have done more simulation on actual code that anyone.

https://newsroom.intel.com/news-rele...oarchitecture/

(Silvermount is the OoOE Atom core, and the previous ones were in-order. Silvermont even ditched HT, which is an efficiency boost that was on the in-order cores)

I think that they key here is that Intel is good at making OoOE cores, and ARM is terrible at it.
You can easily turn that around: perhaps Intel is terrible at making in-order cores, because it hadn't made a modern in-order core since the original Pentium until it brought its abysmal Atom processors to market? ARM on the other hand had had decades of experiences, wringing every bit of performance from in-order designs? Secondly, and more importantly, even if you compensate for the process disadvantage, Intel does not make the most energy efficient cores at the moment and I wouldn't use them as the ultimate benchmark here.

In fact, I reckon that both of these statements are true to some degree, ARM has more expertise designing in-order cores and Intel has (had?) more expertise designing OoO cores. Perhaps this is why Intel's (“good”) OoO designs beat Intel's (“bad”) in-order designs? (Of course, you could turn the argument around for ARM.) But the argument that OoO cores need more logic and are therefore less energy efficient is relatively simple and straightforward.

To me the best argument, though, is that the company that arguably makes the best mobile CPU designs, Apple, adopted big.LITTLE. Of course, we do not know whether their small cores are in-order or OoO, but given the large difference in die size, it seems more plausible to me that these are indeed simpler, in-order designs.
Originally Posted by P View Post
Their core efficiency has gone down since the A9. Just look at how Apple's cores from the Apple A7 on absolutely crushes everything ARM puts out. The A7 doesn't do that by using more power - hence, it has higher performance per watt.
But Apple adopted their own version of big.LITTLE as well in later designs. And judging by the size, it seems quite plausible that the smaller, more energy efficient cores are in-order designs.
Originally Posted by P View Post
But the whole thing falls on the fact that the Cortex A15 was a terribly inefficient core. It was designed for microservers, not mobile phones.
No, it does not: big.LITTLE has nothing to do with those two specific cores, it is a technology that is implemented for all of ARM's subsequent cores — and still works the same way. The underlying principle is exactly the same, no matter if we are talking about Apple's big and small cores, Cortex A76 and Cortex A55 or Cortex A15 and A7. I only included a graph on the A15 because that is what I found. If you'd like, I can remove the labels and replace them with whatever core names you'd like. I could even give the big core an efficiency boost at low work loads compared to the small cores doing the same work. It wouldn't change the overall story.
Originally Posted by P View Post
Apple A7 was easily twice as efficient if not more. ARM invented big.LITTLE to save their terrible core design.
Yet Apple thought it was a good idea to adopt big.LITTLE starting with the A10 after sticking to fewer, big core designs for years. Why? How does that jive with your argument?
Originally Posted by P View Post
But that is not the same thing. If you base the count on load, it is a way to eke out maximum performance, but not maximum efficiency. If you want to have maximum efficiency, you need to look at the code coming in and decide if it runs efficiently on a (fairly narrow) in-order core. I don't see how you can determine that on the fly.
I'm saying kernel schedulers for heterogeneous multiprocessing systems exist today and indeed run on hundreds of millions of devices today. This is not fictional technology we are talking about. And it seems to me the end result is indeed more energy efficient.


By the way, if I were to try to counter my own arguments here, I would add start with the following points:

(1) big.LITTLE as a concept (ARM's latest iteration DynamIQ is much more flexible) does not require you to pair a small in- with a big out-of-order core. You could use two in-order cores (which has in fact been done already) or two OoO cores (I am not sure whether this exists). In fact, ARM's DynamIQ is more flexible as you could have more than two types of cores.

(2) If you want to improve the efficiency of a chip with the help of heterogeneous multiprocessing, you don't need to use the most efficient designs for each type of core. Indeed, you can optimize the big core (which has to be OoO these days) for higher performance and other factors (e. g. die area) may influence whether you go in-order or OoO for the small cores. The result is nevertheless more energy efficient than a single (necessarily OoO core) because you can optimize each of the two types of cores for a narrower power and frequency target window. But that does not say that in-order is more efficient than OoO, rather that big.LITTLE (where the little core may be coincidentally an in-order core) is more efficient than using a single core design (that is necessarily OoO).

That would make some of my arguments equivocal. How is that?
( Last edited by OreoCookie; Sep 27, 2018 at 08:43 PM. )
I don't suffer from insanity, I enjoy every minute of it.
     
reader50
Administrator
Join Date: Jun 2000
Location: California
Status: Offline
Reply With Quote
Sep 27, 2018, 09:08 PM
 
A thought for the Oreo / P discussion. It's interesting of course, but unless I've missed it, the discussion has been almost entirely about efficiency per watt. This is a means to an end - longer battery life. And/or lower phone weight or thickness. Efficiency is critically important to the BIG core, but actual power draw may become the dominant consideration for a little core.

Say a BIG core can achieve 100% of performance at 2 watts, and scales down to 10% performance at 0.15W. But the power draw stops going down with still-lower loads. While a small core can do 1% of max phone performance at 0.05W, and lower still at negligible load.

In this made-up example, the small core is perhaps 33% the efficiency of the BIG core (in its normal operating range), and substantially worse at near-zero load.

However, you would switch to the small core anyway. When the phone is sleeping, or just waiting for user input. Because it reduces CPU power draw by .1W when idle, and may buy you an extra 10-60 minutes of battery life.
     
OreoCookie
Moderator
Join Date: May 2001
Location: Hilbert space
Status: Offline
Reply With Quote
Sep 27, 2018, 09:45 PM
 
Originally Posted by reader50 View Post
Say a BIG core can achieve 100% of performance at 2 watts, and scales down to 10% performance at 0.15W. But the power draw stops going down with still-lower loads. While a small core can do 1% of max phone performance at 0.05W, and lower still at negligible load.
Yes, and I think this is ARM's argument (that at least Apple also subscribes to) and is encapsulated in this graph:


ARM claims that if you plot power-to-performance, you are better off with two different cores rather than just one because two cores allow you to have lower power at a given performance. The optimization point here is not full load of either the small or the big cores, but you look at the curve as a whole — especially at points of partial load.

And I would add that at partial load where you have lots of “bubbles” of no instructions in your instruction stream, you benefit much less from OoO execution.

But reviewing this discussion, I think I should have more carefully distinguished between the benefits of heterogeneous multiprocessing and the claim that it is better to have in-order small cores in a heterogeneous multiprocessing chip.
Originally Posted by reader50 View Post
In this made-up example, the small core is perhaps 33% the efficiency of the BIG core (in its normal operating range), and substantially worse at near-zero load.

However, you would switch to the small core anyway. When the phone is sleeping, or just waiting for user input. Because it reduces CPU power draw by .1W when idle, and may buy you an extra 10-60 minutes of battery life.
I think that's right, and the simpler a design is, the less units need to be powered up. In the present context that'd be all the logic that decides how to re-order instructions (e. g. branch predictors and the associated buffers).
I don't suffer from insanity, I enjoy every minute of it.
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Sep 28, 2018, 05:04 AM
 
Originally Posted by OreoCookie View Post
You can easily turn that around: perhaps Intel is terrible at making in-order cores, because it hadn't made a modern in-order core since the original Pentium until it brought its abysmal Atom processors to market?
It has made in-order designs. All of the Itanium cores are in-order, and they've changed quite a bit over the years. They have also made Larrabee in several generations (Knights Landing etc). Also, Atom has been a going thing since 2009. They didn't just drop one version and leave.

And Atom wasn't terrible - in fact it was faster than anything Arm put out in 2009. It just wasn't enough faster that anyone bothered making a phone that used it when all of Android runs on ARM code.

ARM on the other hand had had decades of experiences, wringing every bit of performance from in-order designs?
Yet their first big OoOE design beats it on efficiency.

Secondly, and more importantly, even if you compensate for the process disadvantage, Intel does not make the most energy efficient cores at the moment and I wouldn't use them as the ultimate benchmark here.
At extremely low power states, the x86 ISA hurts them, but they're still more efficient than anything ARM puts out.

But the reason I'm using them for an argument is that Intel is well known for simulating real world code for every change they do, and basing their designs on it. Their simulations are second to none.

In fact, I reckon that both of these statements are true to some degree, ARM has more expertise designing in-order cores and Intel has (had?) more expertise designing OoO cores. Perhaps this is why Intel's (“good”) OoO designs beat Intel's (“bad”) in-order designs? (Of course, you could turn the argument around for ARM.) But the argument that OoO cores need more logic and are therefore less energy efficient is relatively simple and straightforward.
I think that the key is that if you design an OoOE core for efficiency, it will beat an in-order core designed for efficiency - as shown by the Cortex A9 beating Cortex A8. What has happened with ARM is that they thought the A9 was good enough for phones for a while, and designed the A15 to try to beat their way into low-end servers. Then Apple came along with the Apple A6 and A7, and the Cortex A9 couldn't compete, so mobile phone manufacturers put the A15 there and damn the battery life. big.LITTLE was an emergency fix for that (if you look at the frantic pace of ARMs Linux patches to support it, you can see just how much of a last minute fix it was), and then ARM just ran with it. The A57 was in progress before - 64-bit was supposed to be for servers as well, of course - and the A53 was designed to work with it. It is only later, with the A73, that ARM finally went back to the A9 design (it had been lightly updated as the A12 and A17 but received little marketing and no 64-bit support).

And why it is more efficient? Because the OoOE core will finish the task quicker, dropping into sleep while the in-order core is still working.

To me the best argument, though, is that the company that arguably makes the best mobile CPU designs, Apple, adopted big.LITTLE. Of course, we do not know whether their small cores are in-order or OoO, but given the large difference in die size, it seems more plausible to me that these are indeed simpler, in-order designs.
I think it is fairly clear that the small cores of A10 are dual-issue in-order cores, like the Cortex A8/A7 and original Atom.

But Apple adopted their own version of big.LITTLE as well in later designs. And judging by the size, it seems quite plausible that the smaller, more energy efficient cores are in-order designs.
They did, but long after anyone else, and the Apple A9 - which isn't big.LITTLE - still smashes designs based on multiple in-order cores on efficiency.

No, it does not: big.LITTLE has nothing to do with those two specific cores, it is a technology that is implemented for all of ARM's subsequent cores — and still works the same way. The underlying principle is exactly the same, no matter if we are talking about Apple's big and small cores, Cortex A76 and Cortex A55 or Cortex A15 and A7. I only included a graph on the A15 because that is what I found. If you'd like, I can remove the labels and replace them with whatever core names you'd like. I could even give the big core an efficiency boost at low work loads compared to the small cores doing the same work. It wouldn't change the overall story.
ARM tries to make it out as if big.LITTLE was always the plan, and it wasn't. It was a reaction to Apple's rise pushing Android manufacturers to use chips meant for microservers. The reason the A57 was used was because it was the only 64-bit design of decent performance that ARM had, and you needed 64-bit if you were going to pretend to compete with the Cortex A7.

Yet Apple thought it was a good idea to adopt big.LITTLE starting with the A10 after sticking to fewer, big core designs for years. Why? How does that jive with your argument?
This is exactly the core of my argument, where we started: The in-order cores are used for specific tasks, such as machine learning, the photo manipulation tasks and AR. They don't affect the performance or battery life one iota if I don't have or use those features. They are half way between the big OoOE cores and the GPU shaders. This means that the improvements to the cores after the Apple A9 are irrelevant to the task of making a webpage render.

(They are also used for background tasks where the energy cost of actually waking the core is a significant part of the cost because the task is over so quickly. That is a separate thing and indeed it is a gain for general workloads, but my battery life is fine if I leave my phone doing background tasks. It is when I'm using it too much that the battery drops, and the in-order cores don't affect that.)

I'm saying kernel schedulers for heterogeneous multiprocessing systems exist today and indeed run on hundreds of millions of devices today. This is not fictional technology we are talking about. And it seems to me the end result is indeed more energy efficient.
Than only using a big A15 or A57 core for everything? I'm sure it is. Those aren't efficient cores. I don't buy that it is more efficient than having only a good OoOE core.

But this line of reasoning isn't fruitful on its own. My argument is that in-order cores are only more efficient on specialized tasks. If they are, heterogeneous schedulers need to look at the task to see if they're more efficient. Your argument is that they're more efficient period unless loads get very high, in which case of course a system based on load balancing can work.

By the way, if I were to try to counter my own arguments here, I would add start with the following points:

(1) big.LITTLE as a concept (ARM's latest iteration DynamIQ is much more flexible) does not require you to pair a small in- with a big out-of-order core. You could use two in-order cores (which has in fact been done already) or two OoO cores (I am not sure whether this exists). In fact, ARM's DynamIQ is more flexible as you could have more than two types of cores.

(2) If you want to improve the efficiency of a chip with the help of heterogeneous multiprocessing, you don't need to use the most efficient designs for each type of core. Indeed, you can optimize the big core (which has to be OoO these days) for higher performance and other factors (e. g. die area) may influence whether you go in-order or OoO for the small cores. The result is nevertheless more energy efficient than a single (necessarily OoO core) because you can optimize each of the two types of cores for a narrower power and frequency target window. But that does not say that in-order is more efficient than OoO, rather that big.LITTLE (where the little core may be coincidentally an in-order core) is more efficient than using a single core design (that is necessarily OoO).

That would make some of my arguments equivocal. How is that?
There is an efficiency argument for heterogeneous multiprocessing in the optimization targets. If you have more than one core design, you can optimize them for different tasks, and potentially be more efficient. I'm not arguing against that. I can even see that if you already have big.LITTLE and in-order cores there anyway, it might make sense to make the OoOE core wider and higher performing and less efficient and let the decent in-order core work a little further up the stack. I'm arguing against the idea that in-order cores are more efficient for general computing tasks. They aren't. They're more efficient on specialized tasks, and they can be more efficient for short-term tasks that aren't latency critical. They're less efficient on general computing tasks.

I see in-order cores as half way markers between OoOE and GPUs, and in fact, that was what Larrabee was supposed to be - in-order x86 cores used for a GPU. My vision of future heterogeneous designs is general purpose OoOE cores combined with a large array on in-order vector cores that also handle graphics. Having three types of cores with unequal access to caches, like we do right now, is a step on the way to that.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Sep 28, 2018, 06:03 AM
 
Originally Posted by reader50 View Post
A thought for the Oreo / P discussion. It's interesting of course, but unless I've missed it, the discussion has been almost entirely about efficiency per watt. This is a means to an end - longer battery life. And/or lower phone weight or thickness. Efficiency is critically important to the BIG core, but actual power draw may become the dominant consideration for a little core.

Say a BIG core can achieve 100% of performance at 2 watts, and scales down to 10% performance at 0.15W. But the power draw stops going down with still-lower loads. While a small core can do 1% of max phone performance at 0.05W, and lower still at negligible load.

In this made-up example, the small core is perhaps 33% the efficiency of the BIG core (in its normal operating range), and substantially worse at near-zero load.

However, you would switch to the small core anyway. When the phone is sleeping, or just waiting for user input. Because it reduces CPU power draw by .1W when idle, and may buy you an extra 10-60 minutes of battery life.
Yes, this is the background processing scenario. It absolutely happens, and Apple has done something like that for a long time with the M-series helper chips (M7 in the iPhone 5s). The in-order cores are a slightly more generic version of the same thing.
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
OreoCookie
Moderator
Join Date: May 2001
Location: Hilbert space
Status: Offline
Reply With Quote
Sep 28, 2018, 10:15 AM
 
Originally Posted by P View Post
It has made in-order designs. All of the Itanium cores are in-order, and they've changed quite a bit over the years. They have also made Larrabee in several generations (Knights Landing etc). Also, Atom has been a going thing since 2009. They didn't just drop one version and leave.
I know Intel has made in-order designs, but none of them were very successful and none of them were particularly energy efficient.

- The Itanium was an implementation of the VLIW concept, and while technically in-order, it is a realization of a very different CPU architecture than what we are talking about here. And the Itanic has sunk.
- The Atom was never very powerful nor very energy efficient. We can argue why, but I don't think those were good designs. For a long time, the Atom was produced in older, sometimes ancient process nodes and Intel didn't give the core a lot of love.
- Intel's GPU/mass CPU product were initially based on the P54 cores (an incarnation of the original Pentium, but equipped with modern AVX units if memory serves) and later used Atom cores, I think. Again, not power efficient nor clearly better than comparable ARM-based server SoCs. As far as I know this product line is on life support (Knight's Hill was canceled and Intel apparently plans to release the next version in 3-4 years).
- Intel also made the Quark core, that was supposed to be a step below the Atom. I think Intel sold dev boards for a while and then killed the product.

So while you are right that Intel has made a few in-order cores, arguably none of them were any good compared to the competition.
Originally Posted by P View Post
Yet their first big OoOE design beats it on efficiency.
Can you provide evidence for this? Something other than a back-of-the-envelope computation, and something that takes into account that we are also specifically including scenarios where the system is loaded only partially?
Originally Posted by P View Post
But the reason I'm using them for an argument is that Intel is well known for simulating real world code for every change they do, and basing their designs on it. Their simulations are second to none.
Even if Intel's simulations are better than those of ARM (I can't judge), I'd also say that the simulated workloads and optimization priorities are likely very different. For several generations Intel prioritized absolute performance higher than efficiency (i. e. for a gain of 1 % in performance, they accepted >1 % increase in energy consumption), something they have reversed a few years ago. Moreover, the simulated workloads of something that was supposed to be an embedded CPU initially are probably very different from workloads typical of everything from desktop OS usage, server workloads and scientific computations.

So even if Intel and ARM were equally good at simulating real world code, they'd still end up in different places. Their mediocre Cortex A9 was a clear product of that, compared to the cores Apple had in the A6, that thing was slow.
Originally Posted by P View Post
I think it is fairly clear that the small cores of A10 are dual-issue in-order cores, like the Cortex A8/A7 and original Atom.
I agree that this is likely what is going on, I just wanted to be extra cautious here for the purpose of our discussion. Because unfortunately, Anand and Brian Klug now work for Apple, and I don't have independent verification.

So let's assume we are right. Then why do you think Apple went for in-order small cores if you think that as a matter of principle OoO cores are more efficient?
Originally Posted by P View Post
ARM tries to make it out as if big.LITTLE was always the plan, and it wasn't. It was a reaction to Apple's rise pushing Android manufacturers to use chips meant for microservers.
It took ARM several years to properly understand the different demands placed on CPU cores by “general purpose computer workloads”, i. e. workloads that resemble more and more that of regular PCs, and adapt their road map accordingly. We are not privy to ARM's internal discussions, so probably big.LITTLE was perhaps part “do the best with what we have” but also part of heterogeneous multiprocessing inherently being more energy efficient in ARM's opinion.
Originally Posted by P View Post
This is exactly the core of my argument, where we started: The in-order cores are used for specific tasks, such as machine learning, the photo manipulation tasks and AR. They don't affect the performance or battery life one iota if I don't have or use those features. They are half way between the big OoOE cores and the GPU shaders. This means that the improvements to the cores after the Apple A9 are irrelevant to the task of making a webpage render.
First of all, I don't think iOS reserves the small cores for special tasks. Perhaps you can pin certain processes to certain core types in iOS (at least indirectly by CPU core numbering convention). In any case, Android certainly does not work this way.
Originally Posted by P View Post
(They are also used for background tasks where the energy cost of actually waking the core is a significant part of the cost because the task is over so quickly. That is a separate thing and indeed it is a gain for general workloads, […])
That is why I think in-order cores are more energy efficient: you have less functional units you need to wake up or keep powered permanently, you gain less from OoO because your pipelines are not constantly filled anyway.
Originally Posted by P View Post
Than only using a big A15 or A57 core for everything? I'm sure it is. Those aren't efficient cores. I don't buy that it is more efficient than having only a good OoOE core.
Also iOS does the same since at least the A11* — and those are damn good OoO fast cores. So again, the story is not about ARM's lackluster OoO designs, Apple does the same and arguably is currently leading the world in terms of mobile CPU development, ahead of Intel and ahead of ARM.

* I am not sure whether process migration in the A10 worked on the kernel or SoC level. So perhaps we should include the A10 as well.
Originally Posted by P View Post
But this line of reasoning isn't fruitful on its own. My argument is that in-order cores are only more efficient on specialized tasks. If they are, heterogeneous schedulers need to look at the task to see if they're more efficient. Your argument is that they're more efficient period unless loads get very high, in which case of course a system based on load balancing can work.
I don't think schedulers need to distribute processes to cores according to what core would be more suited to a particular workload. It'd be nice if you could, but I haven't heard that this is what iOS, Windows and Linux currently do (we mustn't forget about WoA!) nor that this is even feasible. You are certainly not only benefitting from the smaller cores with special workloads.

I think kernel-level heterogeneous multiprocessing is more energy efficient than just having one type of core, though.
Originally Posted by P View Post
My vision of future heterogeneous designs is general purpose OoOE cores combined with a large array on in-order vector cores that also handle graphics. Having three types of cores with unequal access to caches, like we do right now, is a step on the way to that.
There are already >6 types of cores on modern smartphone SoCs: two types of CPU cores, GPU cores, a DSP, an ISP and an accelerator for neural networks. If you use CoreML, then the API will dispatch things to the CPUs, the GPU and the Neural Engine (if you have one). We can argue whether things like the secure enclave and the Mx-type cores also count. Clearly the future is already here
I don't suffer from insanity, I enjoy every minute of it.
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 28, 2018, 01:38 PM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:01 AM. )
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 28, 2018, 02:18 PM
 
First time I ever set up a phone over iCloud rather than plugging it in.

Holy shit... 10 minutes versus 2 hours.

Lightning is such a dog.
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 28, 2018, 04:59 PM
 
My review after using it for a couple hours is it’s a refined X. Feels better in the hand, FaceID seems snappier, UI feels a touch smoother, etc., etc.
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 29, 2018, 02:17 PM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:01 AM. )
     
Ham Sandwich
Guest
Status:
Reply With Quote
Sep 29, 2018, 06:09 PM
 
[...deleted...]
( Last edited by Ham Sandwich; Apr 23, 2020 at 10:01 AM. )
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Sep 29, 2018, 07:33 PM
 
It is, but I ultimately prefer the brightness of the 4K.
     
Brien
Professional Poster
Join Date: Jun 2002
Location: Southern California
Status: Offline
Reply With Quote
Sep 30, 2018, 01:02 PM
 
Originally Posted by And.reg View Post
So...
Rumor has it that iPhone XS supports 10W wireless charging.
Is that already enabled by default in iOS 12 or do we need to wait for a software update before buying a 10W charger to get all 10 Jones from it?

Was thinking of getting this one:

https://www.amazon.com/Baseus-Wirele...dp/B07CV9C5Z1/
I think you are good to go.
     
mindwaves
Registered User
Join Date: Sep 2000
Location: Irvine, CA
Status: Offline
Reply With Quote
Oct 1, 2018, 12:39 AM
 
I've been using the XS for about two weeks now and absolutely love it. It is leaps and bounds better than my 7 for almost all imaginable purposes for what I need it to do. The switching of apps is smoother than water and Memoji is a lot of fun, the camera is OK also.

About Apple Pay, with Face ID, it is actually faster than Touch ID when making a payment, partly because with Touch ID, it has to detect the NFC signal first before doing anything. Even though this is a fast process, it is not needed in Face ID. However, Touch ID requires no external input to launch, as opposed to FID, which requires a double click of the side button.

Battery life seems good also. I use a black background for my home screen because OLED.
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Oct 1, 2018, 01:55 AM
 
Oh, good!

I was worried you may not be as thrilled as one should be after dropping that kind of coin.
     
mindwaves
Registered User
Join Date: Sep 2000
Location: Irvine, CA
Status: Offline
Reply With Quote
Oct 1, 2018, 10:53 AM
 
Well, not sure it is worth the $999 + tax I paid for it, but I'm sure happy with it.
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Oct 1, 2018, 11:37 AM
 
I can’t really see how anyone wouldn’t like it, except for FaceID, which is polarizing.
     
Chongo
Addicted to MacNN
Join Date: Aug 2007
Location: Phoenix, Arizona
Status: Offline
Reply With Quote
Oct 1, 2018, 11:48 AM
 
I am going to get the XS Max after I finish paying off a Vet bill. My 6S plus is the 64GB model, I don’t know if I will stay with that or move to the 1/2 terabyte XS Max.
45/47
     
Laminar
Posting Junkie
Join Date: Apr 2007
Location: Iowa, how long can this be? Does it really ruin the left column spacing?
Status: Offline
Reply With Quote
Oct 1, 2018, 02:28 PM
 
Memoji is fun, made one for myself and the wife. She didn't like the sideswept bangs so I had to change that.

     
P
Moderator
Join Date: Apr 2000
Location: Gothenburg, Sweden
Status: Offline
Reply With Quote
Oct 1, 2018, 04:12 PM
 
Originally Posted by OreoCookie View Post
- The Atom was never very powerful nor very energy efficient. We can argue why, but I don't think those were good designs. For a long time, the Atom was produced in older, sometimes ancient process nodes and Intel didn't give the core a lot of love.
But it was powerful when it launched:

https://www.anandtech.com/show/5365/...or-smartphones

And while it was indeed one notch behind the bleeding edge at 32nm, it wasn't crazy behind. The issue is that it wasn't enough better to displace ARM.

- Intel's GPU/mass CPU product were initially based on the P54 cores (an incarnation of the original Pentium, but equipped with modern AVX units if memory serves) and later used Atom cores, I think. Again, not power efficient nor clearly better than comparable ARM-based server SoCs. As far as I know this product line is on life support (Knight's Hill was canceled and Intel apparently plans to release the next version in 3-4 years).
P54C is Intel with MMX, and yes, the later ones were based on Atom, but by then Atom had gone OoOE.

- Intel also made the Quark core, that was supposed to be a step below the Atom. I think Intel sold dev boards for a while and then killed the product.
It was a 486 core more or less unchanged, it wasn't even dual issue. I think they meant it as a hedge against Raspberry Pi becoming a monster platform.

So while you are right that Intel has made a few in-order cores, arguably none of them were any good compared to the competition.
The original Atom for phones (Medfield) was better than anything ARM had at the time, but at some point, making an efficient in-order core is a solved problem. Remember - according to your bench early on, the Cortex A7 was more efficient than what came later. Intel moved on to making an efficient out-of-order core in Silvermont, while ARM moved to making a slightly less efficient (than A7), but more powerful, in-order core in the A53.

Can you provide evidence for this? Something other than a back-of-the-envelope computation, and something that takes into account that we are also specifically including scenarios where the system is loaded only partially?
I was referring to the Cortex-A9 versus Cortex-A8 scenario. It's pretty clear in that case that the A9 was more efficient. The Apple A5 was way more than twice as powerful than the A4 (partly, of course, because it went dualcore), on the same process, and it didn't cut battery life in half. I linked to the power consumption stats further up.

What is a power consumption test that includes periods of low activity - isn't that just a battery life test with mixed usage? Because that is the only case where the in-order can gain something back - short bursts of non-latency critical load that require less than the performance available at the lowest clock.

And if that low-load scenario is the key, why include so many cores? If it is a low-load scenario, include one - or two, if you want to implement them as power states of the main cores - but why more?

Even if Intel's simulations are better than those of ARM (I can't judge), I'd also say that the simulated workloads and optimization priorities are likely very different. For several generations Intel prioritized absolute performance higher than efficiency (i. e. for a gain of 1 % in performance, they accepted >1 % increase in energy consumption), something they have reversed a few years ago.
2004, so more than a few at this point. (2004 was the time when Intel introduced the "1% higher power consumption requires at least 2% higher performance" as a design guideline).

Moreover, the simulated workloads of something that was supposed to be an embedded CPU initially are probably very different from workloads typical of everything from desktop OS usage, server workloads and scientific computations.
Well, then Intel's workloads match the actual case better, don't they? All phones today run modified desktop OSes.

So even if Intel and ARM were equally good at simulating real world code, they'd still end up in different places. Their mediocre Cortex A9 was a clear product of that, compared to the cores Apple had in the A6, that thing was slow.
That mediocre Cortex A9 was still more efficient than anything they had made before - and, I would argue - more efficient than anything that came after. It was also a massive performance improvement over the A8 it replaced. It just didn't have the raw performance of the much wider designs that came after. Adjusted for process node and clock speed, I don't know that it was less efficient than the Apple A6. If we take Apple's word for it that the A6 was twice as fast as their own A5, the Cortex-A9 would need to clock at 1.6 GHz to keep up with the 1.3 GHz Apple A6. That it can easily do, if run on the same process. Will it use more power doing so? I don't know, but I would say that it is probably fairly even - and the A9 would certainly use less power when it wasn't clocking as high.

So let's assume we are right. Then why do you think Apple went for in-order small cores if you think that as a matter of principle OoO cores are more efficient?
My point is that a well-designed in-order core with a focus on efficiency to the exclusion of all else will be less efficient than a well-designed OoOE core with a focus on efficiency to the exclusion of all else, when running general computing tasks. Just want to make that clear. You can certainly make an OoOE core that is inefficient but powerful.

And I think that Apple included those in-order cores for specific tasks that are run on programmable GPUs on desktops - like their image manipulation stuff. Apple's GPUs aren't very programmable right now (or very powerful).

First of all, I don't think iOS reserves the small cores for special tasks. Perhaps you can pin certain processes to certain core types in iOS (at least indirectly by CPU core numbering convention). In any case, Android certainly does not work this way.
I don't think you can send your task to a core at all, can you? I think that if you just send a task to be executed, it goes to the OoOE cores if they're awake and maybe one of the weak cores if they're not. I think that tasks get sent to the array in-order cores when you call a specific API that is programmed to make use of them.

That is why I think in-order cores are more energy efficient: you have less functional units you need to wake up or keep powered permanently, you gain less from OoO because your pipelines are not constantly filled anyway.
This isn't actually the advantage of an in-order core. In any core, you can turn off most units completely - you can turn off decoders, execution units, load/store units, etc. You can make it as wide as you like and it doesn't matter, because all those extra execution units are just turned off. All you need to do to keep it functional is to keep power to the registers. The problem for an OoOE core is that you have lots of registers in your physical register file (PRF), and a translation table to keep track of which is which, and a lot of state to keep track of which instruction should go in which order. If you turn it off, you need to save all of this data, and then restore it when you wake it up. This means that waking a core requires an investment of power.

The power consumption during running are effectively a case of big caches using power. A big cache uses power, and a big PRF uses power. A big PRF gives a better return on that power than a big cache. You can drop power usage by making the L1 cache, the execution window and the PRF smaller, but if you follow CPU news for a while, you will notice that nobody ever does that. They always grow them, because the power investment is worth it when performance goes up.

Also iOS does the same since at least the A11* — and those are damn good OoO fast cores. So again, the story is not about ARM's lackluster OoO designs, Apple does the same and arguably is currently leading the world in terms of mobile CPU development, ahead of Intel and ahead of ARM.

* I am not sure whether process migration in the A10 worked on the kernel or SoC level. So perhaps we should include the A10 as well.
Do you have details on any of this, that iOS will move things between the small and big cores over time, other than a very simple case of moving something from the small to the big core when the phone wakes up and dropping it back down when it is locked?

I don't think schedulers need to distribute processes to cores according to what core would be more suited to a particular workload. It'd be nice if you could, but I haven't heard that this is what iOS, Windows and Linux currently do (we mustn't forget about WoA!) nor that this is even feasible. You are certainly not only benefitting from the smaller cores with special workloads.

I think kernel-level heterogeneous multiprocessing is more energy efficient than just having one type of core, though.
But that can be the case either way. You could have multiple cores with different specialities, and we don't know what the small cores in the A12 are. The fact that Apple claims them to be "50% more efficient" made me think - because if it is a simple in-order core the old one must have been terrible inefficient for that to even be possible - but they're probably just counting the process improvements. The problem with Apple's setup - form the perspective of a general workload - is that they have more of the low-power cores than the powerful ones, which doesn't make sense if they're just there to offload the big cores when they're not working hard.

There are already >6 types of cores on modern smartphone SoCs: two types of CPU cores, GPU cores, a DSP, an ISP and an accelerator for neural networks. If you use CoreML, then the API will dispatch things to the CPUs, the GPU and the Neural Engine (if you have one). We can argue whether things like the secure enclave and the Mx-type cores also count. Clearly the future is already here
I can find more if you want to count them. There is clearly a core in the flash controller, for one. What I'm after are things the application developer can send code to.

Do we know what the Neural Engine is, by the way? Because it seems like a complete buzzword right now. Is it some sort of tensor manipulation engine?
The new Mac Pro has up to 30 MB of cache inside the processor itself. That's more than the HD in my first Mac. Somehow I'm still running out of space.
     
reader50
Administrator
Join Date: Jun 2000
Location: California
Status: Offline
Reply With Quote
Oct 1, 2018, 04:44 PM
 
FaceID has had legal and privacy concerns for awhile. Namely if you can refuse to unlock, the way you presumably can with a passcode. Those concerns are no longer theoretical. The Feds have forced a suspect to unlock their iPhone X. By looking at it.
It happened on August 10, when the FBI searched the house of 28-year-old Grant Michalski, a Columbus, Ohio, resident who would later that month be charged with receiving and possessing child pornography. With a search warrant in hand, a federal investigator told Michalski to put his face in front of the phone, which he duly did. That allowed the agent to pick through the suspect's online chats, photos and whatever else he deemed worthy of investigation.
Activating the SOS mode disables FaceID, but only if you recognize the threat in advance. And the list of people who might be targeted for intrusive searches is growing.

• Anyone crossing a US border. Warrantless electronics searches are way up.
• I seem to recall DHS arguing any international airport is a "border", allowing border exceptions to the 4th amendment.
• Anyone who looks "middle-eastern", or wears a turban.
• Anyone walking around, who happens to be black.
• Walking around while brown might work too.
• If you look too poor to own an iPhone X.
• If your car looks too expensive vs your clothes.
• If you might be a drug dealer. Or look similar to a drug dealer's face. Or if a drug dealer ever parked in your driveway.

During suspicionless car stops, police have argued that everything is suspicious. Car interior too clean. Car too messy. Driving on an interstate. Out-of-state plates. Air freshener hanging from mirror. Possessing cash. Being too calm talking to officers. Being too nervous talking to armed people (officers).

If I get a phone with TouchID or FaceID, I expect to turn them both off. Passcode-only, to protect my rights. I don't think I'm in any of the suspicious group yet, but that "group" is getting so large, it's only a matter of time before we're all in the group. Only whites with a million or more seem definitely immune from warrantless searches.
     
subego
Clinically Insane
Join Date: Jun 2001
Location: Chicago, Bang! Bang!
Status: Online
Reply With Quote
Oct 1, 2018, 04:55 PM
 
As a note, turning it off also works. FaceID won’t work again without a password.

I’m concerned by the problems, but like Alexa, the convenience is too seductive.

Stupid sexy Dot.
     
 
Thread Tools
 
Forum Links
Forum Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Top
Privacy Policy
All times are GMT -4. The time now is 09:44 AM.
All contents of these forums © 1995-2017 MacNN. All rights reserved.
Branding + Design: www.gesamtbild.com
vBulletin v.3.8.8 © 2000-2017, Jelsoft Enterprises Ltd.,