AMD’s Manju Hegde is one of the rare folks I get to interact with who has an extensive background working at both AMD and NVIDIA. He was one of the co-founders and CEO of Ageia, a company that originally tried to bring higher quality physics simulation to desktop PCs in the mid-2000s. In 2008, NVIDIA acquired Ageia and Manju went along, becoming NVIDIA’s VP of CUDA Technical Marketing. The CUDA fit was a natural one for Manju as he spent the previous three years working on non-graphics workloads for highly parallel processors. Two years later, Manju made his way to AMD to continue his vision for heterogeneous compute work on GPUs. His current role is as the Corporate VP of Heterogeneous Applications and Developer Solutions at AMD.
Given what we know about the new AMD and its goal of building a Heterogeneous Systems Architecture (HSA), Manju’s position is quite important. For those of you who don’t remember back to AMD’s 2012 Financial Analyst Day, the formalized AMD strategy is to exploit its GPU advantages on the APU front in as many markets as possible. AMD has a significant GPU performance advantage compared to Intel, but in order to capitalize on that it needs developer support for heterogeneous compute. A major struggle everyone in the GPGPU space faced was enabling applications that took advantage of the incredible horsepower these processor offered. With AMD’s strategy closely married to doing more (but not all, hence the heterogeneous prefix) compute on the GPU, it needs to succeed where others have failed.
The hardware strategy is clear: don’t just build discrete CPUs and GPUs, but instead transition to APUs. This is nothing new as both AMD and Intel were headed in this direction for years. Where AMD sets itself apart is that it is will to dedicate more transistors to the GPU than Intel. The CPU and GPU are treated almost as equal class citizens on AMD APUs, at least when it comes to die area.
The software strategy is what AMD is working on now. AMD’s Fusion12 Developer Summit (AFDS) in its second year, is where developers can go to learn more about AMD’s heterogeneous compute platform and strategy. Why would a developer attend? AMD argues that the speedups offered by heterogeneous compute can be substantial enough that they could enable new features, usage models or experiences that wouldn’t otherwise be possible. In other words, taking advantage of heterogeneous compute can enable differentiation for a developer.
That brings us to today. In advance of this year’s AFDS, Manju has agreed to directly answer your questions about heterogeneous compute, where the industry is headed and anything else AMD will be covering at AFDS. Manju has a BS in Electrical Engineering (IIT, Bombay) and a PhD in Computer Information and Control Engineering (UMich, Ann Arbor) so make the questions as tough as you can. He'll be answering them on May 21st so keep the submissions coming.
I finally made the transition to a notebook as my desktop last year, a move many had made years prior. Quad-core mobile Sandy Bridge and good SSDs made the move simple for me, but Thunderbolt eventually made it near perfect. With only two drive bays in my notebook (I ditched my optical drive so I could have another SSD, something Brian Klug did back in 2010), there wasn't any room for good, high-performance, mass storage. Thunderbolt solved this problem for me.
Co-developed by Apple and Intel, Thunderbolt is a tunnel that carries both PCIe and DisplayPort traffic to the tune of 20Gbps per channel (10Gbps up and down). In the past, whenever you wanted to add a PCIe device (LAN, audio, high-speed storage, etc...) you needed to physically install that device in your system either via an ExpressCard slot on a notebook or via a PCIe slot on your desktop. Thunderbolt acts as a decoupler for PCIe devices, allowing you to put controllers that would traditionally lie inside your system outside of it, or even inside another device like a display. That's where the DisplayPort support comes in.
Apple's Thunderbolt Display is the perfect example of what Thunderbolt can be used to do. Take a DisplayPort panel, integrate Gigabit Ethernet, Firewire 800, audio and USB controllers and you've got Apple's Thunderbolt Display. In theory, you could connect a system that had none of these things, and the functionality would be provided exclusively by the display. Decoupling hardware like this allows OEMs to build thinner and/or smaller form factor machines (think Ultrabooks/MacBook Air), while allowing for full functionality when connected to a display. By carrying DisplayPort over the same cable, you can have a single cable that both extends functionality and connects your small form factor machine to a larger monitor. Thunderbolt enables the modern day dock for notebooks.
For all of last year, Thunderbolt was an Apple exclusive. This year, starting with the launch of Ivy Bridge, Thunderbolt is coming to PCs. We'll see it on notebooks as well as some desktop motherboards. Today we have the very first desktop motherboard with Thunderbolt support: MSI's Z77A-GD80.
Read on for our full preview of the first Thunderbolt PC motherboard.
ARM has been making waves over the past two years with plenty of processor and graphics IP announcements, but they are not alone in the game. MIPS Technologies, almost as old as ARM itself, also licenses RISC processors. With licensees like Broadcom and Sigma Designs, they have undoubtedly held the ...
In the past, overclocking a processor for ‘free’ performance involved taking a cheap model and pushing it past the top end model. In the land of Intel, overclocking by any significant margin has been limited to the more expensive processors – with Sandy Bridge it was common so see a 3.4GHz processor overclocked to 4.6GHz with very little ‘effort’ for those with overclocking experience.
However, Ivy Bridge is now released and behaves differently with regard to Sandy Bridge, in a couple of perhaps alarming ways that we think you should know about. We always want to be thorough here at AnandTech with our analysis, so this article is all about our results from Ivy Bridge overclocking – especially in terms of what to look out for. Ivy Bridge overclocking is a different beast to Sandy Bridge, so we want to make sure there are several clear correlations implanted in a users mind when it comes to a stable Ivy Bridge overclock. For our other readers, we also have some notes regarding some undervolting results on Ivy Bridge.
The times, they are changing. In fact, the times have already changed, we're just waiting for the results. I remember the first time Intel brought me into a hotel room to show me their answer to AMD's Athlon 64 FX—the Pentium 4 Extreme Edition. Back then the desktop race was hotly contested. Pushing the absolute limits of what could be done without a concern for power consumption was the name of the game. In the mid-2000s, the notebook started to take over. Just like the famous day when Apple announced that it was no longer a manufacturer of personal computers but a manufacturer of mobile devices, Intel came to a similar realization years prior when these slides were first shown at an IDF in 2005:
Intel is preparing for another major transition, similar to the one it brought to light seven years ago. The move will once again be motivated by mobility, and the transition will be away from the giant CPUs that currently power high-end desktops and notebooks to lower power, more integrated SoCs that find their way into tablets and smartphones. Intel won't leave the high-end market behind, but the trend towards mobility didn't stop with notebooks.
The fact of the matter is that everything Charlie has said on the big H is correct. Haswell will be a significant step forward in graphics performance over Ivy Bridge, and will likely mark Intel's biggest generational leap in GPU technology of all time. Internally Haswell is viewed as the solution to the ARM problem. Build a chip that can deliver extremely low idle power, to the point where you can't tell the difference between an ARM tablet running in standby and one with a Haswell inside. At the same time, give it the performance we've come to expect from Intel. Haswell is the future, and this is the bridge to take us there.
In our Ivy Bridge preview I applauded Intel for executing so well over the past few years. By limiting major architectural shifts to known process technologies, and keeping design simple when transitioning to a new manufacturing process, Intel took what once was a five year design cycle for microprocessor architectures and condensed it into two. Sure the nature of the changes every 2 years was simpler than what we used to see every 5, but like most things in life—smaller but frequent progress often works better than putting big changes off for a long time.
It's Intel's tick-tock philosophy that kept it from having a Bulldozer, and the lack of such structure that left AMD in the situation it is today (on the CPU side at least). Ironically what we saw happen between AMD and Intel over the past ten years is really just a matter of the same mistake being made by both companies, just at different times. Intel's complacency and lack of an aggressive execution model led to AMD's ability to outshine it in the late K7/K8 days. AMD's similar lack of an execution model and executive complacency allowed the tides to turn once more.
Ivy Bridge is a tick+, as we've already established. Intel took a design risk and went for greater performance all while transitioning to the most significant process technology it has ever seen. The end result is a reasonable increase in CPU performance (for a tick), a big step in GPU performance, and a decrease in power consumption.
Today is the day that Ivy Bridge gets official. Its name truly embodies its purpose. While Sandy Bridge was a bridge to a new architecture, Ivy connects a different set of things. It's a bridge to 22nm, warming the seat before Haswell arrives. It's a bridge to a new world of notebooks that are significantly thinner and more power efficient than what we have today. It's a means to the next chapter in the evolution of the PC.
Let's get to it.
Intel officially launched the Z77 platform earlier this week, and later this month we'll see the official launch of Ivy Bridge, Intel's 3rd generation Core processors. ASUS has agreed to cart nearly everything it makes (including a handful of unreleased products we saw at CES) over to me in NC for a hands on look on video. More importantly - we're going to be doing a Q&A with you all.
ASUS and I will both be answering your questions on camera. If you have any questions you'd like to see us answer or topics you'd like us to address, respond to the comments here or mention @anandtech with the hashtag #asusivy on Twitter along with your question/topic. We won't be able to get to all of them but we'll pick the most interesting/relevant questions and answer them on camera. The topic is obviously going to be Ivy Bridge and the 7-series platform. Simple questions are fine but what I'd really like to see are topics we can have a good discussion about.
When the video goes live, ASUS is also going to let us give away some new Z77 boards as well. We'll have more details on the giveaway closer to the Ivy Bridge launch.
Make the questions good and I look forward to answering them on camera.
I still remember hearing about Intel's tick-tock cadence and not having much faith that the company could pull it off. Granted Intel hasn't given us a new chip every 12 months on the dot, but more or less there's something new every year. Every year we either get a new architecture on an established process node (tock), or a derivative architecture on a new process node (tick). The table below summarizes what we've seen since Intel adopted the strategy:
| Intel's Tick-Tock Cadence | |||||
| Microarchitecture | Process Node | Tick or Tock | Release Year | ||
| Conroe/Merom | 65nm | Tock | 2006 | ||
| Penryn | 45nm | Tick | 2007 | ||
| Nehalem | 45nm | Tock | 2008 | ||
| Westmere | 32nm | Tick | 2010 | ||
| Sandy Bridge | 32nm | Tock | 2011 | ||
| Ivy Bridge | 22nm | Tick | 2012 | ||
| Haswell | 22nm | Tock | 2013 | ||
Last year was a big one. Sandy Bridge brought a Conroe-like increase in performance across the board thanks to a massive re-plumbing of Intel's out-of-order execution engine and other significant changes to the microarchitecture. If you remember Conroe (the first Core 2 architecture), what followed it was a relatively mild upgrade called Penryn that gave you a little bit in the way of performance and dropped power consumption at the same time.
Ivy Bridge, the follow-on to Sandy Bridge should be a tick but because of significant improvements on the GPU side Intel is calling it a tick+. We managed to get our hands on an early Ivy Bridge system and ran it through some tests to determine exactly how much of an improvement is coming our way in a couple of months.
Read on!
When we first looked at the Opteron 6276, our time was limited and we were only able to run our virtualization, compression, encryption, and rendering benchmarks. Most servers capable of running 20 or more cores/threads target the virtualization market, so that's a logical area to benchmark. The other benchmarks either test a small part of the server workload (compression and encryption) or represent a niche (e.g. rendering), but we included those benchmarks for a simple reason: they gave us additional insight into the performance profile of the Interlagos Opteron, they were easy to run, and last but not least those users/readers that use such applications still benefit.
Back in 2008, however, we discussed the elements of a thorough server review. Our list of important areas to test included ERP, OLTP, OLAP, Web, and Collaborative/E-mail applications. Looking at our initial Interlagos review, several of these are missing in action, but much has changed since 2008. The exploding core counts have made other bottlenecks (memory, I/O) much harder to overcome, the web application that we used back in 2009 stopped scaling beyond 12 cores due to lock contention problems, the Exchange benchmark turned out to be an absolute nightmare to scale beyond 8 threads, and the only manageable OLTP test—Swingbench Calling Circle—needed an increasing number of SSDs to scale.
The ballooning core counts have steadily made it harder and even next to impossible to benchmark applications on native Linux or Windows. Thus, we reacted the same way most companies have reacted: we virtualized our benchmark applications. It's only with a hypervisor that these multi-core monsters make sense in most enterprises, but there are always exceptions. Since quite a few of our readers still like seeing "native" Linux and Windows benchmarks, not to mention quite a few ERP, OLTP, and OLAP servers are still running without any form of virtualization, we took the time to complete our previous review and give the Opteron Interlagos another chance.
We've been providing live coverage of AMD's 2012 Financial Analyst Day from Santa Clara today, but if you want a summary of the company's strategy under new CEO Rory Read you've come to the right place. Below you'll find links to everything we've published from AMD's FAD 2012:
AMD's Rory Read Outlines AMD's Future Strategy
AMD Outlines HSA Roadmap: Unified Memory for CPU/GPU in 2013, HSA GPUs in 2014
AMD is Open to Integrating 3rd Party IP in Future SoCs
AMD's Financial Analyst Day 2012 - Mark Papermaster, SVP & CTO Presentation
AMD: Flexible Around ISA
AMD Nods at Shorter Design Cycles, More Synthesized Designs
What AMD Views as Important: Tablets, Servers, Notebooks & GPUs
AMD & Compal Show Off 18mm Trinity Notebook
AMD's 2012 - 2013 Client CPU/GPU/APU Roadmap Revealed
AMD's 2012 - 2013 Server Roadmap: Abu Dhabi, Seoul & Delhi CPUs
AMD is Ambidextrous, Not Married to Any One Architecture, ARM in the Datacenter?
AMD's Tablet Architectures: Hondo at 4.5W, Future Sub-2W SoC
Read on for our summary and analysis of AMD's new strategy.