The Intel Quad-Core Dominance Arrives on the Desktop Platform

 

 
Intel® Core™2 Extreme Quad-Core Processor

The world's best desktop processor for multimedia applications
and first with quad-core technology.

  • Up to 80% faster performance for highly-threaded apps

  • Four processing cores to handle massive throughput

  • Based on leading Intel® Core™ microarchitecture, industry-first 8MB total cache

Just when you thought a CPU with two cores was enough processing power for you, here comes the Intel® Core™ 2 Extreme quad-core processor - the world's first quad-core desktop processor delivering the latest in cutting-edge processor technology. This processor has been primarily designed for PC enthusiasts and first adopters since it carries a hefty ~$1000 price tag. If you're fortunate enough to get one of these babies in your Xmas stocking, you will experience performance second to none on highly-threaded applications and enjoy extreme multi-tasking capabilities.

The Intel internal code name for the
Core 2 Extreme QX6700 quad-core is "Kentsfield" and it is literally build by putting two two dual-core Core 2 Duo E6700 2.66GHz CPUs into a single multi-chip module or package. You've see this same technology before with the Intel Pentium D 900 series of processors code-named "Presler" which debuted earlier this year. You can clearly see the two individual processor dies from the picture of the CPU without the heat spreader below. This gives the QX6700 an effective die size of 286 mm˛, which is double the die size of a single Core 2 Duo CPU. Having all cores on a single package has another benefit of allowing it to look like a single processor as far as Microsoft operating system licensing is concerned, where they charge by the socket, not the number of cores.

When we go under the QX6700's hood, we find the existence of two "Conroe" cores on a single chip, with no new tweaks to the individual cores, and very inefficient power management between the two die as evidenced by the 130W thermal specification, 2x that of Conroe processors. Some CPU architects would argue that this approach isn't "true quad-core" technology, and consider it a bit of a cheat. They would tell you that a true quad-core would consist of four cores on a single processor die. So why didn't Intel take this type of approach? An analogy I like comes from Maximum PC where you engineer a way to easily combine two 2-leaf clovers and produce an abundance of them, as opposed to hunting a field and only coming up with a few naturally occurring 4-leaf clovers. To put it into more tech lingo, Intel had several reasons for producing their first quad-core this way:

  • Processor yield is better for a pair of 143mm˛ dies than one 286mm˛ die (this will change when Intel moves to 45nm technology)

  • It's easier to bin-sort the CPUs to get matched pairs, whereas a die with two mismatched cores would need to ship at the frequency of the lower core.

  • Wafer starts are the same, since the dies are identical, which means manufacturing lines don't need to differ

  • And the most likely key reason: faster time to market with quad-core and beat AMD to the punch

Technically, the QX6700 has a total of 8MB of cache among the four cores, since there are two separate die on the processor package, but each die's 4MB of shared L2 cache is only dedicated to the two cores on that particular die. The cache is still "smart" though within each die and can be shared dynamically between the two cores on that die. If one core is idle, the other core can use all 4MB of L2 cache. If data needs to be passed back and forth between the two dual-core dies, it must be done over the 1066MHz (effective) shared front side bus (FSB). Intel suggests in its technical product specification that the FSB has plenty of bandwidth to handle the kind of traffic used by a desktop CPU, but in the future they will move to a 1333MHz FSB just like the Xeon 5100 series.

Enough of the "it isn't a real quad-core processor" talk and lets get into what matters most to users when purchasing an expensive and powerful processor - application support. Even though dual-core processors have been around for almost two years now, multi-threaded application software is only now starting to emerge from
development. Next month with the arrival of Windows Vista and applications like Office 2007 optimized for it, Intel says Quad-Core users will benefit from their enhanced multitasking capabilities. In fact, Intel mentions that even Windows XP users may benefit from Quad-Core somewhat just from having the additional two cores to run all those background intensive tasks like anti-virus and other security related programs. Also remember that Intel and AMD are both heavily banking on developers taking advantage of multi-threaded code in their software to help drive the need for more CPU performance in the coming years; without it, the need for more processor cores and their incredible performance gains would effectively stagnate.

Intel® Core™2 Extreme Quad-Core processor features
Features Benefits
Quad-Core Processing Provides four independent cores in a single package with 8 MB of L2 cache and a 1066 MHz Front Side Bus. Four dedicated, physical threads help operating systems and applications deliver additional performance, so end users can experience better multi-tasking and multi-threaded performance across many types of applications and work loads.
Intel® Wide Dynamic Execution Improves execution speed and efficiency, delivering more instructions per clock cycle. Each core can complete up to four full instructions simultaneously.
Intel® Smart Memory Access Optimizes the use of the data bandwidth from the memory subsystem to accelerate out-of-order execution. A newly designed prediction mechanism reduces the time in-flight instructions have to wait for data. New pre-fetch algorithms move data from system memory into fast L2 cache in advance of execution. These functions keep the pipeline full, improving instruction throughput and performance.
Intel® Advanced Smart Cache Dynamically allocates the shared L2 cache is to each processor core based on workload. This efficient, dual-core optimized implementation increases the probability that each core can access data from fast L2 cache, significantly reducing latency to frequently used data and improving performance.
Intel® Advanced Digital Media Boost Accelerates the execution of Streaming SIMD Extension (SSE) instructions to significantly improve the performance on a broad range of applications, including video, audio, image and photo processing, multimedia, encryption, financial, engineering and scientific applications. The 128-bit SSE instructions are now issued at a throughput rate of one per clock cycle effectively doubling their speed of execution on a per clock basis over previous generation processors.
Intel® Virtualization Technology Allows one hardware platform to function as multiple "virtual" platforms. Intel VT improves manageability, limiting downtime and maintaining worker productivity by isolating computing activities into separate partitions.
Intel® 64 Allows the processor to access larger amounts of memory. With appropriate 64-bit hardware and software, platforms based on an Intel processor supporting Intel 64 can allow the use of extended virtual and physical memory.
Execute Disable Bit Provides enhanced virus protection when deployed with a supported operating system. Memory can be marked as executable or non-executable, allowing the processor to raise an error to the operating system if malicious code attempts to run in non-executable memory. This prevents the code from infecting the system.
Intel Designed Thermal Solution for Boxed Processors Includes a 4-pin connector for fan speed control to help minimize the acoustic noise levels generated from running the fan at higher speeds for thermal performance. Fan speed control technology is based on actual CPU temperature and power usage.
 

Core™2 Extreme Processor Lineup:
 

The Core 2 Extreme processor family just grew by one member as the new Core 2 Extreme QX6700 Quad-Core was added last month. The new QX6700 Quad-Core operates 300MHz less than the X6800 Dual-Core Extreme, but since the clock multiplier isn't locked in this family you can very easily overclock the 2.66GHz QX6700 by 10% to 2.93GHz without any problems to match the raw speed of the X6800. Besides being able to run 4 concurrent software threads the QX6700 Quad processor differentiates itself by having an 8MB L2 cache (4MB x 2) and a 128KB L1 cache (64KB x 2). It still uses Intel's 65nm manufacturing technology and the stellar Core™ micro-architecture, which we covered in the September ASI Technical Newsletter.

The QX6700 is packed with 582 million transistors into a die size of approximately 286mm2, double that of the Core 2 Duo E6000 "Conroe" series. Like the previous Core 2 Extreme x6800 processor, the QX6700 utilizes a 1066 MHz front-side bus, comes in the LGA775 package and supports DDR2-800 memory. The QX6700 processor voltage draw ranges from 1.100V to 1.372V, is rated at 65W thermal design power (TDP), and has a thermal specification of 130W, or the amount of heat that needs to be dissipated by the cooling system.

ASI
SKU
Processor
Number
Clock
Speed
Cache
Size L2
Front
Side Bus
Quad
Core
Intel®
VT
Enhanced
Intel
SpeedStep®
Technology
Intel®
64
Execute
Disable Bit
sSpec#
54454 QX6700 2.66 GHz 8MB 1066 MHz SL9UL
50918 X6800 2.93 GHz 4MB 1066 MHz   SL9S5

 
Multimedia Performance:

The Core 2 Extreme QX6700 Quad-Core processor makes significant improvements to overall system performance by offloading certain tasks or threads to one of the four specific cores to help get more done in less time. Today's multimedia applications such as video & audio editing, graphics rendering, and 3D modeling take advantage of multi-threading performance and some even demonstrate significant scalability with these Quad-Core processors. If you're in the business of professional content creation, purchasing one of these processors is almost a no-brainer. In the testing results show below and reported at other review web sites, the primary difference between two and four cores is in the sheer amount of work that got done. 3D renderers like Autodesk's 3ds Max, absolutely love more processing cores, as do popular applications such as Photoshop CS2 and Lightwave 9.

The Cinebench 9.5 test is also a multi-threaded 3D rendering benchmark that takes advantage of any and all available processing cores and performance numbers posted by Anandtech show performance gains of over 60% when moving from two to four cores. But one of the more interesting findings reported is with the QX6700 being a more efficient overall CPU with higher performance per watt numbers as compared to any "Conroe" processor, even though Kentsfield consumes twice the amount of power to operate. 

Courtesy of AnandTech, the graph below illustrates how having more cores does increase efficiency if the software is designed to take advantage of those additional cores. The point of diminishing returns hasn't been reached with adding more cores, but the two downward trending curves for Quicktime H.264 encoding (purple line) & iTunes MP3 encoding (yellow line) show the current problem with scaling from two to four cores. Very few desktop applications can actually take advantage of a dual-core CPU, even fewer are geared for Quad-Core processors, and in those applications these Quad-Core processors actually take a step backwards in terms of overall efficiency; that's not the fault of the processor, but rather of the software not being optimized to support multiple threads.
 

 
Exceptional Multi-media Performance

Gaming Performance:
 
It seems all the new multi-core game consoles have made a big impact on the way PC game developers are programming for multi-core desktop processors.  Numerous gaming companies are now working on completely new gaming engines, which can take advantage of four processor cores and potentially any number of cores down the road (possible desktop Octa-Core in the 2008 timeframe with Intel 45nm "Nehalem" microarchitecture). From looking at numerous reviews detailing multi-cores and gaming, it seems the results clearly didn't show any advantage using multi-core processors right now, beyond the moderate speed gains from a couple games that can take advantage of 2 software threads. Even the multi-threaded Quake 4 benchmark doesn't show a performance increase when going from two to four cores, and it's one of very few games that actually takes advantage of multiple cores. Without significant software re-writes of today's games, you just won't see the sort of benchmark improvements you need in order to drive gaming performance forward.

2007 will mark the beginning of multi-threaded games making their impact on the PC gaming market segment. Upcoming game titles that will support Quad-core (or better) support will be Supreme Commander (Gas Powered Games / THQ), Epic's Unreal Tournament 2007 (and all Unreal Engine 3 titles), Valve's Half-Life 2: Episode 2 and Ubisoft's just-released Splinter Cell: Double Agent. At the Fall 2006 Intel Developer Forum, Intel showed off Remedy's Alan Wake, which will support a staggering 5 independent execution threads
->  one each for rendering, audio, streaming, physics and terrain bit-mapping. In this game apparently just the rendering and physics threads alone are enough to max processor utilization for a dual-core CPU, but the additional 3 threads are what may improve the gaming experience on a Quad-Core CPU.

Below are some screenshots from the upcoming PC game Alan Wake:



From Tim Sweeney, Founder and President of Epic Games about Unreal Tournament 2007's thread usage: "Currently Unreal Engine 3 runs two heavyweight threads all the time: one for gameplay and one for rendering. In addition, there are several helper threads to which we offload all of the physics (using Ageia's multithreaded PhysX library), streaming, and several other tasks. We plan to extend the threading support further in time for the release of Unreal Tournament 2007 next year, to further exploit multi-core PC CPUs. Major opportunities for multithreaded optimization include particle systems, animation, and terrain. Also, since UT2007 uses very extensive vehicle and ragdoll physics, we expect that at peak times during gameplay that we'll have no trouble fully exploiting 4 threads at the maximum detail settings."

Comparative Performance:

I think this quote from The Inquirer web site says it all: "...taking a look into media encoding shows that the AMD Quad FX (FX-74) just gets crushed by Kentsfield in raw performance, especially in MPEG-2 8Mbit reproduction. But the worst result for AMD is a look into power consumption and performance per watt. The AMD system consumes far more power than Intel, sometimes even double that of the Kentsfield setup."

All benchmarks found here => http://www.intel.com/performance/desktop/extreme/index.htm


Platform Support for Core™2 Extreme Quad-Core processors:
 
Motherboard Support:

Quite a few motherboard companies have announced support for the new Intel QX6700, but not all motherboards will support this quad-core CPU. For example the Intel® 975X Express Chipset supports the Intel Core™2 Quad-Core processor if you have a MB, which has support for the new input voltage range along with a very recent BIOS update that adds the new CPU Microcode signature for this processor. For example, Intel's own D975XBX board doesn't support the QX6700, so Intel is now shipping a new version, the D975XBX2 "Bad Axe 2" (ASI SKU: 53844) board with full Quad-Core support as well as changes to allow for DDR2-800 memory. Most of ASUS's current high-end desktop offerings based on the 975X, P965, and nVidia 680i / 650i chipsets will support the new Quad-Core processor => Click here to see full list.
 
Chassis Support:

Intel thermal specifications require the use of a Thermally Advantaged Chassis (TAC) version 1.1 when integrating an Intel® Core™2 Extreme processor into your system. A TAC version 1.1 chassis is defined by the presence of an 80mm side-panel air duct, a 92mm rear chassis fan and side-panel venting holes above the graphics and add-in card slots to provide additional cooling for high-end PCI Express graphics and other peripherals. Some chassis even have super-quiet 200mm fans as found in the new Antec "Nine Hundred" ultimate gamer chassis (ASI SKU: 52553).

     To view Intel's Thermally Advantaged Chassis list => Click Here
   
Power Supply Support:
Intel requires an ATX12V version 2.2
power supply for use with the Core™2 Extreme QX6700 processor. Please check www.intel.com/go/powersupplies for the appropriate support and validated power supplies. But please only use this chart as a guideline, since your particular system configuration will dictate the total wattage needed to run your system reliably. For a system with a discrete PCI Express x16 video card, Intel recommends a power supply in the 450-600W range, and if you employ two high-end nVidia 8800GTX in SLI with multiple hard drives, you might even consider going with a 700W to 1KW power supply.