Thanks to visit codestin.com
Credit goes to www.scribd.com

0% found this document useful (0 votes)
458 views318 pages

252046

Intel(r) 64 and IA-32 architectures may contain design defects or errors known as errata that may cause the product to deviate from published specifications. INTEL DISCLAIMS any EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND / or USE of Intel PRODUCTS.

Uploaded by

ysusmp
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
458 views318 pages

252046

Intel(r) 64 and IA-32 architectures may contain design defects or errors known as errata that may cause the product to deviate from published specifications. INTEL DISCLAIMS any EXPRESS OR IMPLIED WARRANTY RELATING TO SALE AND / or USE of Intel PRODUCTS.

Uploaded by

ysusmp
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 318

Intel 64 and IA-32 Architectures Software Developers Manual

Documentation Changes

August 2012

Notice: The Intel 64 and IA-32 architectures may contain design defects or errors known as errata that may cause the product to deviate from published specifications. Current characterized errata are documented in the specification updates.

Document Number: 252046-037

INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
Legal Lines and Disclaimers

A "Mission Critical Application" is any application in which failure of the Intel Product could result, directly or indirectly, in personal injury or death. SHOULD YOU PURCHASE OR USE INTEL'S PRODUCTS FOR ANY SUCH MISSION CRITICAL APPLICATION, YOU SHALL INDEMNIFY AND HOLD INTEL AND ITS SUBSIDIARIES, SUBCONTRACTORS AND AFFILIATES, AND THE DIRECTORS, OFFICERS, AND EMPLOYEES OF EACH, HARMLESS AGAINST ALL CLAIMS COSTS, DAMAGES, AND EXPENSES AND REASONABLE ATTORNEYS' FEES ARISING OUT OF, DIRECTLY OR INDIRECTLY, ANY CLAIM OF PRODUCT LIABILITY, PERSONAL INJURY, OR DEATH ARISING IN ANY WAY OUT OF SUCH MISSION CRITICAL APPLICATION, WHETHER OR NOT INTEL OR ITS SUBCONTRACTOR WAS NEGLIGENT IN THE DESIGN, MANUFACTURE, OR WARNING OF THE INTEL PRODUCT OR ANY OF ITS PARTS. Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined". Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information. Intel, the Intel logo, Pentium, Xeon, Intel NetBurst, Intel Core, Intel Core Solo, Intel Core Duo, Intel Core 2 Duo, Intel Core 2 Extreme, Intel Pentium D, Itanium, Intel SpeedStep, MMX, Intel Atom, and VTune are trademarks of Intel Corporation in the U.S. and/or other countries. *Other names and brands may be claimed as the property of others. Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order. Copyright 1997-2012 Intel Corporation. All rights reserved.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Contents
Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 Summary Tables of Changes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 Documentation Changes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Revision History

Revision History
Revision -001 -002
Initial release Added 1-10 Documentation Changes. Removed old Documentation Changes items that already have been incorporated in the published Software Developers manual Added 9 -17 Documentation Changes. Removed Documentation Change #6 - References to bits Gen and Len Deleted. Removed Documentation Change #4 - VIF Information Added to CLI Discussion Removed Documentation changes 1-17.

Description

Date November 2002 December 2002

-003

February 2003

-004 -005 -006 -007 -008 -009 -010 -011 -012 -013 -014 -015 -016 -017 -018 -019 -020 -021 -022 -023

Added Documentation changes 1-24.


Removed Documentation Changes 1-24. Added Documentation Changes 1-15. Added Documentation Changes 16- 34. Updated Documentation changes 14, 16, 17, and 28. Added Documentation Changes 35-45. Removed Documentation Changes 1-45. Added Documentation Changes 1-5. Added Documentation Changes 7-27. Removed Documentation Changes 1-27. Added Documentation Changes 1. Added Documentation Changes 2-28. Removed Documentation Changes 1-28. Added Documentation Changes 1-16. Updated title. There are no Documentation Changes for this revision of the document. Added Documentation Changes 1-21. Removed Documentation Changes 1-21. Added Documentation Changes 1-20. Added Documentation changes 21-23. Removed Documentation Changes 1-23. Added Documentation Changes 1-36. Added Documentation Changes 37-42. Removed Documentation Changes 1-42. Added Documentation Changes 1-19. Added Documentation Changes 20-27. Removed Documentation Changes 1-27. Added Documentation Changes 1-6 Removed Documentation Changes 1-6 Added Documentation Changes 1-6 Removed Documentation Changes 1-6 Added Documentation Changes 1-21

June 2003 September 2003 November 2003 January 2004 March 2004 May 2004 August 2004 November 2004 March 2005 July 2005 September 2005 March 9, 2006 March 27, 2006 September 2006 October 2006 March 2007 May 2007 November 2007 August 2008 March 2009

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Revision History

Revision -024 -025 -026 -027 -028 -029 -030 -031 -032 -033 -034 -035 -036 -037

Description
Removed Documentation Changes 1-21 Added Documentation Changes 1-16 Removed Documentation Changes 1-16 Added Documentation Changes 1-18 Removed Documentation Changes 1-18 Added Documentation Changes 1-15 Removed Documentation Changes 1-15 Added Documentation Changes 1-24 Removed Documentation Changes 1-24 Added Documentation Changes 1-29 Removed Documentation Changes 1-29 Added Documentation Changes 1-29 Removed Documentation Changes 1-29 Added Documentation Changes 1-29 Removed Documentation Changes 1-29 Added Documentation Changes 1-29 Removed Documentation Changes 1-29 Added Documentation Changes 1-14 Removed Documentation Changes 1-14 Added Documentation Changes 1-38 Removed Documentation Changes 1-38 Added Documentation Changes 1-16 Removed Documentation Changes 1-16 Added Documentation Changes 1-18 Removed Documentation Changes 1-18 Added Documentation Changes 1-17 Removed Documentation Changes 1-17 Added Documentation Changes 1-28

Date June 2009 September 2009 December 2009 March 2010 June 2010 September 2010 January 2011 April 2011 May 2011 October 2011 December 2011 March 2012 May 2012 August 2012

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Revision History

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Preface
This document is an update to the specifications contained in the Affected Documents table below. This document is a compilation of device and documentation errata, specification clarifications and changes. It is intended for hardware system manufacturers and software developers of applications, operating systems, or tools.

Affected Documents
Document Title Intel 64 and IA-32 Architectures Software Developers Manual, Volume 1: Basic Architecture Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2A: Instruction Set Reference, A-M Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2B: Instruction Set Reference, N-Z Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2C: Instruction Set Reference Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A: System Programming Guide, Part 1 Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3B: System Programming Guide, Part 2 Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3

Document Number/ Location 253665 253666 253667 326018 253668 253669 326019

Nomenclature
Documentation Changes include typos, errors, or omissions from the current published specifications. These will be incorporated in any new release of the specification.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Summary Tables of Changes


The following table indicates documentation changes which apply to the Intel 64 and IA-32 architectures. This table uses the following notations:

Codes Used in Summary Tables


Change bar to left of table row indicates this erratum is either new or modified from the previous version of the document.

Documentation Changes(Sheet 1 of 2)
No. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 Updates to Chapter 1, Volume 1 Updates to Chapter 6, Volume 1 Updates to Chapter 12, Volume 1 Updates to Chapter 1, Volume 2A Updates to Chapter 2, Volume 2A Updates to Chapter 3, Volume 2A Updates to Chapter 4, Volume 2B Updates to Chapter 5, Volume 2C Updates to Appendix A, Volume 2C Updates to Chapter 1, Volume 3A Updates to Chapter 2, Volume 3A Updates to Chapter 4, Volume 3A Updates to Chapter 5, Volume 3A Updates to Chapter 11, Volume 3A Updates to Chapter 16, Volume 3B Updates to Chapter 17, Volume 3B Updates to Chapter 18, Volume 3B Updates to Chapter 19, Volume 3B Updates to Chapter 24, Volume 3C Updates to Chapter 25, Volume 3C Updates to Chapter 26, Volume 3C Updates to Chapter 27, Volume 3C Updates to Chapter 28, Volume 3C Updates to Chapter 29, Volume 3C DOCUMENTATION CHANGES

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Documentation Changes(Sheet 2 of 2)
No. 25 26 27 28 Updates to Chapter 34, Volume 3C Updates to Chapter 35, Volume 3C Updates to Appendix B, Volume 3C Updates to Appendix C, Volume 3C DOCUMENTATION CHANGES

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

Documentation Changes
1. Updates to Chapter 1, Volume 1
Change bars show changes to Chapter 1 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 1: Basic Architecture. -----------------------------------------------------------------------------------------...

1.1

INTEL 64 AND IA-32 PROCESSORS COVERED IN THIS MANUAL

This manual set includes information pertaining primarily to the most recent Intel 64 and IA-32 processors, which include: Pentium processors P6 family processors Pentium 4 processors Pentium M processors Intel Xeon processors Pentium D processors Pentium processor Extreme Editions 64-bit Intel Xeon processors Intel CoreTM Duo processor Intel CoreTM Solo processor Dual-Core Intel Xeon processor LV Intel CoreTM2 Duo processor Intel CoreTM2 Quad processor Q6000 series Intel Xeon processor 3000, 3200 series Intel Xeon processor 5000 series Intel Xeon processor 5100, 5300 series Intel CoreTM2 Extreme processor X7000 and X6800 series Intel CoreTM2 Extreme processor QX6000 series Intel Xeon processor 7100 series Intel Pentium Dual-Core processor Intel Xeon processor 7200, 7300 series Intel Xeon processor 5200, 5400, 7400 series Intel CoreTM2 Extreme processor QX9000 and X9000 series Intel CoreTM2 Quad processor Q9000 series Intel CoreTM2 Duo processor E8000, T9000 series Intel AtomTM processor family Intel CoreTM i7 processor

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

10

Intel CoreTM i5 processor Intel Xeon processor E7-8800/4800/2800 product families Intel Xeon processor E5 family Intel Xeon processor E3-1200 family Intel CoreTM i7-3930K processor 2nd generation Intel CoreTM i7-2xxx, Intel CoreTM i5-2xxx, Intel CoreTM i3-2xxx processor series Intel Xeon processor E3-1200 v2 product family 3rd generation Intel CoreTM processors Next generation Intel CoreTM processors

P6 family processors are IA-32 processors based on the P6 family microarchitecture. This includes the Pentium Pro, Pentium II, Pentium III, and Pentium III Xeon processors. The Pentium 4, Pentium D, and Pentium processor Extreme Editions are based on the Intel NetBurst microarchitecture. Most early Intel Xeon processors are based on the Intel NetBurst microarchitecture. Intel Xeon processor 5000, 7100 series are based on the Intel NetBurst microarchitecture. The Intel CoreTM Duo, Intel CoreTM Solo and dual-core Intel Xeon processor LV are based on an improved Pentium M processor microarchitecture. The Intel Xeon processor 3000, 3200, 5100, 5300, 7200 and 7300 series, Intel Pentium dual-core, Intel CoreTM2 Duo, Intel CoreTM2 Quad, and Intel CoreTM2 Extreme processors are based on Intel CoreTM microarchitecture. The Intel Xeon processor 5200, 5400, 7400 series, Intel CoreTM2 Quad processor Q9000 series, and Intel CoreTM2 Extreme processor QX9000, X9000 series, Intel CoreTM2 processor E8000 series are based on Enhanced Intel CoreTM microarchitecture. The Intel AtomTM processor family is based on the Intel AtomTM microarchitecture and supports Intel 64 architecture. The Intel CoreTM i7 processor and the Intel CoreTM i5 processor are based on the Intel microarchitecture code name Nehalem and support Intel 64 architecture. Processors based on Intel microarchitecture code name Westmere support Intel 64 architecture. The Intel Xeon processor E5 family, Intel Xeon processor E3-1200 family, Intel Xeon processor E7-8800/ 4800/2800 product families, Intel CoreTM i7-3930K processor, 2nd generation Intel CoreTM i7-2xxx, Intel CoreTM i5-2xxx, Intel CoreTM i3-2xxx processor series are based on the Intel microarchitecture code name Sandy Bridge and support Intel 64 architecture. The Intel Xeon processor E3-1200 v2 product family and 3rd generation Intel CoreTM processors are based on the Intel microarchitecture code name Ivy Bridge and support Intel 64 architecture. The Next Generation Intel CoreTM processors are based on the Intel microarchitecture code name Haswell and support Intel 64 architecture. P6 family, Pentium M, Intel CoreTM Solo, Intel CoreTM Duo processors, dual-core Intel Xeon processor LV, and early generations of Pentium 4 and Intel Xeon processors support IA-32 architecture. The Intel AtomTM processor Z5xx series support IA-32 architecture. The Intel Xeon processor 3000, 3200, 5000, 5100, 5200, 5300, 5400, 7100, 7200, 7300, 7400 series, Intel CoreTM2 Duo, Intel CoreTM2 Extreme processors, Intel Core 2 Quad processors, Pentium D processors, Pentium Dual-Core processor, newer generations of Pentium 4 and Intel Xeon processor family support Intel 64 architecture. IA-32 architecture is the instruction set architecture and programming environment for Intel's 32-bit microprocessors. Intel 64 architecture is the instruction set architecture and programming environment which is the superset of Intels 32-bit and 64-bit architectures. It is compatible with the IA-32 architecture.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

11

1.2

OVERVIEW OF VOLUME 1: BASIC ARCHITECTURE

A description of this manuals content follows: Chapter 1 About This Manual. Gives an overview of all five volumes of the Intel 64 and IA-32 Architectures Software Developers Manual. It also describes the notational conventions in these manuals and lists related Intel manuals and documentation of interest to programmers and hardware designers. Chapter 2 Intel 64 and IA-32 Architectures. Introduces the Intel 64 and IA-32 architectures along with the families of Intel processors that are based on these architectures. It also gives an overview of the common features found in these processors and brief history of the Intel 64 and IA-32 architectures. Chapter 3 Basic Execution Environment. Introduces the models of memory organization and describes the register set used by applications. Chapter 4 Data Types. Describes the data types and addressing modes recognized by the processor; provides an overview of real numbers and floating-point formats and of floating-point exceptions. Chapter 5 Instruction Set Summary. Lists all Intel 64 and IA-32 instructions, divided into technology groups. Chapter 6 Procedure Calls, Interrupts, and Exceptions. Describes the procedure stack and mechanisms provided for making procedure calls and for servicing interrupts and exceptions. Chapter 7 Programming with General-Purpose Instructions. Describes basic load and store, program control, arithmetic, and string instructions that operate on basic data types, general-purpose and segment registers; also describes system instructions that are executed in protected mode. Chapter 8 Programming with the x87 FPU. Describes the x87 floating-point unit (FPU), including floatingpoint registers and data types; gives an overview of the floating-point instruction set and describes the processor's floating-point exception conditions. Chapter 9 Programming with Intel MMX Technology. Describes Intel MMX technology, including MMX registers and data types; also provides an overview of the MMX instruction set. Chapter 10 Programming with Streaming SIMD Extensions (SSE). Describes SSE extensions, including XMM registers, the MXCSR register, and packed single-precision floating-point data types; provides an overview of the SSE instruction set and gives guidelines for writing code that accesses the SSE extensions. Chapter 11 Programming with Streaming SIMD Extensions 2 (SSE2). Describes SSE2 extensions, including XMM registers and packed double-precision floating-point data types; provides an overview of the SSE2 instruction set and gives guidelines for writing code that accesses SSE2 extensions. This chapter also describes SIMD floating-point exceptions that can be generated with SSE and SSE2 instructions. It also provides general guidelines for incorporating support for SSE and SSE2 extensions into operating system and applications code. Chapter 12 Programming with SSE3, SSSE3 and SSE4. Provides an overview of the SSE3 instruction set, Supplemental SSE3, SSE4, and guidelines for writing code that accesses these extensions. Chapter 13 Programming with AVX. Provides an overview of the Intel AVX instruction set and gives guidelines for writing code that accesses the AVX extensions. Chapter 14 Input/Output. Describes the processors I/O mechanism, including I/O port addressing, I/O instructions, and I/O protection mechanisms. Chapter 15 Processor Identification and Feature Determination. Describes how to determine the CPU type and features available in the processor. Appendix A EFLAGS Cross-Reference. Summarizes how the IA-32 instructions affect the flags in the EFLAGS register. Appendix B EFLAGS Condition Codes. Summarizes how conditional jump, move, and byte set on condition code instructions use condition code flags (OF, CF, ZF, SF, and PF) in the EFLAGS register.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

12

Appendix C Floating-Point Exceptions Summary. Summarizes exceptions raised by the x87 FPU floatingpoint and SSE/SSE2/SSE3 floating-point instructions. Appendix D Guidelines for Writing x87 FPU Exception Handlers. Describes how to design and write MSDOS* compatible exception handling facilities for FPU exceptions (includes software and hardware requirements and assembly-language code examples). This appendix also describes general techniques for writing robust FPU exception handlers. Appendix E Guidelines for Writing SIMD Floating-Point Exception Handlers. Gives guidelines for writing exception handlers for exceptions generated by SSE/SSE2/SSE3 floating-point instructions. ...

2. Updates to Chapter 6, Volume 1


Change bars show changes to Chapter 6 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 1: Basic Architecture. -----------------------------------------------------------------------------------------...

6.3.7

Branch Functions in 64-Bit Mode

The 64-bit extensions expand branching mechanisms to accommodate branches in 64-bit linear-address space. These are: Near-branch semantics are redefined in 64-bit mode In 64-bit mode and compatibility mode, 64-bit call-gate descriptors for far calls are available

In 64-bit mode, the operand size for all near branches (CALL, RET, JCC, JCXZ, JMP, and LOOP) is forced to 64 bits. These instructions update the 64-bit RIP without the need for a REX operand-size prefix. The following aspects of near branches are controlled by the effective operand size: Truncation of the size of the instruction pointer Size of a stack pop or push, due to a CALL or RET Size of a stack-pointer increment or decrement, due to a CALL or RET Indirect-branch operand size

In 64-bit mode, all of the above actions are forced to 64 bits regardless of operand size prefixes (operand size prefixes are silently ignored). However, the displacement field for relative branches is still limited to 32 bits and the address size for near branches is not forced in 64-bit mode. Address sizes affect the size of RCX used for JCXZ and LOOP; they also impact the address calculation for memory indirect branches. Such addresses are 64 bits by default; but they can be overridden to 32 bits by an address size prefix. Software typically uses far branches to change privilege levels. The legacy IA-32 architecture provides the callgate mechanism to allow software to branch from one privilege level to another, although call gates can also be used for branches that do not change privilege levels. When call gates are used, the selector portion of the direct or indirect pointer references a gate descriptor (the offset in the instruction is ignored). The offset to the destinations code segment is taken from the call-gate descriptor. 64-bit mode redefines the type value of a 32-bit call-gate descriptor type to a 64-bit call gate descriptor and expands the size of the 64-bit descriptor to hold a 64-bit offset. The 64-bit mode call-gate descriptor allows far branches that reference any location in the supported linear-address space. These call gates also hold the target code selector (CS), allowing changes to privilege level and default size as a result of the gate transition.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

13

Because immediates are generally specified up to 32 bits, the only way to specify a full 64-bit absolute RIP in 64bit mode is with an indirect branch. For this reason, direct far branches are eliminated from the instruction set in 64-bit mode. 64-bit mode also expands the semantics of the SYSENTER and SYSEXIT instructions so that the instructions operate within a 64-bit memory space. The mode also introduces two new instructions: SYSCALL and SYSRET (which are valid only in 64-bit mode). For details, see SYSENTERFast System Call, SYSEXITFast Return from Fast System Call, SYSCALLFast System Call, and SYSRETReturn From Fast System Call in Chapter 4, Instruction Set Reference, M-Z, of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2B. ...

3. Updates to Chapter 12, Volume 1


Change bars show changes to Chapter 12 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 1: Basic Architecture. -----------------------------------------------------------------------------------------...

12.5

OVERVIEW OF SSSE3 INSTRUCTIONS

SSSE3 provides 32 instructions to accelerate a variety of multimedia and signal processing applications employing SIMD integer data. See: Section 12.6, SSSE3 Instructions, provides an introduction to individual SSSE3 instructions. Intel 64 and IA-32 Architectures Software Developers Manual, Volumes 2A & 2B, provide detailed information on individual instructions. Chapter 13, System Programming for Instruction Set Extensions and Processor Extended States, in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A, gives guidelines for integrating SSE/SSE2/SSE3/SSSE3 extensions into an operating-system environment.

...

12.10.4 Packed Blending Instructions


SSE4.1 adds 6 instructions used for blending (BLENDPS, BLENDPD, BLENDVPS, BLENDVPD, PBLENDVB, PBLENDW). Blending conditionally copies a data element in a source operand to the same element in the destination. SSE4.1 instructions improve blending operations for most field sizes. A single new SSE4.1 instruction can generally replace a sequence of 2 to 4 operations using previous architectures. The variable blend instructions (BLENDVPS, BLENDVPD, PBLENDW) introduce the use of control bits stored in an implicit XMM register (XMM0). The most significant bit in each field (the sign bit, for 2s complement integer or floating-point) is used as a selector. See Table 12-3. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

14

4. Updates to Chapter 1, Volume 2A


Change bars show changes to Chapter 1 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2A: Instruction Set Reference, A-L. -----------------------------------------------------------------------------------------...

1.1

INTEL 64 AND IA-32 PROCESSORS COVERED IN THIS MANUAL

This manual set includes information pertaining primarily to the most recent Intel 64 and IA-32 processors, which include: Pentium processors P6 family processors Pentium 4 processors Pentium M processors Intel Xeon processors Pentium D processors Pentium processor Extreme Editions 64-bit Intel Xeon processors Intel Core Duo processor Intel Core Solo processor Dual-Core Intel Xeon processor LV Intel Core2 Duo processor Intel Core2 Quad processor Q6000 series Intel Xeon processor 3000, 3200 series Intel Xeon processor 5000 series Intel Xeon processor 5100, 5300 series Intel Core2 Extreme processor X7000 and X6800 series Intel Core2 Extreme QX6000 series Intel Xeon processor 7100 series Intel Pentium Dual-Core processor Intel Xeon processor 7200, 7300 series Intel Xeon processor 5200, 5400, 7400 series Intel CoreTM2 Extreme processor QX9000 and X9000 series Intel CoreTM2 Quad processor Q9000 series Intel CoreTM2 Duo processor E8000, T9000 series Intel AtomTM processor family Intel CoreTM i7 processor Intel CoreTM i5 processor Intel Xeon processor E7-8800/4800/2800 product families Intel Xeon processor E5 family

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

15

Intel Xeon processor E3-1200 family Intel CoreTM i7-3930K processor 2nd generation Intel CoreTM i7-2xxx, Intel CoreTM i5-2xxx, Intel CoreTM i3-2xxx processor series Intel Xeon processor E3-1200 v2 product family 3rd generation Intel CoreTM processors Next generation Intel CoreTM processors

P6 family processors are IA-32 processors based on the P6 family microarchitecture. This includes the Pentium Pro, Pentium II, Pentium III, and Pentium III Xeon processors. The Pentium 4, Pentium D, and Pentium processor Extreme Editions are based on the Intel NetBurst microarchitecture. Most early Intel Xeon processors are based on the Intel NetBurst microarchitecture. Intel Xeon processor 5000, 7100 series are based on the Intel NetBurst microarchitecture. The Intel Core Duo, Intel Core Solo and dual-core Intel Xeon processor LV are based on an improved Pentium M processor microarchitecture. The Intel Xeon processor 3000, 3200, 5100, 5300, 7200, and 7300 series, Intel Pentium dual-core, Intel Core2 Duo, Intel Core2 Quad, and Intel Core2 Extreme processors are based on Intel Core microarchitecture. The Intel Xeon processor 5200, 5400, 7400 series, Intel CoreTM2 Quad processor Q9000 series, and Intel CoreTM2 Extreme processors QX9000, X9000 series, Intel CoreTM2 processor E8000 series are based on Enhanced Intel CoreTM microarchitecture. The Intel AtomTM processor family is based on the Intel AtomTM microarchitecture and supports Intel 64 architecture. The Intel CoreTM i7 processor and the Intel CoreTM i5 processor are based on the Intel microarchitecture code name Nehalem and support Intel 64 architecture. Processors based on Intel microarchitecture code name Westmere support Intel 64 architecture. The Intel Xeon processor E5 family, Intel Xeon processor E3-1200 family, Intel Xeon processor E7-8800/ 4800/2800 product families, Intel CoreTM i7-3930K processor, 2nd generation Intel CoreTM i7-2xxx, Intel CoreTM i5-2xxx, Intel CoreTM i3-2xxx processor series are based on the Intel microarchitecture code name Sandy Bridge and support Intel 64 architecture. The Intel Xeon processor E3-1200 v2 product family and 3rd generation Intel CoreTM processors are based on the Intel microarchitecture code name Ivy Bridge and support Intel 64 architecture. The Next Generation Intel CoreTM processors are based on the Intel microarchitecture code name Haswell and support Intel 64 architecture. P6 family, Pentium M, Intel Core Solo, Intel Core Duo processors, dual-core Intel Xeon processor LV, and early generations of Pentium 4 and Intel Xeon processors support IA-32 architecture. The Intel AtomTM processor Z5xx series support IA-32 architecture. The Intel Xeon processor 3000, 3200, 5000, 5100, 5200, 5300, 5400, 7100, 7200, 7300, 7400 series, Intel Core2 Duo, Intel Core2 Extreme, Intel Core2 Quad processors, Pentium D processors, Pentium DualCore processor, newer generations of Pentium 4 and Intel Xeon processor family support Intel 64 architecture. IA-32 architecture is the instruction set architecture and programming environment for Intel's 32-bit microprocessors. Intel 64 architecture is the instruction set architecture and programming environment which is the superset of Intels 32-bit and 64-bit architectures. It is compatible with the IA-32 architecture.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

16

...

5. Updates to Chapter 2, Volume 2A


Change bars show changes to Chapter 2 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2A: Instruction Set Reference, A-L. -----------------------------------------------------------------------------------------...

Table 2-14 Instructions in each Exception Class


Exception Class Type 1 Type 2 Instruction (V)MOVAPD, (V)MOVAPS, (V)MOVDQA, (V)MOVNTDQ, (V)MOVNTDQA, (V)MOVNTPD, (V)MOVNTPS (V)ADDPD, (V)ADDPS, (V)ADDSUBPD, (V)ADDSUBPS, (V)CMPPD, (V)CMPPS, (V)CVTDQ2PS, (V)CVTPD2DQ, (V)CVTPD2PS, (V)CVTPS2DQ, (V)CVTTPD2DQ, (V)CVTTPS2DQ, (V)DIVPD, (V)DIVPS, (V)DPPD*, (V)DPPS*, (V)HADDPD, (V)HADDPS, (V)HSUBPD, (V)HSUBPS, (V)MAXPD, (V)MAXPS, (V)MINPD, (V)MINPS, (V)MULPD, (V)MULPS, (V)ROUNDPD, (V)ROUNDPS, (V)SQRTPD, (V)SQRTPS, (V)SUBPD, (V)SUBPS (V)ADDSD, (V)ADDSS, (V)CMPSD, (V)CMPSS, (V)COMISD, (V)COMISS, (V)CVTPS2PD, (V)CVTSD2SI, (V)CVTSD2SS, (V)CVTSI2SD, (V)CVTSI2SS, (V)CVTSS2SD, (V)CVTSS2SI, (V)CVTTSD2SI, (V)CVTTSS2SI, (V)DIVSD, (V)DIVSS, (V)MAXSD, (V)MAXSS, (V)MINSD, (V)MINSS, (V)MULSD, (V)MULSS, (V)ROUNDSD, (V)ROUNDSS, (V)SQRTSD, (V)SQRTSS, (V)SUBSD, (V)SUBSS, (V)UCOMISD, (V)UCOMISS (V)AESDEC, (V)AESDECLAST, (V)AESENC, (V)AESENCLAST, (V)AESIMC, (V)AESKEYGENASSIST, (V)ANDPD, (V)ANDPS, (V)ANDNPD, (V)ANDNPS, (V)BLENDPD, (V)BLENDPS, VBLENDVPD, VBLENDVPS, (V)LDDQU, (V)MASKMOVDQU, (V)PTEST, VTESTPS, VTESTPD, (V)MOVDQU*, (V)MOVSHDUP, (V)MOVSLDUP, (V)MOVUPD*, (V)MOVUPS*, (V)MPSADBW, (V)ORPD, (V)ORPS, (V)PABSB, (V)PABSW, (V)PABSD, (V)PACKSSWB, (V)PACKSSDW, (V)PACKUSWB, (V)PACKUSDW, (V)PADDB, (V)PADDW, (V)PADDD, (V)PADDQ, (V)PADDSB, (V)PADDSW, (V)PADDUSB, (V)PADDUSW, (V)PALIGNR, (V)PAND, (V)PANDN, (V)PAVGB, (V)PAVGW, (V)PBLENDVB, (V)PBLENDW, (V)PCMP(E/I)STRI/M***, (V)PCMPEQB, (V)PCMPEQW, (V)PCMPEQD, (V)PCMPEQQ, (V)PCMPGTB, (V)PCMPGTW, (V)PCMPGTD, (V)PCMPGTQ, (V)PCLMULQDQ, (V)PHADDW, (V)PHADDD, (V)PHADDSW, (V)PHMINPOSUW, (V)PHSUBD, (V)PHSUBW, (V)PHSUBSW, (V)PMADDWD, (V)PMADDUBSW, (V)PMAXSB, (V)PMAXSW, (V)PMAXSD, (V)PMAXUB, (V)PMAXUW, (V)PMAXUD, (V)PMINSB, (V)PMINSW, (V)PMINSD, (V)PMINUB, (V)PMINUW, (V)PMINUD, (V)PMULHUW, (V)PMULHRSW, (V)PMULHW, (V)PMULLW, (V)PMULLD, (V)PMULUDQ, (V)PMULDQ, (V)POR, (V)PSADBW, (V)PSHUFB, (V)PSHUFD, (V)PSHUFHW, (V)PSHUFLW, (V)PSIGNB, (V)PSIGNW, (V)PSIGND, (V)PSLLW, (V)PSLLD, (V)PSLLQ, (V)PSRAW, (V)PSRAD, (V)PSRLW, (V)PSRLD, (V)PSRLQ, (V)PSUBB, (V)PSUBW, (V)PSUBD, (V)PSUBQ, (V)PSUBSB, (V)PSUBSW, (V)PUNPCKHBW, (V)PUNPCKHWD, (V)PUNPCKHDQ, (V)PUNPCKHQDQ, (V)PUNPCKLBW, (V)PUNPCKLWD, (V)PUNPCKLDQ, (V)PUNPCKLQDQ, (V)PXOR, (V)RCPPS, (V)RSQRTPS, (V)SHUFPD, (V)SHUFPS, (V)UNPCKHPD, (V)UNPCKHPS, (V)UNPCKLPD, (V)UNPCKLPS, (V)XORPD, (V)XORPS Type 5 Type 6 Type 7 Type 8 (V)CVTDQ2PD, (V)EXTRACTPS, (V)INSERTPS, (V)MOVD, (V)MOVQ, (V)MOVDDUP, (V)MOVLPD, (V)MOVLPS, (V)MOVHPD, (V)MOVHPS, (V)MOVSD, (V)MOVSS, (V)PEXTRB, (V)PEXTRD, (V)PEXTRW, (V)PEXTRQ, (V)PINSRB, (V)PINSRD, (V)PINSRW, (V)PINSRQ, (V)RCPSS, (V)RSQRTSS, (V)PMOVSX/ZX, VLDMXCSR*, VSTMXCSR VEXTRACTF128, VPERMILPD, VPERMILPS, VPERM2F128, VBROADCASTSS, VBROADCASTSD, VBROADCASTF128, VINSERTF128, VMASKMOVPS**, VMASKMOVPD** (V)MOVLHPS, (V)MOVHLPS, (V)MOVMSKPD, (V)MOVMSKPS, (V)PMOVMSKB, (V)PSLLDQ, (V)PSRLDQ, (V)PSLLW, (V)PSLLD, (V)PSLLQ, (V)PSRAW, (V)PSRAD, (V)PSRLW, (V)PSRLD, (V)PSRLQ VZEROALL, VZEROUPPER

Type 3

Type 4

(*) - Additional exception restrictions are present - see the Instruction description for details (**) - Instruction behavior on alignment check reporting with mask bits of less than all 1s are the same as with mask bits of all 1s, i.e. no alignment checks are performed.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

17

(***) - PCMPESTRI, PCMPESTRM, PCMPISTRI, and PCMPISTRM instructions do not cause #GP if the memory operand is not aligned to 16-Byte boundary. ...

2.4.4

Exceptions Type 4 (>=16 Byte mem arg no alignment, no floating-point exceptions)


Table 2-20 Type 4 Class Exception Conditions
Virtual 80x86 Protected and Compatibility

Exception

64-bit

Real

Cause of Exception

X X X

VEX prefix. VEX prefix: If XCR0[2:1] != 11b. If CR4.OSXSAVE[bit 18]=0. Legacy SSE instruction: If CR0.EM[bit 2] = 1. If CR4.OSFXSR[bit 9] = 0. If preceded by a LOCK prefix (F0H). If any REX, F2, F3, or 66 prefixes precede a VEX prefix. If any corresponding CPUID feature flag is 0. If CR0.TS[bit 3]=1. For an illegal address in the SS segment. X If a memory address referencing the SS segment is in a non-canonical form. Legacy SSE: Memory operand is not 16-byte aligned.1 For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. X If the memory address is in a non-canonical form. If any part of the operand lies outside the effective address space from 0 to FFFFH. X X For a page fault. X

Invalid Opcode, #UD

X X X

X X X X

X X X X X X

X X X X X

Device Not Available, #NM Stack, SS(0)

X General Protection, #GP(0) X Page Fault #PF(fault-code)


NOTES:

X X

X X

1. PCMPESTRI, PCMPESTRM, PCMPISTRI, and PCMPISTRM instructions do not cause #GP if the memory operand is not aligned to 16Byte boundary. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

18

6. Updates to Chapter 3, Volume 2A


Change bars show changes to Chapter 3 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2A: Instruction Set Reference, A-L. -----------------------------------------------------------------------------------------...

ADCAdd with Carry


Opcode 14 ib 15 iw 15 id REX.W + 15 id 80 /2 ib REX + 80 /2 ib 81 /2 iw 81 /2 id REX.W + 81 /2 id 83 /2 ib 83 /2 ib REX.W + 83 /2 ib 10 /r REX + 10 /r 11 /r 11 /r REX.W + 11 /r 12 /r REX + 12 /r 13 /r 13 /r REX.W + 13 /r Instruction ADC AL, imm8 ADC AX, imm16 ADC EAX, imm32 ADC RAX, imm32 ADC r/m8, imm8 ADC r/m8 , imm8 ADC r/m16, imm16 ADC r/m32, imm32 ADC r/m64, imm32 ADC r/m16, imm8 ADC r/m32, imm8 ADC r/m64, imm8 ADC r/m8, r8 ADC r/m8 , r8
* * *

Op/ En I I I I MI MI MI MI MI MI MI MI MR MR MR MR MR RM RM RM RM RM

64-bit Mode Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid

Compat/ Description Leg Mode Valid Valid Valid N.E. Valid N.E. Valid Valid N.E. Valid Valid N.E. Valid N.E. Valid Valid N.E. Valid N.E. Valid Valid N.E. Add with carry imm8 to AL. Add with carry imm16 to AX. Add with carry imm32 to EAX. Add with carry imm32 sign extended to 64bits to RAX. Add with carry imm8 to r/m8. Add with carry imm8 to r/m8. Add with carry imm16 to r/m16. Add with CF imm32 to r/m32. Add with CF imm32 sign extended to 64-bits to r/m64. Add with CF sign-extended imm8 to r/m16. Add with CF sign-extended imm8 into r/m32. Add with CF sign-extended imm8 into r/m64. Add with carry byte register to r/m8. Add with carry byte register to r/m64. Add with carry r16 to r/m16. Add with CF r32 to r/m32. Add with CF r64 to r/m64. Add with carry r/m8 to byte register. Add with carry r/m64 to byte register. Add with carry r/m16 to r16. Add with CF r/m32 to r32. Add with CF r/m64 to r64.

ADC r/m16, r16 ADC r/m32, r32 ADC r/m64, r64 ADC r8, r/m8 ADC r8 , r/m8* ADC r16, r/m16 ADC r32, r/m32 ADC r64, r/m64
*

NOTES: *In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

19

Instruction Operand Encoding


Op/En RM MR MI I Operand 1 ModRM:reg (r, w) ModRM:r/m (r, w) ModRM:r/m (r, w) AL/AX/EAX/RAX Operand 2 ModRM:r/m (r) ModRM:reg (r) imm8 imm8 Operand 3 NA NA NA NA Operand 4 NA NA NA NA

Description
Adds the destination operand (first operand), the source operand (second operand), and the carry (CF) flag and stores the result in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) The state of the CF flag represents a carry from a previous addition. When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format. The ADC instruction does not distinguish between signed or unsigned operands. Instead, the processor evaluates the result for both data types and sets the OF and CF flags to indicate a carry in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result. The ADC instruction is usually executed as part of a multibyte or multiword addition in which an ADD instruction is followed by an ADC instruction. This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.

Operation
DEST DEST + SRC + CF;

Intel C/C++ Compiler Intrinsic Equivalent


ADC: extern unsigned char _addcarry_u8(unsigned char c_in, unsigned char src1, unsigned char src2, unsigned char *sum_out); ADC: extern unsigned char _addcarry_u16(unsigned char c_in, unsigned short src1, unsigned short src2, unsigned short *sum_out); ADC: extern unsigned char _addcarry_u32(unsigned char c_in, unsigned int src1, unsigned char int, unsigned int *sum_out);

ADC: extern unsigned char _addcarry_u64(unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64 *sum_out);

Flags Affected
The OF, SF, ZF, AF, CF, and PF flags are set according to the result.

Protected Mode Exceptions


#GP(0) If the destination is located in a non-writable segment. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register is used to access memory and it contains a NULL segment selector. #SS(0) If a memory operand effective address is outside the SS segment limit.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

20

#PF(fault-code) #AC(0) #UD

If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If the LOCK prefix is used but the destination is not a memory operand.

Real-Address Mode Exceptions


#GP #SS #UD If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit. If the LOCK prefix is used but the destination is not a memory operand.

Virtual-8086 Mode Exceptions


#GP(0) #SS(0) #PF(fault-code) #AC(0) #UD If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit. If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made. If the LOCK prefix is used but the destination is not a memory operand.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


#SS(0) #GP(0) #PF(fault-code) #AC(0) #UD ... If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If the LOCK prefix is used but the destination is not a memory operand.

CLFLUSHFlush Cache Line


Opcode 0F AE /7 Instruction CLFLUSH m8 Op/ En M 64-bit Mode Valid Compat/ Description Leg Mode Valid Flushes cache line containing m8.

Instruction Operand Encoding


Op/En M Operand 1 ModRM:r/m (w) Operand 2 NA Operand 3 NA Operand 4 NA

Description
Invalidates the cache line that contains the linear address specified with the source operand from all levels of the processor cache hierarchy (data and instruction). The invalidation is broadcast throughout the cache coherence domain. If, at any level of the cache hierarchy, the line is inconsistent with memory (dirty) it is written to memory before invalidation. The source operand is a byte memory location.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

21

The availability of CLFLUSH is indicated by the presence of the CPUID feature flag CLFSH (bit 19 of the EDX register, see CPUIDCPU Identification in this chapter). The aligned cache line size affected is also indicated with the CPUID instruction (bits 8 through 15 of the EBX register when the initial value in the EAX register is 1). The memory attribute of the page containing the affected line has no effect on the behavior of this instruction. It should be noted that processors are free to speculatively fetch and cache data from system memory regions assigned a memory-type allowing for speculative reads (such as, the WB, WC, and WT memory types). PREFETCHh instructions can be used to provide the processor with hints for this speculative behavior. Because this speculative fetching can occur at any time and is not tied to instruction execution, the CLFLUSH instruction is not ordered with respect to PREFETCHh instructions or any of the speculative fetching mechanisms (that is, data can be speculatively loaded into a cache line just before, during, or after the execution of a CLFLUSH instruction that references the cache line). CLFLUSH is only ordered by the MFENCE instruction. It is not guaranteed to be ordered by any other fencing or serializing instructions or by another CLFLUSH instruction. For example, software can use an MFENCE instruction to ensure that previous stores are included in the write-back. The CLFLUSH instruction can be used at all privilege levels and is subject to all permission checking and faults associated with a byte load (and in addition, a CLFLUSH instruction is allowed to flush a linear address in an execute-only segment). Like a load, the CLFLUSH instruction sets the A bit but not the D bit in the page tables. The CLFLUSH instruction was introduced with the SSE2 extensions; however, because it has its own CPUID feature flag, it can be implemented in IA-32 processors that do not include the SSE2 extensions. Also, detecting the presence of the SSE2 extensions with the CPUID instruction does not guarantee that the CLFLUSH instruction is implemented in the processor. CLFLUSH operation is the same in non-64-bit modes and 64-bit mode.

Operation
Flush_Cache_Line(SRC);

Intel C/C++ Compiler Intrinsic Equivalents


CLFLUSH: void _mm_clflush(void const *p)

Protected Mode Exceptions


#GP(0) #SS(0) #PF(fault-code) #UD For an illegal memory operand effective address in the CS, DS, ES, FS or GS segments. For an illegal address in the SS segment. For a page fault. If CPUID.01H:EDX.CLFSH[bit 19] = 0. If the LOCK prefix is used. If instruction prefix is 66H, F2H or F3H.

Real-Address Mode Exceptions


#GP #UD If any part of the operand lies outside the effective address space from 0 to FFFFH. If CPUID.01H:EDX.CLFSH[bit 19] = 0. If the LOCK prefix is used. If instruction prefix is 66H, F2H or F3H.

Virtual-8086 Mode Exceptions


Same exceptions as in real address mode. #PF(fault-code) For a page fault.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

22

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


#SS(0) #GP(0) #PF(fault-code) #UD If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. For a page fault. If CPUID.01H:EDX.CLFSH[bit 19] = 0. If the LOCK prefix is used. If instruction prefix is 66H, F2H or F3H. ...

Table 3-17 Information Returned by CPUID Instruction


Initial EAX Value Basic CPUID Information 0H EAX EBX ECX EDX EAX EBX Maximum Input Value for Basic CPUID Information (see Table 3-18) Genu ntel ineI Version Information: Type, Family, Model, and Stepping ID (see Figure 3-5) Bits 07-00: Brand Index Bits 15-08: CLFLUSH line size (Value 8 = cache line size in bytes) Bits 23-16: Maximum number of addressable IDs for logical processors in this physical package*. Bits 31-24: Initial APIC ID Feature Information (see Figure and Table 3-20) Feature Information (see Figure 3-7 and Table 3-21) Information Provided about the Processor

01H

ECX EDX

NOTES: * The nearest power-of-2 integer that is not smaller than EBX[23:16] is the number of unique initial APIC IDs reserved for addressing different logical processors in a physical package. This field is only valid if CPUID.1.EDX.HTT[bit 28]= 1. 02H EAX EBX ECX EDX EAX EBX ECX EDX Cache and TLB Information (see Table 3-22) Cache and TLB Information Cache and TLB Information Cache and TLB Information Reserved. Reserved. Bits 00-31 of 96 bit processor serial number. (Available in Pentium III processor only; otherwise, the value in this register is reserved.) Bits 32-63 of 96 bit processor serial number. (Available in Pentium III processor only; otherwise, the value in this register is reserved.)

03H

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

23

Table 3-17 Information Returned by CPUID Instruction (Contd.)


Initial EAX Value Information Provided about the Processor NOTES: Processor serial number (PSN) is not supported in the Pentium 4 processor or later. On all models, use the PSN flag (returned using CPUID) to check for PSN support before accessing the feature. See AP-485, Intel Processor Identification and the CPUID Instruction (Order Number 241618) for more information on PSN. CPUID leaves > 3 < 80000000 are visible only when IA32_MISC_ENABLE.BOOT_NT4[bit 22] = 0 (default). Deterministic Cache Parameters Leaf 04H NOTES: Leaf 04H output depends on the initial value in ECX.* See also: INPUT EAX = 4: Returns Deterministic Cache Parameters for each level on page 3-166. EAX Bits 04-00: Cache Type Field 0 = Null - No more caches 1 = Data Cache 2 = Instruction Cache 3 = Unified Cache 4-31 = Reserved Bits 07-05: Cache Level (starts at 1) Bit 08: Self Initializing cache level (does not need SW initialization) Bit 09: Fully Associative cache Bits 13-10: Reserved Bits 25-14: Maximum number of addressable IDs for logical processors sharing this cache**, *** Bits 31-26: Maximum number of addressable IDs for processor cores in the physical package**, ****, ***** EBX Bits 11-00: L = System Coherency Line Size** Bits 21-12: P = Physical Line partitions** Bits 31-22: W = Ways of associativity** Bits 31-00: S = Number of Sets** Bit 0: Write-Back Invalidate/Invalidate 0 = WBINVD/INVD from threads sharing this cache acts upon lower level caches for threads sharing this cache. 1 = WBINVD/INVD is not guaranteed to act upon lower level caches of non-originating threads sharing this cache. Bit 1: Cache Inclusiveness 0 = Cache is not inclusive of lower cache levels. 1 = Cache is inclusive of lower cache levels. Bit 2: Complex Cache Indexing 0 = Direct mapped cache. 1 = A complex function is used to index the cache, potentially using all address bits. Bits 31-03: Reserved = 0

ECX EDX

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

24

Table 3-17 Information Returned by CPUID Instruction (Contd.)


Initial EAX Value Information Provided about the Processor NOTES: * If ECX contains an invalid sub leaf index, EAX/EBX/ECX/EDX return 0. Invalid sub-leaves of EAX = 04H: ECX = n, n > 3. ** Add one to the return value to get the result. ***The nearest power-of-2 integer that is not smaller than (1 + EAX[25:14]) is the number of unique initial APIC IDs reserved for addressing different logical processors sharing this cache **** The nearest power-of-2 integer that is not smaller than (1 + EAX[31:26]) is the number of unique Core_IDs reserved for addressing different processor cores in a physical package. Core ID is a subset of bits of the initial APIC ID. ***** The returned value is constant for valid initial values in ECX. Valid ECX values start from 0. MONITOR/MWAIT Leaf 05H EAX EBX ECX Bits 15-00: Smallest monitor-line size in bytes (default is processor's monitor granularity) Bits 31-16: Reserved = 0 Bits 15-00: Largest monitor-line size in bytes (default is processor's monitor granularity) Bits 31-16: Reserved = 0 Bit 00: Enumeration of Monitor-Mwait extensions (beyond EAX and EBX registers) supported Bit 01: Supports treating interrupts as break-event for MWAIT, even when interrupts disabled Bits 31 - 02: Reserved EDX Bits 03 - 00: Number of C0* sub C-states supported using MWAIT Bits 07 - 04: Number of C1* sub C-states supported using MWAIT Bits 11 - 08: Number of C2* sub C-states supported using MWAIT Bits 15 - 12: Number of C3* sub C-states supported using MWAIT Bits 19 - 16: Number of C4* sub C-states supported using MWAIT Bits 31 - 20: Reserved = 0 NOTE: * The definition of C0 through C4 states for MWAIT extension are processor-specific C-states, not ACPI Cstates. Bit 00: Digital temperature sensor is supported if set Bit 01: Intel Turbo Boost Technology Available (see description of IA32_MISC_ENABLE[38]). Bit 02: ARAT. APIC-Timer-always-running feature is supported if set. Bit 03: Reserved Bit 04: PLN. Power limit notification controls are supported if set. Bit 05: ECMD. Clock modulation duty cycle extension is supported if set. Bit 06: PTM. Package thermal management is supported if set. Bits 31 - 07: Reserved Bits 03 - 00: Number of Interrupt Thresholds in Digital Thermal Sensor Bits 31 - 04: Reserved

Thermal and Power Management Leaf 06H EAX

EBX

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

25

Table 3-17 Information Returned by CPUID Instruction (Contd.)


Initial EAX Value ECX Information Provided about the Processor Bit 00: Hardware Coordination Feedback Capability (Presence of IA32_MPERF and IA32_APERF). The capability to provide a measure of delivered processor performance (since last reset of the counters), as a percentage of expected processor performance at frequency specified in CPUID Brand String Bits 02 - 01: Reserved = 0 Bit 03: The processor supports performance-energy bias preference if CPUID.06H:ECX.SETBH[bit 3] is set and it also implies the presence of a new architectural MSR called IA32_ENERGY_PERF_BIAS (1B0H) Bits 31 - 04: Reserved = 0 Reserved = 0 Sub-leaf 0 (Input ECX = 0). * EAX EBX Bits 31-00: Reports the maximum input value for supported leaf 7 sub-leaves. Bit 00: FSGSBASE. Supports RDFSBASE/RDGSBASE/WRFSBASE/WRGSBASE if 1. Bit 01: IA32_TSC_ADJUST MSR is supported if 1. Bit 06: Reserved Bit 07: SMEP. Supports Supervisor Mode Execution Protection if 1. Bit 08: Reserved Bit 09: Supports Enhanced REP MOVSB/STOSB if 1. Bit 10: INVPCID. If 1, supports INVPCID instruction for system software that manages process-context identifiers. Bit 31:11: Reserved Reserved Reserved NOTE: * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Invalid sub-leaves of EAX = 07H: ECX = n, n > 0. Direct Cache Access Information Leaf 09H EAX EBX ECX EDX Architectural Performance Monitoring Leaf 0AH EAX Bits 07 - 00: Version ID of architectural performance monitoring Bits 15- 08: Number of general-purpose performance monitoring counter per logical processor Bits 23 - 16: Bit width of general-purpose, performance monitoring counter Bits 31 - 24: Length of EBX bit vector to enumerate architectural performance monitoring events Value of bits [31:0] of IA32_PLATFORM_DCA_CAP MSR (address 1F8H) Reserved Reserved Reserved

EDX 07H

Structured Extended Feature Flags Enumeration Leaf (Output depends on ECX input value)

ECX EDX

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

26

Table 3-17 Information Returned by CPUID Instruction (Contd.)


Initial EAX Value EBX Information Provided about the Processor Bit 00: Core cycle event not available if 1 Bit 01: Instruction retired event not available if 1 Bit 02: Reference cycles event not available if 1 Bit 03: Last-level cache reference event not available if 1 Bit 04: Last-level cache misses event not available if 1 Bit 05: Branch instruction retired event not available if 1 Bit 06: Branch mispredict retired event not available if 1 Bits 31- 07: Reserved = 0 Reserved = 0 Bits 04 - 00: Number of fixed-function performance counters (if Version ID > 1) Bits 12- 05: Bit width of fixed-function performance counters (if Version ID > 1) Reserved = 0 NOTES: Most of Leaf 0BH output depends on the initial value in ECX. EDX output do not vary with initial value in ECX. ECX[7:0] output always reflect initial value in ECX. If ECX contains an invalid sub-leaf index, EAX/EBX/EDX return 0; ECX returns same ECX input. Invalid sub-leaves of EAX = 0BH: ECX = n, n > 1. Leaf 0BH exists if EBX[15:0] is not zero. EAX Bits 04-00: Number of bits to shift right on x2APIC ID to get a unique topology ID of the next level type*. All logical processors with the same next level ID share current level. Bits 31-05: Reserved. Bits 15 - 00: Number of logical processors at this level type. The number reflects configuration as shipped by Intel**. Bits 31- 16: Reserved. Bits 07 - 00: Level number. Same value in ECX input Bits 15 - 08: Level type***. Bits 31 - 16:: Reserved. Bits 31- 00: x2APIC ID the current logical processor. NOTES: * Software should use this field (EAX[4:0]) to enumerate processor topology of the system. ** Software must not use EBX[15:0] to enumerate processor topology of the system. This value in this field (EBX[15:0]) is only intended for display/diagnostic purposes. The actual number of logical processors available to BIOS/OS/Applications may be different from the value of EBX[15:0], depending on software and platform hardware configurations. *** The value of the level type field is not related to level numbers in any way, higher level type values do not mean higher levels. Level type field has the following encoding: 0 : invalid 1 : SMT 2 : Core 3-255 : Reserved

ECX EDX

Extended Topology Enumeration Leaf 0BH

EBX

ECX

EDX

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

27

Table 3-17 Information Returned by CPUID Instruction (Contd.)


Initial EAX Value 0DH EAX Information Provided about the Processor Processor Extended State Enumeration Main Leaf (EAX = 0DH, ECX = 0) NOTES: Leaf 0DH main leaf (ECX = 0). Bits 31-00: Reports the valid bit fields of the lower 32 bits of XCR0. If a bit is 0, the corresponding bit field in XCR0 is reserved. Bit 00: legacy x87 Bit 01: 128-bit SSE Bit 02: 256-bit AVX Bits 31- 03: Reserved Bits 31-00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) required by enabled features in XCR0. May be different than ECX if some features at the end of the XSAVE save area are not enabled. Bit 31-00: Maximum size (bytes, from the beginning of the XSAVE/XRSTOR save area) of the XSAVE/XRSTOR save area required by all supported features in the processor, i.e all the valid bit fields in XCR0. Bit 31-00: Reports the valid bit fields of the upper 32 bits of XCR0. If a bit is 0, the corresponding bit field in XCR0 is reserved. Bits 31-01: Reserved Bit 00: XSAVEOPT is available; EBX ECX EDX 0DH Reserved Reserved Reserved NOTES: Leaf 0DH output depends on the initial value in ECX. Each valid sub-leaf index maps to a valid bit in the XCR0 register starting at bit position 2 * If ECX contains an invalid sub-leaf index, EAX/EBX/ECX/EDX return 0. Invalid sub-leaves of EAX = 0DH: ECX = n, n > 2. EAX EBX Bits 31-0: The size in bytes (from the offset specified in EBX) of the save area for an extended state feature associated with a valid sub-leaf index, n. This field reports 0 if the sub-leaf index, n, is invalid*. Bits 31-0: The offset in bytes of this extended state components save area from the beginning of the XSAVE/XRSTOR area. This field reports 0 if the sub-leaf index, n, is invalid*. This field reports 0 if the sub-leaf index, n, is invalid*; otherwise it is reserved. This field reports 0 if the sub-leaf index, n, is invalid*; otherwise it is reserved. Invalid. No existing or future CPU will return processor identification or feature information if the initial EAX value is in the range 40000000H to 4FFFFFFFH. Extended Function CPUID Information

EBX

ECX

EDX

Processor Extended State Enumeration Sub-leaf (EAX = 0DH, ECX = 1) 0DH EAX

Processor Extended State Enumeration Sub-leaves (EAX = 0DH, ECX = n, n > 1)

ECX EDX 40000000H 4FFFFFFFH

Unimplemented CPUID Leaf Functions

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

28

Table 3-17 Information Returned by CPUID Instruction (Contd.)


Initial EAX Value 80000000H EAX EBX ECX EDX 80000001H EAX EBX ECX EDX Information Provided about the Processor Maximum Input Value for Extended Function CPUID Information (see Table 3-18). Reserved Reserved Reserved Extended Processor Signature and Feature Bits. Reserved Bit 00: LAHF/SAHF available in 64-bit mode Bits 31-01 Reserved Bits 10-00: Reserved Bit 11: SYSCALL/SYSRET available in 64-bit mode Bits 19-12: Reserved = 0 Bit 20: Execute Disable Bit available Bits 25-21: Reserved = 0 Bit 26: 1-GByte pages are available if 1 Bit 27: RDTSCP and IA32_TSC_AUX are available if 1 Bits 28: Reserved = 0 Bit 29: Intel 64 Architecture available if 1 Bits 31-30: Reserved = 0 Processor Brand String Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Processor Brand String Continued Reserved = 0 Reserved = 0 Reserved = 0 Reserved = 0 Reserved = 0 Reserved = 0 Bits 07-00: Cache Line size in bytes Bits 11-08: Reserved Bits 15-12: L2 Associativity field * Bits 31-16: Cache size in 1K units Reserved = 0

80000002H EAX EBX ECX EDX 80000003H EAX EBX ECX EDX 80000004H EAX EBX ECX EDX 80000005H EAX EBX ECX EDX 80000006H EAX EBX ECX

EDX

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

29

Table 3-17 Information Returned by CPUID Instruction (Contd.)


Initial EAX Value NOTES: Information Provided about the Processor

* L2 associativity field encodings:


00H - Disabled 01H - Direct mapped 02H - 2-way 04H - 4-way 06H - 8-way 08H - 16-way 0FH - Fully associative

80000007H EAX EBX ECX EDX

Reserved = 0 Reserved = 0 Reserved = 0 Bits 07-00: Reserved = 0 Bit 08: Invariant TSC available if 1 Bits 31-09: Reserved = 0 Linear/Physical Address size Bits 07-00: #Physical Address Bits* Bits 15-8: #Linear Address Bits Bits 31-16: Reserved = 0 Reserved = 0 Reserved = 0 Reserved = 0 NOTES: * If CPUID.80000008H:EAX[7:0] is supported, the maximum physical address number supported should come from this field.

80000008H EAX

EBX ECX EDX

...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

30

CRC32 Accumulate CRC32 Value


Opcode/ Instruction F2 0F 38 F0 /r CRC32 r32, r/m8 F2 REX 0F 38 F0 /r CRC32 r32, r/m8* F2 0F 38 F1 /r CRC32 r32, r/m16 F2 0F 38 F1 /r CRC32 r32, r/m32 F2 REX.W 0F 38 F0 /r CRC32 r64, r/m8 F2 REX.W 0F 38 F1 /r CRC32 r64, r/m64 NOTES: *In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH. RM Valid N.E. Accumulate CRC32 on r/m64. RM Valid N.E. Accumulate CRC32 on r/m8. RM Valid Valid Accumulate CRC32 on r/m32. RM Valid Valid Accumulate CRC32 on r/m16. RM Valid N.E. Accumulate CRC32 on r/m8. Op/ En RM 64-Bit Mode Valid Compat/ Description Leg Mode Valid Accumulate CRC32 on r/m8.

Instruction Operand Encoding


Op/En RM Operand 1 ModRM:reg (r, w) Operand 2 ModRM:r/m (r) Operand 3 NA Operand 4 NA

Description
Starting with an initial value in the first operand (destination operand), accumulates a CRC32 (polynomial 0x11EDC6F41) value for the second operand (source operand) and stores the result in the destination operand. The source operand can be a register or a memory location. The destination operand must be an r32 or r64 register. If the destination is an r64 register, then the 32-bit result is stored in the least significant double word and 00000000H is stored in the most significant double word of the r64 register. The initial value supplied in the destination operand is a double word integer stored in the r32 register or the least significant double word of the r64 register. To incrementally accumulate a CRC32 value, software retains the result of the previous CRC32 operation in the destination operand, then executes the CRC32 instruction again with new input data in the source operand. Data contained in the source operand is processed in reflected bit order. This means that the most significant bit of the source operand is treated as the least significant bit of the quotient, and so on, for all the bits of the source operand. Likewise, the result of the CRC operation is stored in the destination operand in reflected bit order. This means that the most significant bit of the resulting CRC (bit 31) is stored in the least significant bit of the destination operand (bit 0), and so on, for all the bits of the CRC.

Operation Notes:
BIT_REFLECT64: DST[63-0] = SRC[0-63] BIT_REFLECT32: DST[31-0] = SRC[0-31] BIT_REFLECT16: DST[15-0] = SRC[0-15] BIT_REFLECT8: DST[7-0] = SRC[0-7]

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

31

MOD2: Remainder from Polynomial division modulus 2 CRC32 instruction for 64-bit source operand and 64-bit destination operand: TEMP1[63-0] BIT_REFLECT64 (SRC[63-0]) TEMP2[31-0] BIT_REFLECT32 (DEST[31-0]) TEMP3[95-0] TEMP1[63-0] 32 TEMP4[95-0] TEMP2[31-0] 64 TEMP5[95-0] TEMP3[95-0] XOR TEMP4[95-0] TEMP6[31-0] TEMP5[95-0] MOD2 11EDC6F41H DEST[31-0] BIT_REFLECT (TEMP6[31-0]) DEST[63-32] 00000000H CRC32 instruction for 32-bit source operand and 32-bit destination operand: TEMP1[31-0] BIT_REFLECT32 (SRC[31-0]) TEMP2[31-0] BIT_REFLECT32 (DEST[31-0]) TEMP3[63-0] TEMP1[31-0] 32 TEMP4[63-0] TEMP2[31-0] 32 TEMP5[63-0] TEMP3[63-0] XOR TEMP4[63-0] TEMP6[31-0] TEMP5[63-0] MOD2 11EDC6F41H DEST[31-0] BIT_REFLECT (TEMP6[31-0]) CRC32 instruction for 16-bit source operand and 32-bit destination operand: TEMP1[15-0] BIT_REFLECT16 (SRC[15-0]) TEMP2[31-0] BIT_REFLECT32 (DEST[31-0]) TEMP3[47-0] TEMP1[15-0] 32 TEMP4[47-0] TEMP2[31-0] 16 TEMP5[47-0] TEMP3[47-0] XOR TEMP4[47-0] TEMP6[31-0] TEMP5[47-0] MOD2 11EDC6F41H DEST[31-0] BIT_REFLECT (TEMP6[31-0]) CRC32 instruction for 8-bit source operand and 64-bit destination operand: TEMP1[7-0] BIT_REFLECT8(SRC[7-0]) TEMP2[31-0] BIT_REFLECT32 (DEST[31-0]) TEMP3[39-0] TEMP1[7-0] 32 TEMP4[39-0] TEMP2[31-0] 8 TEMP5[39-0] TEMP3[39-0] XOR TEMP4[39-0] TEMP6[31-0] TEMP5[39-0] MOD2 11EDC6F41H DEST[31-0] BIT_REFLECT (TEMP6[31-0]) DEST[63-32] 00000000H CRC32 instruction for 8-bit source operand and 32-bit destination operand: TEMP1[7-0] BIT_REFLECT8(SRC[7-0]) TEMP2[31-0] BIT_REFLECT32 (DEST[31-0]) TEMP3[39-0] TEMP1[7-0] 32 TEMP4[39-0] TEMP2[31-0] 8 TEMP5[39-0] TEMP3[39-0] XOR TEMP4[39-0] TEMP6[31-0] TEMP5[39-0] MOD2 11EDC6F41H DEST[31-0] BIT_REFLECT (TEMP6[31-0])

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

32

Flags Affected
None

Intel C/C++ Compiler Intrinsic Equivalent


unsigned int _mm_crc32_u8( unsigned int crc, unsigned char data ) unsigned int _mm_crc32_u16( unsigned int crc, unsigned short data ) unsigned int _mm_crc32_u32( unsigned int crc, unsigned int data ) unsinged __int64 _mm_crc32_u64( unsinged __int64 crc, unsigned __int64 data )

SIMD Floating Point Exceptions


None

Protected Mode Exceptions


#GP(0) #SS(0) #PF (fault-code) #AC(0) #UD If a memory operand effective address is outside the CS, DS, ES, FS or GS segments. If a memory operand effective address is outside the SS segment limit. For a page fault. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0. If LOCK prefix is used.

Real-Address Mode Exceptions


#GP(0) #SS(0) #UD If any part of the operand lies outside of the effective address space from 0 to 0FFFFH. If a memory operand effective address is outside the SS segment limit. If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0. If LOCK prefix is used.

Virtual 8086 Mode Exceptions


#GP(0) #SS(0) #PF (fault-code) #AC(0) #UD If any part of the operand lies outside of the effective address space from 0 to 0FFFFH. If a memory operand effective address is outside the SS segment limit. For a page fault. If alignment checking is enabled and an unaligned memory reference is made. If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0. If LOCK prefix is used.

Compatibility Mode Exceptions


Same exceptions as in Protected Mode.

64-Bit Mode Exceptions


#GP(0) #SS(0) #PF (fault-code) #AC(0) #UD If the memory address is in a non-canonical form. If a memory address referencing the SS segment is in a non-canonical form. For a page fault. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If CPUID.01H:ECX.SSE4_2 [Bit 20] = 0.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

33

If LOCK prefix is used. ...

DPPD Dot Product of Packed Double Precision Floating-Point Values


Opcode/ Instruction 66 0F 3A 41 /r ib DPPD xmm1, xmm2/m128, imm8 Op/ En RMI 64/32-bit CPUID Mode Feature Flag V/V SSE4_1 Description

Selectively multiply packed DP floating-point values from xmm1 with packed DP floatingpoint values from xmm2, add and selectively store the packed DP floating-point values to xmm1. Selectively multiply packed DP floating-point values from xmm2 with packed DP floatingpoint values from xmm3, add and selectively store the packed DP floating-point values to xmm1.

VEX.NDS.128.66.0F3A.WIG 41 /r ib VDPPD xmm1,xmm2, xmm3/m128, imm8

RVMI V/V

AVX

Instruction Operand Encoding


Op/En RMI RVMI Operand 1 ModRM:reg (r, w) ModRM:reg (w) Operand 2 ModRM:r/m (r) VEX.vvvv (r) Operand 3 imm8 ModRM:r/m (r) Operand 4 NA imm8

Description
Conditionally multiplies the packed double-precision floating-point values in the destination operand (first operand) with the packed double-precision floating-point values in the source (second operand) depending on a mask extracted from bits [5:4] of the immediate operand (third operand). If a condition mask bit is zero, the corresponding multiplication is replaced by a value of 0.0. The two resulting double-precision values are summed into an intermediate result. The intermediate result is conditionally broadcasted to the destination using a broadcast mask specified by bits [1:0] of the immediate byte. If a broadcast mask bit is "1", the intermediate result is copied to the corresponding qword element in the destination operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero. DPPD follows the NaN forwarding rules stated in the Software Developers Manual, vol. 1, table 4.7. These rules do not cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of those NaNs in the destination is implementation dependent. NaNs on the input sources or computationally generated NaNs will have at least one NaN propagated to the destination. 128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified. VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed. If VDPPD is encoded with VEX.L= 1, an attempt to execute the instruction encoded with VEX.L= 1 will cause an #UD exception. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

34

DPPS Dot Product of Packed Single Precision Floating-Point Values


Opcode/ Instruction 66 0F 3A 40 /r ib DPPS xmm1, xmm2/m128, imm8 Op/ En RMI 64/32-bit CPUID Mode Feature Flag V/V SSE4_1 Description

Selectively multiply packed SP floating-point values from xmm1 with packed SP floatingpoint values from xmm2, add and selectively store the packed SP floating-point values or zero values to xmm1. Multiply packed SP floating point values from xmm1 with packed SP floating point values from xmm2/mem selectively add and store to xmm1. Multiply packed single-precision floating-point values from ymm2 with packed SP floating point values from ymm3/mem, selectively add pairs of elements and store to ymm1.

VEX.NDS.128.66.0F3A.WIG 40 /r ib VDPPS xmm1,xmm2, xmm3/m128, imm8

RVMI V/V

AVX

VEX.NDS.256.66.0F3A.WIG 40 /r ib VDPPS ymm1, ymm2, ymm3/m256, imm8

RVMI V/V

AVX

Instruction Operand Encoding


Op/En RMI RVMI Operand 1 ModRM:reg (r, w) ModRM:reg (w) Operand 2 ModRM:r/m (r) VEX.vvvv (r) Operand 3 imm8 ModRM:r/m (r) Operand 4 NA imm8

Description
Conditionally multiplies the packed single precision floating-point values in the destination operand (first operand) with the packed single-precision floats in the source (second operand) depending on a mask extracted from the high 4 bits of the immediate byte (third operand). If a condition mask bit in Imm8[7:4] is zero, the corresponding multiplication is replaced by a value of 0.0. The four resulting single-precision values are summed into an intermediate result. The intermediate result is conditionally broadcasted to the destination using a broadcast mask specified by bits [3:0] of the immediate byte. If a broadcast mask bit is "1", the intermediate result is copied to the corresponding dword element in the destination operand. If a broadcast mask bit is zero, the corresponding element in the destination is set to zero. DPPS follows the NaN forwarding rules stated in the Software Developers Manual, vol. 1, table 4.7. These rules do not cover horizontal prioritization of NaNs. Horizontal propagation of NaNs to the destination and the positioning of those NaNs in the destination is implementation dependent. NaNs on the input sources or computationally generated NaNs will have at least one NaN propagated to the destination. 128-bit Legacy SSE version: The second source can be an XMM register or an 128-bit memory location. The destination is not distinct from the first source XMM register and the upper bits (VLMAX-1:128) of the corresponding YMM register destination are unmodified. VEX.128 encoded version: the first source operand is an XMM register or 128-bit memory location. The destination operand is an XMM register. The upper bits (VLMAX-1:128) of the corresponding YMM register destination are zeroed. VEX.256 encoded version: The first source operand is a YMM register. The second source operand can be a YMM register or a 256-bit memory location. The destination operand is a YMM register.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

35

Operation
DP_primitive (SRC1, SRC2) IF (imm8[4] = 1) THEN Temp1[31:0] DEST[31:0] * SRC[31:0]; ELSE Temp1[31:0] +0.0; FI; IF (imm8[5] = 1) THEN Temp1[63:32] DEST[63:32] * SRC[63:32]; ELSE Temp1[63:32] +0.0; FI; IF (imm8[6] = 1) THEN Temp1[95:64] DEST[95:64] * SRC[95:64]; ELSE Temp1[95:64] +0.0; FI; IF (imm8[7] = 1) THEN Temp1[127:96] DEST[127:96] * SRC[127:96]; ELSE Temp1[127:96] +0.0; FI; Temp2[31:0] Temp1[31:0] + Temp1[63:32]; Temp3[31:0] Temp1[95:64] + Temp1[127:96]; Temp4[31:0] Temp2[31:0] + Temp3[31:0]; IF (imm8[0] = 1) THEN DEST[31:0] Temp4[31:0]; ELSE DEST[31:0] +0.0; FI; IF (imm8[1] = 1) THEN DEST[63:32] Temp4[31:0]; ELSE DEST[63:32] +0.0; FI; IF (imm8[2] = 1) THEN DEST[95:64] Temp4[31:0]; ELSE DEST[95:64] +0.0; FI; IF (imm8[3] = 1) THEN DEST[127:96] Temp4[31:0]; ELSE DEST[127:96] +0.0; FI; DPPS (128-bit Legacy SSE version) DEST[127:0]DP_Primitive(SRC1[127:0], SRC2[127:0]); DEST[VLMAX-1:128] (Unmodified) VDPPS (VEX.128 encoded version) DEST[127:0]DP_Primitive(SRC1[127:0], SRC2[127:0]); DEST[VLMAX-1:128] 0 VDPPS (VEX.256 encoded version) DEST[127:0]DP_Primitive(SRC1[127:0], SRC2[127:0]); DEST[255:128]DP_Primitive(SRC1[255:128], SRC2[255:128]);

Flags Affected
None

Intel C/C++ Compiler Intrinsic Equivalent


(V)DPPS: __m128 _mm_dp_ps ( __m128 a, __m128 b, const int mask);

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

36

VDPPS:

__m256 _mm256_dp_ps ( __m256 a, __m256 b, const int mask);

SIMD Floating-Point Exceptions


Overflow, Underflow, Invalid, Precision, Denormal Exceptions are determined separately for each add and multiply operation, in the order of their execution. Unmasked exceptions will leave the destination operands unchanged.

Other Exceptions
See Exceptions Type 2. ...

FSINSine
Opcode D9 FE Instruction FSIN 64-Bit Mode Valid Compat/ Leg Mode Valid Description Replace ST(0) with its sine.

Description
Computes the sine of the source operand in register ST(0) and stores the result in ST(0). The source operand must be given in radians and must be within the range 263 to +263. The following table shows the results obtained when taking the sine of various classes of numbers, assuming that underflow does not occur.

Table 3-45 FSIN Results


SRC (ST(0)) F 0 +0 +F + NaN NOTES: F Means finite floating-point value. * Indicates floating-point invalid-arithmetic-operand (#IA) exception. If the source operand is outside the acceptable range, the C2 flag in the FPU status word is set, and the value in register ST(0) remains unchanged. The instruction does not raise an exception when the source operand is out of range. It is up to the program to check the C2 flag for out-of-range conditions. Source values outside the range 263 to +263 can be reduced to the range of the instruction by subtracting an appropriate integer multiple of 2 or by using the FPREM instruction with a divisor of 2. See the section titled Pi in Chapter 8 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 1, for a discussion of the proper value to use for in performing such reductions. This instructions operation is the same in non-64-bit modes and 64-bit mode. DEST (ST(0)) * 1 to + 1 0 +0 1 to +1
* NaN

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

37

Operation
IF -263 < ST(0) < 263 THEN C2 0; ST(0) sin(ST(0)); ELSE (* Source operand out of range *) C2 1; FI;

FPU Flags Affected


C1 C2 C0, C3 Set to 0 if stack underflow occurred. Set if result was rounded up; cleared otherwise. Set to 1 if outside range (263 < source operand < +263); otherwise, set to 0. Undefined.

Floating-Point Exceptions
#IS #IA #D #P Stack underflow occurred. Source operand is an SNaN value, , or unsupported format. Source operand is a denormal value. Value cannot be represented exactly in destination format.

Protected Mode Exceptions


#NM #MF #UD CR0.EM[bit 2] or CR0.TS[bit 3] = 1. If there is a pending x87 FPU exception. If the LOCK prefix is used.

Real-Address Mode Exceptions


Same exceptions as in protected mode.

Virtual-8086 Mode Exceptions


Same exceptions as in protected mode.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


Same exceptions as in protected mode. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

38

INVDInvalidate Internal Caches


Opcode 0F 08 Instruction INVD Op/ En NP 64-Bit Mode Valid Compat/ Description Leg Mode Valid Flush internal caches; initiate flushing of external caches.

NOTES: * See the IA-32 Architecture Compatibility section below.

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Description
Invalidates (flushes) the processors internal caches and issues a special-function bus cycle that directs external caches to also flush themselves. Data held in internal caches is not written back to main memory. After executing this instruction, the processor does not wait for the external caches to complete their flushing operation before proceeding with instruction execution. It is the responsibility of hardware to respond to the cache flush signal. The INVD instruction is a privileged instruction. When the processor is running in protected mode, the CPL of a program or procedure must be 0 to execute this instruction. The INVD instruction may be used when the cache is used as temporary memory and the cache contents need to be invalidated rather than written back to memory. When the cache is used as temporary memory, no external device should be actively writing data to main memory. Use this instruction with care. Data cached internally and not written back to main memory will be lost. Note that any data from an external device to main memory (for example, via a PCIWrite) can be temporarily stored in the caches; these data can be lost when an INVD instruction is executed. Unless there is a specific requirement or benefit to flushing caches without writing back modified cache lines (for example, temporary memory, testing, or fault recovery where cache coherency with main memory is not a concern), software should instead use the WBINVD instruction. This instructions operation is the same in non-64-bit modes and 64-bit mode.

IA-32 Architecture Compatibility


The INVD instruction is implementation dependent; it may be implemented differently on different families of Intel 64 or IA-32 processors. This instruction is not supported on IA-32 processors earlier than the Intel486 processor.

Operation
Flush(InternalCaches); SignalFlush(ExternalCaches); Continue (* Continue execution *)

Flags Affected
None.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

39

Protected Mode Exceptions


#GP(0) #UD If the current privilege level is not 0. If the LOCK prefix is used.

Real-Address Mode Exceptions


#UD If the LOCK prefix is used.

Virtual-8086 Mode Exceptions


#GP(0) The INVD instruction cannot be executed in virtual-8086 mode.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


Same exceptions as in protected mode. ...

INVLPGInvalidate TLB Entry


Opcode 0F 01/7 Instruction INVLPG m Op/ En M 64-Bit Mode Valid Compat/ Description Leg Mode Valid Invalidate TLB Entry for page that contains m.

NOTES: * See the IA-32 Architecture Compatibility section below.

Instruction Operand Encoding


Op/En M Operand 1 ModRM:r/m (r) Operand 2 NA Operand 3 NA Operand 4 NA

Description
Invalidates (flushes) the translation lookaside buffer (TLB) entry specified with the source operand. The source operand is a memory address. The processor determines the page that contains that address and flushes the TLB entry for that page. The INVLPG instruction is a privileged instruction. When the processor is running in protected mode, the CPL must be 0 to execute this instruction. The INVLPG instruction normally flushes the TLB entry only for the specified page; however, in some cases, it may flush more entries, even the entire TLB. The instruction is guaranteed to invalidates only TLB entries associated with the current PCID. (If PCIDs are disabled CR4.PCIDE = 0 the current PCID is 000H.) The instruction also invalidates any global TLB entries for the specified page, regardless of PCID. For more details on operations that flush the TLB, see MOVMove to/from Control Registers in Chapter 4 of Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2B and Section 4.10.4.1, Operations that Invalidate TLBs and Paging-Structure Caches, of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A). This instructions operation is the same in all non-64-bit modes. It also operates the same in 64-bit mode, except if the memory address is in non-canonical form. In this case, INVLPG is the same as a NOP.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

40

IA-32 Architecture Compatibility


The INVLPG instruction is implementation dependent, and its function may be implemented differently on different families of Intel 64 or IA-32 processors. This instruction is not supported on IA-32 processors earlier than the Intel486 processor.

Operation
Flush(RelevantTLBEntries); Continue; (* Continue execution *)

Flags Affected
None.

Protected Mode Exceptions


#GP(0) #UD If the current privilege level is not 0. Operand is a register. If the LOCK prefix is used.

Real-Address Mode Exceptions


#UD Operand is a register. If the LOCK prefix is used.

Virtual-8086 Mode Exceptions


#GP(0) The INVLPG instruction cannot be executed at the virtual-8086 mode.

64-Bit Mode Exceptions


#GP(0) #UD If the current privilege level is not 0. Operand is a register. If the LOCK prefix is used. ...

IRET/IRETDInterrupt Return
Opcode CF CF REX.W + CF Instruction IRET IRETD IRETQ Op/ En NP NP NP 64-Bit Mode Valid Valid Valid Compat/ Description Leg Mode Valid Valid N.E. Interrupt return (16-bit operand size). Interrupt return (32-bit operand size). Interrupt return (64-bit operand size).

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

41

Description
Returns program control from an exception or interrupt handler to a program or procedure that was interrupted by an exception, an external interrupt, or a software-generated interrupt. These instructions are also used to perform a return from a nested task. (A nested task is created when a CALL instruction is used to initiate a task switch or when an interrupt or exception causes a task switch to an interrupt or exception handler.) See the section titled Task Linking in Chapter 7 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A. IRET and IRETD are mnemonics for the same opcode. The IRETD mnemonic (interrupt return double) is intended for use when returning from an interrupt when using the 32-bit operand size; however, most assemblers use the IRET mnemonic interchangeably for both operand sizes. In Real-Address Mode, the IRET instruction preforms a far return to the interrupted program or procedure. During this operation, the processor pops the return instruction pointer, return code segment selector, and EFLAGS image from the stack to the EIP, CS, and EFLAGS registers, respectively, and then resumes execution of the interrupted program or procedure. In Protected Mode, the action of the IRET instruction depends on the settings of the NT (nested task) and VM flags in the EFLAGS register and the VM flag in the EFLAGS image stored on the current stack. Depending on the setting of these flags, the processor performs the following types of interrupt returns: Return from virtual-8086 mode. Return to virtual-8086 mode. Intra-privilege level return. Inter-privilege level return. Return from nested task (task switch).

If the NT flag (EFLAGS register) is cleared, the IRET instruction performs a far return from the interrupt procedure, without a task switch. The code segment being returned to must be equally or less privileged than the interrupt handler routine (as indicated by the RPL field of the code segment selector popped from the stack). As with a real-address mode interrupt return, the IRET instruction pops the return instruction pointer, return code segment selector, and EFLAGS image from the stack to the EIP, CS, and EFLAGS registers, respectively, and then resumes execution of the interrupted program or procedure. If the return is to another privilege level, the IRET instruction also pops the stack pointer and SS from the stack, before resuming program execution. If the return is to virtual-8086 mode, the processor also pops the data segment registers from the stack. If the NT flag is set, the IRET instruction performs a task switch (return) from a nested task (a task called with a CALL instruction, an interrupt, or an exception) back to the calling or interrupted task. The updated state of the task executing the IRET instruction is saved in its TSS. If the task is re-entered later, the code that follows the IRET instruction is executed. If the NT flag is set and the processor is in IA-32e mode, the IRET instruction causes a general protection exception. In 64-bit mode, the instructions default operation size is 32 bits. Use of the REX.W prefix promotes operation to 64 bits (IRETQ). See the summary chart at the beginning of this section for encoding data and limits. See Changes to Instruction Behavior in VMX Non-Root Operation in Chapter 25 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C, for more information about the behavior of this instruction in VMX non-root operation.

Operation
IF PE = 0 THEN GOTO REAL-ADDRESS-MODE; ELSE

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

42

IF (IA32_EFER.LMA = 0) THEN (* Protected mode *) GOTO PROTECTED-MODE; ELSE (* IA-32e mode *) GOTO IA-32e-MODE; FI; FI; REAL-ADDRESS-MODE; IF OperandSize = 32 THEN IF top 12 bytes of stack not within stack limits THEN #SS; FI; tempEIP 4 bytes at end of stack IF tempEIP[31:16] is not zero THEN #GP(0); FI; EIP Pop(); CS Pop(); (* 32-bit pop, high-order 16 bits discarded *) tempEFLAGS Pop(); EFLAGS (tempEFLAGS AND 257FD5H) OR (EFLAGS AND 1A0000H); ELSE (* OperandSize = 16 *) IF top 6 bytes of stack are not within stack limits THEN #SS; FI; EIP Pop(); (* 16-bit pop; clear upper 16 bits *) CS Pop(); (* 16-bit pop *) EFLAGS[15:0] Pop(); FI; END; PROTECTED-MODE: IF VM = 1 (* Virtual-8086 mode: PE = 1, VM = 1 *) THEN GOTO RETURN-FROM-VIRTUAL-8086-MODE; (* PE = 1, VM = 1 *) FI; IF NT = 1 THEN GOTO TASK-RETURN; (* PE = 1, VM = 0, NT = 1 *) FI; IF OperandSize = 32 THEN IF top 12 bytes of stack not within stack limits THEN #SS(0); FI; tempEIP Pop(); tempCS Pop(); tempEFLAGS Pop(); ELSE (* OperandSize = 16 *) IF top 6 bytes of stack are not within stack limits THEN #SS(0); FI; tempEIP Pop(); tempCS Pop(); tempEFLAGS Pop(); tempEIP tempEIP AND FFFFH; tempEFLAGS tempEFLAGS AND FFFFH;

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

43

FI; IF tempEFLAGS(VM) = 1 and CPL = 0 THEN GOTO RETURN-TO-VIRTUAL-8086-MODE; ELSE GOTO PROTECTED-MODE-RETURN; FI; IA-32e-MODE: IF NT = 1 THEN #GP(0); ELSE IF OperandSize = 32 THEN IF top 12 bytes of stack not within stack limits THEN #SS(0); FI; tempEIP Pop(); tempCS Pop(); tempEFLAGS Pop(); ELSE IF OperandSize = 16 THEN IF top 6 bytes of stack are not within stack limits THEN #SS(0); FI; tempEIP Pop(); tempCS Pop(); tempEFLAGS Pop(); tempEIP tempEIP AND FFFFH; tempEFLAGS tempEFLAGS AND FFFFH; FI; ELSE (* OperandSize = 64 *) THEN tempRIP Pop(); tempCS Pop(); tempEFLAGS Pop(); tempRSP Pop(); tempSS Pop(); FI; GOTO IA-32e-MODE-RETURN; RETURN-FROM-VIRTUAL-8086-MODE: (* Processor is in virtual-8086 mode when IRET is executed and stays in virtual-8086 mode *) IF IOPL = 3 (* Virtual mode: PE = 1, VM = 1, IOPL = 3 *) THEN IF OperandSize = 32 THEN IF top 12 bytes of stack not within stack limits THEN #SS(0); FI; IF instruction pointer not within code segment limits THEN #GP(0); FI; EIP Pop(); CS Pop(); (* 32-bit pop, high-order 16 bits discarded *) EFLAGS Pop(); (* VM, IOPL,VIP and VIF EFLAG bits not modified by pop *)

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

44

FI; END;

ELSE (* OperandSize = 16 *) IF top 6 bytes of stack are not within stack limits THEN #SS(0); FI; IF instruction pointer not within code segment limits THEN #GP(0); FI; EIP Pop(); EIP EIP AND 0000FFFFH; CS Pop(); (* 16-bit pop *) EFLAGS[15:0] Pop(); (* IOPL in EFLAGS not modified by pop *) FI; ELSE #GP(0); (* Trap to virtual-8086 monitor: PE = 1, VM = 1, IOPL < 3 *)

RETURN-TO-VIRTUAL-8086-MODE: (* Interrupted procedure was in virtual-8086 mode: PE = 1, CPL=0, VM IF top 24 bytes of stack are not within stack segment limits THEN #SS(0); FI; IF instruction pointer not within code segment limits THEN #GP(0); FI; CS tempCS; EIP tempEIP & FFFFH; EFLAGS tempEFLAGS; TempESP Pop(); TempSS Pop(); ES Pop(); (* Pop 2 words; throw away high-order word *) DS Pop(); (* Pop 2 words; throw away high-order word *) FS Pop(); (* Pop 2 words; throw away high-order word *) GS Pop(); (* Pop 2 words; throw away high-order word *) SS:ESP TempSS:TempESP; CPL 3; (* Resume execution in Virtual-8086 mode *) END;

= 1 in flag image *)

TASK-RETURN: (* PE = 1, VM = 0, NT = 1 *) Read segment selector in link field of current TSS; IF local/global bit is set to local or index not within GDT limits THEN #TS (TSS selector); FI; Access TSS for task specified in link field of current TSS; IF TSS descriptor type is not TSS or if the TSS is marked not busy THEN #TS (TSS selector); FI; IF TSS not present THEN #NP(TSS selector); FI; SWITCH-TASKS (without nesting) to TSS specified in link field of current TSS; Mark the task just abandoned as NOT BUSY; IF EIP is not within code segment limit THEN #GP(0); FI; END;

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

45

PROTECTED-MODE-RETURN: (* PE = 1 *) IF return code segment selector is NULL THEN GP(0); FI; IF return code segment selector addresses descriptor beyond descriptor table limit THEN GP(selector); FI; Read segment descriptor pointed to by the return code segment selector; IF return code segment descriptor is not a code segment THEN #GP(selector); FI; IF return code segment selector RPL < CPL THEN #GP(selector); FI; IF return code segment descriptor is conforming and return code segment DPL > return code segment selector RPL THEN #GP(selector); FI; IF return code segment descriptor is not present THEN #NP(selector); FI; IF return code segment selector RPL > CPL THEN GOTO RETURN-OUTER-PRIVILEGE-LEVEL; ELSE GOTO RETURN-TO-SAME-PRIVILEGE-LEVEL; FI; END; RETURN-TO-SAME-PRIVILEGE-LEVEL: (* PE = 1, RPL = CPL *) IF new mode 64-Bit Mode THEN IF tempEIP is not within code segment limits THEN #GP(0); FI; EIP tempEIP; ELSE (* new mode = 64-bit mode *) IF tempRIP is non-canonical THEN #GP(0); FI; RIP tempRIP; FI; CS tempCS; (* Segment descriptor information also loaded *) EFLAGS (CF, PF, AF, ZF, SF, TF, DF, OF, NT) tempEFLAGS; IF OperandSize = 32 or OperandSize = 64 THEN EFLAGS(RF, AC, ID) tempEFLAGS; FI; IF CPL IOPL THEN EFLAGS(IF) tempEFLAGS; FI; IF CPL = 0 THEN (* VM = 0 in flags image *) EFLAGS(IOPL) tempEFLAGS; IF OperandSize = 32 or OperandSize = 64 THEN EFLAGS(VIF, VIP) tempEFLAGS; FI; FI; END; RETURN-TO-OUTER-PRIVILEGE-LEVEL: IF OperandSize = 32 THEN IF top 8 bytes on stack are not within limits

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

46

THEN #SS(0); FI; ELSE (* OperandSize = 16 *) IF top 4 bytes on stack are not within limits THEN #SS(0); FI; FI; Read return segment selector; IF stack segment selector is NULL THEN #GP(0); FI; IF return stack segment selector index is not within its descriptor table limits THEN #GP(SSselector); FI; Read segment descriptor pointed to by return segment selector; IF stack segment selector RPL RPL of the return code segment selector or the stack segment descriptor does not indicate a a writable data segment; or the stack segment DPL RPL of the return code segment selector THEN #GP(SS selector); FI; IF stack segment is not present THEN #SS(SS selector); FI; IF new mode 64-Bit Mode THEN IF tempEIP is not within code segment limits THEN #GP(0); FI; EIP tempEIP; ELSE (* new mode = 64-bit mode *) IF tempRIP is non-canonical THEN #GP(0); FI; RIP tempRIP; FI; CS tempCS; EFLAGS (CF, PF, AF, ZF, SF, TF, DF, OF, NT) tempEFLAGS; IF OperandSize = 32 THEN EFLAGS(RF, AC, ID) tempEFLAGS; FI; IF CPL IOPL THEN EFLAGS(IF) tempEFLAGS; FI; IF CPL = 0 THEN EFLAGS(IOPL) tempEFLAGS; IF OperandSize = 32 THEN EFLAGS(VM, VIF, VIP) tempEFLAGS; FI; IF OperandSize = 64 THEN EFLAGS(VIF, VIP) tempEFLAGS; FI; FI; CPL RPL of the return code segment selector; FOR each of segment register (ES, FS, GS, and DS) DO IF segment register points to data or non-conforming code segment and CPL > segment descriptor DPL (* Stored in hidden part of segment register *) THEN (* Segment register invalid *) SegmentSelector 0; (* NULL segment selector *) FI; OD;

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

47

END; IA-32e-MODE-RETURN: (* IA32_EFER.LMA = 1, PE = 1 *) IF ( (return code segment selector is NULL) or (return RIP is non-canonical) or (SS selector is NULL going back to compatibility mode) or (SS selector is NULL going back to CPL3 64-bit mode) or (RPL <> CPL going back to non-CPL3 64-bit mode for a NULL SS selector) ) THEN GP(0); FI; IF return code segment selector addresses descriptor beyond descriptor table limit THEN GP(selector); FI; Read segment descriptor pointed to by the return code segment selector; IF return code segment descriptor is not a code segment THEN #GP(selector); FI; IF return code segment selector RPL < CPL THEN #GP(selector); FI; IF return code segment descriptor is conforming and return code segment DPL > return code segment selector RPL THEN #GP(selector); FI; IF return code segment descriptor is not present THEN #NP(selector); FI; IF return code segment selector RPL > CPL THEN GOTO RETURN-OUTER-PRIVILEGE-LEVEL; ELSE GOTO RETURN-TO-SAME-PRIVILEGE-LEVEL; FI; END;

Flags Affected
All the flags and fields in the EFLAGS register are potentially modified, depending on the mode of operation of the processor. If performing a return from a nested task to a previous task, the EFLAGS register will be modified according to the EFLAGS image stored in the previous tasks TSS.

Protected Mode Exceptions


#GP(0) #GP(selector) If the return code or stack segment selector is NULL. If the return instruction pointer is not within the return code segment limit. If a segment selector index is outside its descriptor table limits. If the return code segment selector RPL is less than the CPL. If the DPL of a conforming-code segment is greater than the return code segment selector RPL. If the DPL for a nonconforming-code segment is not equal to the RPL of the code segment selector. If the stack segment descriptor DPL is not equal to the RPL of the return code segment selector. If the stack segment is not a writable data segment. If the stack segment selector RPL is not equal to the RPL of the return code segment selector. If the segment descriptor for a code segment does not indicate it is a code segment. If the segment selector for a TSS has its local/global bit set for local. If a TSS segment descriptor specifies that the TSS is not busy. If a TSS segment descriptor specifies that the TSS is not available. #SS(0) If the top bytes of stack are not within stack limits.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

48

#NP(selector) #PF(fault-code) #AC(0) #UD

If the return code or stack segment is not present. If a page fault occurs. If an unaligned memory reference occurs when the CPL is 3 and alignment checking is enabled. If the LOCK prefix is used.

Real-Address Mode Exceptions


#GP #SS If the return instruction pointer is not within the return code segment limit. If the top bytes of stack are not within stack limits.

Virtual-8086 Mode Exceptions


#GP(0) #PF(fault-code) #SS(0) #AC(0) #UD If the return instruction pointer is not within the return code segment limit. IF IOPL not equal to 3. If a page fault occurs. If the top bytes of stack are not within stack limits. If an unaligned memory reference occurs and alignment checking is enabled. If the LOCK prefix is used.

Compatibility Mode Exceptions


#GP(0) If EFLAGS.NT[bit 14] = 1. Other exceptions same as in Protected Mode.

64-Bit Mode Exceptions


#GP(0) If EFLAGS.NT[bit 14] = 1. If the return code segment selector is NULL. If the stack segment selector is NULL going back to compatibility mode. If the stack segment selector is NULL going back to CPL3 64-bit mode. If a NULL stack segment selector RPL is not equal to CPL going back to non-CPL3 64-bit mode. If the return instruction pointer is not within the return code segment limit. If the return instruction pointer is non-canonical. #GP(Selector) If a segment selector index is outside its descriptor table limits. If a segment descriptor memory address is non-canonical. If the segment descriptor for a code segment does not indicate it is a code segment. If the proposed new code segment descriptor has both the D-bit and L-bit set. If the DPL for a nonconforming-code segment is not equal to the RPL of the code segment selector. If CPL is greater than the RPL of the code segment selector. If the DPL of a conforming-code segment is greater than the return code segment selector RPL. If the stack segment is not a writable data segment. If the stack segment descriptor DPL is not equal to the RPL of the return code segment selector. If the stack segment selector RPL is not equal to the RPL of the return code segment selector. #SS(0) If an attempt to pop a value off the stack violates the SS limit.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

49

If an attempt to pop a value off the stack causes a non-canonical address to be referenced. #NP(selector) #PF(fault-code) #AC(0) #UD ... If the return code or stack segment is not present. If a page fault occurs. If an unaligned memory reference occurs when the CPL is 3 and alignment checking is enabled. If the LOCK prefix is used.

7. Updates to Chapter 4, Volume 2B


Change bars show changes to Chapter 4 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2B: Instruction Set Reference, M-Z. -----------------------------------------------------------------------------------------...

MOVMove
Opcode 88 /r REX + 88 /r 89 /r 89 /r REX.W + 89 /r 8A /r REX + 8A /r 8B /r 8B /r REX.W + 8B /r 8C /r REX.W + 8C /r 8E /r REX.W + 8E /r A0 REX.W + A0 A1 A1 REX.W + A1 A2 REX.W + A2 A3 Instruction MOV r/m8,r8 MOV r/m8
***,

Op/ En MR r8
***

64-Bit Mode Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid

Compat/ Description Leg Mode Valid N.E. Valid Valid N.E. Valid N.E. Valid Valid N.E. Valid Valid Valid Valid Valid N.E. Valid Valid N.E. Valid N.E. Valid Move r8 to r/m8. Move r8 to r/m8. Move r16 to r/m16. Move r32 to r/m32. Move r64 to r/m64. Move r/m8 to r8. Move r/m8 to r8. Move r/m16 to r16. Move r/m32 to r32. Move r/m64 to r64. Move segment register to r/m16. Move zero extended 16-bit segment register to r/m64. Move r/m16 to segment register. Move lower 16 bits of r/m64 to segment register. Move byte at (seg:offset) to AL. Move byte at (offset) to AL. Move word at (seg:offset) to AX. Move doubleword at (seg:offset) to EAX. Move quadword at (offset) to RAX. Move AL to (seg:offset). Move AL to (offset). Move AX to (seg:offset).

MR MR MR MR RM RM RM RM RM MR MR RM RM FD FD FD FD FD TD TD TD

MOV r/m16,r16 MOV r/m32,r32 MOV r/m64,r64 MOV r8,r/m8 MOV r8***,r/m8*** MOV r16,r/m16 MOV r32,r/m32 MOV r64,r/m64 MOV r/m16,Sreg** MOV r/m64,Sreg** MOV Sreg,r/m16** MOV Sreg,r/m64** MOV AL,moffs8* MOV AL,moffs8* MOV AX,moffs16* MOV EAX,moffs32* MOV RAX,moffs64* MOV moffs8,AL MOV moffs8
***

,AL

MOV moffs16*,AX

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

50

A3 REX.W + A3 B0+ rb REX + B0+ rb B8+ rw B8+ rd REX.W + B8+ rd C6 /0 REX + C6 /0 C7 /0 C7 /0 REX.W + C7 /0

MOV moffs32*,EAX MOV moffs64*,RAX MOV r8, imm8 MOV r8***, imm8 MOV r16, imm16 MOV r32, imm32 MOV r64, imm64 MOV r/m8, imm8 MOV r/m8***, imm8 MOV r/m16, imm16 MOV r/m32, imm32 MOV r/m64, imm32

TD TD OI OI OI OI OI MI MI MI MI MI

Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid

Valid N.E. Valid N.E. Valid Valid N.E. Valid N.E. Valid Valid N.E.

Move EAX to (seg:offset). Move RAX to (offset). Move imm8 to r8. Move imm8 to r8. Move imm16 to r16. Move imm32 to r32. Move imm64 to r64. Move imm8 to r/m8. Move imm8 to r/m8. Move imm16 to r/m16. Move imm32 to r/m32. Move imm32 sign extended to 64-bits to r/m64.

NOTES: * The moffs8, moffs16, moffs32 and moffs64 operands specify a simple offset relative to the segment base, where 8, 16, 32 and 64 refer to the size of the data. The address-size attribute of the instruction determines the size of the offset, either 16, 32 or 64 bits. ** In 32-bit mode, the assembler may insert the 16-bit operand-size prefix with this instruction (see the following Description section for further information). ***In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.

Instruction Operand Encoding


Op/En MR RM FD TD OI MI Operand 1 ModRM:r/m (w) ModRM:reg (w) AL/AX/EAX/RAX Moffs (w) opcode + rd (w) ModRM:r/m (w) Operand 2 ModRM:reg (r) ModRM:r/m (r) Moffs AL/AX/EAX/RAX imm8/16/32/64 imm8/16/32/64 Operand 3 NA NA NA NA NA NA Operand 4 NA NA NA NA NA NA

Description
Copies the second operand (source operand) to the first operand (destination operand). The source operand can be an immediate value, general-purpose register, segment register, or memory location; the destination register can be a general-purpose register, segment register, or memory location. Both operands must be the same size, which can be a byte, a word, a doubleword, or a quadword. The MOV instruction cannot be used to load the CS register. Attempting to do so results in an invalid opcode exception (#UD). To load the CS register, use the far JMP, CALL, or RET instruction. If the destination operand is a segment register (DS, ES, FS, GS, or SS), the source operand must be a valid segment selector. In protected mode, moving a segment selector into a segment register automatically causes the segment descriptor information associated with that segment selector to be loaded into the hidden (shadow) part of the segment register. While loading this information, the segment selector and segment descriptor infor-

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

51

mation is validated (see the Operation algorithm below). The segment descriptor data is obtained from the GDT or LDT entry for the specified segment selector. A NULL segment selector (values 0000-0003) can be loaded into the DS, ES, FS, and GS registers without causing a protection exception. However, any subsequent attempt to reference a segment whose corresponding segment register is loaded with a NULL value causes a general protection exception (#GP) and no memory reference occurs. Loading the SS register with a MOV instruction inhibits all interrupts until after the execution of the next instruction. This operation allows a stack pointer to be loaded into the ESP register with the next instruction (MOV ESP, stack-pointer value) before an interrupt occurs1. Be aware that the LSS instruction offers a more efficient method of loading the SS and ESP registers. When operating in 32-bit mode and moving data between a segment register and a general-purpose register, the 32-bit IA-32 processors do not require the use of the 16-bit operand-size prefix (a byte with the value 66H) with this instruction, but most assemblers will insert it if the standard form of the instruction is used (for example, MOV DS, AX). The processor will execute this instruction correctly, but it will usually require an extra clock. With most assemblers, using the instruction form MOV DS, EAX will avoid this unneeded 66H prefix. When the processor executes the instruction with a 32-bit general-purpose register, it assumes that the 16 least-significant bits of the general-purpose register are the destination or source operand. If the register is a destination operand, the resulting value in the two high-order bytes of the register is implementation dependent. For the Pentium 4, Intel Xeon, and P6 family processors, the two high-order bytes are filled with zeros; for earlier 32-bit IA-32 processors, the two high order bytes are undefined. In 64-bit mode, the instructions default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.

Operation
DEST SRC; Loading a segment register while in protected mode results in special checks and actions, as described in the following listing. These checks are performed on the segment selector and the segment descriptor to which it points. IF SS is loaded THEN IF segment selector is NULL THEN #GP(0); FI; IF segment selector index is outside descriptor table limits or segment selector's RPL CPL or segment is not a writable data segment or DPL CPL THEN #GP(selector); FI; IF segment not marked present 1. If a code instruction breakpoint (for debug) is placed on an instruction located immediately after a MOV SS instruction, the breakpoint may not be triggered. However, in a sequence of instructions that load the SS register, only the first instruction in the sequence is guaranteed to delay an interrupt. In the following sequence, interrupts may be recognized before MOV ESP, EBP executes: MOV SS, EDX MOV SS, EAX MOV ESP, EBP

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

52

FI;

THEN #SS(selector); ELSE SS segment selector; SS segment descriptor; FI;

IF DS, ES, FS, or GS is loaded with non-NULL selector THEN IF segment selector index is outside descriptor table limits or segment is not a data or readable code segment or ((segment is a data or nonconforming code segment) or ((RPL > DPL) and (CPL > DPL)) THEN #GP(selector); FI; IF segment not marked present THEN #NP(selector); ELSE SegmentRegister segment selector; SegmentRegister segment descriptor; FI; FI; IF DS, ES, FS, or GS is loaded with NULL selector THEN SegmentRegister segment selector; SegmentRegister segment descriptor; FI;

Flags Affected
None.

Protected Mode Exceptions


#GP(0) If attempt is made to load SS register with NULL segment selector. If the destination operand is in a non-writable segment. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains a NULL segment selector. #GP(selector) If segment selector index is outside descriptor table limits. If the SS register is being loaded and the segment selector's RPL and the segment descriptors DPL are not equal to the CPL. If the SS register is being loaded and the segment pointed to is a non-writable data segment. If the DS, ES, FS, or GS register is being loaded and the segment pointed to is not a data or readable code segment. If the DS, ES, FS, or GS register is being loaded and the segment pointed to is a data or nonconforming code segment, but both the RPL and the CPL are greater than the DPL. #SS(0) #SS(selector) #NP #PF(fault-code) If a memory operand effective address is outside the SS segment limit. If the SS register is being loaded and the segment pointed to is marked not present. If the DS, ES, FS, or GS register is being loaded and the segment pointed to is marked not present. If a page fault occurs.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

53

#AC(0) #UD

If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If attempt is made to load the CS register. If the LOCK prefix is used.

Real-Address Mode Exceptions


#GP #SS #UD If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit. If attempt is made to load the CS register. If the LOCK prefix is used.

Virtual-8086 Mode Exceptions


#GP(0) #SS(0) #PF(fault-code) #AC(0) #UD If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit. If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made. If attempt is made to load the CS register. If the LOCK prefix is used.

Compatibility Mode Exceptions


Same exceptions as in protected mode. ...

MOVD/MOVQMove Doubleword/Move Quadword


Opcode/ Instruction 0F 6E /r MOVD mm, r/m32 REX.W + 0F 6E /r MOVQ mm, r/m64 0F 7E /r MOVD r/m32, mm REX.W + 0F 7E /r MOVQ r/m64, mm VEX.128.66.0F.W0 6E / VMOVD xmm1, r32/m32 VEX.128.66.0F.W1 6E /r VMOVQ xmm1, r64/m64 66 0F 6E /r MOVD xmm, r/m32 66 REX.W 0F 6E /r MOVQ xmm, r/m64 RM V/N.E. SSE2 Move quadword from r/m64 to xmm. RM V/V SSE2 Move doubleword from r/m32 to xmm. RM V/N.E. AVX Move quadword from r/m64 to xmm1. RM V/V AVX Move doubleword from r/m32 to xmm1. MR V/N.E. MMX Move quadword from mm to r/m64. MR V/V MMX Move doubleword from mm to r/m32. RM V/N.E. MMX Move quadword from r/m64 to mm. Op/ En RM 64/32-bit CPUID Mode Feature Flag V/V MMX Description

Move doubleword from r/m32 to mm.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

54

66 0F 7E /r MOVD r/m32, xmm 66 REX.W 0F 7E /r MOVQ r/m64, xmm VEX.128.66.0F.W0 7E /r VMOVD r32/m32, xmm1 VEX.128.66.0F.W1 7E /r VMOVQ r64/m64, xmm1

MR MR MR MR

V/V V/N.E. V/V V/N.E.

SSE2 SSE2 AVX AVX

Move doubleword from xmm register to r/ m32. Move quadword from xmm register to r/m64. Move doubleword from xmm1 register to r/m32. Move quadword from xmm1 register to r/m64.

Instruction Operand Encoding


Op/En RM MR Operand 1 ModRM:reg (w) ModRM:r/m (w) Operand 2 ModRM:r/m (r) ModRM:reg (r) Operand 3 NA NA Operand 4 NA NA

Description
Copies a doubleword from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be general-purpose registers, MMX technology registers, XMM registers, or 32-bit memory locations. This instruction can be used to move a doubleword to and from the low doubleword of an MMX technology register and a general-purpose register or a 32-bit memory location, or to and from the low doubleword of an XMM register and a general-purpose register or a 32-bit memory location. The instruction cannot be used to transfer data between MMX technology registers, between XMM registers, between generalpurpose registers, or between memory locations. When the destination operand is an MMX technology register, the source operand is written to the low doubleword of the register, and the register is zero-extended to 64 bits. When the destination operand is an XMM register, the source operand is written to the low doubleword of the register, and the register is zero-extended to 128 bits. In 64-bit mode, the instructions default operation size is 32 bits. Use of the REX.R prefix permits access to additional registers (R8-R15). Use of the REX.W prefix promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.

Operation
MOVD (when destination operand is MMX technology register) DEST[31:0] SRC; DEST[63:32] 00000000H; MOVD (when destination operand is XMM register) DEST[31:0] SRC; DEST[127:32] 000000000000000000000000H; DEST[VLMAX-1:128] (Unmodified) MOVD (when source operand is MMX technology or XMM register) DEST SRC[31:0]; VMOVD (VEX-encoded version when destination is an XMM register) DEST[31:0] SRC[31:0] DEST[VLMAX-1:32] 0

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

55

MOVQ (when destination operand is XMM register) DEST[63:0] SRC[63:0]; DEST[127:64] 0000000000000000H; DEST[VLMAX-1:128] (Unmodified) MOVQ (when destination operand is r/m64) DEST[63:0] SRC[63:0]; MOVQ (when source operand is XMM register or r/m64) DEST SRC[63:0]; VMOVQ (VEX-encoded version when destination is an XMM register) DEST[63:0] SRC[63:0] DEST[VLMAX-1:64] 0

Intel C/C++ Compiler Intrinsic Equivalent


MOVD: MOVD: MOVD: MOVD: MOVQ: MOVQ: __m64 _mm_cvtsi32_si64 (int i ) int _mm_cvtsi64_si32 ( __m64m ) __m128i _mm_cvtsi32_si128 (int a) int _mm_cvtsi128_si32 ( __m128i a) __int64 _mm_cvtsi128_si64(__m128i); __m128i _mm_cvtsi64_si128(__int64);

Flags Affected
None.

SIMD Floating-Point Exceptions


None.

Other Exceptions
See Exceptions Type 5; additionally #UD ... If VEX.L = 1. If VEX.vvvv != 1111B.

MOVQMove Quadword
Opcode/ Instruction 0F 6F /r MOVQ mm, mm/m64 0F 7F /r MOVQ mm/m64, mm F3 0F 7E MOVQ xmm1, xmm2/m64 RM V/V SSE2 Move quadword from xmm2/mem64 to xmm1. MR V/V MMX Move quadword from mm to mm/m64. Op/ En RM 64/32-bit CPUID Mode Feature Flag V/V MMX Description

Move quadword from mm/m64 to mm.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

56

VEX.128.F3.0F.WIG 7E /r VMOVQ xmm1, xmm2 VEX.128.F3.0F.WIG 7E /r VMOVQ xmm1, m64 66 0F D6 MOVQ xmm2/m64, xmm1 VEX.128.66.0F.WIG D6 /r VMOVQ xmm1/m64, xmm2

RM RM MR MR

V/V V/V V/V V/V

AVX AVX SSE2 AVX

Move quadword from xmm2 to xmm1. Load quadword from m64 to xmm1. Move quadword from xmm1 to xmm2/mem64. Move quadword from xmm2 register to xmm1/m64.

Instruction Operand Encoding


Op/En RM MR Operand 1 ModRM:reg (w) ModRM:r/m (w) Operand 2 ModRM:r/m (r) ModRM:reg (r) Operand 3 NA NA Operand 4 NA NA

Description
Copies a quadword from the source operand (second operand) to the destination operand (first operand). The source and destination operands can be MMX technology registers, XMM registers, or 64-bit memory locations. This instruction can be used to move a quadword between two MMX technology registers or between an MMX technology register and a 64-bit memory location, or to move data between two XMM registers or between an XMM register and a 64-bit memory location. The instruction cannot be used to transfer data between memory locations. When the source operand is an XMM register, the low quadword is moved; when the destination operand is an XMM register, the quadword is stored to the low quadword of the register, and the high quadword is cleared to all 0s. In 64-bit mode, use of the REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15). Note: In VEX.128.66.0F D6 instruction version, VEX.vvvv and VEX.L=1 are reserved and the former must be 1111b otherwise instructions will #UD. Note: In VEX.128.F3.0F 7E version, VEX.vvvv and VEX.L=1 are reserved and the former must be 1111b, otherwise instructions will #UD.

Operation
MOVQ instruction when operating on MMX technology registers and memory locations: DEST SRC; MOVQ instruction when source and destination operands are XMM registers: DEST[63:0] SRC[63:0]; DEST[127:64] 0000000000000000H; MOVQ instruction when source operand is XMM register and destination operand is memory location: DEST SRC[63:0]; MOVQ instruction when source operand is memory location and destination operand is XMM register: DEST[63:0] SRC;

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

57

DEST[127:64] 0000000000000000H; VMOVQ (VEX.NDS.128.F3.0F 7E) with XMM register source and destination: DEST[63:0] SRC[63:0] DEST[VLMAX-1:64] 0 VMOVQ (VEX.128.66.0F D6) with XMM register source and destination: DEST[63:0] SRC[63:0] DEST[VLMAX-1:64] 0 VMOVQ (7E) with memory source: DEST[63:0] SRC[63:0] DEST[VLMAX-1:64] 0 VMOVQ (D6) with memory dest: DEST[63:0] SRC2[63:0]

Flags Affected
None.

Intel C/C++ Compiler Intrinsic Equivalent


MOVQ: m128i _mm_mov_epi64(__m128i a)

SIMD Floating-Point Exceptions


None.

Other Exceptions
See Table 22-8, Exception Conditions for Legacy SIMD/MMX Instructions without FP Exception, in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3B. ...

PCMPESTRI Packed Compare Explicit Length Strings, Return Index


Opcode/ Instruction 66 0F 3A 61 /r imm8 PCMPESTRI xmm1, xmm2/m128, imm8 VEX.128.66.0F3A.WIG 61 /r ib VPCMPESTRI xmm1, xmm2/m128, imm8 Op/ En RMI 64/32 bit Mode Support V/V CPUID Feature Flag SSE4_2 Description

Perform a packed comparison of string data with explicit lengths, generating an index, and storing the result in ECX. Perform a packed comparison of string data with explicit lengths, generating an index, and storing the result in ECX.

RMI

V/V

AVX

Instruction Operand Encoding


Op/En RMI Operand 1 ModRM:reg (r) Operand 2 ModRM:r/m (r) Operand 3 imm8 Operand 4 NA

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

58

Description
The instruction compares and processes data from two string fragments based on the encoded value in the Imm8 Control Byte (see Section 4.1, Imm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM), and generates an index stored to the count register (ECX/RCX). Each string fragment is represented by two values. The first value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data). The second value is stored in an input length register. The input length register is EAX/RAX (for xmm1) or EDX/RDX (for xmm2/m128). The length represents the number of bytes/words which are valid for the respective xmm/m128 data. The length of each input is interpreted as being the absolute-value of the value in the length register. The absolute-value computation saturates to 16 (for bytes) and 8 (for words), based on the value of imm8[bit3] when the value in the length register is greater than 16 (8) or less than -16 (-8). The comparison and aggregation operations are performed according to the encoded value of Imm8 bit fields (see Section 4.1). The index of the first (or last, according to imm8[6]) set bit of IntRes2 (see Section 4.1.4) is returned in ECX. If no bits are set in IntRes2, ECX is set to 16 (8). Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant information: CFlag Reset if IntRes2 is equal to zero, set otherwise ZFlag Set if absolute-value of EDX is < 16 (8), reset otherwise SFlag Set if absolute-value of EAX is < 16 (8), reset otherwise OFlag IntRes2[0] AFlag Reset PFlag Reset

Effective Operand Size


Operating mode/size 16 bit 32 bit 64 bit 64 bit + REX.W Operand 1 xmm xmm xmm xmm Operand 2 xmm/m128 xmm/m128 xmm/m128 xmm/m128 Length 1 EAX EAX EAX RAX Length 2 EDX EDX EDX RDX Result ECX ECX ECX RCX

Intel C/C++ Compiler Intrinsic Equivalent For Returning Index


int _mm_cmpestri (__m128i a, int la, __m128i b, int lb, const int mode);

Intel C/C++ Compiler Intrinsics For Reading EFlag Results


int int int int int _mm_cmpestra (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestrc (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestro (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestrs (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestrz (__m128i a, int la, __m128i b, int lb, const int mode);

SIMD Floating-Point Exceptions


None.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

59

Other Exceptions
See Exceptions Type 4; additionally, this instruction does not cause #GP if the memory operand is not aligned to 16 Byte boundary, and #UD ... If VEX.L = 1. If VEX.vvvv != 1111B.

PCMPESTRM Packed Compare Explicit Length Strings, Return Mask


Opcode/ Instruction 66 0F 3A 60 /r imm8 PCMPESTRM xmm1, xmm2/m128, imm8 VEX.128.66.0F3A.WIG 60 /r ib VPCMPESTRM xmm1, xmm2/m128, imm8 Op/ En RMI 64/32 bit Mode Support V/V CPUID Feature Flag SSE4_2 Description

Perform a packed comparison of string data with explicit lengths, generating a mask, and storing the result in XMM0 Perform a packed comparison of string data with explicit lengths, generating a mask, and storing the result in XMM0.

RMI

V/V

AVX

Instruction Operand Encoding


Op/En RMI Operand 1 ModRM:reg (r) Operand 2 ModRM:r/m (r) Operand 3 imm8 Operand 4 NA

Description
The instruction compares data from two string fragments based on the encoded value in the imm8 control byte (see Section 4.1, Imm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM), and generates a mask stored to XMM0. Each string fragment is represented by two values. The first value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data). The second value is stored in an input length register. The input length register is EAX/RAX (for xmm1) or EDX/RDX (for xmm2/m128). The length represents the number of bytes/words which are valid for the respective xmm/m128 data. The length of each input is interpreted as being the absolute-value of the value in the length register. The absolute-value computation saturates to 16 (for bytes) and 8 (for words), based on the value of imm8[bit3] when the value in the length register is greater than 16 (8) or less than -16 (-8). The comparison and aggregation operations are performed according to the encoded value of Imm8 bit fields (see Section 4.1). As defined by imm8[6], IntRes2 is then either stored to the least significant bits of XMM0 (zero extended to 128 bits) or expanded into a byte/word-mask and then stored to XMM0. Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant information: CFlag Reset if IntRes2 is equal to zero, set otherwise ZFlag Set if absolute-value of EDX is < 16 (8), reset otherwise SFlag Set if absolute-value of EAX is < 16 (8), reset otherwise OFlag IntRes2[0] AFlag Reset PFlag Reset

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

60

Note: In VEX.128 encoded versions, bits (VLMAX-1:128) of XMM0 are zeroed. VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.

Effective Operand Size


Operating mode/size 16 bit 32 bit 64 bit 64 bit + REX.W Operand1 xmm xmm xmm xmm Operand 2 xmm/m128 xmm/m128 xmm/m128 xmm/m128 Length1 EAX EAX EAX RAX Length2 EDX EDX EDX RDX Result XMM0 XMM0 XMM0 XMM0

Intel C/C++ Compiler Intrinsic Equivalent For Returning Mask


__m128i _mm_cmpestrm (__m128i a, int la, __m128i b, int lb, const int mode);

Intel C/C++ Compiler Intrinsics For Reading EFlag Results


int int int int int _mm_cmpestra (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestrc (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestro (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestrs (__m128i a, int la, __m128i b, int lb, const int mode); _mm_cmpestrz (__m128i a, int la, __m128i b, int lb, const int mode);

SIMD Floating-Point Exceptions


None.

Other Exceptions
See Exceptions Type 4; additionally, this instruction does not cause #GP if the memory operand is not aligned to 16 Byte boundary, and #UD ... If VEX.L = 1. If VEX.vvvv != 1111B.

PCMPISTRI Packed Compare Implicit Length Strings, Return Index


Opcode/ Instruction 66 0F 3A 63 /r imm8 PCMPISTRI xmm1, xmm2/m128, imm8 VEX.128.66.0F3A.WIG 63 /r ib VPCMPISTRI xmm1, xmm2/m128, imm8 Op/ En RM 64/32 bit Mode Support V/V CPUID Feature Flag SSE4_2 Description

Perform a packed comparison of string data with implicit lengths, generating an index, and storing the result in ECX. Perform a packed comparison of string data with implicit lengths, generating an index, and storing the result in ECX.

RM

V/V

AVX

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

61

Instruction Operand Encoding


Op/En RM Operand 1 ModRM:reg (r) Operand 2 ModRM:r/m (r) Operand 3 imm8 Operand 4 NA

Description
The instruction compares data from two strings based on the encoded value in the Imm8 Control Byte (see Section 4.1, Imm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM), and generates an index stored to ECX. Each string is represented by a single value. The value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data). Each input byte/word is augmented with a valid/invalid tag. A byte/word is considered valid only if it has a lower index than the least significant null byte/ word. (The least significant null byte/word is also considered invalid.) The comparison and aggregation operations are performed according to the encoded value of Imm8 bit fields (see Section 4.1). The index of the first (or last, according to imm8[6]) set bit of IntRes2 is returned in ECX. If no bits are set in IntRes2, ECX is set to 16 (8). Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant information: CFlag Reset if IntRes2 is equal to zero, set otherwise ZFlag Set if any byte/word of xmm2/mem128 is null, reset otherwise SFlag Set if any byte/word of xmm1 is null, reset otherwise OFlag IntRes2[0] AFlag Reset PFlag Reset Note: In VEX.128 encoded version, VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.

Effective Operand Size


Operating mode/size 16 bit 32 bit 64 bit 64 bit + REX.W Operand1 xmm xmm xmm xmm Operand 2 xmm/m128 xmm/m128 xmm/m128 xmm/m128 Result ECX ECX ECX RCX

Intel C/C++ Compiler Intrinsic Equivalent For Returning Index


int _mm_cmpistri (__m128i a, __m128i b, const int mode);

Intel C/C++ Compiler Intrinsics For Reading EFlag Results


int int int int _mm_cmpistra (__m128i a, __m128i b, const int mode); _mm_cmpistrc (__m128i a, __m128i b, const int mode); _mm_cmpistro (__m128i a, __m128i b, const int mode); _mm_cmpistrs (__m128i a, __m128i b, const int mode);

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

62

int

_mm_cmpistrz (__m128i a, __m128i b, const int mode);

SIMD Floating-Point Exceptions


None.

Other Exceptions
See Exceptions Type 4; additionally, this instruction does not cause #GP if the memory operand is not aligned to 16 Byte boundary, and #UD ... If VEX.L = 1. If VEX.vvvv != 1111B.

PCMPISTRM Packed Compare Implicit Length Strings, Return Mask


Opcode/ Instruction 66 0F 3A 62 /r imm8 PCMPISTRM xmm1, xmm2/m128, imm8 VEX.128.66.0F3A.WIG 62 /r ib VPCMPISTRM xmm1, xmm2/m128, imm8 Op/ En RM 64/32 bit Mode Support V/V CPUID Feature Flag SSE4_2 Description

Perform a packed comparison of string data with implicit lengths, generating a mask, and storing the result in XMM0. Perform a packed comparison of string data with implicit lengths, generating a Mask, and storing the result in XMM0.

RM

V/V

AVX

Instruction Operand Encoding


Op/En RM Operand 1 ModRM:reg (r) Operand 2 ModRM:r/m (r) Operand 3 imm8 Operand 4 NA

Description
The instruction compares data from two strings based on the encoded value in the imm8 byte (see Section 4.1, Imm8 Control Byte Operation for PCMPESTRI / PCMPESTRM / PCMPISTRI / PCMPISTRM) generating a mask stored to XMM0. Each string is represented by a single value. The value is an xmm (or possibly m128 for the second operand) which contains the data elements of the string (byte or word data). Each input byte/word is augmented with a valid/invalid tag. A byte/word is considered valid only if it has a lower index than the least significant null byte/ word. (The least significant null byte/word is also considered invalid.) The comparison and aggregation operation are performed according to the encoded value of Imm8 bit fields (see Section 4.1). As defined by imm8[6], IntRes2 is then either stored to the least significant bits of XMM0 (zero extended to 128 bits) or expanded into a byte/word-mask and then stored to XMM0. Note that the Arithmetic Flags are written in a non-standard manner in order to supply the most relevant information: CFlag Reset if IntRes2 is equal to zero, set otherwise ZFlag Set if any byte/word of xmm2/mem128 is null, reset otherwise SFlag Set if any byte/word of xmm1 is null, reset otherwise OFlag IntRes2[0] AFlag Reset

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

63

PFlag Reset Note: In VEX.128 encoded versions, bits (VLMAX-1:128) of XMM0 are zeroed. VEX.vvvv is reserved and must be 1111b, VEX.L must be 0, otherwise the instruction will #UD.

Effective Operand Size


Operating mode/size 16 bit 32 bit 64 bit 64 bit + REX.W Operand1 xmm xmm xmm xmm Operand 2 xmm/m128 xmm/m128 xmm/m128 xmm/m128 Result XMM0 XMM0 XMM0 XMM0

Intel C/C++ Compiler Intrinsic Equivalent For Returning Mask


__m128i _mm_cmpistrm (__m128i a, __m128i b, const int mode);

Intel C/C++ Compiler Intrinsics For Reading EFlag Results


int int int int int _mm_cmpistra (__m128i a, __m128i b, const int mode); _mm_cmpistrc (__m128i a, __m128i b, const int mode); _mm_cmpistro (__m128i a, __m128i b, const int mode); _mm_cmpistrs (__m128i a, __m128i b, const int mode); _mm_cmpistrz (__m128i a, __m128i b, const int mode);

SIMD Floating-Point Exceptions


None.

Other Exceptions
See Exceptions Type 4; additionally, this instruction does not cause #GP if the memory operand is not aligned to 16 Byte boundary, and #UD ... If VEX.L = 1. If VEX.vvvv != 1111B.

PINSRB/PINSRD/PINSRQ Insert Byte/Dword/Qword


Opcode/ Instruction 66 0F 3A 20 /r ib PINSRB xmm1, r32/m8, imm8 66 0F 3A 22 /r ib PINSRD xmm1, r/m32, imm8 66 REX.W 0F 3A 22 /r ib PINSRQ xmm1, r/m64, imm8 Op/ En RMI 64/32 bit Mode Support V/V CPUID Feature Flag SSE4_1 Description

Insert a byte integer value from r32/m8 into xmm1 at the destination element in xmm1 specified by imm8. Insert a dword integer value from r/m32 into the xmm1 at the destination element specified by imm8. Insert a qword integer value from r/m64 into the xmm1 at the destination element specified by imm8.

RMI

V/V

SSE4_1

RMI

N. E./V

SSE4_1

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

64

VEX.NDS.128.66.0F3A.W0 20 /r ib VPINSRB xmm1, xmm2, r32/m8, imm8 VEX.NDS.128.66.0F3A.W0 22 /r ib VPINSRD xmm1, xmm2, r32/m32, imm8 VEX.NDS.128.66.0F3A.W1 22 /r ib VPINSRQ xmm1, xmm2, r64/m64, imm8 NOTES:

RVMI V1/V

AVX

Merge a byte integer value from r32/m8 and rest from xmm2 into xmm1 at the byte offset in imm8. Insert a dword integer value from r32/m32 and rest from xmm2 into xmm1 at the dword offset in imm8. Insert a qword integer value from r64/m64 and rest from xmm2 into xmm1 at the qword offset in imm8.

RVMI V/V

AVX

RVMI V/I

AVX

1. In 64-bit mode, VEX.W1 is ignored for VPINSRB (similar to legacy REX.W=1 prefix with PINSRB).

Instruction Operand Encoding


Op/En RMI RVMI Operand 1 ModRM:reg (w) ModRM:reg (w) Operand 2 ModRM:r/m (r) VEX.vvvv (r) Operand 3 imm8 ModRM:r/m (r) Operand 4 NA imm8

Description
Copies a byte/dword/qword from the source operand (second operand) and inserts it in the destination operand (first operand) at the location specified with the count operand (third operand). (The other elements in the destination register are left untouched.) The source operand can be a general-purpose register or a memory location. (When the source operand is a general-purpose register, PINSRB copies the low byte of the register.) The destination operand is an XMM register. The count operand is an 8-bit immediate. When specifying a qword[dword, byte] location in an XMM register, the [2, 4] least-significant bit(s) of the count operand specify the location. In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15, R8-15). Use of REX.W permits the use of 64 bit general purpose registers. 128-bit Legacy SSE version: Bits (VLMAX-1:128) of the corresponding YMM destination register remain unchanged. VEX.128 encoded version: Bits (VLMAX-1:128) of the destination YMM register are zeroed. VEX.L must be 0, otherwise the instruction will #UD. Attempt to execute VPINSRQ in non-64-bit mode will cause #UD.

Operation
CASE OF PINSRB: SEL COUNT[3:0]; MASK (0FFH << (SEL * 8)); TEMP (((SRC[7:0] << (SEL *8)) AND MASK); PINSRD: SEL COUNT[1:0]; MASK (0FFFFFFFFH << (SEL * 32)); TEMP (((SRC << (SEL *32)) AND MASK) ; PINSRQ: SEL COUNT[0] MASK (0FFFFFFFFFFFFFFFFH << (SEL * 64)); TEMP (((SRC << (SEL *32)) AND MASK) ; ESAC; DEST ((DEST AND NOT MASK) OR TEMP);

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

65

VPINSRB (VEX.128 encoded version) SEL imm8[3:0] DEST[127:0] write_b_element(SEL, SRC2, SRC1) DEST[VLMAX-1:128] 0 VPINSRD (VEX.128 encoded version) SEL imm8[1:0] DEST[127:0] write_d_element(SEL, SRC2, SRC1) DEST[VLMAX-1:128] 0 VPINSRQ (VEX.128 encoded version) SEL imm8[0] DEST[127:0] write_q_element(SEL, SRC2, SRC1) DEST[VLMAX-1:128] 0

Intel C/C++ Compiler Intrinsic Equivalent


PINSRB: PINSRD: PINSRQ: __m128i _mm_insert_epi8 (__m128i s1, int s2, const int ndx); __m128i _mm_insert_epi32 (__m128i s2, int s, const int ndx); __m128i _mm_insert_epi64(__m128i s2, __int64 s, const int ndx);

Flags Affected
None.

SIMD Floating-Point Exceptions


None.

Other Exceptions
See Exceptions Type 5; additionally #UD ... If VEX.L = 1. If VPINSRQ in non-64-bit mode with VEX.W=1.

POPCNT Return the Count of Number of Bits Set to 1


Opcode F3 0F B8 /r F3 0F B8 /r F3 REX.W 0F B8 /r Instruction POPCNT r16, r/m16 POPCNT r32, r/m32 POPCNT r64, r/m64 Op/ En RM RM RM 64-Bit Mode Valid Valid Valid Compat/ Description Leg Mode Valid Valid N.E. POPCNT on r/m16 POPCNT on r/m32 POPCNT on r/m64

Instruction Operand Encoding


Op/En RM Operand 1 ModRM:reg (w) Operand 2 ModRM:r/m (r) Operand 3 NA Operand 4 NA

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

66

Description
This instruction calculates of number of bits set to 1 in the second operand (source) and returns the count in the first operand (a destination register).

Operation
Count = 0; For (i=0; i < OperandSize; i++) { IF (SRC[ i] = 1) // ith bit THEN Count++; FI; } DEST Count;

Flags Affected
OF, SF, ZF, AF, CF, PF are all cleared. ZF is set if SRC = 0, otherwise ZF is cleared

Intel C/C++ Compiler Intrinsic Equivalent


POPCNT: POPCNT: int _mm_popcnt_u32(unsigned int a); int64_t _mm_popcnt_u64(unsigned __int64 a);

Protected Mode Exceptions


#GP(0) #SS(0) #PF (fault-code) #AC(0) #UD If a memory operand effective address is outside the CS, DS, ES, FS or GS segments. If a memory operand effective address is outside the SS segment limit. For a page fault. If an unaligned memory reference is made while the current privilege level is 3 and alignment checking is enabled. If CPUID.01H:ECX.POPCNT [Bit 23] = 0. If LOCK prefix is used. Either the prefix REP (F3h) or REPN (F2H) is used.

Real-Address Mode Exceptions


#GP(0) #SS(0) #UD If any part of the operand lies outside of the effective address space from 0 to 0FFFFH. If a memory operand effective address is outside the SS segment limit. If CPUID.01H:ECX.POPCNT [Bit 23] = 0. If LOCK prefix is used. Either the prefix REP (F3h) or REPN (F2H) is used.

Virtual 8086 Mode Exceptions


#GP(0) #SS(0) #PF (fault-code) #AC(0) #UD If any part of the operand lies outside of the effective address space from 0 to 0FFFFH. If a memory operand effective address is outside the SS segment limit. For a page fault. If an unaligned memory reference is made while alignment checking is enabled. If CPUID.01H:ECX.POPCNT [Bit 23] = 0. If LOCK prefix is used.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

67

Either the prefix REP (F3h) or REPN (F2H) is used.

Compatibility Mode Exceptions


Same exceptions as in Protected Mode.

64-Bit Mode Exceptions


#GP(0) #SS(0) #PF (fault-code) #AC(0) #UD If the memory address is in a non-canonical form. If a memory address referencing the SS segment is in a non-canonical form. For a page fault. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If CPUID.01H:ECX.POPCNT [Bit 23] = 0. If LOCK prefix is used. Either the prefix REP (F3h) or REPN (F2H) is used. ...

SBBInteger Subtraction with Borrow


Opcode 1C ib 1D iw 1D id REX.W + 1D id 80 /3 ib REX + 80 /3 ib 81 /3 iw 81 /3 id REX.W + 81 /3 id 83 /3 ib 83 /3 ib REX.W + 83 /3 ib 18 /r REX + 18 /r 19 /r 19 /r REX.W + 19 /r 1A /r REX + 1A /r Instruction SBB AL, imm8 SBB AX, imm16 SBB EAX, imm32 SBB RAX, imm32 SBB r/m8, imm8 SBB r/m8*, imm8 SBB r/m16, imm16 SBB r/m32, imm32 SBB r/m64, imm32 SBB r/m16, imm8 SBB r/m32, imm8 SBB r/m64, imm8 SBB r/m8, r8 SBB r/m8*, r8 SBB r/m16, r16 SBB r/m32, r32 SBB r/m64, r64 SBB r8, r/m8 SBB r8*, r/m8* Op/ En I I I I MI MI MI MI MI MI MI MI MR MR MR MR MR RM RM 64-Bit Mode Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Valid Compat/ Description Leg Mode Valid Valid Valid N.E. Valid N.E. Valid Valid N.E. Valid Valid N.E. Valid N.E. Valid Valid N.E. Valid N.E. Subtract with borrow imm8 from AL. Subtract with borrow imm16 from AX. Subtract with borrow imm32 from EAX. Subtract with borrow sign-extended imm.32 to 64-bits from RAX. Subtract with borrow imm8 from r/m8. Subtract with borrow imm8 from r/m8. Subtract with borrow imm16 from r/m16. Subtract with borrow imm32 from r/m32. Subtract with borrow sign-extended imm32 to 64-bits from r/m64. Subtract with borrow sign-extended imm8 from r/m16. Subtract with borrow sign-extended imm8 from r/m32. Subtract with borrow sign-extended imm8 from r/m64. Subtract with borrow r8 from r/m8. Subtract with borrow r8 from r/m8. Subtract with borrow r16 from r/m16. Subtract with borrow r32 from r/m32. Subtract with borrow r64 from r/m64. Subtract with borrow r/m8 from r8. Subtract with borrow r/m8 from r8.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

68

1B /r 1B /r REX.W + 1B /r

SBB r16, r/m16 SBB r32, r/m32 SBB r64, r/m64

RM RM RM

Valid Valid Valid

Valid Valid N.E.

Subtract with borrow r/m16 from r16. Subtract with borrow r/m32 from r32. Subtract with borrow r/m64 from r64.

NOTES: * In 64-bit mode, r/m8 can not be encoded to access the following byte registers if a REX prefix is used: AH, BH, CH, DH.

Instruction Operand Encoding


Op/En I MI MR RM Operand 1 AL/AX/EAX/RAX ModRM:r/m (w) ModRM:r/m (w) ModRM:reg (w) Operand 2 imm8/16/32 imm8/16/32 ModRM:reg (r) ModRM:r/m (r) Operand 3 NA NA NA NA Operand 4 NA NA NA NA

Description
Adds the source operand (second operand) and the carry (CF) flag, and subtracts the result from the destination operand (first operand). The result of the subtraction is stored in the destination operand. The destination operand can be a register or a memory location; the source operand can be an immediate, a register, or a memory location. (However, two memory operands cannot be used in one instruction.) The state of the CF flag represents a borrow from a previous subtraction. When an immediate value is used as an operand, it is sign-extended to the length of the destination operand format. The SBB instruction does not distinguish between signed or unsigned operands. Instead, the processor evaluates the result for both data types and sets the OF and CF flags to indicate a borrow in the signed or unsigned result, respectively. The SF flag indicates the sign of the signed result. The SBB instruction is usually executed as part of a multibyte or multiword subtraction in which a SUB instruction is followed by a SBB instruction. This instruction can be used with a LOCK prefix to allow the instruction to be executed atomically. In 64-bit mode, the instructions default operation size is 32 bits. Using a REX prefix in the form of REX.R permits access to additional registers (R8-R15). Using a REX prefix in the form of REX.W promotes operation to 64 bits. See the summary chart at the beginning of this section for encoding data and limits.

Operation
DEST (DEST (SRC + CF));

Intel C/C++ Compiler Intrinsic Equivalent


SBB: extern unsigned char _subborrow_u8(unsigned char c_in, unsigned char src1, unsigned char src2, unsigned char *diff_out); SBB: extern unsigned char _subborrow_u16(unsigned char c_in, unsigned short src1, unsigned short src2, unsigned short *diff_out); SBB: extern unsigned char _subborrow_u32(unsigned char c_in, unsigned int src1, unsigned char int, unsigned int *diff_out); SBB: extern unsigned char _subborrow_u64(unsigned char c_in, unsigned __int64 src1, unsigned __int64 src2, unsigned __int64 *diff_out);

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

69

Flags Affected
The OF, SF, ZF, AF, PF, and CF flags are set according to the result.

Protected Mode Exceptions


#GP(0) If the destination is located in a non-writable segment. If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If the DS, ES, FS, or GS register contains a NULL segment selector. #SS(0) #PF(fault-code) #AC(0) #UD If a memory operand effective address is outside the SS segment limit. If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If the LOCK prefix is used but the destination is not a memory operand.

Real-Address Mode Exceptions


#GP #SS #UD If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit. If the LOCK prefix is used but the destination is not a memory operand.

Virtual-8086 Mode Exceptions


#GP(0) #SS(0) #PF(fault-code) #AC(0) #UD If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit. If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made. If the LOCK prefix is used but the destination is not a memory operand.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


#SS(0) #GP(0) #PF(fault-code) #AC(0) #UD ... If a memory address referencing the SS segment is in a non-canonical form. If the memory address is in a non-canonical form. If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If the LOCK prefix is used but the destination is not a memory operand.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

70

SCAS/SCASB/SCASW/SCASDScan String
Opcode AE AF AF REX.W + AF AE AF AF REX.W + AF Instruction SCAS m8 SCAS m16 SCAS m32 SCAS m64 SCASB SCASW SCASD SCASQ Op/ En NP NP NP NP NP NP NP NP 64-Bit Mode Valid Valid Valid Valid Valid Valid Valid Valid Compat/ Description Leg Mode Valid Valid Valid N.E. Valid Valid Valid N.E. Compare AL with byte at ES:(E)DI or RDI, then set status flags.* Compare AX with word at ES:(E)DI or RDI, then set status flags.* Compare EAX with doubleword at ES(E)DI or RDI then set status flags.* Compare RAX with quadword at RDI or EDI then set status flags. Compare AL with byte at ES:(E)DI or RDI then set status flags.* Compare AX with word at ES:(E)DI or RDI then set status flags.* Compare EAX with doubleword at ES:(E)DI or RDI then set status flags.* Compare RAX with quadword at RDI or EDI then set status flags.

NOTES: * In 64-bit mode, only 64-bit (RDI) and 32-bit (EDI) address sizes are supported. In non-64-bit mode, only 32-bit (EDI) and 16-bit (DI) address sizes are supported.

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Description
In non-64-bit modes and in default 64-bit mode: this instruction compares a byte, word, doubleword or quadword specified using a memory operand with the value in AL, AX, or EAX. It then sets status flags in EFLAGS recording the results. The memory operand address is read from ES:(E)DI register (depending on the address-size attribute of the instruction and the current operational mode). Note that ES cannot be overridden with a segment override prefix. At the assembly-code level, two forms of this instruction are allowed. The explicit-operand form and the no-operands form. The explicit-operand form (specified using the SCAS mnemonic) allows a memory operand to be specified explicitly. The memory operand must be a symbol that indicates the size and location of the operand value. The register operand is then automatically selected to match the size of the memory operand (AL register for byte comparisons, AX for word comparisons, EAX for doubleword comparisons). The explicit-operand form is provided to allow documentation. Note that the documentation provided by this form can be misleading. That is, the memory operand symbol must specify the correct type (size) of the operand (byte, word, or doubleword) but it does not have to specify the correct location. The location is always specified by ES:(E)DI. The no-operands form of the instruction uses a short form of SCAS. Again, ES:(E)DI is assumed to be the memory operand and AL, AX, or EAX is assumed to be the register operand. The size of operands is selected by the mnemonic: SCASB (byte comparison), SCASW (word comparison), or SCASD (doubleword comparison). After the comparison, the (E)DI register is incremented or decremented automatically according to the setting of the DF flag in the EFLAGS register. If the DF flag is 0, the (E)DI register is incremented; if the DF flag is 1, the

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

71

(E)DI register is decremented. The register is incremented or decremented by 1 for byte operations, by 2 for word operations, and by 4 for doubleword operations. SCAS, SCASB, SCASW, SCASD, and SCASQ can be preceded by the REP prefix for block comparisons of ECX bytes, words, doublewords, or quadwords. Often, however, these instructions will be used in a LOOP construct that takes some action based on the setting of status flags. See REP/REPE/REPZ /REPNE/REPNZRepeat String Operation Prefix in this chapter for a description of the REP prefix. In 64-bit mode, the instructions default address size is 64-bits, 32-bit address size is supported using the prefix 67H. Using a REX prefix in the form of REX.W promotes operation on doubleword operand to 64 bits. The 64-bit no-operand mnemonic is SCASQ. Address of the memory operand is specified in either RDI or EDI, and AL/AX/ EAX/RAX may be used as the register operand. After a comparison, the destination register is incremented or decremented by the current operand size (depending on the value of the DF flag). See the summary chart at the beginning of this section for encoding data and limits.

Operation
Non-64-bit Mode: IF (Byte comparison) THEN temp AL SRC; SetStatusFlags(temp); THEN IF DF = 0 THEN (E)DI (E)DI + 1; ELSE (E)DI (E)DI 1; FI; ELSE IF (Word comparison) THEN temp AX SRC; SetStatusFlags(temp); IF DF = 0 THEN (E)DI (E)DI + 2; ELSE (E)DI (E)DI 2; FI; FI; ELSE IF (Doubleword comparison) THEN temp EAX SRC; SetStatusFlags(temp); IF DF = 0 THEN (E)DI (E)DI + 4; ELSE (E)DI (E)DI 4; FI; FI; FI; 64-bit Mode: IF (Byte cmparison) THEN temp AL SRC; SetStatusFlags(temp); THEN IF DF = 0 THEN (R|E)DI (R|E)DI + 1; ELSE (R|E)DI (R|E)DI 1; FI;

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

72

ELSE IF (Word comparison) THEN temp AX SRC; SetStatusFlags(temp); IF DF = 0 THEN (R|E)DI (R|E)DI + 2; ELSE (R|E)DI (R|E)DI 2; FI; FI; ELSE IF (Doubleword comparison) THEN temp EAX SRC; SetStatusFlags(temp); IF DF = 0 THEN (R|E)DI (R|E)DI + 4; ELSE (R|E)DI (R|E)DI 4; FI; FI; ELSE IF (Quadword comparison using REX.W ) THEN temp RAX SRC; SetStatusFlags(temp); IF DF = 0 THEN (R|E)DI (R|E)DI + 8; ELSE (R|E)DI (R|E)DI 8; FI; FI; F

Flags Affected
The OF, SF, ZF, AF, PF, and CF flags are set according to the temporary result of the comparison.

Protected Mode Exceptions


#GP(0) If a memory operand effective address is outside the limit of the ES segment. If the ES register contains a NULL segment selector. If an illegal memory operand effective address in the ES segment is given. #PF(fault-code) #AC(0) #UD If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If the LOCK prefix is used.

Real-Address Mode Exceptions


#GP #SS #UD If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit. If the LOCK prefix is used.

Virtual-8086 Mode Exceptions


#GP(0) #SS(0) If a memory operand effective address is outside the CS, DS, ES, FS, or GS segment limit. If a memory operand effective address is outside the SS segment limit.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

73

#PF(fault-code) #AC(0) #UD

If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made. If the LOCK prefix is used.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


#GP(0) #PF(fault-code) #AC(0) #UD ... If the memory address is in a non-canonical form. If a page fault occurs. If alignment checking is enabled and an unaligned memory reference is made while the current privilege level is 3. If the LOCK prefix is used.

SWAPGSSwap GS Base Register


Opcode 0F 01 F8 Instruction SWAPGS Op/ En NP 64-Bit Mode Valid Compat/ Description Leg Mode Invalid Exchanges the current GS base register value with the value contained in MSR address C0000102H.

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Description
SWAPGS exchanges the current GS base register value with the value contained in MSR address C0000102H (IA32_KERNEL_GS_BASE). The SWAPGS instruction is a privileged instruction intended for use by system software. When using SYSCALL to implement system calls, there is no kernel stack at the OS entry point. Neither is there a straightforward method to obtain a pointer to kernel structures from which the kernel stack pointer could be read. Thus, the kernel cannot save general purpose registers or reference memory. By design, SWAPGS does not require any general purpose registers or memory operands. No registers need to be saved before using the instruction. SWAPGS exchanges the CPL 0 data pointer from the IA32_KERNEL_GS_BASE MSR with the GS base register. The kernel can then use the GS prefix on normal memory references to access kernel data structures. Similarly, when the OS kernel is entered using an interrupt or exception (where the kernel stack is already set up), SWAPGS can be used to quickly get a pointer to the kernel data structures. The IA32_KERNEL_GS_BASE MSR itself is only accessible using RDMSR/WRMSR instructions. Those instructions are only accessible at privilege level 0. The WRMSR instruction ensures that the IA32_KERNEL_GS_BASE MSR contains a canonical address.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

74

Operation
IF CS.L 1 (* Not in 64-Bit Mode *) THEN #UD; FI; IF CPL 0 THEN #GP(0); FI; tmp GS.base; GS.base IA32_KERNEL_GS_BASE; IA32_KERNEL_GS_BASE tmp;

Flags Affected
None

Protected Mode Exceptions


#UD If Mode

64-Bit. 64-Bit. 64-Bit. 64-Bit.

Real-Address Mode Exceptions


#UD If Mode

Virtual-8086 Mode Exceptions


#UD If Mode

Compatibility Mode Exceptions


#UD If Mode

64-Bit Mode Exceptions


#GP(0) If CPL

0.

If the LOCK prefix is used. ...

SYSCALLFast System Call


Opcode 0F 05 Instruction SYSCALL Op/ En NP 64-Bit Mode Valid Compat/ Description Leg Mode Invalid Fast call to privilege level 0 system procedures.

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

75

Description
SYSCALL invokes an OS system-call handler at privilege level 0. It does so by loading RIP from the IA32_LSTAR MSR (after saving the address of the instruction following SYSCALL into RCX). (The WRMSR instruction ensures that the IA32_LSTAR MSR always contain a canonical address.) SYSCALL also saves RFLAGS into R11 and then masks RFLAGS using the IA32_FMASK MSR (MSR address C0000084H); specifically, the processor clears in RFLAGS every bit corresponding to a bit that is set in the IA32_FMASK MSR. SYSCALL loads the CS and SS selectors with values derived from bits 47:32 of the IA32_STAR MSR. However, the CS and SS descriptor caches are not loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the responsibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values correspond to the fixed values loaded into the descriptor caches; the SYSCALL instruction does not ensure this correspondence. The SYSCALL instruction does not save the stack pointer (RSP). If the OS system-call handler will change the stack pointer, it is the responsibility of software to save the previous value of the stack pointer. This might be done prior to executing SYSCALL, with software restoring the stack pointer with the instruction following SYSCALL (which will be executed after SYSRET). Alternatively, the OS system-call handler may save the stack pointer and restore it before executing SYSRET.

Operation
IF (CS.L 1 ) or (IA32_EFER.LMA 1) or (IA32_EFER.SCE 1) (* Not in 64-Bit Mode or SYSCALL/SYSRET not enabled in IA32_EFER *) THEN #UD; FI; RCX RIP; (* Will contain address of next instruction *) RIP IA32_LSTAR; R11 RFLAGS; RFLAGS RFLAGS AND NOT(IA32_FMASK); CS.Selector IA32_STAR[47:32] AND FFFCH (* Operating system provides CS; RPL forced to 0 *) (* Set rest of CS to a fixed value *) CS.Base 0; (* Flat segment *) CS.Limit FFFFFH; (* With 4-KByte granularity, implies a 4-GByte limit *) CS.Type 11; (* Execute/read code, accessed *) CS.S 1; CS.DPL 0; CS.P 1; CS.L 1; (* Entry is to 64-bit mode *) CS.D 0; (* Required if CS.L = 1 *) CS.G 1; (* 4-KByte granularity *) CPL 0; SS.Selector IA32_STAR[47:32] + 8; (* Set rest of SS to a fixed value *) SS.Base 0; SS.Limit FFFFFH; SS.Type 3; SS.S 1; SS.DPL 0; (* SS just above CS *) (* Flat segment *) (* With 4-KByte granularity, implies a 4-GByte limit *) (* Read/write data, accessed *)

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

76

SS.P 1; SS.B 1; SS.G 1;

(* 32-bit stack segment *) (* 4-KByte granularity *)

Flags Affected
All.

Protected Mode Exceptions


#UD The SYSCALL instruction is not recognized in protected mode.

Real-Address Mode Exceptions


#UD The SYSCALL instruction is not recognized in real-address mode.

Virtual-8086 Mode Exceptions


#UD The SYSCALL instruction is not recognized in virtual-8086 mode.

Compatibility Mode Exceptions


#UD The SYSCALL instruction is not recognized in compatibility mode.

64-Bit Mode Exceptions


#UD If IA32_EFER.SCE = 0. If the LOCK prefix is used. ...

SYSENTERFast System Call


Opcode 0F 34 Instruction SYSENTER Op/ En NP 64-Bit Mode Valid Compat/ Description Leg Mode Valid Fast call to privilege level 0 system procedures.

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Description
Executes a fast call to a level 0 system procedure or routine. SYSENTER is a companion instruction to SYSEXIT. The instruction is optimized to provide the maximum performance for system calls from user code running at privilege level 3 to operating system or executive procedures running at privilege level 0. When executed in IA-32e mode, the SYSENTER instruction transitions the logical processor to 64-bit mode; otherwise, the logical processor remains in protected mode. Prior to executing the SYSENTER instruction, software must specify the privilege level 0 code segment and code entry point, and the privilege level 0 stack segment and stack pointer by writing values to the following MSRs:

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

77

IA32_SYSENTER_CS (MSR address 174H) The lower 16 bits of this MSR are the segment selector for the privilege level 0 code segment. This value is also used to determine the segment selector of the privilege level 0 stack segment (see the Operation section). This value cannot indicate a null selector. IA32_SYSENTER_EIP (MSR address 175H) The value of this MSR is loaded into RIP (thus, this value references the first instruction of the selected operating procedure or routine). In protected mode, only bits 31:0 are loaded. IA32_SYSENTER_ESP (MSR address 176H) The value of this MSR is loaded into RSP (thus, this value contains the stack pointer for the privilege level 0 stack). This value cannot represent a non-canonical address. In protected mode, only bits 31:0 are loaded.

These MSRs can be read from and written to using RDMSR/WRMSR. The WRMSR instruction ensures that the IA32_SYSENTER_EIP and IA32_SYSENTER_ESP MSRs always contain canonical addresses. While SYSENTER loads the CS and SS selectors with values derived from the IA32_SYSENTER_CS MSR, the CS and SS descriptor caches are not loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the responsibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values correspond to the fixed values loaded into the descriptor caches; the SYSENTER instruction does not ensure this correspondence. The SYSENTER instruction can be invoked from all operating modes except real-address mode. The SYSENTER and SYSEXIT instructions are companion instructions, but they do not constitute a call/return pair. When executing a SYSENTER instruction, the processor does not save state information for the user code (e.g., the instruction pointer), and neither the SYSENTER nor the SYSEXIT instruction supports passing parameters on the stack. To use the SYSENTER and SYSEXIT instructions as companion instructions for transitions between privilege level 3 code and privilege level 0 operating system procedures, the following conventions must be followed: The segment descriptors for the privilege level 0 code and stack segments and for the privilege level 3 code and stack segments must be contiguous in a descriptor table. This convention allows the processor to compute the segment selectors from the value entered in the SYSENTER_CS_MSR MSR. The fast system call stub routines executed by user code (typically in shared libraries or DLLs) must save the required return IP and processor state information if a return to the calling procedure is required. Likewise, the operating system or executive procedures called with SYSENTER instructions must have access to and use this saved return and state information when returning to the user code.

The SYSENTER and SYSEXIT instructions were introduced into the IA-32 architecture in the Pentium II processor. The availability of these instructions on a processor is indicated with the SYSENTER/SYSEXIT present (SEP) feature flag returned to the EDX register by the CPUID instruction. An operating system that qualifies the SEP flag must also qualify the processor family and model to ensure that the SYSENTER/SYSEXIT instructions are actually present. For example: IF CPUID SEP bit is set THEN IF (Family = 6) and (Model < 3) and (Stepping < 3) THEN SYSENTER/SYSEXIT_Not_Supported; FI; ELSE SYSENTER/SYSEXIT_Supported; FI; FI; When the CPUID instruction is executed on the Pentium Pro processor (model 1), the processor returns a the SEP flag as set, but does not support the SYSENTER/SYSEXIT instructions.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

78

Operation
IF CR0.PE = 0 OR IA32_SYSENTER_CS[15:2] = 0 THEN #GP(0); FI; RFLAGS.VM 0; (* Ensures protected mode execution *) RFLAGS.IF 0; (* Mask interrupts *) IF in IA-32e mode THEN RSP IA32_SYSENTER_ESP; RIP IA32_SYSENTER_EIP; ELSE ESP IA32_SYSENTER_ESP[31:0]; EIP IA32_SYSENTER_EIP[31:0]; FI; CS.Selector IA32_SYSENTER_CS[15:0] AND FFFCH; (* Operating system provides CS; RPL forced to 0 *) (* Set rest of CS to a fixed value *) CS.Base 0; (* Flat segment *) CS.Limit FFFFFH; (* With 4-KByte granularity, implies a 4-GByte limit *) CS.Type 11; (* Execute/read code, accessed *) CS.S 1; CS.DPL 0; CS.P 1; IF in IA-32e mode THEN CS.L 1; (* Entry is to 64-bit mode *) CS.D 0; (* Required if CS.L = 1 *) ELSE CS.L 0; CS.D 1; (* 32-bit code segment*) FI; CS.G 1; (* 4-KByte granularity *) CPL 0; SS.Selector CS.Selector + 8; (* Set rest of SS to a fixed value *) SS.Base 0; SS.Limit FFFFFH; SS.Type 3; SS.S 1; SS.DPL 0; SS.P 1; SS.B 1; SS.G 1; (* SS just above CS *) (* Flat segment *) (* With 4-KByte granularity, implies a 4-GByte limit *) (* Read/write data, accessed *)

(* 32-bit stack segment*) (* 4-KByte granularity *)

Flags Affected
VM, IF (see Operation above)

Protected Mode Exceptions


#GP(0) If IA32_SYSENTER_CS[15:2] = 0.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

79

#UD

If the LOCK prefix is used.

Real-Address Mode Exceptions


#GP #UD The SYSENTER instruction is not recognized in real-address mode. If the LOCK prefix is used.

Virtual-8086 Mode Exceptions


Same exceptions as in protected mode.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


Same exceptions as in protected mode. ...

SYSEXITFast Return from Fast System Call


Opcode 0F 35 REX.W + 0F 35 Instruction SYSEXIT SYSEXIT Op/ En NP NP 64-Bit Mode Valid Valid Compat/ Description Leg Mode Valid Valid Fast return to privilege level 3 user code. Fast return to 64-bit mode privilege level 3 user code.

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Description
Executes a fast return to privilege level 3 user code. SYSEXIT is a companion instruction to the SYSENTER instruction. The instruction is optimized to provide the maximum performance for returns from system procedures executing at protections levels 0 to user procedures executing at protection level 3. It must be executed from code executing at privilege level 0. With a 64-bit operand size, SYSEXIT remains in 64-bit mode; otherwise, it either enters compatibility mode (if the logical processor is in IA-32e mode) or remains in protected mode (if it is not). Prior to executing SYSEXIT, software must specify the privilege level 3 code segment and code entry point, and the privilege level 3 stack segment and stack pointer by writing values into the following MSR and generalpurpose registers: IA32_SYSENTER_CS (MSR address 174H) Contains a 32-bit value that is used to determine the segment selectors for the privilege level 3 code and stack segments (see the Operation section) RDX The canonical address in this register is loaded into RIP (thus, this value references the first instruction to be executed in the user code). If the return is not to 64-bit mode, only bits 31:0 are loaded. ECX The canonical address in this register is loaded into RSP (thus, this value contains the stack pointer for the privilege level 3 stack). If the return is not to 64-bit mode, only bits 31:0 are loaded.

The IA32_SYSENTER_CS MSR can be read from and written to using RDMSR and WRMSR.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

80

While SYSEXIT loads the CS and SS selectors with values derived from the IA32_SYSENTER_CS MSR, the CS and SS descriptor caches are not loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the responsibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values correspond to the fixed values loaded into the descriptor caches; the SYSEXIT instruction does not ensure this correspondence. The SYSEXIT instruction can be invoked from all operating modes except real-address mode and virtual-8086 mode. The SYSENTER and SYSEXIT instructions were introduced into the IA-32 architecture in the Pentium II processor. The availability of these instructions on a processor is indicated with the SYSENTER/SYSEXIT present (SEP) feature flag returned to the EDX register by the CPUID instruction. An operating system that qualifies the SEP flag must also qualify the processor family and model to ensure that the SYSENTER/SYSEXIT instructions are actually present. For example: IF CPUID SEP bit is set THEN IF (Family = 6) and (Model < 3) and (Stepping < 3) THEN SYSENTER/SYSEXIT_Not_Supported; FI; ELSE SYSENTER/SYSEXIT_Supported; FI; FI; When the CPUID instruction is executed on the Pentium Pro processor (model 1), the processor returns a the SEP flag as set, but does not support the SYSENTER/SYSEXIT instructions.

Operation
IF IA32_SYSENTER_CS[15:2] = 0 OR CR0.PE = 0 OR CPL 0 THEN #GP(0); FI; IF operand size is 64-bit THEN (* Return to 64-bit mode *) RSP RCX; RIP RDX; ELSE (* Return to protected mode or compatibility mode *) RSP ECX; RIP EDX; FI; IF operand size is 64-bit (* Operating system provides CS; RPL forced to 3 *) THEN CS.Selector IA32_SYSENTER_CS[15:0] + 32; ELSE CS.Selector IA32_SYSENTER_CS[15:0] + 16; FI; CS.Selector CS.Selector OR 3; (* RPL forced to 3 *) (* Set rest of CS to a fixed value *) CS.Base 0; (* Flat segment *) CS.Limit FFFFFH; (* With 4-KByte granularity, implies a 4-GByte limit *) CS.Type 11; (* Execute/read code, accessed *) CS.S 1; CS.DPL 3; CS.P 1; IF operand size is 64-bit THEN (* return to 64-bit mode *) CS.L 1; (* 64-bit code segment *)

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

81

FI; CS.G 1; CPL 3;

CS.D 0; (* Required if CS.L = 1 *) ELSE (* return to protected mode or compatibility mode *) CS.L 0; CS.D 1; (* 32-bit code segment*) (* 4-KByte granularity *) (* SS just above CS *) (* Flat segment *) (* With 4-KByte granularity, implies a 4-GByte limit *) (* Read/write data, accessed *)

SS.Selector CS.Selector + 8; (* Set rest of SS to a fixed value *) SS.Base 0; SS.Limit FFFFFH; SS.Type 3; SS.S 1; SS.DPL 3; SS.P 1; SS.B 1; SS.G 1;

(* 32-bit stack segment*) (* 4-KByte granularity *)

Flags Affected
None.

Protected Mode Exceptions


#GP(0) #UD If IA32_SYSENTER_CS[15:2] = 0. If CPL

0.

If the LOCK prefix is used.

Real-Address Mode Exceptions


#GP #UD The SYSEXIT instruction is not recognized in real-address mode. If the LOCK prefix is used.

Virtual-8086 Mode Exceptions


#GP(0) The SYSEXIT instruction is not recognized in virtual-8086 mode.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


#GP(0) If IA32_SYSENTER_CS = 0. If CPL #UD ...

0.

If RCX or RDX contains a non-canonical address. If the LOCK prefix is used.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

82

SYSRETReturn From Fast System Call


Opcode 0F 07 REX.W + 0F 07 Instruction SYSRET SYSRET Op/ En NP NP 64-Bit Mode Valid Valid Compat/ Description Leg Mode Invalid Invalid Return to compatibility mode from fast system call Return to 64-bit mode from fast system call

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Description
SYSRET is a companion instruction to the SYSCALL instruction. It returns from an OS system-call handler to user code at privilege level 3. It does so by loading RIP from RCX and loading RFLAGS from R11.1 With a 64-bit operand size, SYSRET remains in 64-bit mode; otherwise, it enters compatibility mode and only the low 32 bits of the registers are loaded. SYSRET loads the CS and SS selectors with values derived from bits 63:48 of the IA32_STAR MSR. However, the CS and SS descriptor caches are not loaded from the descriptors (in GDT or LDT) referenced by those selectors. Instead, the descriptor caches are loaded with fixed values. See the Operation section for details. It is the responsibility of OS software to ensure that the descriptors (in GDT or LDT) referenced by those selector values correspond to the fixed values loaded into the descriptor caches; the SYSRET instruction does not ensure this correspondence. The SYSRET instruction does not modify the stack pointer (ESP or RSP). For that reason, it is necessary for software to switch to the user stack. The OS may load the user stack pointer (if it was saved after SYSCALL) before executing SYSRET; alternatively, user code may load the stack pointer (if it was saved before SYSCALL) after receiving control from SYSRET. If the OS loads the stack pointer before executing SYSRET, it must ensure that the handler of any interrupt or exception delivered between restoring the stack pointer and successful execution of SYSRET is not invoked with the user stack. It can do so using approaches such as the following: External interrupts. The OS can prevent an external interrupt from being delivered by clearing EFLAGS.IF before loading the user stack pointer. Nonmaskable interrupts (NMIs). The OS can ensure that the NMI handler is invoked with the correct stack by using the interrupt stack table (IST) mechanism for gate 2 (NMI) in the IDT (see Section 6.14.5, Interrupt Stack Table, in Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A). General-protection exceptions (#GP). The SYSRET instruction generates #GP(0) if the value of RCX is not canonical. The OS can address this possibility using one or more of the following approaches: Confirming that the value of RCX is canonical before executing SYSRET. Using paging to ensure that the SYSCALL instruction will never save a non-canonical value into RCX. Using the IST mechanism for gate 13 (#GP) in the IDT.

1. Regardless of the value of R11, the RF and VM flags are always 0 in RFLAGS after execution of SYSRET. In addition, all reserved bits in RFLAGS retain the fixed values.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

83

Operation
IF (CS.L 1 ) or (IA32_EFER.LMA 1) or (IA32_EFER.SCE 1) (* Not in 64-Bit Mode or SYSCALL/SYSRET not enabled in IA32_EFER *) THEN #UD; FI; IF (CPL 0) OR (RCX is not canonical) THEN #GP(0); FI; IF (operand size is 64-bit) THEN (* Return to 64-Bit Mode *) RIP RCX; ELSE (* Return to Compatibility Mode *) RIP ECX; FI; RFLAGS (R11 & 3C7FD7H) | 2;

(* Clear RF, VM, reserved bits; set bit 2 *)

IF (operand size is 64-bit) THEN CS.Selector IA32_STAR[63:48]+16; ELSE CS.Selector IA32_STAR[63:48]; FI; CS.Selector CS.Selector OR 3; (* RPL forced to 3 *) (* Set rest of CS to a fixed value *) CS.Base 0; (* Flat segment *) CS.Limit FFFFFH; (* With 4-KByte granularity, implies a 4-GByte limit *) CS.Type 11; (* Execute/read code, accessed *) CS.S 1; CS.DPL 3; CS.P 1; IF (operand size is 64-bit) THEN (* Return to 64-Bit Mode *) CS.L 1; (* 64-bit code segment *) CS.D 0; (* Required if CS.L = 1 *) ELSE (* Return to Compatibility Mode *) CS.L 0; (* Compatibility mode *) CS.D 1; (* 32-bit code segment *) FI; CS.G 1; (* 4-KByte granularity *) CPL 0; SS.Selector (IA32_STAR[63:48]+8) OR 3; (* Set rest of SS to a fixed value *) SS.Base 0; SS.Limit FFFFFH; SS.Type 3; SS.S 1; SS.DPL 3; SS.P 1; SS.B 1; SS.G 1; (* RPL forced to 3 *) (* Flat segment *) (* With 4-KByte granularity, implies a 4-GByte limit *) (* Read/write data, accessed *)

(* 32-bit stack segment*) (* 4-KByte granularity *)

Flags Affected
All.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

84

Protected Mode Exceptions


#UD The SYSRET instruction is not recognized in protected mode.

Real-Address Mode Exceptions


#UD The SYSRET instruction is not recognized in real-address mode.

Virtual-8086 Mode Exceptions


#UD The SYSRET instruction is not recognized in virtual-8086 mode.

Compatibility Mode Exceptions


#UD The SYSRET instruction is not recognized in compatibility mode.

64-Bit Mode Exceptions


#UD #GP(0) If IA32_EFER.SCE = 0. If the LOCK prefix is used. If CPL

0.

If RCX contains a non-canonical address. ...

UCOMISDUnordered Compare Scalar Double-Precision Floating-Point Values and Set EFLAGS


Opcode/ Instruction 66 0F 2E /r UCOMISD xmm1, xmm2/m64 VEX.LIG.66.0F.WIG 2E /r VUCOMISD xmm1, xmm2/m64 RM V/V AVX Op/ En RM 64/32 bit Mode Support V/V CPUID Feature Flag SSE2 Description

Compares (unordered) the low doubleprecision floating-point values in xmm1 and xmm2/m64 and set the EFLAGS accordingly. Compare low double precision floating-point values in xmm1 and xmm2/mem64 and set the EFLAGS flags accordingly.

Instruction Operand Encoding


Op/En RM Operand 1 ModRM:reg (r) Operand 2 ModRM:r/m (r) Operand 3 NA Operand 4 NA

Description
Performs an unordered compare of the double-precision floating-point values in the low quadwords of source operand 1 (first operand) and source operand 2 (second operand), and sets the ZF, PF, and CF flags in the EFLAGS register according to the result (unordered, greater than, less than, or equal). The OF, SF and AF flags in the EFLAGS register are set to 0. The unordered result is returned if either source operand is a NaN (QNaN or SNaN). Source operand 1 is an XMM register; source operand 2 can be an XMM register or a 64 bit memory location. The UCOMISD instruction differs from the COMISD instruction in that it signals a SIMD floating-point invalid operation exception (#I) only when a source operand is an SNaN. The COMISD instruction signals an invalid operation exception if a source operand is either a QNaN or an SNaN. The EFLAGS register is not updated if an unmasked SIMD floating-point exception is generated.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

85

In 64-bit mode, using a REX prefix in the form of REX.R permits this instruction to access additional registers (XMM8-XMM15). Note: In VEX-encoded versions, VEX.vvvv is reserved and must be 1111b, otherwise instructions will #UD.

Operation
RESULT UnorderedCompare(SRC1[63:0] < > SRC2[63:0]) { (* Set EFLAGS *) CASE (RESULT) OF UNORDERED: ZF, PF, CF 111; GREATER_THAN: ZF, PF, CF 000; LESS_THAN: ZF, PF, CF 001; EQUAL: ZF, PF, CF 100; ESAC; OF, AF, SF 0;

Intel C/C++ Compiler Intrinsic Equivalent


int _mm_ucomieq_sd(__m128d a, __m128d b) int _mm_ucomilt_sd(__m128d a, __m128d b) int _mm_ucomile_sd(__m128d a, __m128d b) int _mm_ucomigt_sd(__m128d a, __m128d b) int _mm_ucomige_sd(__m128d a, __m128d b) int _mm_ucomineq_sd(__m128d a, __m128d b)

SIMD Floating-Point Exceptions


Invalid (if SNaN operands), Denormal.

Other Exceptions
See Exceptions Type 3; additionally #UD ... If VEX.vvvv != 1111B.

WRMSRWrite to Model Specific Register


Opcode 0F 30 Instruction WRMSR Op/ En NP 64-Bit Mode Valid Compat/ Description Leg Mode Valid Write the value in EDX:EAX to MSR specified by ECX.

Instruction Operand Encoding


Op/En NP Operand 1 NA Operand 2 NA Operand 3 NA Operand 4 NA

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

86

Description
Writes the contents of registers EDX:EAX into the 64-bit model specific register (MSR) specified in the ECX register. (On processors that support the Intel 64 architecture, the high-order 32 bits of RCX are ignored.) The contents of the EDX register are copied to high-order 32 bits of the selected MSR and the contents of the EAX register are copied to low-order 32 bits of the MSR. (On processors that support the Intel 64 architecture, the high-order 32 bits of each of RAX and RDX are ignored.) Undefined or reserved bits in an MSR should be set to values previously read. This instruction must be executed at privilege level 0 or in real-address mode; otherwise, a general protection exception #GP(0) is generated. Specifying a reserved or unimplemented MSR address in ECX will also cause a general protection exception. The processor will also generate a general protection exception if software attempts to write to bits in a reserved MSR. When the WRMSR instruction is used to write to an MTRR, the TLBs are invalidated. This includes global entries (see Translation Lookaside Buffers (TLBs) in Chapter 3 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A). MSRs control functions for testability, execution tracing, performance-monitoring and machine check errors. Chapter 35, Model-Specific Registers (MSRs), in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C, lists all MSRs that can be written with this instruction and their addresses. Note that each processor family has its own set of MSRs. The WRMSR instruction is a serializing instruction (see Serializing Instructions in Chapter 8 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A). Note that WRMSR to the IA32_TSC_DEADLINE MSR (MSR index 6E0H) and the X2APIC MSRs (MSR indices 802H to 83FH) are not serializing. The CPUID instruction should be used to determine whether MSRs are supported (CPUID.01H:EDX[5] = 1) before using this instruction.

IA-32 Architecture Compatibility


The MSRs and the ability to read them with the WRMSR instruction were introduced into the IA-32 architecture with the Pentium processor. Execution of this instruction by an IA-32 processor earlier than the Pentium processor results in an invalid opcode exception #UD.

Operation
MSR[ECX] EDX:EAX;

Flags Affected
None.

Protected Mode Exceptions


#GP(0) If the current privilege level is not 0. If the value in ECX specifies a reserved or unimplemented MSR address. If the value in EDX:EAX sets bits that are reserved in the MSR specified by ECX. If the source register contains a non-canonical address and ECX specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE, IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP. #UD If the LOCK prefix is used.

Real-Address Mode Exceptions


#GP If the value in ECX specifies a reserved or unimplemented MSR address.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

87

If the value in EDX:EAX sets bits that are reserved in the MSR specified by ECX. If the source register contains a non-canonical address and ECX specifies one of the following MSRs: IA32_DS_AREA, IA32_FS_BASE, IA32_GS_BASE, IA32_KERNEL_GS_BASE, IA32_LSTAR, IA32_SYSENTER_EIP, IA32_SYSENTER_ESP. #UD If the LOCK prefix is used.

Virtual-8086 Mode Exceptions


#GP(0) The WRMSR instruction is not recognized in virtual-8086 mode.

Compatibility Mode Exceptions


Same exceptions as in protected mode.

64-Bit Mode Exceptions


Same exceptions as in protected mode. ...

8. Updates to Chapter 5, Volume 2C


Change bars show changes to Chapter 5 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2C: Instruction Set Reference. -----------------------------------------------------------------------------------------...

GETSEC[ENTERACCS] - Execute Authenticated Chipset Code


Opcode 0F 37 (EAX = 2) Instruction Description EBX holds the authenticated code module physical base address. ECX holds the authenticated code module size (bytes). GETSEC[ENTERACCS] Enter authenticated code execution mode.

Description
The GETSEC[ENTERACCS] function loads, authenticates and executes an authenticated code module using an Intel TXT platform chipset's public key. The ENTERACCS leaf of GETSEC is selected with EAX set to 2 at entry. There are certain restrictions enforced by the processor for the execution of the GETSEC[ENTERACCS] instruction: Execution is not allowed unless the processor is in protected mode or IA-32e mode with CPL = 0 and EFLAGS.VM = 0. Processor cache must be available and not disabled, that is, CR0.CD and CR0.NW bits must be 0. For processor packages containing more than one logical processor, CR0.CD is checked to ensure consistency between enabled logical processors. For enforcing consistency of operation with numeric exception reporting using Interrupt 16, CR0.NE must be set.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

88

An Intel TXT-capable chipset must be present as communicated to the processor by sampling of the power-on configuration capability field after reset. The processor can not already be in authenticated code execution mode as launched by a previous GETSEC[ENTERACCS] or GETSEC[SENTER] instruction without a subsequent exiting using GETSEC[EXITAC]). To avoid potential operability conflicts between modes, the processor is not allowed to execute this instruction if it currently is in SMM or VMX operation. To insure consistent handling of SIPI messages, the processor executing the GETSEC[ENTERACCS] instruction must also be designated the BSP (boot-strap processor) as defined by A32_APIC_BASE.BSP (Bit 8).

Failure to conform to the above conditions results in the processor signaling a general protection exception. Prior to execution of the ENTERACCS leaf, other logical processors, i.e. RLPs, in the platform must be: idle in a wait-for-SIPI state (as initiated by an INIT assertion or through reset for non-BSP designated processors), or in the SENTER sleep state as initiated by a GETSEC[SENTER] from the initiating logical processor (ILP).

If other logical processor(s) in the same package are not idle in one of these states, execution of ENTERACCS signals a general protection exception. The same requirement and action applies if the other logical processor(s) of the same package do not have CR0.CD = 0. A successful execution of ENTERACCS results in the ILP entering an authenticated code execution mode. Prior to reaching this point, the processor performs several checks. These include: Establish and check the location and size of the specified authenticated code module to be executed by the processor. Inhibit the ILPs response to the external events: INIT, A20M, NMI and SMI. Broadcast a message to enable protection of memory and I/O from other processor agents. Load the designated code module into an authenticated code execution area. Isolate the contents of the authenticated code execution area from further state modification by external agents. Authenticate the authenticated code module. Initialize the initiating logical processor state based on information contained in the authenticated code module header. Unlock the Intel TXT-capable chipset private configuration space and TPM locality 3 space. Begin execution in the authenticated code module at the defined entry point.

The GETSEC[ENTERACCS] function requires two additional input parameters in the general purpose registers EBX and ECX. EBX holds the authenticated code (AC) module physical base address (the AC module must reside below 4 GBytes in physical address space) and ECX holds the AC module size (in bytes). The physical base address and size are used to retrieve the code module from system memory and load it into the internal authenticated code execution area. The base physical address is checked to verify it is on a modulo-4096 byte boundary. The size is verified to be a multiple of 64, that it does not exceed the internal authenticated code execution area capacity (as reported by GETSEC[CAPABILITIES]), and that the top address of the AC module does not exceed 32 bits. An error condition results in an abort of the authenticated code execution launch and the signaling of a general protection exception. As an integrity check for proper processor hardware operation, execution of GETSEC[ENTERACCS] will also check the contents of all the machine check status registers (as reported by the MSRs IA32_MCi_STATUS) for any valid uncorrectable error condition. In addition, the global machine check status register IA32_MCG_STATUS MCIP bit must be cleared and the IERR processor package pin (or its equivalent) must not be asserted, indicating that no machine check exception processing is currently in progress. These checks are performed prior to initiating the load of the authenticated code module. Any outstanding valid uncorrectable machine check error condition present in these status registers at this point will result in the processor signaling a general protection violation.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

89

The ILP masks the response to the assertion of the external signals INIT#, A20M, NMI#, and SMI#. This masking remains active until optionally unmasked by GETSEC[EXITAC] (this defined unmasking behavior assumes GETSEC[ENTERACCS] was not executed by a prior GETSEC[SENTER]). The purpose of this masking control is to prevent exposure to existing external event handlers that may not be under the control of the authenticated code module. The ILP sets an internal flag to indicate it has entered authenticated code execution mode. The state of the A20M pin is likewise masked and forced internally to a de-asserted state so that any external assertion is not recognized during authenticated code execution mode. To prevent other (logical) processors from interfering with the ILP operating in authenticated code execution mode, memory (excluding implicit write-back transactions) access and I/O originating from other processor agents are blocked. This protection starts when the ILP enters into authenticated code execution mode. Only memory and I/O transactions initiated from the ILP are allowed to proceed. Exiting authenticated code execution mode is done by executing GETSEC[EXITAC]. The protection of memory and I/O activities remains in effect until the ILP executes GETSEC[EXITAC]. Prior to launching the authenticated execution module using GETSEC[ENTERACCS] or GETSEC[SENTER], the processors MTRRs (Memory Type Range Registers) must first be initialized to map out the authenticated RAM addresses as WB (writeback). Failure to do so may affect the ability for the processor to maintain isolation of the loaded authenticated code module. If the processor detected this requirement is not met, it will signal an Intel TXT reset condition with an error code during the loading of the authenticated code module. While physical addresses within the load module must be mapped as WB, the memory type for locations outside of the module boundaries must be mapped to one of the supported memory types as returned by GETSEC[PARAMETERS] (or UC as default). To conform to the minimum granularity of MTRR MSRs for specifying the memory type, authenticated code RAM (ACRAM) is allocated to the processor in 4096 byte granular blocks. If an AC module size as specified in ECX is not a multiple of 4096 then the processor will allocate up to the next 4096 byte boundary for mapping as ACRAM with indeterminate data. This pad area will not be visible to the authenticated code module as external memory nor can it depend on the value of the data used to fill the pad area. At the successful completion of GETSEC[ENTERACCS], the architectural state of the processor is partially initialized from contents held in the header of the authenticated code module. The processor GDTR, CS, and DS selectors are initialized from fields within the authenticated code module. Since the authenticated code module must be relocatable, all address references must be relative to the authenticated code module base address in EBX. The processor GDTR base value is initialized to the AC module header field GDTBasePtr + module base address held in EBX and the GDTR limit is set to the value in the GDTLimit field. The CS selector is initialized to the AC module header SegSel field, while the DS selector is initialized to CS + 8. The segment descriptor fields are implicitly initialized to BASE=0, LIMIT=FFFFFh, G=1, D=1, P=1, S=1, read/write access for DS, and execute/read access for CS. The processor begins the authenticated code module execution with the EIP set to the AC module header EntryPoint field + module base address (EBX). The AC module based fields used for initializing the processor state are checked for consistency and any failure results in a shutdown condition. A summary of the register state initialization after successful completion of GETSEC[ENTERACCS] is given for the processor in Table 5-4. The paging is disabled upon entry into authenticated code execution mode. The authenticated code module is loaded and initially executed using physical addresses. It is up to the system software after execution of GETSEC[ENTERACCS] to establish a new (or restore its previous) paging environment with an appropriate mapping to meet new protection requirements. EBP is initialized to the authenticated code module base physical address for initial execution in the authenticated environment. As a result, the authenticated code can reference EBP for relative address based references, given that the authenticated code module must be position independent.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

90

Table 5-4 Register State Initialization after GETSEC[ENTERACCS]


Register State CR0 CR4 EFLAGS IA32_EFER EIP [E|R]BX ECX [E|R]DX EBP CS DS GDTR DR7 IA32_DEBUGCTL IA32_MISC_ENA BLE Initialization Status PG0, AM0, WP0: Others unchanged MCE0: Others unchanged 00000002H 0H AC.base + EntryPoint Pre-ENTERACCS state: Next [E|R]IP prior to GETSEC[ENTERACCS] Pre-ENTERACCS state: [31:16]=GDTR.limit; [15:0]=CS.sel Pre-ENTERACCS state: GDTR base AC.base Sel=[SegSel], base=0, limit=FFFFFh, G=1, D=1, AR=9BH Sel=[SegSel] +8, base=0, limit=FFFFFh, G=1, D=1, AR=93H Base= AC.base (EBX) + [GDTBasePtr], Limit=[GDTLimit] 00000400H 0H see Table 5-5 for example The number of initialized fields may change due.to processor implementation IA-32e mode disabled AC.base is in EBX as input to GETSEC[ENTERACCS] Carry forward 64-bit processor state across GETSEC[ENTERACCS] Carry forward processor state across GETSEC[ENTERACCS] Carry forward 64-bit processor state across GETSEC[ENTERACCS] Comment Paging, Alignment Check, Write-protection are disabled Machine Check Exceptions Disabled

The segmentation related processor state that has not been initialized by GETSEC[ENTERACCS] requires appropriate initialization before use. Since a new GDT context has been established, the previous state of the segment selector values held in ES, SS, FS, GS, TR, and LDTR might not be valid. The MSR IA32_EFER is also unconditionally cleared as part of the processor state initialized by ENTERACCS. Since paging is disabled upon entering authenticated code execution mode, a new paging environment will have to be reestablished in order to establish IA-32e mode while operating in authenticated code execution mode. Debug exception and trap related signaling is also disabled as part of GETSEC[ENTERACCS]. This is achieved by resetting DR7, TF in EFLAGs, and the MSR IA32_DEBUGCTL. These debug functions are free to be re-enabled once supporting exception handler(s), descriptor tables, and debug registers have been properly initialized following entry into authenticated code execution mode. Also, any pending single-step trap condition will have been cleared upon entry into this mode. The IA32_MISC_ENABLE MSR is initialized upon entry into authenticated execution mode. Certain bits of this MSR are preserved because preserving these bits may be important to maintain previously established platform settings (See the footnote for Table 5-5.). The remaining bits are cleared for the purpose of establishing a more consistent environment for the execution of authenticated code modules. One of the impacts of initializing this MSR is any previous condition established by the MONITOR instruction will be cleared.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

91

To support the possible return to the processor architectural state prior to execution of GETSEC[ENTERACCS], certain critical processor state is captured and stored in the general- purpose registers at instruction completion. [E|R]BX holds effective address ([E|R]IP) of the instruction that would execute next after GETSEC[ENTERACCS], ECX[15:0] holds the CS selector value, ECX[31:16] holds the GDTR limit field, and [E|R]DX holds the GDTR base field. The subsequent authenticated code can preserve the contents of these registers so that this state can be manually restored if needed, prior to exiting authenticated code execution mode with GETSEC[EXITAC]. For the processor state after exiting authenticated code execution mode, see the description of GETSEC[SEXIT].

Table 5-5 IA32_MISC_ENABLE MSR Initialization1 by ENTERACCS and SENTER


Field Fast strings enable FOPCODE compatibility mode enable Thermal monitor enable Split-lock disable Bus lock on cache line splits disable Hardware prefetch disable GV1/2 legacy enable MONITOR/MWAIT s/m enable Adjacent sector prefetch disable
NOTES:

Bit position 0 2 3 4 8 9 15 18 19

Description Clear to 0 Clear to 0 Set to 1 if other thermal monitor capability is not enabled.2 Clear to 0 Clear to 0 Clear to 0 Clear to 0 Clear to 0 Clear to 0

1. The number of IA32_MISC_ENABLE fields that are initialized may vary due to processor implementations. 2. ENTERACCS (and SENTER) initialize the state of processor thermal throttling such that at least a minimum level is enabled. If thermal throttling is already enabled when executing one of these GETSEC leaves, then no change in the thermal throttling control settings will occur. If thermal throttling is disabled, then it will be enabled via setting of the thermal throttle control bit 3 as a result of executing these GETSEC leaves. The IDTR will also require reloading with a new IDT context after entering authenticated code execution mode, before any exceptions or the external interrupts INTR and NMI can be handled. Since external interrupts are reenabled at the completion of authenticated code execution mode (as terminated with EXITAC), it is recommended that a new IDT context be established before this point. Until such a new IDT context is established, the programmer must take care in not executing an INT n instruction or any other operation that would result in an exception or trap signaling. Prior to completion of the GETSEC[ENTERACCS] instruction and after successful authentication of the AC module, the private configuration space of the Intel TXT chipset is unlocked. The authenticated code module alone can gain access to this normally restricted chipset state for the purpose of securing the platform. Once the authenticated code module is launched at the completion of GETSEC[ENTERACCS], it is free to enable interrupts by setting EFLAGS.IF and enable NMI by execution of IRET. This presumes that it has re-established interrupt handling support through initialization of the IDT, GDT, and corresponding interrupt handling code.

Operation in a Uni-Processor Platform


(* The state of the internal flag ACMODEFLAG persists across instruction boundary *) IF (CR4.SMXE=0) THEN #UD; ELSIF (in VMX non-root operation) THEN VM Exit (reason=GETSEC instruction);

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

92

ELSIF (GETSEC leaf unsupported) THEN #UD; ELSIF ((in VMX operation) or (CR0.PE=0) or (CR0.CD=1) or (CR0.NW=1) or (CR0.NE=0) or (CPL>0) or (EFLAGS.VM=1) or (IA32_APIC_BASE.BSP=0) or (TXT chipset not present) or (ACMODEFLAG=1) or (IN_SMM=1)) THEN #GP(0); IF (GETSEC[PARAMETERS].Parameter_Type = 5, MCA_Handling (bit 6) = 0) FOR I = 0 to IA32_MCG_CAP.COUNT-1 DO IF (IA32_MC[I]_STATUS = uncorrectable error) THEN #GP(0); OD; FI; IF (IA32_MCG_STATUS.MCIP=1) or (IERR pin is asserted) THEN #GP(0); ACBASE EBX; ACSIZE ECX; IF (((ACBASE MOD 4096) != 0) or ((ACSIZE MOD 64 )!= 0 ) or (ACSIZE < minimum module size) OR (ACSIZE > authenticated RAM capacity)) or ((ACBASE+ACSIZE) > (2^32 -1))) THEN #GP(0); IF (secondary thread(s) CR0.CD = 1) or ((secondary thread(s) NOT(wait-for-SIPI)) and (secondary thread(s) not in SENTER sleep state) THEN #GP(0); Mask SMI, INIT, A20M, and NMI external pin events; IA32_MISC_ENABLE (IA32_MISC_ENABLE & MASK_CONST*) (* The hexadecimal value of MASK_CONST may vary due to processor implementations *) A20M 0; IA32_DEBUGCTL 0; Invalidate processor TLB(s); Drain Outgoing Transactions; ACMODEFLAG 1; SignalTXTMessage(ProcessorHold); Load the internal ACRAM based on the AC module size; (* Ensure that all ACRAM loads hit Write Back memory space *) IF (ACRAM memory type != WB) THEN TXT-SHUTDOWN(#BadACMMType); IF (AC module header version isnot supported) OR (ACRAM[ModuleType] <> 2) THEN TXT-SHUTDOWN(#UnsupportedACM); (* Authenticate the AC Module and shutdown with an error if it fails *) KEY GETKEY(ACRAM, ACBASE); KEYHASH HASH(KEY); CSKEYHASH READ(TXT.PUBLIC.KEY); IF (KEYHASH <> CSKEYHASH) THEN TXT-SHUTDOWN(#AuthenticateFail); SIGNATURE DECRYPT(ACRAM, ACBASE, KEY); (* The value of SIGNATURE_LEN_CONST is implementation-specific*) FOR I=0 to SIGNATURE_LEN_CONST - 1 DO ACRAM[SCRATCH.I] SIGNATURE[I];

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

93

COMPUTEDSIGNATURE HASH(ACRAM, ACBASE, ACSIZE); FOR I=0 to SIGNATURE_LEN_CONST - 1 DO ACRAM[SCRATCH.SIGNATURE_LEN_CONST+I] COMPUTEDSIGNATURE[I]; IF (SIGNATURE<>COMPUTEDSIGNATURE) THEN TXT-SHUTDOWN(#AuthenticateFail); ACMCONTROL ACRAM[CodeControl]; IF ((ACMCONTROL.0 = 0) and (ACMCONTROL.1 = 1) and (snoop hit to modified line detected on ACRAM load)) THEN TXT-SHUTDOWN(#UnexpectedHITM); IF (ACMCONTROL reserved bits are set) THEN TXT-SHUTDOWN(#BadACMFormat); IF ((ACRAM[GDTBasePtr] < (ACRAM[HeaderLen] * 4 + Scratch_size)) OR ((ACRAM[GDTBasePtr] + ACRAM[GDTLimit]) >= ACSIZE)) THEN TXT-SHUTDOWN(#BadACMFormat); IF ((ACMCONTROL.0 = 1) and (ACMCONTROL.1 = 1) and (snoop hit to modified line detected on ACRAM load)) THEN ACEntryPoint ACBASE+ACRAM[ErrorEntryPoint]; ELSE ACEntryPoint ACBASE+ACRAM[EntryPoint]; IF ((ACEntryPoint >= ACSIZE) OR (ACEntryPoint < (ACRAM[HeaderLen] * 4 + Scratch_size)))THEN TXT-SHUTDOWN(#BadACMFormat); IF (ACRAM[GDTLimit] & FFFF0000h) THEN TXT-SHUTDOWN(#BadACMFormat); IF ((ACRAM[SegSel] > (ACRAM[GDTLimit] - 15)) OR (ACRAM[SegSel] < 8)) THEN TXT-SHUTDOWN(#BadACMFormat); IF ((ACRAM[SegSel].TI=1) OR (ACRAM[SegSel].RPL!=0)) THEN TXT-SHUTDOWN(#BadACMFormat); CR0.[PG.AM.WP] 0; CR4.MCE 0; EFLAGS 00000002h; IA32_EFER 0h; [E|R]BX [E|R]IP of the instruction after GETSEC[ENTERACCS]; ECX Pre-GETSEC[ENTERACCS] GDT.limit:CS.sel; [E|R]DX Pre-GETSEC[ENTERACCS] GDT.base; EBP ACBASE; GDTR.BASE ACBASE+ACRAM[GDTBasePtr]; GDTR.LIMIT ACRAM[GDTLimit]; CS.SEL ACRAM[SegSel]; CS.BASE 0; CS.LIMIT FFFFFh; CS.G 1; CS.D 1; CS.AR 9Bh; DS.SEL ACRAM[SegSel]+8; DS.BASE 0; DS.LIMIT FFFFFh; DS.G 1; DS.D 1; DS.AR 93h; DR7 00000400h; IA32_DEBUGCTL 0; SignalTXTMsg(OpenPrivate); SignalTXTMsg(OpenLocality3);

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

94

EIP ACEntryPoint; END;

Flags Affected
All flags are cleared.

Use of Prefixes
LOCK REP* Operand size Address size REX Causes #UD Cause #UD (includes REPNE/REPNZ and REP/REPE/REPZ) Causes #UD Ignored Ignored

Segment overrides Ignored

Protected Mode Exceptions


#UD #GP(0) If CR4.SMXE = 0. If GETSEC[ENTERACCS] is not reported as supported by GETSEC[CAPABILITIES]. If CR0.CD = 1 or CR0.NW = 1 or CR0.NE = 0 or CR0.PE = 0 or CPL > 0 or EFLAGS.VM = 1. If a Intel TXT-capable chipset is not present. If in VMX root operation. If the initiating processor is not designated as the bootstrap processor via the MSR bit IA32_APIC_BASE.BSP. If the processor is already in authenticated code execution mode. If the processor is in SMM. If a valid uncorrectable machine check error is logged in IA32_MC[I]_STATUS. If the authenticated code base is not on a 4096 byte boundary. If the authenticated code size > processor internal authenticated code area capacity. If the authenticated code size is not modulo 64. If other enabled logical processor(s) of the same package CR0.CD = 1. If other enabled logical processor(s) of the same package are not in the wait-for-SIPI or SENTER sleep state.

Real-Address Mode Exceptions


#UD #GP(0) If CR4.SMXE = 0. If GETSEC[ENTERACCS] is not reported as supported by GETSEC[CAPABILITIES]. GETSEC[ENTERACCS] is not recognized in real-address mode.

Virtual-8086 Mode Exceptions


#UD #GP(0) If CR4.SMXE = 0. If GETSEC[ENTERACCS] is not reported as supported by GETSEC[CAPABILITIES]. GETSEC[ENTERACCS] is not recognized in virtual-8086 mode.

Compatibility Mode Exceptions


All protected mode exceptions apply.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

95

#GP

IF AC code module does not reside in physical address below 2^32 -1.

64-Bit Mode Exceptions


All protected mode exceptions apply. #GP IF AC code module does not reside in physical address below 2^32 -1.

VM-exit Condition
Reason (GETSEC) ... IF in VMX non-root operation.

9. Updates to Appendix A, Volume 2C


Change bars show changes to Appendix A of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2C: Instruction Set Reference. ------------------------------------------------------------------------------------------

Table A-3

Two-byte Opcode Map: 00H 77H (First Byte is 0FH) *

pfx 0

0 Grp 61A

1 Grp 71A

2 LAR Gv, Ew vmovlps Vq, Hq, Mq vmovhlps Vq, Hq, Uq vmovlpd Vq, Hq, Mq vmovsldup Vx, Wx vmovddup Vx, Wx MOV Cd, Rd

3 LSL Gv, Ew vmovlps Mq, Vq

5 SYSCALLo64

6 CLTS

7 SYSRETo64

vmovups Vps, Wps

vmovups Wps, Vps

vunpcklps Vx, Hx, Wx

vunpckhps Vx, Hx, Wx

vmovhpsv1 Vdq, Hq, Mq vmovlhps Vdq, Hq, Uq vmovhpdv1 Vdq, Hq, Mq vmovshdup Vx, Wx

vmovhpsv1 Mq, Vq

66 F3 F2

vmovupd Vpd, Wpd vmovss Vx, Hx, Wss vmovsd Vx, Hx, Wsd MOV Rd, Cd

vmovupd Wpd,Vpd vmovss Wss, Hx, Vss vmovsd Wsd, Hx, Vsd MOV Rd, Dd

vmovlpd Mq, Vq

vunpcklpd Vx,Hx,Wx

vunpckhpd Vx,Hx,Wx

vmovhpdv1 Mq, Vq

MOV Dd, Rd

WRMSR

RDTSC

RDMSR

RDPMC

SYSENTER

SYSEXIT

GETSEC

CMOVcc, (Gv, Ev) - Conditional Move 4 O NO B/C/NAE AE/NB/NC E/Z NE/NZ BE/NA A/NBE

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

96

pfx

0 vmovmskps Gy, Ups

1 vsqrtps Vps, Wps vsqrtpd Vpd, Wpd vsqrtss Vss, Hss, Wss vsqrtsd Vsd, Hsd, Wsd

2 vrsqrtps Vps, Wps

3 vrcpps Vps, Wps

4 vandps Vps, Hps, Wps vandpd Vpd, Hpd, Wpd

5 vandnps Vps, Hps, Wps vandnpd Vpd, Hpd, Wpd

6 vorps Vps, Hps, Wps vorpd Vpd, Hpd, Wpd

7 vxorps Vps, Hps, Wps vxorpd Vpd, Hpd, Wpd

66 F3 F2

vmovmskpd Gy,Upd

vrsqrtss Vss, Hss, Wss

vrcpss Vss, Hss, Wss

punpcklbw Pq, Qd 6 66 F3 pshufw Pq, Qq, Ib vpshufd Vx, Wx, Ib vpshufhw Vx, Wx, Ib vpshuflw Vx, Wx, Ib vpunpcklbw Vx, Hx, Wx

punpcklwd Pq, Qd vpunpcklwd Vx, Hx, Wx

punpckldq Pq, Qd vpunpckldq Vx, Hx, Wx

packsswb Pq, Qq vpacksswb Vx, Hx, Wx

pcmpgtb Pq, Qq vpcmpgtb Vx, Hx, Wx

pcmpgtw Pq, Qq vpcmpgtw Vx, Hx, Wx

pcmpgtd Pq, Qq vpcmpgtd Vx, Hx, Wx

packuswb Pq, Qq vpackuswb Vx, Hx, Wx

(Grp 121A)

(Grp 131A)

(Grp 141A)

pcmpeqb Pq, Qq vpcmpeqb Vx, Hx, Wx

pcmpeqw Pq, Qq vpcmpeqw Vx, Hx, Wx

pcmpeqd Pq, Qq vpcmpeqd Vx, Hx, Wx

emms vzeroupperv vzeroallv

66 F3 F2

...

Table A-3. Two-byte Opcode Map: 08H 7FH (First Byte is 0FH) *
pfx 0 Prefetch1C (Grp 161A) 8 INVD 9 WBINVD A B 2-byte Illegal Opcodes UD21B C D prefetchw(/1) Ev NOP /0 Ev E F

vmovaps Vps, Wps 66 2 F3 F2 3 3-byte escape (Table A-4) vmovapd Vpd, Wpd

vmovaps Wps, Vps vmovapd Wpd,Vpd

cvtpi2ps Vps, Qpi cvtpi2pd Vpd, Qpi vcvtsi2ss Vss, Hss, Ey vcvtsi2sd Vsd, Hsd, Ey 3-byte escape (Table A-5)

vmovntps Mps, Vps vmovntpd Mpd, Vpd

cvttps2pi Ppi, Wps cvttpd2pi Ppi, Wpd vcvttss2si Gy, Wss vcvttsd2si Gy, Wsd

cvtps2pi Ppi, Wps cvtpd2pi Qpi, Wpd vcvtss2si Gy, Wss vcvtsd2si Gy, Wsd

vucomiss Vss, Wss vucomisd Vsd, Wsd

vcomiss Vss, Wss vcomisd Vsd, Wsd

CMOVcc(Gv, Ev) - Conditional Move 4 S NS P/PE NP/PO L/NGE NL/GE LE/NG NLE/G

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

97

pfx

8 vaddps Vps, Hps, Wps

9 vmulps Vps, Hps, Wps vmulpd Vpd, Hpd, Wpd vmulss Vss, Hss, Wss vmulsd Vsd, Hsd, Wsd punpckhwd Pq, Qd vpunpckhwd Vx, Hx, Wx

A vcvtps2pd Vpd, Wps vcvtpd2ps Vps, Wpd vcvtss2sd Vsd, Hx, Wss vcvtsd2ss Vss, Hx, Wsd punpckhdq Pq, Qd vpunpckhdq Vx, Hx, Wx

B vcvtdq2ps Vps, Wdq vcvtps2dq Vdq, Wps vcvttps2dq Vdq, Wps

C vsubps Vps, Hps, Wps vsubpd Vpd, Hpd, Wpd vsubss Vss, Hss, Wss vsubsd Vsd, Hsd, Wsd

D vminps Vps, Hps, Wps vminpd Vpd, Hpd, Wpd vminss Vss, Hss, Wss vminsd Vsd, Hsd, Wsd

E vdivps Vps, Hps, Wps vdivpd Vpd, Hpd, Wpd vdivss Vss, Hss, Wss vdivsd Vsd, Hsd, Wsd movd/q Pd, Ey

F vmaxps Vps, Hps, Wps vmaxpd Vpd, Hpd, Wpd vmaxss Vss, Hss, Wss vmaxsd Vsd, Hsd, Wsd movq Pq, Qq vmovdqa Vx, Wx vmovdqu Vx, Wx

66 5 F3 F2

vaddpd Vpd, Hpd, Wpd vaddss Vss, Hss, Wss vaddsd Vsd, Hsd, Wsd punpckhbw Pq, Qd

packssdw Pq, Qd vpackssdw Vx, Hx, Wx vpunpcklqdq Vx, Hx, Wx vpunpckhqdq Vx, Hx, Wx

66 F3

vpunpckhbw Vx, Hx, Wx

vmovd/q Vy, Ey

VMREAD Ey, Gy 66 7 F3

VMWRITE Gy, Ey vhaddpd Vpd, Hpd, Wpd vhsubpd Vpd, Hpd, Wpd

movd/q Ey, Pd vmovd/q Ey, Vy vmovq Vq, Wq vhaddps Vps, Hps, Wps vhsubps Vps, Hps, Wps

movq Qq, Pq vmovdqa Wx,Vx vmovdqu Wx,Vx

F2

...

Table A-3. Two-byte Opcode Map: 80H F7H (First Byte is 0FH) *
pfx 8 0 1 2 3 4 5 6 7 Jccf64, Jz - Long-displacement jump on condition O NO B/CNAE AE/NB/NC E/Z NE/NZ BE/NA A/NBE

SETcc, Eb - Byte Set on condition 9 O PUSHd64 FS CMPXCHG B Eb, Gb XADD Eb, Gb 66 C F3 F2 psrlw Pq, Qq 66 D F3 F2 vaddsubps Vps, Hps, Wps vaddsubpd Vpd, Hpd, Wpd vpsrlw Vx, Hx, Wx Ev, Gv XADD Ev, Gv NO POPd64 FS B/C/NAE CPUID AE/NB/NC BT Ev, Gv BTR Ev, Gv E/Z SHLD Ev, Gv, Ib LFS Gv, Mp NE/NZ SHLD Ev, Gv, CL LGS Gv, Mp MOVZX Gv, Eb vshufps Vps,Hps,Wps,Ib vshufpd Vpd,Hpd,Wpd,Ib Gv, Ew Grp 91A BE/NA A/NBE

LSS Gv, Mp

vcmpps Vps,Hps,Wps,Ib vcmppd Vpd,Hpd,Wpd,Ib vcmpss Vss,Hss,Wss,Ib vcmpsd Vsd,Hsd,Wsd,Ib psrld Pq, Qq vpsrld Vx, Hx, Wx

movnti My, Gy

pinsrw Pq,Ry/Mw,Ib vpinsrw Vdq,Hdq,Ry/Mw,Ib

pextrw Gd, Nq, Ib vpextrw Gd, Udq, Ib

psrlq Pq, Qq vpsrlq Vx, Hx, Wx

paddq Pq, Qq vpaddq Vx, Hx, Wx

pmullw Pq, Qq vpmullw Vx, Hx, Wx vmovq Wq, Vq movq2dq Vdq, Nq movdq2q Pq, Uq

pmovmskb Gd, Nq vpmovmskb Gd, Ux

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

98

pfx

0 pavgb Pq, Qq

1 psraw Pq, Qq vpsraw Vx, Hx, Wx

2 psrad Pq, Qq vpsrad Vx, Hx, Wx

3 pavgw Pq, Qq vpavgw Vx, Hx, Wx

4 pmulhuw Pq, Qq vpmulhuw Vx, Hx, Wx

5 pmulhw Pq, Qq vpmulhw Vx, Hx, Wx

7 movntq Mq, Pq

66 E F3 F2

vpavgb Vx, Hx, Wx

vcvttpd2dq Vx, Wpd vcvtdq2pd Vx, Wpd vcvtpd2dq Vx, Wpd

vmovntdq Mx, Vx

psllw Pq, Qq F 66 F2 vlddqu Vx, Mx vpsllw Vx, Hx, Wx

pslld Pq, Qq vpslld Vx, Hx, Wx

psllq Pq, Qq vpsllq Vx, Hx, Wx

pmuludq Pq, Qq vpmuludq Vx, Hx, Wx

pmaddwd Pq, Qq vpmaddwd Vx, Hx, Wx

psadbw Pq, Qq vpsadbw Vx, Hx, Wx

maskmovq Pq, Nq vmaskmovdqu Vdq, Udq

...

Table A-4 Three-byte Opcode Map: 00H F7H (First Two Bytes are 0F 38H) *
pfx 0 pshufb Pq, Qq
vpshufb Vx, Hx, Wx pblendvb Vdq, Wdq

1 phaddw Pq, Qq
vphaddw Vx, Hx, Wx

2 phaddd Pq, Qq
vphaddd Vx, Hx, Wx

3 phaddsw Pq, Qq
vphaddsw Vx, Hx, Wx vcvtph2psv Vx, Wx, Ib

4 pmaddubsw Pq, Qq
vpmaddubsw Vx, Hx, Wx blendvps Vdq, Wdq

5 phsubw Pq, Qq
vphsubw Vx, Hx, Wx blendvpd Vdq, Wdq

6 phsubd Pq, Qq
vphsubd Vx, Hx, Wx vpermpsv Vqq, Hqq, Wqq

7 phsubsw Pq, Qq
vphsubsw Vx, Hx, Wx vptest Vx, Wx

0 66

66

2 3 4 5 6 7

66 66 66

vpmovsxbw Vx, Ux/Mq vpmovzxbw Vx, Ux/Mq vpmulld Vx, Hx, Wx

vpmovsxbd Vx, Ux/Md vpmovzxbd Vx, Ux/Md vphminposuw Vdq, Wdq

vpmovsxbq Vx, Ux/Mw vpmovzxbq Vx, Ux/Mw

vpmovsxwd Vx, Ux/Mq vpmovzxwd Vx, Ux/Mq

vpmovsxwq Vx, Ux/Md vpmovzxwq Vx, Ux/Md

vpmovsxdq Vx, Ux/Mq vpmovzxdq Vx, Ux/Mq vpsrlvd/qv Vx, Hx, Wx

vpermdv Vqq, Hqq, Wqq vpsravdv Vx, Hx, Wx

vpcmpgtq Vx, Hx, Wx vpsllvd/qv Vx, Hx, Wx

66

INVEPT Gy, Mdq

INVVPID Gy, Mdq

INVPCID Gy, Mdq

9 A B C D E

66 66 66

vgatherdd/qv Vx,Hx,Wx

vgatherqd/qv Vx,Hx,Wx

vgatherdps/dv Vx,Hx,Wx

vgatherqps/dv Vx,Hx,Wx

vfmaddsub132ps/dv Vx,Hx,Wx vfmaddsub213ps/dv Vx,Hx,Wx vfmaddsub231ps/dv Vx,Hx,Wx

vfmsubadd132ps/dv Vx,Hx,Wx vfmsubadd213ps/dv Vx,Hx,Wx vfmsubadd231ps/dv Vx,Hx,Wx

66 F F3 F2 66 & F2

MOVBE Gy, My MOVBE Gw, Mw

MOVBE My, Gy MOVBE Mw, Gw

ANDNv Gy, By, Ey

BZHIv Gy, Ey, By ADCX Gy, Ey ADOX Gy, Ey MULXv By,Gy,rDX,Ey

Grp 171A CRC32 Gd, Eb CRC32 Gd, Eb CRC32 Gd, Ey CRC32 Gd, Ew

PEXTv Gy, By, Ey PDEPv Gy, By, Ey

BEXTRv Gy, Ey, By SHLXv Gy, Ey, By SARXv Gy, Ey, By SHRXv Gy, Ey, By

...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

99

Table A-4. Three-byte Opcode Map: 08H FFH (First Two Bytes are 0F 38H) *
pfx 8
psignb Pq, Qq 0 66 vpsignb Vx, Hx, Wx

9
psignw Pq, Qq vpsignw Vx, Hx, Wx

A
psignd Pq, Qq vpsignd Vx, Hx, Wx

B
pmulhrsw Pq, Qq vpmulhrsw Vx, Hx, Wx

vpermilpsv Vx,Hx,Wx

vpermilpdv Vx,Hx,Wx

vtestpsv Vx, Wx

vtestpdv Vx, Wx

1 66 2 3 4 5 6 7 8 9 A B C D E 66 F F3 F2 66 & F2 66 66 66 66 66 66 66 66 66

pabsb Pq, Qq vbroadcastssv Vx, Wd vpmuldq Vx, Hx, Wx vpminsb Vx, Hx, Wx vbroadcastsdv Vqq, vbroadcastf128v Vqq, Mdq Wq vpcmpeqq Vx, Hx, Wx vpminsd Vx, Hx, Wx vmovntdqa Vx, Mx vpminuw Vx, Hx, Wx vpackusdw Vx, Hx, Wx vpminud Vx, Hx, Wx vpabsb Vx, Wx vmaskmovps Vx,Hx,Mx vpmaxsb Vx, Hx, Wx
v

pabsw Pq, Qq vpabsw Vx, Wx vmaskmovpd Vx,Hx,Mx vpmaxsd Vx, Hx, Wx


v

pabsd Pq, Qq vpabsd Vx, Wx vmaskmovpsv Mx,Hx,Vx vpmaxuw Vx, Hx, Wx vmaskmovpdv Mx,Hx,Vx vpmaxud Vx, Hx, Wx

vpbroadcastdv Vx, Wx vpbroadcastbv Vx, Wx

vpbroadcastqv Vx, Wx vpbroadcastwv Vx, Wx

vbroadcasti128v

Vqq, Mdq

vpmaskmovd/qv Vx,Hx,Mx

vpmaskmovd/qv Mx,Vx,Hx
vfnmadd132ss/dv vfnmsub132ps/dv vfnmsub132ss/dv

vfmadd132ps/dv Vx, Hx, Wx vfmadd213ps/dv Vx, Hx, Wx vfmadd231ps/dv Vx, Hx, Wx

vfmadd132ss/dv Vx, Hx, Wx vfmadd213ss/dv Vx, Hx, Wx vfmadd231ss/dv Vx, Hx, Wx

vfmsub132ps/dv Vx, Hx, Wx vfmsub213ps/dv Vx, Hx, Wx vfmsub231ps/dv Vx, Hx, Wx

vfmsub132ss/dv Vx, Hx, Wx vfmsub213ss/dv Vx, Hx, Wx vfmsub231ss/dv Vx, Hx, Wx


VAESIMC

vfnmadd132ps/dv

Vx, Hx, Wx
vfnmadd213ps/dv

Vx, Hx, Wx
vfnmadd213ss/dv

Vx, Hx, Wx
vfnmsub213ps/dv

Vx, Hx, Wx
vfnmsub213ss/dv

Vx, Hx, Wx
vfnmadd231ps/dv

Vx, Hx, Wx
vfnmadd231ss/dv

Vx, Hx, Wx
vfnmsub231ps/dv

Vx, Hx, Wx
vfnmsub231ss/dv

Vx, Hx, Wx
VAESENC

Vx, Hx, Wx
VAESENCLAST

Vx, Hx, Wx
VAESDEC

Vx, Hx, Wx
VAESDECLAST

Vdq, Wdq

Vdq,Hdq,Wdq

Vdq,Hdq,Wdq

Vdq,Hdq,Wdq

Vdq,Hdq,Wdq

NOTES:

* All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

100

Table A-5 Three-byte Opcode Map: 00H F7H (First two bytes are 0F 3AH) *
pfx 0 vpermqv Vqq, Wqq, Ib 0 66 1 vpermpdv Vqq, Wqq, Ib 2 vpblenddv Vx,Hx,Wx,Ib 3 4 vpermilpsv Vx, Wx, Ib 5 vpermilpdv Vx, Wx, Ib 6 vperm2f128v Vqq,Hqq,Wqq,Ib 7

1 2 3 4 5 6 7 8 9 A B C D E F

66 66 vpinsrb vinsertps Vdq,Hdq,Ry/Mb,Ib Vdq,Hdq,Udq/Md,Ib vpinsrd/q Vdq,Hdq,Ey,Ib

vpextrb Rd/Mb, Vdq, Ib

vpextrw Rd/Mw, Vdq, Ib

vpextrd/q Ey, Vdq, Ib

vextractps Ed, Vdq, Ib

66

vdpps Vx,Hx,Wx,Ib vpcmpestrm Vdq, Wdq, Ib

vdppd Vdq,Hdq,Wdq,Ib vpcmpestri Vdq, Wdq, Ib

vmpsadbw Vx,Hx,Wx,Ib vpcmpistrm Vdq, Wdq, Ib vpcmpistri Vdq, Wdq, Ib

vpclmulqdq Vdq,Hdq,Wdq,Ib

vperm2i128v Vqq,Hqq,Wqq,Ib

66

F2

RORXv Gy, Ey, Ib

...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

101

Table A-5. Three-byte Opcode Map: 08H FFH (First Two Bytes are 0F 3AH) *
pfx 0 66 1 2 3 4 5 6 7 8 9 A B C D E F 66 VAESKEYGEN Vdq, Wdq, Ib 66 66 vinserti128v Vqq,Hqq,Wqq,Ib vextracti128v Wdq,Vqq,Ib vblendvpsv Vx,Hx,Wx,Lx vblendvpdv Vx,Hx,Wx,Lx vpblendvbv Vx,Hx,Wx,Lx 66 vroundps Vx,Wx,Ib vinsertf128v Vqq,Hqq,Wqq,Ib vroundpd Vx,Wx,Ib vextractf128v Wdq,Vqq,Ib vroundss Vss,Wss,Ib vroundsd Vsd,Wsd,Ib vblendps Vx,Hx,Wx,Ib vblendpd Vx,Hx,Wx,Ib vcvtps2phv Wx, Vx, Ib vpblendw Vx,Hx,Wx,Ib 8 9 A B C D E F palignr Pq, Qq, Ib vpalignr Vx,Hx,Wx,Ib

NOTES:

* All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

102

Table A-6 Opcode Extensions for One- and Two-byte Opcodes by Group Number *
Encoding of Bits 5,4,3 of the ModR/M Byte (bits 2,1,0 in parenthesis)
Opcode
80-83 8F C0,C1 reg, imm D0, D1 reg, 1 D2, D3 reg, CL F6, F7 FE FF 0F 00

Group
1 1A 2

Mod 7,6
mem, 11B mem, 11B mem, 11B

pfx

000
ADD POP ROL

001
OR

010
ADC

011
SBB

100
AND

101
SUB

110
XOR

111
CMP

ROR

RCL

RCR

SHL/SAL

SHR

SAR

3 4 5 6

mem, 11B mem, 11B mem, 11B mem, 11B mem 11B

TEST Ib/Iz INC Eb INC Ev SLDT Rv/Mw SGDT Ms DEC Eb DEC Ev STR Rv/Mw SIDT Ms

NOT

NEG

MUL AL/rAX

IMUL AL/rAX

DIV AL/rAX

IDIV AL/rAX

CALLNf64 Ev LLDT Ew LGDT Ms

CALLF Ep LTR Ew LIDT Ms

JMPNf64 Ev VERR Ew SMSW Mw/Rv

JMPF Mp VERW Ew

PUSHd64 Ev

LMSW Ew

INVLPG Mb SWAPGS o64(000) RDTSCP (001)

0F 01

VMCALL (001) MONITOR XGETBV (000) XSETBV (001) VMLAUNCH (000) (010) MWAIT (001) VMFUNC VMRESUME CLAC (010) (100) (011) VMXOFF STAC (011) XEND (101) (100) XTEST (110) BT CMPXCH8B Mq CMPXCHG16B Mdq BTS BTR VMPTRLD Mq VMCLEAR Mq VMXON Mq RDRAND Rv

0F BA

mem, 11B

BTC VMPTRST Mq

mem 0F C7 9

66 F3

VMPTRST Mq RDSEED Rv

11B 0F B9 10 mem 11B mem C6 11 C7 11B mem 11B mem 0F 71 12 psrlw Nq, Ib 66 vpsrlw Hx,Ux,Ib psraw Nq, Ib vpsraw Hx,Ux,Ib MOV Eb, Ib MOV Ev, Iz

XABORT (000) Ib

XBEGIN (000) Jz

11B

psllw Nq, Ib vpsllw Hx,Ux,Ib

mem 0F 72 13 psrld Nq, Ib 66 vpsrld Hx,Ux,Ib psrad Nq, Ib vpsrad Hx,Ux,Ib pslld Nq, Ib vpslld Hx,Ux,Ib

11B

mem 0F 73 14 psrlq Nq, Ib 66 vpsrlq Hx,Ux,Ib vpsrldq Hx,Ux,Ib psllq Nq, Ib vpsllq Hx,Ux,Ib vpslldq Hx,Ux,Ib

11B

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

103

Encoding of Bits 5,4,3 of the ModR/M Byte (bits 2,1,0 in parenthesis)


Opcode Group Mod 7,6
mem 0F AE 15 F3 11B 0F 18 16 mem 11B VEX.0F38 F3 17 mem 11B BLSRv By, Ey BLSMSKv By, Ey BLSIv By, Ey RDFSBASE Ry prefetch NTA RDGSBASE Ry prefetch T0 WRFSBASE Ry prefetch T1 WRGSBASE Ry prefetch T2

pfx

000
fxsave

001
fxrstor

010
ldmxcsr

011
stmxcsr

100
XSAVE

101
XRSTOR lfence

110
XSAVEOPT mfence

111
clflush sfence

NOTES:

* All blanks in all opcode maps are reserved and must not be used. Do not depend on the operation of undefined or reserved locations.

10.Updates to Chapter 1, Volume 3A


Change bars show changes to Chapter 1 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------...

1.1

INTEL 64 AND IA-32 PROCESSORS COVERED IN THIS MANUAL

This manual set includes information pertaining primarily to the most recent Intel 64 and IA-32 processors, which include: Pentium processors P6 family processors Pentium 4 processors Pentium M processors Intel Xeon processors Pentium D processors Pentium processor Extreme Editions 64-bit Intel Xeon processors Intel Core Duo processor Intel Core Solo processor Dual-Core Intel Xeon processor LV Intel Core2 Duo processor Intel Core2 Quad processor Q6000 series Intel Xeon processor 3000, 3200 series Intel Xeon processor 5000 series Intel Xeon processor 5100, 5300 series Intel Core2 Extreme processor X7000 and X6800 series Intel Core2 Extreme QX6000 series

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

104

Intel Xeon processor 7100 series Intel Pentium Dual-Core processor Intel Xeon processor 7200, 7300 series Intel Core2 Extreme QX9000 series Intel Xeon processor 5200, 5400, 7400 series Intel CoreTM2 Extreme processor QX9000 and X9000 series Intel CoreTM2 Quad processor Q9000 series Intel CoreTM2 Duo processor E8000, T9000 series Intel AtomTM processor family Intel CoreTM i7 processor Intel CoreTM i5 processor Intel Xeon processor E7-8800/4800/2800 product families Intel Xeon processor E5 family Intel Xeon processor E3-1200 family Intel CoreTM i7-3930K processor 2nd generation Intel CoreTM i7-2xxx, Intel CoreTM i5-2xxx, Intel CoreTM i3-2xxx processor series Intel Xeon processor E3-1200 v2 product family 3rd generation Intel CoreTM processors Next generation Intel CoreTM processors

P6 family processors are IA-32 processors based on the P6 family microarchitecture. This includes the Pentium Pro, Pentium II, Pentium III, and Pentium III Xeon processors. The Pentium 4, Pentium D, and Pentium processor Extreme Editions are based on the Intel NetBurst microarchitecture. Most early Intel Xeon processors are based on the Intel NetBurst microarchitecture. Intel Xeon processor 5000, 7100 series are based on the Intel NetBurst microarchitecture. The Intel Core Duo, Intel Core Solo and dual-core Intel Xeon processor LV are based on an improved Pentium M processor microarchitecture. The Intel Xeon processor 3000, 3200, 5100, 5300, 7200, and 7300 series, Intel Pentium dual-core, Intel Core2 Duo, Intel Core2 Quad and Intel Core2 Extreme processors are based on Intel Core microarchitecture. The Intel Xeon processor 5200, 5400, 7400 series, Intel CoreTM2 Quad processor Q9000 series, and Intel CoreTM2 Extreme processors QX9000, X9000 series, Intel CoreTM2 processor E8000 series are based on Enhanced Intel CoreTM microarchitecture. The Intel AtomTM processor family is based on the Intel AtomTM microarchitecture and supports Intel 64 architecture. The Intel CoreTM i7 processor and the Intel CoreTM i5 processor are based on the Intel microarchitecture code name Nehalem and support Intel 64 architecture. Processors based on Intel microarchitecture code name Westmere support Intel 64 architecture. The Intel Xeon processor E5 family, Intel Xeon processor E3-1200 family, Intel Xeon processor E7-8800/ 4800/2800 product families, Intel CoreTM i7-3930K processor, 2nd generation Intel CoreTM i7-2xxx, Intel CoreTM i5-2xxx, Intel CoreTM i3-2xxx processor series are based on the Intel microarchitecture code name Sandy Bridge and support Intel 64 architecture. The Intel Xeon processor E3-1200 v2 product family and 3rd generation Intel CoreTM processors are based on the Intel microarchitecture code name Ivy Bridge and support Intel 64 architecture.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

105

The Next Generation Intel CoreTM processors are based on the Intel microarchitecture code name Haswell and support Intel 64 architecture. P6 family, Pentium M, Intel Core Solo, Intel Core Duo processors, dual-core Intel Xeon processor LV, and early generations of Pentium 4 and Intel Xeon processors support IA-32 architecture. The Intel Atom processor Z5xx series support IA-32 architecture. The Intel Xeon processor 3000, 3200, 5000, 5100, 5200, 5300, 5400, 7100, 7200, 7300, 7400 series, Intel Core2 Duo, Intel Core2 Extreme processors, Intel Core 2 Quad processors, Pentium D processors, Pentium Dual-Core processor, newer generations of Pentium 4 and Intel Xeon processor family support Intel 64 architecture. IA-32 architecture is the instruction set architecture and programming environment for Intel's 32-bit microprocessors. Intel 64 architecture is the instruction set architecture and programming environment which is a superset of and compatible with IA-32 architecture.

1.2

OVERVIEW OF THE SYSTEM PROGRAMMING GUIDE

A description of this manuals content follows: Chapter 1 About This Manual. Gives an overview of all seven volumes of the Intel 64 and IA-32 Architectures Software Developers Manual. It also describes the notational conventions in these manuals and lists related Intel manuals and documentation of interest to programmers and hardware designers. Chapter 2 System Architecture Overview. Describes the modes of operation used by Intel 64 and IA-32 processors and the mechanisms provided by the architectures to support operating systems and executives, including the system-oriented registers and data structures and the system-oriented instructions. The steps necessary for switching between real-address and protected modes are also identified. Chapter 3 Protected-Mode Memory Management. Describes the data structures, registers, and instructions that support segmentation and paging. The chapter explains how they can be used to implement a flat (unsegmented) memory model or a segmented memory model. Chapter 4 Paging. Describes the paging modes supported by Intel 64 and IA-32 processors. Chapter 5 Protection. Describes the support for page and segment protection provided in the Intel 64 and IA32 architectures. This chapter also explains the implementation of privilege rules, stack switching, pointer validation, user and supervisor modes. Chapter 6 Interrupt and Exception Handling. Describes the basic interrupt mechanisms defined in the Intel 64 and IA-32 architectures, shows how interrupts and exceptions relate to protection, and describes how the architecture handles each exception type. Reference information for each exception is given in this chapter. Includes programming the LINT0 and LINT1 inputs and gives an example of how to program the LINT0 and LINT1 pins for specific interrupt vectors. Chapter 7 Task Management. Describes mechanisms the Intel 64 and IA-32 architectures provide to support multitasking and inter-task protection. Chapter 8 Multiple-Processor Management. Describes the instructions and flags that support multiple processors with shared memory, memory ordering, and Intel Hyper-Threading Technology. Includes MP initialization for P6 family processors and gives an example of how to use of the MP protocol to boot P6 family processors in an MP system. Chapter 9 Processor Management and Initialization. Defines the state of an Intel 64 or IA-32 processor after reset initialization. This chapter also explains how to set up an Intel 64 or IA-32 processor for real-address mode operation and protected- mode operation, and how to switch between modes. Chapter 10 Advanced Programmable Interrupt Controller (APIC). Describes the programming interface to the local APIC and gives an overview of the interface between the local APIC and the I/O APIC. Includes APIC

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

106

bus message formats and describes the message formats for messages transmitted on the APIC bus for P6 family and Pentium processors. Chapter 11 Memory Cache Control. Describes the general concept of caching and the caching mechanisms supported by the Intel 64 or IA-32 architectures. This chapter also describes the memory type range registers (MTRRs) and how they can be used to map memory types of physical memory. Information on using the new cache control and memory streaming instructions introduced with the Pentium III, Pentium 4, and Intel Xeon processors is also given. Chapter 12 Intel MMX Technology System Programming. Describes those aspects of the Intel MMX technology that must be handled and considered at the system programming level, including: task switching, exception handling, and compatibility with existing system environments. Chapter 13 System Programming For Instruction Set Extensions And Processor Extended States. Describes the operating system requirements to support SSE/SSE2/SSE3/SSSE3/SSE4 extensions, including task switching, exception handling, and compatibility with existing system environments. The latter part of this chapter describes the extensible framework of operating system requirements to support processor extended states. Processor extended state may be required by instruction set extensions beyond those of SSE/SSE2/SSE3/ SSSE3/SSE4 extensions. Chapter 14 Power and Thermal Management. Describes facilities of Intel 64 and IA-32 architecture used for power management and thermal monitoring. Chapter 15 Machine-Check Architecture. Describes the machine-check architecture and machinecheck exception mechanism found in the Pentium 4, Intel Xeon, and P6 family processors. Additionally, a signaling mechanism for software to respond to hardware corrected machine check error is covered. Chapter 16 Interpreting Machine-Check Error Codes. Gives an example of how to interpret the error codes for a machine-check error that occurred on a P6 family processor. Chapter 17 Debugging, Branch Profiles and Time-Stamp Counter. Describes the debugging registers and other debug mechanism provided in Intel 64 or IA-32 processors. This chapter also describes the time-stamp counter. Chapter 18 Performance Monitoring. Describes the Intel 64 and IA-32 architectures facilities for monitoring performance. Chapter 19 Performance-Monitoring Events. Lists architectural performance events. Non-architectural performance events (i.e. model-specific events) are listed for each generation of microarchitecture. Chapter 20 8086 Emulation. Describes the real-address and virtual-8086 modes of the IA-32 architecture. Chapter 21 Mixing 16-Bit and 32-Bit Code. Describes how to mix 16-bit and 32-bit code modules within the same program or task. Chapter 22 IA-32 Architecture Compatibility. Describes architectural compatibility among IA-32 processors. Chapter 23 Introduction to Virtual-Machine Extensions. Describes the basic elements of virtual machine architecture and the virtual-machine extensions for Intel 64 and IA-32 Architectures. Chapter 24 Virtual-Machine Control Structures. Describes components that manage VMX operation. These include the working-VMCS pointer and the controlling-VMCS pointer. Chapter 25 VMX Non-Root Operation. Describes the operation of a VMX non-root operation. Processor operation in VMX non-root mode can be restricted programmatically such that certain operations, events or conditions can cause the processor to transfer control from the guest (running in VMX non-root mode) to the monitor software (running in VMX root mode). Chapter 26 VM Entries. Describes VM entries. VM entry transitions the processor from the VMM running in VMX root-mode to a VM running in VMX non-root mode. VM-Entry is performed by the execution of VMLAUNCH or VMRESUME instructions.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

107

Chapter 27 VM Exits. Describes VM exits. Certain events, operations or situations while the processor is in VMX non-root operation may cause VM-exit transitions. In addition, VM exits can also occur on failed VM entries. Chapter 28 VMX Support for Address Translation. Describes virtual-machine extensions that support address translation and the virtualization of physical memory. Chapter 29 APIC Virtualization and Virtual Interrupts. Describes the VMCS including controls that enable the virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC). Chapter 30 VMX Instruction Reference. Describes the virtual-machine extensions (VMX). VMX is intended for a system executive to support virtualization of processor hardware and a system software layer acting as a host to multiple guest software environments. Chapter 31 Virtual-Machine Monitoring Programming Considerations. Describes programming considerations for VMMs. VMMs manage virtual machines (VMs). Chapter 32 Virtualization of System Resources. Describes the virtualization of the system resources. These include: debugging facilities, address translation, physical memory, and microcode update facilities. Chapter 33 Handling Boundary Conditions in a Virtual Machine Monitor. Describes what a VMM must consider when handling exceptions, interrupts, error conditions, and transitions between activity states. Chapter 34 System Management Mode. Describes Intel 64 and IA-32 architectures system management mode (SMM) facilities. Chapter 35 Model-Specific Registers (MSRs). Lists the MSRs available in the Pentium processors, the P6 family processors, the Pentium 4, Intel Xeon, Intel Core Solo, Intel Core Duo processors, and Intel Core 2 processor family and describes their functions. Appendix A VMX Capability Reporting Facility. Describes the VMX capability MSRs. Support for specific VMX features is determined by reading capability MSRs. Appendix B Field Encoding in VMCS. Enumerates all fields in the VMCS and their encodings. Fields are grouped by width (16-bit, 32-bit, etc.) and type (guest-state, host-state, etc.). Appendix C VM Basic Exit Reasons. Describes the 32-bit fields that encode reasons for a VM exit. Examples of exit reasons include, but are not limited to: software interrupts, processor exceptions, software traps, NMIs, external interrupts, and triple faults. ...

11.Updates to Chapter 2, Volume 3A


Change bars show changes to Chapter 2 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------...

2.2.1

Extended Feature Enable Register

The IA32_EFER MSR provides several fields related to IA-32e mode enabling and operation. It also provides one field that relates to page-access right modification (see Section 4.6, Access Rights). The layout of the IA32_EFER MSR is shown in Figure 2-4.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

108

63

12 11 10 9 8 7

IA32_EFER

Execute Disable Bit Enable IA-32e Mode Active IA-32e Mode Enable SYSCALL Enable

Reserved

Figure 2-4 IA32_EFER MSR Layout

Table 2-1 IA32_EFER MSR Information


Bit 0 7:1 8 9 10 11 63:12 ... Description SYSCALL Enable (R/W) Enables SYSCALL/SYSRET instructions in 64-bit mode. Reserved. IA-32e Mode Enable (R/W) Enables IA-32e mode operation. Reserved. IA-32e Mode Active (R) Indicates IA-32e mode is active when set. Execute Disable Bit Enable (R/W) Enables page access restriction by preventing instruction fetches from PAE pages with the XD bit set (See Section 4.6). Reserved.

12.Updates to Chapter 4, Volume 3A


Change bars show changes to Chapter 4 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

109

4.1.1

Three Paging Modes

If CR0.PG = 0, paging is not used. The logical processor treats all linear addresses as if they were physical addresses. CR4.PAE and IA32_EFER.LME are ignored by the processor, as are CR0.WP, CR4.PSE, CR4.PGE, CR4.SMEP, and IA32_EFER.NXE. Paging is enabled if CR0.PG = 1. Paging can be enabled only if protection is enabled (CR0.PE = 1). If paging is enabled, one of three paging modes is used. The values of CR4.PAE and IA32_EFER.LME determine which paging mode is used: If CR0.PG = 1 and CR4.PAE = 0, 32-bit paging is used. 32-bit paging is detailed in Section 4.3. 32-bit paging uses CR0.WP, CR4.PSE, CR4.PGE, and CR4.SMEP as described in Section 4.1.3. If CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LME = 0, PAE paging is used. PAE paging is detailed in Section 20. PAE paging uses CR0.WP, CR4.PGE, CR4.SMEP, and IA32_EFER.NXE as described in Section 4.1.3. If CR0.PG = 1, CR4.PAE = 1, and IA32_EFER.LME = 1, IA-32e paging is used.1 IA-32e paging is detailed in Section 4.5. IA-32e paging uses CR0.WP, CR4.PGE, CR4.PCIDE, CR4.SMEP, and IA32_EFER.NXE as described in Section 4.1.3. IA-32e paging is available only on processors that support the Intel 64 architecture. Linear-address width. The size of the linear addresses that can be translated. Physical-address width. The size of the physical addresses produced by paging. Page size. The granularity at which linear addresses are translated. Linear addresses on the same page are translated to corresponding physical addresses on the same page. Support for execute-disable access rights. In some paging modes, software can be prevented from fetching instructions from pages that are otherwise readable. Support for PCIDs. In some paging modes, software can enable a facility by which a logical processor caches information for multiple linear-address spaces. The processor may retain cached information when software switches between different linear-address spaces.

The three paging modes differ with regard to the following details:

Table 4-1 illustrates the key differences between the three paging modes.

Table 4-1 Properties of Different Paging Modes


Paging Mode None 32-bit PAE PG in CR0 0 1 1 PAE in CR4 N/A 0 1 LME in IA32_EFER N/A 02 0 Lin.Addr. Width 32 32 32 Phys.Addr. Width1 32 Up to 403 Up to 52 Up to 52 Page Sizes N/A 4 KB 4 MB4 4 KB 2 MB 4 KB 2 MB 1 GB6 Supports ExecuteDisable? No No Yes5 Supports PCIDs? No No No

IA-32e
NOTES:

48

Yes5

Yes7

1. The physical-address width is always bounded by MAXPHYADDR; see Section 4.1.4. 2. The processor ensures that IA32_EFER.LME must be 0 if CR0.PG = 1 and CR4.PAE = 0. 1. The LMA flag in the IA32_EFER MSR (bit 10) is a status bit that indicates whether the logical processor is in IA-32e mode (and thus using IA-32e paging). The processor always sets IA32_EFER.LMA to CR0.PG & IA32_EFER.LME. Software cannot directly modify IA32_EFER.LMA; an execution of WRMSR to the IA32_EFER MSR ignores bit 10 of its source operand.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

110

3. 32-bit paging supports physical-address widths of more than 32 bits only for 4-MByte pages and only if the PSE-36 mechanism is supported; see Section 4.1.4 and Section 4.3. 4. 4-MByte pages are used with 32-bit paging only if CR4.PSE = 1; see Section 4.3. 5. Execute-disable access rights are applied only if IA32_EFER.NXE = 1; see Section 4.6. 6. Not all processors that support IA-32e paging support 1-GByte pages; see Section 4.1.4. 7. PCIDs are used only if CR4.PCIDE = 1; see Section 4.10.1. Because they are used only if IA32_EFER.LME = 0, 32-bit paging and PAE paging is used only in legacy protected mode. Because legacy protected mode cannot produce linear addresses larger than 32 bits, 32-bit paging and PAE paging translate 32-bit linear addresses. Because it is used only if IA32_EFER.LME = 1, IA-32e paging is used only in IA-32e mode. (In fact, it is the use of IA-32e paging that defines IA-32e mode.) IA-32e mode has two sub-modes: ... Compatibility mode. This mode uses only 32-bit linear addresses. IA-32e paging treats bits 47:32 of such an address as all 0. 64-bit mode. While this mode produces 64-bit linear addresses, the processor ensures that bits 63:47 of such an address are identical.1 IA-32e paging does not use bits 63:48 of such addresses.

13. Updates to Chapter 5, Volume 3A


Change bars show changes to Chapter 5 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------...

5.8

PRIVILEGE LEVEL CHECKING WHEN TRANSFERRING PROGRAM CONTROL BETWEEN CODE SEGMENTS

To transfer program control from one code segment to another, the segment selector for the destination code segment must be loaded into the code-segment register (CS). As part of this loading process, the processor examines the segment descriptor for the destination code segment and performs various limit, type, and privilege checks. If these checks are successful, the CS register is loaded, program control is transferred to the new code segment, and program execution begins at the instruction pointed to by the EIP register. Program control transfers are carried out with the JMP, CALL, RET, SYSENTER, SYSEXIT, SYSCALL, SYSRET, INT n, and IRET instructions, as well as by the exception and interrupt mechanisms. Exceptions, interrupts, and the IRET instruction are special cases discussed in Chapter 6, Interrupt and Exception Handling. This chapter discusses only the JMP, CALL, RET, SYSENTER, SYSEXIT, SYSCALL, and SYSRET instructions. A JMP or CALL instruction can reference another code segment in any of four ways: The target operand contains the segment selector for the target code segment. The target operand points to a call-gate descriptor, which contains the segment selector for the target code segment. The target operand points to a TSS, which contains the segment selector for the target code segment.

1. Such an address is called canonical. Use of a non-canonical linear address in 64-bit mode produces a general-protection exception (#GP(0)); the processor does not attempt to translate non-canonical linear addresses using IA-32e paging.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

111

The target operand points to a task gate, which points to a TSS, which in turn contains the segment selector for the target code segment.

The following sections describe first two types of references. See Section 7.3, Task Switching, for information on transferring program control through a task gate and/or TSS. The SYSENTER and SYSEXIT instructions are special instructions for making fast calls to and returns from operating system or executive procedures. These instructions are discussed in Section 5.8.7, Performing Fast Calls to System Procedures with the SYSENTER and SYSEXIT Instructions. The SYCALL and SYSRET instructions are special instructions for making fast calls to and returns from operating system or executive procedures in 64-bit mode. These instructions are discussed in Section 5.8.8, Fast System Calls in 64-Bit Mode. ...

5.8.7.1

SYSENTER and SYSEXIT Instructions in IA-32e Mode

For Intel 64 processors, the SYSENTER and SYSEXIT instructions are enhanced to allow fast system calls from user code running at privilege level 3 (in compatibility mode or 64-bit mode) to 64-bit executive procedures running at privilege level 0. IA32_SYSENTER_EIP MSR and IA32_SYSENTER_ESP MSR are expanded to hold 64-bit addresses. If IA-32e mode is inactive, only the lower 32-bit addresses stored in these MSRs are used. The WRMSR instruction ensures that the addresses stored in these MSRs are canonical. Note that, in 64-bit mode, IA32_SYSENTER_CS must not contain a NULL selector. When SYSENTER transfers control, the following fields are generated and bits set: Target code segment Reads non-NULL selector from IA32_SYSENTER_CS. New CS attributes CS base = 0, CS limit = FFFFFFFFH. Target instruction Reads 64-bit canonical address from IA32_SYSENTER_EIP. Stack segment Computed by adding 8 to the value from IA32_SYSENTER_CS. Stack pointer Reads 64-bit canonical address from IA32_SYSENTER_ESP. New SS attributes SS base = 0, SS limit = FFFFFFFFH.

When the SYSEXIT instruction transfers control to 64-bit mode user code using REX.W, the following fields are generated and bits set: Target code segment Computed by adding 32 to the value in IA32_SYSENTER_CS. New CS attributes L-bit = 1 (go to 64-bit mode). Target instruction Reads 64-bit canonical address in RDX. Stack segment Computed by adding 40 to the value of IA32_SYSENTER_CS. Stack pointer Update RSP using 64-bit canonical address in RCX.

When SYSEXIT transfers control to compatibility mode user code when the operand size attribute is 32 bits, the following fields are generated and bits set: Target code segment Computed by adding 16 to the value in IA32_SYSENTER_CS. New CS attributes L-bit = 0 (go to compatibility mode). Target instruction Fetch the target instruction from 32-bit address in EDX. Stack segment Computed by adding 24 to the value in IA32_SYSENTER_CS. Stack pointer Update ESP from 32-bit address in ECX.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

112

5.8.8

Fast System Calls in 64-Bit Mode

The SYSCALL and SYSRET instructions are designed for operating systems that use a flat memory model (segmentation is not used). The instructions, along with SYSENTER and SYSEXIT, are suited for IA-32e mode operation. SYSCALL and SYSRET, however, are not supported in compatibility mode (or in protected mode). Use CPUID to check if SYSCALL and SYSRET are available (CPUID.80000001H.EDX[bit 11] = 1). SYSCALL is intended for use by user code running at privilege level 3 to access operating system or executive procedures running at privilege level 0. SYSRET is intended for use by privilege level 0 operating system or executive procedures for fast returns to privilege level 3 user code. Stack pointers for SYSCALL/SYSRET are not specified through model specific registers. The clearing of bits in RFLAGS is programmable rather than fixed. SYSCALL/SYSRET save and restore the RFLAGS register. For SYSCALL, the processor saves RFLAGS into R11 and the RIP of the next instruction into RCX; it then gets the privilege-level 0 target code segment, instruction pointer, stack segment, and flags as follows: Target code segment Reads a non-NULL selector from IA32_STAR[47:32]. Target instruction pointer Reads a 64-bit address from IA32_LSTAR. (The WRMSR instruction ensures that the value of the IA32_LSTAR MSR is canonical.) Stack segment Computed by adding 8 to the value in IA32_STAR[47:32]. Flags The processor sets RFLAGS to the logical-AND of its current value with the complement of the value in the IA32_FMASK MSR.

When SYSRET transfers control to 64-bit mode user code using REX.W, the processor gets the privilege level 3 target code segment, instruction pointer, stack segment, and flags as follows: Target code segment Reads a non-NULL selector from IA32_STAR[63:48] + 16. Target instruction pointer Copies the value in RCX into RIP. Stack segment IA32_STAR[63:48] + 8. EFLAGS Loaded from R11.

When SYSRET transfers control to 32-bit mode user code using a 32-bit operand size, the processor gets the privilege level 3 target code segment, instruction pointer, stack segment, and flags as follows: Target code segment Reads a non-NULL selector from IA32_STAR[63:48]. Target instruction pointer Copies the value in ECX into EIP. Stack segment IA32_STAR[63:48] + 8. EFLAGS Loaded from R11.

It is the responsibility of the OS to ensure the descriptors in the GDT/LDT correspond to the selectors loaded by SYSCALL/SYSRET (consistent with the base, limit, and attribute values forced by the instructions). See Figure 5-14 for the layout of IA32_STAR, IA32_LSTAR and IA32_FMASK.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

113

63

32 31 Reserved SYSCALL EFLAGS Mask

IA32_FMASK 63 0

Target RIP for 64-bit Mode Calling Program

IA32_LSTAR 63 48 47 SYSCALL CS and SS 32 31 0

SYSRET CS and SS

Reserved

IA32_STAR

Figure 5-14 MSRs Used by SYSCALL and SYSRET


The SYSCALL instruction does not save the stack pointer, and the SYSRET instruction does not restore it. It is likely that the OS system-call handler will change the stack pointer from the user stack to the OS stack. If so, it is the responsibility of software first to save the user stack pointer. This might be done by user code, prior to executing SYSCALL, or by the OS system-call handler after SYSCALL. Because the SYSRET instruction does not modify the stack pointer, it is necessary for software to switch back to the user stack. The OS may load the user stack pointer (if it was saved after SYSCALL) before executing SYSRET; alternatively, user code may load the stack pointer (if it was saved before SYSCALL) after receiving control from SYSRET. If the OS loads the stack pointer before executing SYSRET, it must ensure that the handler of any interrupt or exception delivered between restoring the stack pointer and successful execution of SYSRET is not invoked with the user stack. It can do so using approaches such as the following: External interrupts. The OS can prevent an external interrupt from being delivered by clearing EFLAGS.IF before loading the user stack pointer. Nonmaskable interrupts (NMIs). The OS can ensure that the NMI handler is invoked with the correct stack by using the interrupt stack table (IST) mechanism for gate 2 (NMI) in the IDT (see Section 6.14.5, Interrupt Stack Table). General-protection exceptions (#GP). The SYSRET instruction generates #GP(0) if the value of RCX is not canonical. The OS can address this possibility using one or more of the following approaches: Confirming that the value of RCX is canonical before executing SYSRET. Using paging to ensure that the SYSCALL instruction will never save a non-canonical value into RCX. Using the IST mechanism for gate 13 (#GP) in the IDT. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

114

14. Updates to Chapter 11, Volume 3A


Change bars show changes to Chapter 11 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A: System Programming Guide, Part 1. -----------------------------------------------------------------------------------------...

11.11.2.4 System-Management Range Register Interface


If IA32_MTRRCAP[bit 11] is set, the processor supports the SMRR interface to restrict access to a specified memory address range used by system-management mode (SMM) software (see Section 34.4.2.1). If the SMRR interface is supported, SMM software is strongly encouraged to use it to protect the SMI code and data stored by SMI handler in the SMRAM region. The system-management range registers consist of a pair of MSRs (see Figure 11-8). The IA32_SMRR_PHYSBASE MSR defines the base address for the SMRAM memory range and the memory type used to access it in SMM. The IA32_SMRR_PHYSMASK MSR contains a valid bit and a mask that determines the SMRAM address range protected by the SMRR interface. These MSRs may be written only in SMM; an attempt to write them outside of SMM causes a general-protection exception.1 Figure 11-8 shows flags and fields in these registers. The functions of these flags and fields are the following: Type field, bits 0 through 7 Specifies the memory type for the range (see Table 11-8 for the encoding of this field). PhysBase field, bits 12 through 31 Specifies the base address of the address range. The address must be less than 4 GBytes and is automatically aligned on a 4-KByte boundary. PhysMask field, bits 12 through 31 Specifies a mask that determines the range of the region being mapped, according to the following relationships: Address_Within_Range AND PhysMask = PhysBase AND PhysMask This value is extended by 12 bits at the low end to form the mask value. For more information: see Section 11.11.3, Example Base and Mask Calculations. V (valid) flag, bit 11 Enables the register pair when set; disables register pair when clear. Before attempting to access these SMRR registers, software must test bit 11 in the IA32_MTRRCAP register. If SMRR is not supported, reads from or writes to registers cause general-protection exceptions. When the valid flag in the IA32_SMRR_PHYSMASK MSR is 1, accesses to the specified address range are treated as follows: If the logical processor is in SMM, accesses uses the memory type in the IA32_SMRR_PHYSBASE MSR. If the logical processor is not in SMM, write accesses are ignored and read accesses return a fixed value for each byte. The uncacheable memory type (UC) is used in this case.

The above items apply even if the address range specified overlaps with a range specified by the MTRRs.

1. For some processor models, these MSRs can be accessed by RDMSR and WRMSR only if the SMRR interface has been enabled using a model-specific bit in the IA32_FEATURE_CONTROL MSR.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

115

IA32_SMRR_PHYSBASE Register
63 31 12 11 8 7 0

Reserved

PhysBase

Type

PhysBase Base address of range Type Memory type for range

IA32_SMRR_PHYSMASK Register
63 31 12 11 10 0

Reserved

PhysMask

Reserved

PhysMask Sets range mask V Valid Reserved

Figure 11-8 IA32_SMRR_PHYSBASE and IA32_SMRR_PHYSMASK SMRR Pair


...

15.Updates to Chapter 16, Volume 3B


Change bars show changes to Chapter 16 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

116

16.4.1
Type MCA error codes1 Model specific errors

Internal Machine Check Errors


Table 16-13 Machine Check Error Codes for IA32_MC4_STATUS
Bit No. 0-15 19:16 Bit Function MCACOD Reserved except for the following 0000b - No Error 0001b - Non_IMem_Sel 0010b - I_Parity_Error 0011b - Bad_OpCode 0100b - I_Stack_Underflow 0101b - I_Stack_Overflow 0110b - D_Stack_Underflow 0111b - D_Stack_Overflow 1000b - Non-DMem_Sel 1001b - D_Parity_Error 23-20 31-24 Reserved Reserved except for the following Reserved 00h - No Error 0Dh - MC_IMC_FORCE_SR_S3_TIMEOUT 0Eh - MC_CPD_UNCPD_ST_TIMOUT 0Fh - MC_PKGS_SAFE_WP_TIMEOUT 43h - MC_PECI_MAILBOX_QUIESCE_TIMEOUT 5Ch - MC_MORE_THAN_ONE_LT_AGENT 60h - MC_INVALID_PKGS_REQ_PCH 61h - MC_INVALID_PKGS_REQ_QPI 62h - MC_INVALID_PKGS_RES_QPI 63h - MC_INVALID_PKGC_RES_PCH 64h - MC_INVALID_PKG_STATE_CONFIG 70h - MC_WATCHDG_TIMEOUT_PKGC_SLAVE 71h - MC_WATCHDG_TIMEOUT_PKGC_MASTER 72h - MC_WATCHDG_TIMEOUT_PKGS_MASTER 7ah - MC_HA_FAILSTS_CHANGE_DETECTED 81h - MC_RECOVERABLE_DIE_THERMAL_TOO_HOT 56-32 Reserved Reserved Bit Description

Status register validity indicators1


NOTES:

57-63

1. These fields are architecturally defined. Refer to Chapter 15, Machine-Check Architecture, for more information. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

117

16.4.3

Integrated Memory Controller Machine Check Errors

MC error codes associated with integrated memory controllers are reported in the MSRs IA32_MC8_STATUSIA32_MC11_STATUS. The supported error codes are follows the architectural MCACOD definition type 1MMMCCCC (see Chapter 15, Machine-Check Architecture,). MSR_ERROR_CONTROL.[ bit 1] can enable additional information logging of the IMC. The additional error information logged by the IMC is stored in IA32_MCi_STATUS and IA32_MCi_MISC; (i = 8, 11).

Table 16-15 Intel IMC MC Error Codes for IA32_MCi_STATUS (i= 8, 11)
Type MCA error codes1 Model specific errors Bit No. 0-15 31:16 Bit Function MCACOD Reserved except for the following Bit Description Bus error format: 1PPTRRRRIILL 0x001 - Address parity error 0x002 - HA Wrt buffer Data parity error 0x004 - HA Wrt byte enable parity error 0x008 - Corrected patrol scrub error 0x010 - Uncorrected patrol scrub error 0x020 - Corrected spare error 0x040 - Uncorrected spare error When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first device error when corrected error is detected during normal read. Reserved See Chapter 15, Machine-Check Architecture,

Model specific errors

36-32

Other info

37 56-38 Status register validity indicators1


NOTES:

Reserved

57-63

1. These fields are architecturally defined. Refer to Chapter 15, Machine-Check Architecture, for more information.

Table 16-16 Intel IMC MC Error Codes for IA32_MCi_MISC (i= 8, 11)
Type MCA addr info1 Model specific errors Bit No. 0-8 13:9 Bit Function Bit Description See Chapter 15, Machine-Check Architecture, When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second device error when corrected error is detected during normal read. Otherwise contain parity error if MCi_Status indicates HA_WB_Data or HA_W_BE parity error. When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error bit mask. When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error bit mask. When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error failing rank.

Model specific errors Model specific errors

29-14 45-30 50:46

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

118

Type

Bit No. 55:51 58:56 61-59 62-63

Bit Function

Bit Description When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error failing rank. When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log first-device error failing DIMM slot. When MSR_ERROR_CONTROL.[1] is set, allows the iMC to log second-device error failing DIMM slot. Reserved

1. These fields are architecturally defined. Refer to Chapter 15, Machine-Check Architecture, for more information. ...

NOTES:

16. Updates to Chapter 17, Volume 3B


Change bars show changes to Chapter 17 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------...

17.4

LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING OVERVIEW

P6 family processors introduced the ability to set breakpoints on taken branches, interrupts, and exceptions, and to single-step from one branch to the next. This capability has been modified and extended in the Pentium 4, Intel Xeon, Pentium M, Intel Core Solo, Intel Core Duo, Intel Core2 Duo, Intel Core i7 and Intel Atom processors to allow logging of branch trace messages in a branch trace store (BTS) buffer in memory. See the following sections for processor specific implementation of last branch, interrupt and exception recording: Section 17.5, Last Branch, Interrupt, and Exception Recording (Intel Core2 Duo and Intel Atom Processor Family) Section 17.6, Last Branch, Interrupt, and Exception Recording for Processors based on Intel Microarchitecture code name Nehalem Section 17.7, Last Branch, Interrupt, and Exception Recording for Processors based on Intel Microarchitecture code name Sandy Bridge Section 17.8, Last Branch, Call Stack, Interrupt, and Exception Recording for Processors based on Intel Microarchitecture code name Haswell Section 17.9, Last Branch, Interrupt, and Exception Recording (Processors based on Intel NetBurst Microarchitecture) Section 17.10, Last Branch, Interrupt, and Exception Recording (Intel Core Solo and Intel Core Duo Processors) Section 17.11, Last Branch, Interrupt, and Exception Recording (Pentium M Processors) Section 17.12, Last Branch, Interrupt, and Exception Recording (P6 Family Processors) The following subsections of Section 17.4 describe common features of profiling branches. These features are generally enabled using the IA32_DEBUGCTL MSR (older processor may have implemented a subset or modelspecific features, see definitions of MSR_DEBUGCTLA, MSR_DEBUGCTLB, MSR_DEBUGCTL).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

119

...

17.4.8.1

LBR Stack and Intel 64 Processors

LBR MSRs are 64-bits. If IA-32e mode is disabled, only the lower 32-bits of the address is recorded. If IA-32e mode is enabled, the processor writes 64-bit values into the MSR. In 64-bit mode, last branch records store 64-bit addresses; in compatibility mode, the upper 32-bits of last branch records are cleared.

MSR_LASTBRANCH_0_FROM_IP through MSR_LASTBRANCH_(N-1)_FROM_IP


63 0

Source Address MSR_LASTBRANCH_0_TO_IP through MSR_LASTBRANCH_(N-1)_TO_IP


63 0

Destination Address

Figure 0-1. 64-bit Address Layout of LBR MSR


Software should query an architectural MSR IA32_PERF_CAPABILITIES[5:0] about the format of the address that is stored in the LBR stack. Four formats are defined by the following encoding: 000000B (32-bit record format) Stores 32-bit offset in current CS of respective source/destination, 000001B (64-bit LIP record format) Stores 64-bit linear address of respective source/destination, 000010B (64-bit EIP record format) Stores 64-bit offset (effective address) of respective source/ destination. 000011B (64-bit EIP record format) and Flags Stores 64-bit offset (effective address) of respective source/destination. LBR flags are supported in the upper bits of FROM register in the LBR stack. See LBR stack details below for flag support and definition. 000011B (64-bit EIP record format), Flags and TSX Stores 64-bit offset (effective address) of respective source/destination. LBR flags are supported in the upper bits of FROM register in the LBR stack. TSX fields are also supported. Processors support for the architectural MSR IA32_PERF_CAPABILITIES is provided by CPUID.01H:ECX[PERF_CAPAB_MSR] (bit 15). ...

17.7

LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON INTEL MICROARCHITECTURE CODE NAME SANDY BRIDGE

Generally, all of the last branch record, interrupt and exception recording facility described in Section 17.6, Last Branch, Interrupt, and Exception Recording for Processors based on Intel Microarchitecture code name Nehalem, apply to processors based on Intel microarchitecture code name Sandy Bridge. For processors based on Intel microarchitecture code name Ivy Bridge, the same holds true.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

120

One difference of note is that MSR_LBR_SELECT is shared between two logical processors in the same core. In Intel microarchitecture code name Sandy Bridge, each logical processor has its own MSR_LBR_SELECT. The filtering semantics for Near_ind_jmp and Near_rel_jmp has been enhanced, see Table 17-10.

Table 17-10
Bit Field CPL_EQ_0 CPL_NEQ_0 JCC NEAR_REL_CALL NEAR_IND_CALL NEAR_RET NEAR_IND_JMP NEAR_REL_JMP FAR_BRANCH Reserved Bit Offset 0 1 2 3 4 5 6 7 8 63:9

MSR_LBR_SELECT for Intel microarchitecture code name Sandy Bridge


Access R/W R/W R/W R/W R/W R/W R/W R/W R/W Description When set, do not capture branches occurring in ring 0 When set, do not capture branches occurring in ring >0 When set, do not capture conditional branches When set, do not capture near relative calls When set, do not capture near indirect calls When set, do not capture near returns When set, do not capture near indirect jumps except near indirect calls and near returns When set, do not capture near relative jumps except near relative calls. When set, do not capture far branches Must be zero

17.8

LAST BRANCH, CALL STACK, INTERRUPT, AND EXCEPTION RECORDING FOR PROCESSORS BASED ON INTEL MICROARCHITECTURE CODE NAME HASWELL

Generally, all of the last branch record, interrupt and exception recording facility described in Section 17.7, Last Branch, Interrupt, and Exception Recording for Processors based on Intel Microarchitecture code name Sandy Bridge, apply to next generation processors based on Intel Microarchitecture code name Haswell. The LBR facility also supports an alternate capability to profile call stack profiles. Configuring the LBR facility to conduct call stack profiling is by writing 1 to the MSR_LBR_SELECT.EN_CALLSTACK[bit 9]; see Table 17-11. If MSR_LBR_SELECT.EN_CALLSTACK is clear, the LBR facility will capture branches normally as described in Section 17.7.

Table 17-11
Bit Field CPL_EQ_0 CPL_NEQ_0 JCC NEAR_REL_CALL NEAR_IND_CALL NEAR_RET NEAR_IND_JMP NEAR_REL_JMP FAR_BRANCH EN_CALLSTACK Reserved Bit Offset 0 1 2 3 4 5 6 7 8 9 63:10

MSR_LBR_SELECT for Intel microarchitecture code name Haswell


Access R/W R/W R/W R/W R/W R/W R/W R/W R/W Description When set, do not capture branches occurring in ring 0 When set, do not capture branches occurring in ring >0 When set, do not capture conditional branches When set, do not capture near relative calls When set, do not capture near indirect calls When set, do not capture near returns When set, do not capture near indirect jumps except near indirect calls and near returns When set, do not capture near relative jumps except near relative calls. When set, do not capture far branches Enable LBR stack to use LIFO filtering to capture Call stack profile Must be zero

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

121

The call stack profiling capability is an enhancement of the LBR facility. The LBR stack is a ring buffer typically used to profile control flow transitions resulting from branches. However, the finite depth of the LBR stack often become less effective when profiling certain high-level languages (e.g. C++), where a transition of the execution flow is accompanied by a large number of leaf function calls, each of which returns an individual parameter to form the list of parameters for the main execution function call. A long list of such parameters returned by the leaf functions would serve to flush the data captured in the LBR stack, often losing the main execution context. When the call stack feature is enabled, the LBR stack will capture unfiltered call data normally, but as return instructions are executed the last captured branch record is flushed from the on-chip registers in a last-in first-out (LIFO) manner. Thus, branch information relative to leaf functions will not be captured, while preserving the call stack information of the main line execution path. The configuration of the call stack facility is summarized below: Set IA32_DEBUGCTL.LBR (bit 0) to enable the LBR stack to capture branch records. The source and target addresses of the call branches will be captured in the 16 pairs of From/To LBR MSRs that form the LBR stack. Program the Top of Stack (TOS) MSR that points to the last valid from/to pair. This register is incremented by 1, modulo 16, before recording the next pair of addresses. Program the branch filtering bits of MSR_LBR_SELECT (bits 0:8) as desired. Program the MSR_LBR_SELECT to enable LIFO filtering of return instructions with: The following bits in MSR_LBR_SELECT must be set to 1: JCC, NEAR_IND_JMP, NEAR_REL_JMP, FAR_BRANCH, EN_CALLSTACK; The following bits in MSR_LBR_SELECT must be cleared: NEAR_REL_CALL, NEAR-IND_CALL, NEAR_RET; At most one of CPL_EQ_0, CPL_NEQ_0 is set.

17.8.1

LBR Stack Enhancement

Processors based on Intel microarchitecture code name Haswell provide 16 pairs of MSR to record last branch record information. The layout of each MSR pair is enumerated by IA32_PERF_CAPABILITIES[5:0] = 04H, and is shown in Table 17-12 and Table 17-7.

Table 17-12
Bit Field Data SIGN_EXT TSX_ABORT Bit Offset 47:0 60:48 61 Access R/O R/0 R/0

IA32_LASTBRANCH_x_FROM_IP with TSX Information


Description The linear address of the branch instruction itself, this is the branch from address. Signed extension of bit 47 of this register. When set, indicates a TSX Abort entry LBR_FROM: EIP at the time of the TSX Abort LBR_TO: EIP of the start of HLE region, or EIP of the RTM Abort Handler When set, indicates the entry occurred in a TSX region When set, indicates either the target of the branch was mispredicted and/or the direction (taken/non-taken) was mispredicted; otherwise, the target branch was predicted.

IN_TSX MISPRED

62 63

R/0 R/O

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

122

17.9

LAST BRANCH, INTERRUPT, AND EXCEPTION RECORDING (PROCESSORS BASED ON INTEL NETBURST MICROARCHITECTURE)

Pentium 4 and Intel Xeon processors based on Intel NetBurst microarchitecture provide the following methods for recording taken branches, interrupts and exceptions: Store branch records in the last branch record (LBR) stack MSRs for the most recent taken branches, interrupts, and/or exceptions in MSRs. A branch record consist of a branch-from and a branch-to instruction address. Send the branch records out on the system bus as branch trace messages (BTMs). Log BTMs in a memory-resident branch trace store (BTS) buffer. MSR_DEBUGCTLA MSR Enables last branch, interrupt, and exception recording; single-stepping on taken branches; branch trace messages (BTMs); and branch trace store (BTS). This register is named DebugCtlMSR in the P6 family processors. Debug store (DS) feature flag (CPUID.1:EDX.DS[bit 21]) Indicates that the processor provides the debug store (DS) mechanism, which allows BTMs to be stored in a memory-resident BTS buffer. CPL-qualified debug store (DS) feature flag (CPUID.1:ECX.DS-CPL[bit 4]) Indicates that the processor provides a CPL-qualified debug store (DS) mechanism, which allows software to selectively skip sending and storing BTMs, according to specified current privilege level settings, into a memory-resident BTS buffer. IA32_MISC_ENABLE MSR Indicates that the processor provides the BTS facilities. Last branch record (LBR) stack The LBR stack is a circular stack that consists of four MSRs (MSR_LASTBRANCH_0 through MSR_LASTBRANCH_3) for the Pentium 4 and Intel Xeon processor family [CPUID family 0FH, models 0H-02H]. The LBR stack consists of 16 MSR pairs (MSR_LASTBRANCH_0_FROM_IP through MSR_LASTBRANCH_15_FROM_IP and MSR_LASTBRANCH_0_TO_IP through MSR_LASTBRANCH_15_TO_IP) for the Pentium 4 and Intel Xeon processor family [CPUID family 0FH, model 03H]. Last branch record top-of-stack (TOS) pointer The TOS Pointer MSR contains a 2-bit pointer (0-3) to the MSR in the LBR stack that contains the most recent branch, interrupt, or exception recorded for the Pentium 4 and Intel Xeon processor family [CPUID family 0FH, models 0H-02H]. This pointer becomes a 4-bit pointer (0-15) for the Pentium 4 and Intel Xeon processor family [CPUID family 0FH, model 03H]. See also: Table 17-12, Figure 17-12, and Section 17.9.2, LBR Stack for Processors Based on Intel NetBurst Microarchitecture. Last exception record See Section 17.9.3, Last Exception Records.

To support these functions, the processor provides the following MSRs and related facilities:

...

17.13.3

Time-Stamp Counter Adjustment

Software can modify the value of the time-stamp counter (TSC) of a logical processor by using the WRMSR instruction to write to the IA32_TIME_STAMP_COUNTER MSR (address 10H). Because such a write applies only to that logical processor, software seeking to synchronize the TSC values of multiple logical processors must perform these writes on each logical processor. It may be difficult for software to do this in a way than ensures that all logical processors will have the same value for the TSC at a given point in time. The synchronization of TSC adjustment can be simplified by using the 64-bit IA32_TSC_ADJUST MSR (address 3BH). Like the IA32_TIME_STAMP_COUNTER MSR, the IA32_TSC_ADJUST MSR is maintained separately for each logical processor. A logical processor maintains and uses the IA32_TSC_ADJUST MSR as follows: On RESET, the value of the IA32_TSC_ADJUST MSR is 0.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

123

If an execution of WRMSR to the IA32_TIME_STAMP_COUNTER MSR adds (or subtracts) value X from the TSC, the logical processor also adds (or subtracts) value X from the IA32_TSC_ADJUST MSR. If an execution of WRMSR to the IA32_TSC_ADJUST MSR adds (or subtracts) value X from that MSR, the logical processor also adds (or subtracts) value X from the TSC.

Unlike the TSC, the value of the IA32_TSC_ADJUST MSR changes only in response to WRMSR (either to the MSR itself, or to the IA32_TIME_STAMP_COUNTER MSR). Its value does not otherwise change as time elapses. Software seeking to adjust the TSC can do so by using WRMSR to write the same value to the IA32_TSC_ADJUST MSR on each logical processor. Processor support for the IA32_TSC_ADJUST MSR is indicated by CPUID.(EAX=07H, ECX=0H):EBX.TSC_ADJUST (bit 1). ...

17.Updates to Chapter 18, Volume 3B


Change bars show changes to Chapter 18 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------...

18.1

PERFORMANCE MONITORING OVERVIEW

Performance monitoring was introduced in the Pentium processor with a set of model-specific performance-monitoring counter MSRs. These counters permit selection of processor performance parameters to be monitored and measured. The information obtained from these counters can be used for tuning system and compiler performance. In Intel P6 family of processors, the performance monitoring mechanism was enhanced to permit a wider selection of events to be monitored and to allow greater control events to be monitored. Next, Pentium 4 and Intel Xeon processors introduced a new performance monitoring mechanism and new set of performance events. The performance monitoring mechanisms and performance events defined for the Pentium, P6 family, Pentium 4, and Intel Xeon processors are not architectural. They are all model specific (not compatible among processor families). Intel Core Solo and Intel Core Duo processors support a set of architectural performance events and a set of non-architectural performance events. Processors based on Intel Core microarchitecture and Intel Atom microarchitecture support enhanced architectural performance events and non-architectural performance events. Starting with Intel Core Solo and Intel Core Duo processors, there are two classes of performance monitoring capabilities. The first class supports events for monitoring performance using counting or sampling usage. These events are non-architectural and vary from one processor model to another. They are similar to those available in Pentium M processors. These non-architectural performance monitoring events are specific to the microarchitecture and may change with enhancements. They are discussed in Section 18.3, Performance Monitoring (Intel Core Solo and Intel Core Duo Processors). Non-architectural events for a given microarchitecture can not be enumerated using CPUID; and they are listed in Chapter 19, Performance-Monitoring Events. The second class of performance monitoring capabilities is referred to as architectural performance monitoring. This class supports the same counting and sampling usages, with a smaller set of available events. The visible behavior of architectural performance events is consistent across processor implementations. Availability of architectural performance monitoring capabilities is enumerated using the CPUID.0AH. These events are discussed in Section 18.2. See also: Section 18.2, Architectural Performance Monitoring

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

124

Section 18.3, Performance Monitoring (Intel Core Solo and Intel Core Duo Processors) Section 18.4, Performance Monitoring (Processors Based on Intel Core Microarchitecture) Section 18.5, Performance Monitoring (Processors Based on Intel Atom Microarchitecture) Section 18.6, Performance Monitoring for Processors Based on Intel Microarchitecture Code Name Nehalem Section 18.7, Performance Monitoring for Processors Based on Intel Microarchitecture Code Name Westmere Section 18.8, Performance Monitoring for Processors Based on Intel Microarchitecture Code Name Sandy Bridge Section 18.8.8, Intel Xeon Processor E5 Family Uncore Performance Monitoring Facility Section 18.9, 3rd Generation Intel Core Processor Performance Monitoring Facility Section 18.10, Next Generation Intel Core Processor Performance Monitoring Facility Section 18.11, Performance Monitoring (Processors Based on Intel NetBurst Microarchitecture) Section 18.12, Performance Monitoring and Intel Hyper-Threading Technology in Processors Based on Intel NetBurst Microarchitecture Section 18.15, Performance Monitoring and Dual-Core Technology Section 18.16, Performance Monitoring on 64-bit Intel Xeon Processor MP with Up to 8-MByte L3 Cache Section 18.18, Performance Monitoring (P6 Family Processor) Section 18.19, Performance Monitoring (Pentium Processors) ...

18.2.2.2

Architectural Performance Monitoring Version 3 Facilities

The facilities provided by architectural performance monitoring version 1 and 2 are also supported by architectural performance monitoring version 3. Additionally version 3 provides enhancements to support a processor core comprising of more than one logical processor, i.e. a processor core supporting Intel Hyper-Threading Technology or simultaneous multi-threading capability. Specifically, CPUID leaf 0AH provides enumeration mechanisms to query: The number of general-purpose performance counters (IA32_PMCx) is reported in CPUID.0AH:EAX[15:8], the bit width of general-purpose performance counters (see also Section 18.2.1.1) is reported in CPUID.0AH:EAX[23:16]. The bit vector representing the set of architectural performance monitoring events supported (see Section 18.2.3) The number of fixed-function performance counters, the bit width of fixed-function performance counters (see also Section 18.2.2.1). Each general-purpose performance counter IA32_PMCx (starting at MSR address 0C1H) is associated with a corresponding IA32_PERFEVTSELx MSR (starting at MSR address 186H). The Bit field layout of IA32_PERFEVTSELx MSRs is defined architecturally in Figure 18-6.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

125

63

31

24 23 22 21 20 19 18 17 16 15

8 7 Event Select

A I U Counter Mask I E N N N P E O S Unit Mask (UMASK) (CMASK) S R V N Y T C

INVInvert counter mask ENEnable counters ANYAny Thread INTAPIC interrupt enable PCPin control EEdge detect OSOperating system mode USRUser Mode

Reserved

Figure 18-6 Layout of IA32_PERFEVTSELx MSRs Supporting Architectural Performance Monitoring Version 3
Bit 21 (AnyThread) of IA32_PERFEVTSELx is supported in architectural performance monitoring version 3. When set to 1, it enables counting the associated event conditions (including matching the threads CPL with the OS/USR setting of IA32_PERFEVTSELx) occurring across all logical processors sharing a processor core. When bit 21 is 0, the counter only increments the associated event conditions (including matching the threads CPL with the OS/USR setting of IA32_PERFEVTSELx) occurring in the logical processor which programmed the IA32_PERFEVTSELx MSR. Each fixed-function performance counter IA32_FIXED_CTRx (starting at MSR address 309H) is configured by a 4-bit control block in the IA32_PERF_FIXED_CTR_CTRL MSR. The control block also allow thread-specificity configuration using an AnyThread bit. The layout of IA32_PERF_FIXED_CTR_CTRL MSR is shown.

63

12 11
P A M N I Y

9 8 7
E N P A M N I Y

5 43 2 1 0
E N P A M N I Y E N

Cntr2 Controls for IA32_FIXED_CTR2 Cntr1 Controls for IA32_FIXED_CTR1 PMI Enable PMI on overflow on IA32_FIXED_CTR0 AnyThread AnyThread for IA32_FIXED_CTR0 ENABLE IA32_FIXED_CTR0. 0: disable; 1: OS; 2: User; 3: All ring levels

Reserved

Figure 18-7 Layout of IA32_FIXED_CTR_CTRL MSR Supporting Architectural Performance Monitoring Version 3
Each control block for a fixed-function performance counter provides a AnyThread (bit position 2 + 4*N, N= 0, 1, etc.) bit. When set to 1, it enables counting the associated event conditions (including matching the threads CPL with the ENABLE setting of the corresponding control block of IA32_PERF_FIXED_CTR_CTRL) occurring across all logical processors sharing a processor core. When an AnyThread bit is 0 in IA32_PERF_FIXED_CTR_CTRL, the corresponding fixed counter only increments the associated event conditions occurring in the logical processor which programmed the IA32_PERF_FIXED_CTR_CTRL MSR. The IA32_PERF_GLOBAL_CTRL, IA32_PERF_GLOBAL_STATUS, IA32_PERF_GLOBAL_OVF_CTRL MSRs provide single-bit controls/status for each general-purpose and fixed-function performance counter. Figure 18-8 and Figure 18-9 show the layout of these MSRs for N general-purpose performance counters (where N is reported by CPUID.0AH:EAX[15:8]) and three fixed-function counters.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

126

Note: The Intel Atom processor family supports two general-purpose performance monitoring counters (i.e. N =2 in Figure 18-9), other processor families in Intel 64 architecture may support a different value of N in Figure 18-9. The number N is reported by CPUID.0AH:EAX[15:8]. The Intel Core i7 processor supports four general-purpose performance monitoring counters (i.e. N =4 in Figure 18-9).

63

Global Enable Controls IA32_PERF_GLOBAL_CTRL 35 34 33 32 31

N .. .. 1 0

IA32_FIXED_CTR2 enable IA32_FIXED_CTR1 enable IA32_FIXED_CTR0 enable IA32_PMC(N-1) enable .................... enable IA32_PMC1 enable IA32_PMC0 enable

Reserved

Figure 18-8 Layout of Global Performance Monitoring Control MSR

63 62

Global Overflow Status IA32_PERF_GLOBAL_STATUS 35 34 33 32 31

N .. .. 1 0

CondChgd OvfBuffer IA32_FIXED_CTR2 Overflow IA32_FIXED_CTR1 Overflow IA32_FIXED_CTR0 Overflow IA32_PMC1 Overflow IA32_PMC0 Overflow

IA32_PMC(N-1) Overflow ...................... Overflow

63 62

Global Overflow Status IA32_PERF_GLOBAL_OVF_CTRL 35 34 33 32 31

N .. .. 1 0

ClrCondChgd ClrOvfBuffer IA32_FIXED_CTR2 ClrOverflow IA32_FIXED_CTR1 ClrOverflow IA32_FIXED_CTR0 ClrOverflow IA32_PMC1 ClrOverflow IA32_PMC0 ClrOverflow IA32_PMC(N-1) ClrOverflow ........................ ClrOverflow

Figure 18-9 Global Performance Monitoring Overflow Status and Control MSRs
...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

127

18.6.1.1

Precise Event Based Sampling (PEBS)

All four general-purpose performance counters, IA32_PMCx, can be used for PEBS if the performance event supports PEBS. Software uses IA32_MISC_ENABLE[7] and IA32_MISC_ENABLE[12] to detect whether the performance monitoring facility and PEBS functionality are supported in the processor. The MSR IA32_PEBS_ENABLE provides 4 bits that software must use to enable which IA32_PMCx overflow condition will cause the PEBS record to be captured. Additionally, the PEBS record is expanded to allow latency information to be captured. The MSR IA32_PEBS_ENABLE provides 4 additional bits that software must use to enable latency data recording in the PEBS record upon the respective IA32_PMCx overflow condition. The layout of IA32_PEBS_ENABLE for processors based on Intel microarchitecture code name Nehalem is shown in Figure 18-15. When a counter is enabled to capture machine state (PEBS_EN_PMCx = 1), the processor will write machine state information to a memory buffer specified by software as detailed below. When the counter IA32_PMCx overflows from maximum count to zero, the PEBS hardware is armed.

63

36 3534 33 32 31

8 7 6 5 43 2 1 0

LL_EN_PMC3 (R/W) LL_EN_PMC2 (R/W) LL_EN_PMC1 (R/W) LL_EN_PMC0 (R/W) PEBS_EN_PMC3 (R/W) PEBS_EN_PMC2 (R/W) PEBS_EN_PMC1 (R/W) PEBS_EN_PMC0 (R/W)

Reserved

RESET Value 0x00000000_00000000

Figure 18-15 Layout of IA32_PEBS_ENABLE MSR


Upon occurrence of the next PEBS event, the PEBS hardware triggers an assist and causes a PEBS record to be written. The format of the PEBS record is indicated by the bit field IA32_PERF_CAPABILITIES[11:8] (see Figure 18-40). The behavior of PEBS assists is reported by IA32_PERF_CAPABILITIES[6] (see Figure 18-40). The return instruction pointer (RIP) reported in the PEBS record will point to the instruction after (+1) the instruction that causes the PEBS assist. The machine state reported in the PEBS record is the machine state after the instruction that causes the PEBS assist is retired. For instance, if the instructions: mov eax, [eax] ; causes PEBS assist nop are executed, the PEBS record will report the address of the nop, and the value of EAX in the PEBS record will show the value read from memory, not the target address of the read operation. The PEBS record format is shown in Table 18-12, and each field in the PEBS record is 64 bits long. The PEBS record format, along with debug/store area storage format, does not change regardless of IA-32e mode is active or not. CPUID.01H:ECX.DTES64[bit 2] reports whether the processor's DS storage format support is mode-independent. When set, it uses 64-bit DS storage format. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

128

18.6.1.3

Off-core Response Performance Monitoring in the Processor Core

Programming a performance event using the off-core response facility can choose any of the four IA32_PERFEVTSELx MSR with specific event codes and predefine mask bit value. Each event code for off-core response monitoring requires programming an associated configuration MSR, MSR_OFFCORE_RSP_0. There is only one off-core response configuration MSR. Table 18-14 lists the event code, mask value and additional offcore configuration MSR that must be programmed to count off-core response events using IA32_PMCx.

Table 18-14 Off-Core Response Event Encoding


Event code in IA32_PERFEVTSELx 0xB7 Mask Value in IA32_PERFEVTSELx 0x01 Required Off-core Response MSR MSR_OFFCORE_RSP_0 (address 0x1A6)

The layout of MSR_OFFCORE_RSP_0 is shown in Figure 18-18. Bits 7:0 specifies the request type of a transaction request to the uncore. Bits 15:8 specifies the response of the uncore subsystem.

63

15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0

RESPONSE TYPE NON_DRAM (R/W) RESPONSE TYPE LOCAL_DRAM (R/W) RESPONSE TYPE REMOTE_DRAM (R/W) RESPONSE TYPE REMOTE_CACHE_FWD (R/W) RESPONSE TYPE RESERVED RESPONSE TYPE OTHER_CORE_HITM (R/W) RESPONSE TYPE OTHER_CORE_HIT_SNP (R/W) RESPONSE TYPE UNCORE_HIT (R/W) REQUEST TYPE OTHER (R/W) REQUEST TYPE PF_IFETCH (R/W) REQUEST TYPE PF_RFO (R/W) REQUEST TYPE PF_DATA_RD (R/W) REQUEST TYPE WB (R/W) REQUEST TYPE DMND_IFETCH (R/W) REQUEST TYPE DMND_RFO (R/W) REQUEST TYPE DMND_DATA_RD (R/W)

Reserved

RESET Value 0x00000000_00000000

Figure 18-18 Layout of MSR_OFFCORE_RSP_0 and MSR_OFFCORE_RSP_1 to Configure Off-core Response Events
...

18.8

PERFORMANCE MONITORING FOR PROCESSORS BASED ON INTEL MICROARCHITECTURE CODE NAME SANDY BRIDGE

Intel Core i7-2xxx, Intel Core i5-2xxx, Intel Core i3-2xxx processor series, and Intel Xeon processor E3-1200 family are based on Intel microarchitecture code name Sandy Bridge; this section describes the performance monitoring facilities provided in the processor core. The core PMU supports architectural performance

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

129

monitoring capability with version ID 3 (see Section 18.2.2.2) and a host of non-architectural monitoring capabilities. Architectural performance monitoring events and non-architectural monitoring events are programmed using fixed counters and programmable counters/event select MSRS described in Section 18.2.2.2. The core PMUs capability is similar to those described in Section 18.6.1 and Section 18.7, with some differences and enhancements relative to Intel microarchitecture code name Westmere summarized in Table 18-19.

Table 18-19 Core PMU Comparison


Box # of Fixed counters per thread # of general-purpose counters per core Counter width (R,W) # of programmable counters per thread Precise Event Based Sampling (PEBS) Events PEBS-Load Latency Sandy Bridge 3 8 R:48 , W: 32/48 4 or (8 if a core not shared by two threads) See Table 18-21 See Section 18.8.4.2; Data source encoding, STLB miss encoding, Lock transaction encoding PEBS-Precise Store PEBS-PDIR Off-core Response Event Section 18.8.4.3 yes (using precise INST_RETIRED.ALL) MSR 1A6H and 1A7H; Extended request and response types No No MSR 1A6H and 1A7H, limited response types Nehalem supports 1A6H only. Westmere 3 8 R:48, W:32 4 See Table 18-10 Data source encoding See Section 18.2.2.3. Use CPUID to enumerate # of counters. IA32_PMC4-IA32_PMC7 do not support PEBS. Comment Use CPUID to enumerate # of counters.

...

18.8.5

Off-core Response Performance Monitoring

The core PMU in processors based on Intel microarchitecture code name Sandy Bridge provides off-core response facility similar to prior generation. Off-core response can be programmed only with a specific pair of event select and counter MSR, and with specific event codes and predefine mask bit value in a dedicated MSR to specify attributes of the off-core transaction. Two event codes are dedicated for off-core response event programming. Each event code for off-core response monitoring requires programming an associated configuration MSR, MSR_OFFCORE_RSP_x. Table 18-24 lists the event code, mask value and additional off-core configuration MSR that must be programmed to count off-core response events using IA32_PMCx.

Table 18-24 Off-Core Response Event Encoding


Counter PMC0-3 PMC0-3 Event code 0xB7 0xBB UMask 0x01 0x01 Required Off-core Response MSR MSR_OFFCORE_RSP_0 (address 0x1A6) MSR_OFFCORE_RSP_1 (address 0x1A7)

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

130

The layout of MSR_OFFCORE_RSP_0 and MSR_OFFCORE_RSP_1 are shown in Figure 18-30 and Figure 18-31. Bits 15:0 specifies the request type of a transaction request to the uncore. Bits 30:16 specifies supplier information, bits 37:31 specifies snoop response information. ...

18.8.7

Intel Xeon Processor E5 Family Performance Monitoring Facility

The Intel Xeon Processor E5 Family (and Intel Core i7-3930K Processor) are based on Intel microarchitecture code name Sandy Bridge. While the processor cores share the same microarchitecture as those of the Intel Xeon Processor E3 Family and second generation Intel Core i7-2xxx, Intel Core i5-2xxx, Intel Core i3-2xxx processor series, the uncore subsystems are different. An overview of the uncore performance monitoring facilities of the Intel Xeon processor E5 family (and Intel Core i7-3930K processor) is described in Section 18.8.8. Thus, the performance monitoring facilities in the processor core generally are the same as those described in Section 18.8 through Section 18.8.5. However, the MSR_OFFCORE_RSP_0/MSR_OFFCORE_RSP_1 Response Supplier Info field shown in Table 18-26 applies to Intel Core Processors with CPUID signature of DisplayFamily_DisplayModel encoding of 06_2AH; Intel Xeon processor with CPUID signature of DisplayFamily_DisplayModel encoding of 06_2DH supports an additional field for remote DRAM controller shown in Table 18-29. Additionally, the are some small differences in the non-architectural performance monitoring events (see Table 19-7).

Table 18-29 MSR_OFFCORE_RSP_x Supplier Info Field Definitions


Subtype Common Supplier Info Bit Name Any NO_SUPP LLC_HITM LLC_HITE LLC_HITS LLC_HITF LOCAL Remote ... Offset 16 17 18 19 20 21 22 30:23 Description (R/W). Catch all value for any response types. (R/W). No Supplier Information available (R/W). M-state initial lookup stat in L3. (R/W). E-state (R/W). S-state (R/W). F-state (R/W). Local DRAM Controller (R/W): Remote DRAM Controller (either all 0s or all 1s)

18.9

3RD GENERATION INTEL CORE PROCESSOR PERFORMANCE MONITORING FACILITY

The 3rd Generation Intel Core Processor Family and Intel Xeon Processor E3-1200v2 Product Family are based on Intel microarchitecture code name Ivy Bridge. The performance monitoring facilities in the processor core generally are the same as those described in Section 18.8 through Section 18.8.5. The non-architectural performance monitoring events supported by the processor core are listed in Table 19-7.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

131

18.10

NEXT GENERATION INTEL CORE PROCESSOR PERFORMANCE MONITORING FACILITY

The Next Generation Intel Core processor is based on Intel microarchitecture code name Haswell. The core PMU supports architectural performance monitoring capability with version ID 3 (see Section 18.2.2.2) and a host of non-architectural monitoring capabilities. Architectural performance monitoring events and non-architectural monitoring events are programmed using fixed counters and programmable counters/event select MSRS as described in Section 18.2.2.2. The core PMUs capability is similar to those described in Section 18.8, with some differences and enhancements summarized in Table 18-31.

Table 18-31 Core PMU Comparison


Box # of Fixed counters per thread # of general-purpose counters per core Counter width (R,W) # of programmable counters per thread Precise Event Based Sampling (PEBS) Events PEBS-Load Latency PEBS-Precise Store PEBS-PDIR PEBS-EventingIP Data Address Profiling LBR Profiling Call Stack Profiling Off-core Response Event Intel TSX support for Perfmon Haswell 3 8 R:48 , W: 32/48 4 or (8 if a core not shared by two threads) See Table 18-21 See Section 18.8.4.2; No, replaced by Data Address profiling yes (using precise INST_RETIRED.ALL) yes yes yes yes, see Section 17.8 MSR 1A6H and 1A7H; Extended request and response types See Section 18.10.5; Sandy Bridge 3 8 R:48 , W: 32/48 4 or (8 if a core not shared by two threads) See Table 18-21 See Section 18.8.4.2; Section 18.8.4.3 yes (using precise INST_RETIRED.ALL) no no yes no MSR 1A6H and 1A7H; Extended request and response types no Use LBR facility See Section 18.2.2.3. Use CPUID to enumerate # of counters. IA32_PMC4-IA32_PMC7 do not support PEBS. Comment

18.10.1 Precise Event Based Sampling (PEBS) Facility


The PEBS facility in the Next Generation Intel Core processor is similar to those in processors based on Intel microarchitecture code name Sandy Bridge, with several enhanced features. The key components and differences of PEBS facility relative to Intel microarchitecture code name Sandy Bridge is summarized in Table 18-32.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

132

Table 18-32 PEBS Facility Comparison


Box Valid IA32_PMCx PEBS Buffer Programming IA32_PEBS_ENABLE Layout PEBS record layout PEBS Events PEBS-Load Latency PEBS-Precise Store PEBS-PDIR SAMPLING Restriction Haswell PMC0-PMC3 Section 18.6.1.1 Figure 18-29 Table 18-33, Enhanced fields at offsets 98H, A0H, A8H, B0H See Table 18-21 See Table 18-22 no, replaced by data address profiling yes Sandy Bridge PMC0-PMC3 Section 18.6.1.1 Figure 18-15 Table 18-12, Enhanced fields at offsets 98H, A0H, A8H See Table 18-21 Table 18-22 yes; see Section 18.8.4.3 yes IA32_PMC1 only IA32_PMC4-IA32_PMC7 do not support PEBS. Comment No PEBS on PMC4-PMC7 Unchanged

Small SAV(CountDown) value incur higher overhead than prior generation.

Only IA32_PMC0 through IA32_PMC3 support PEBS.

NOTE
PEBS events are only valid when the following fields of IA32_PERFEVTSELx are all zero: AnyThread, Edge, Invert, CMask.

18.10.2 PEBS Data Format


The PEBS record format for the Next Generation Intel Core processor is shown in Table 18-33. The PEBS record format, along with debug/store area storage format, does not change regardless of whether IA-32e mode is active or not. CPUID.01H:ECX.DTES64[bit 2] reports whether the processor's DS storage format support is modeindependent. When set, it uses 64-bit DS storage format.

Table 18-33 PEBS Record Format for Next Generation Intel Core Processor Family
Byte Offset 0x0 0x8 0x10 0x18 0x20 0x28 0x30 0x38 0x40 0x48 Field R/EFLAGS R/EIP R/EAX R/EBX R/ECX R/EDX R/ESI R/EDI R/EBP R/ESP Byte Offset 0x60 0x68 0x70 0x78 0x80 0x88 0x90 0x98 0xA0 0xA8 Field R10 R11 R12 R13 R14 R15 IA32_PERF_GLOBAL_STATUS Data Linear Address Data Source Encoding Latency value (core cycles)

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

133

Table 18-33 PEBS Record Format for Next Generation Intel Core Processor Family
Byte Offset 0x50 0x58 Field R8 R9 Byte Offset 0xB0 Field EventingIP

The layout of PEBS records are almost identical to those shown in Table 18-12. Offset 0xB0 is a new field that records the eventing IP address of the retired instruction that triggered the PEBS assist. The PEBS records at offsets 0x98, 0xA0, and 0xAB record data gathered from three of the PEBS capabilities in prior processor generations: load latency facility (Section 18.8.4.2), PDIR (Section 18.8.4.4), and precise store (Section 18.8.4.3). In the core PMU of the next generation processor, load latency facility and PDIR capabilities are unchanged. However, precise store is replaced by an enhanced capability, data address profiling, that is not restricted to store address. Data address profiling also records information in PEBS records at offsets 0x98, 0xA0, and 0xAB.

18.10.3 PEBS Data Address Profiling


The Data Linear Address facility is also abbreviated as DataLA. The facility is a replacement or extension of the precise store facility in previous processor generations. The DataLA facility complements the load latency facility by providing a means to profile load and store memory references in the system, leverages the PEBS facility, and provides additional information about sampled loads and stores. Having precise memory reference events with linear address information for both loads and stores provides information to improve data structure layout, eliminate remote node references, and identify cache-line conflicts in NUMA systems. The DataLA facility in the next generation processor supports the following events configured to use PEBS:

Table 18-34 Precise Events That Supports Data Linear Address Profiling
Event Name MEM_UOPS_RETIRED.STLB_MISS_LOADS MEM_UOPS_RETIRED.LOCK_LOADS MEM_UOPS_RETIRED.SPLIT_LOADS MEM_UOPS_RETIRED.ALL_LOADS MEM_LOAD_UOPS_RETIRED.L1_HIT MEM_LOAD_UOPS_RETIRED.LLC_HIT MEM_LOAD_UOPS_RETIRED.L2_MISS MEM_LOAD_UOPS_RETIRED.HIT_LFB MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HIT UOPS_RETIRED.ALL (if load or store is tagged) MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_NONE MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM_SNP_HIT MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM_SNP_HIT MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_FWD MEM_LOAD_UOPS_MISC_RETIRED.LLC_MISS Event Name MEM_UOPS_RETIRED.STLB_MISS_STORES MEM_UOPS_RETIRED.LOCK_STORES MEM_UOPS_RETIRED.SPLIT_STORES MEM_UOPS_RETIRED.ALL_STORES MEM_LOAD_UOPS_RETIRED.L2_HIT MEM_LOAD_UOPS_RETIRED.L1_MISS MEM_LOAD_UOPS_RETIRED.LLC_MISS MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_MISS MEM_LOAD_UOPS_LLC_HIT_RETIRED.XSNP_HITM MEM_LOAD_UOPS_MISC_RETIRED.UC MEM_LOAD_UOPS_LLC_MISS_RETIRED.LOCAL_DRAM MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_DRAM MEM_LOAD_UOPS_LLC_MISS_RETIRED.REMOTE_HITM MEM_LOAD_UOPS_MISC_RETIRED.NON_DRAM

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

134

DataLA can use any one of the IA32_PMC0-IA32_PMC3 counters. Counter overflows will initiate the generation of PEBS records. Upon counter overflow, hardware captures the linear address and possible other status information of the retiring memory uop. This information is then written to the PEBS record that is subsequently generated. To enable the DataLA facility, software must complete the following steps. Please note that the DataLA facility relies on the PEBS facility, so the PEBS configuration requirements must be completed before attempting to capture DataLA information. Complete the PEBS configuration steps. Program the an event listed in Table 18-34 using any one of IA32_PERFEVTSEL0-IA32_PERFEVTSEL3. Set the corresponding IA32_PEBS_ENABLE.PEBS_EN_CTRx bit and IA32_PEBS_ENABLE[63]. This enables the corresponding IA32_PMCx as a PEBS counter and enables the DataLA facility, respectively.

When the DataLA facility is enabled, the relevant information written into a PEBS record affects entries at offsets 98H, A0H and A8H, as shown in Table 18-35.

Table 18-35 Layout of Data Linear Address Information In PEBS Record


Field Data Linear Address Store Status Offset 98H A0H Description The linear address of the load or the destination of the store. DCU Hit (Bit 0): The store hit the data cache closest to the core (L1 cache) if this bit is set, otherwise the store missed the data cache. This information is valid only for the following store events: UOPS_RETIRED.ALL (if store is tagged), MEM_UOPS_RETIRED.STLB_MISS_STORES, MEM_UOPS_RETIRED.LOCK_STORES, MEM_UOPS_RETIRED.SPLIT_STORES, MEM_UOPS_RETIRED.ALL_STORES Other bits are zero, The STLB_MISS, LOCK bit information can be obtained by programming the corresponding store event in Table 18-34. Always zero

Reserved

A8H

18.10.3.1 EventingIP Record


The PEBS record layout for processors based on Intel microarchitecture code name Haswell adds a new field at offset 0B0H. This is the eventingIP field that records the IP address of the retired instruction that triggered the PEBS assist. The EIP/RIP field at offset 08H records the IP address of the next instruction to be executed following the PEBS assist.

18.10.4 Off-core Response Performance Monitoring


The core PMU facility to collect off-core response events are similar to those described in Section 18.8.5. The event codes are listed in Table 18-24. Each event code for off-core response monitoring requires programming an associated configuration MSR, MSR_OFFCORE_RSP_x. Software must program MSR_OFFCORE_RSP_x according to: Transaction request type encoding (bits 15:0): see Table 18-36. Supplier information (bits 30:16): see Table 18-26. Snoop response information (bits 37:31): see Table 18-27.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

135

Table 18-36 MSR_OFFCORE_RSP_x Request_Type Definition (Haswell)


Bit Name DMND_DATA_RD Offset Description 0 (R/W). Counts the number of demand and DCU prefetch data reads of full and partial cachelines as well as demand data page table entry cacheline reads. Does not count L2 data read prefetches or instruction fetches. (R/W). Counts the number of demand and DCU prefetch reads for ownership (RFO) requests generated by a write to data cacheline. Does not count L2 RFO prefetches. (R/W). Counts the number of demand and DCU prefetch instruction cacheline reads. Does not count L2 code read prefetches. Reserved (R/W). Counts the number of data cacheline reads generated by L2 prefetchers. (R/W). Counts the number of RFO requests generated by L2 prefetchers. (R/W). Counts the number of code reads generated by L2 prefetchers. Reserved (R/W). Any other request that crosses IDI, including I/O.

DMND_RFO DMND_IFETCH Reserved PF_DATA_RD PF_RFO PF_IFETCH Reserved OTHER

1 2 3 4 5 6 7-14 15

18.10.5 Performance Monitoring and Intel TSX


Intel TSX allows multi-threaded program to make forward progress with less synchronization overhead. If a target workload for performance monitoring contains instruction streams using Intel TSX, the transaction code regions in the workload may encounter the following scenarios: (a) The transactional code on some logical processors may execute speculatively and commit results with synchronization overhead elided, or (b) the speculatively executed transaction code aborts and the transactional code will restart normal execution experiencing the cost of the synchronization primitive. For details of transactional code behavior of Intel TSX, see Chapter 8 of Intel Architecture Instruction Set Extensions Programming Reference. If a processor supports Intel TSX, the core PMU enhances its IA32_PERFEVTSELx MSR with two additional bit fields for event filtering. Support for Intel TSX is indicated by either (a) CPUID.(EAX=7, ECX=0):RTM[bit 11]=1, or (b) if CPUID.07H.EBX.HLE [bit 4] = 1. The TSX-enhanced layout of IA32_PERFEVTSELx is shown in Figure 1834. The two additional bit fields are: IN_TX (bit 32): When set, the counter will only include counts that occurred inside a transactional region, regardless of whether that region was aborted or committed. This bit may only be set if the processor supports HLE or RTM. IN_TXCP (bit 33): When set, the counter will not include counts that occurred inside of an aborted transactional region. This bit may only be set if the processor supports HLE or RTM. This bit may only be set for IA32_PERFEVTSEL2.

When the IA32_PERFEVTSELx MSR is programmed with both IN_TX=0 and IN_TXCP=0 on a processor that supports Intel TSX, the result in a counter may include detectable conditions associated with a transaction code region for its aborted execution (if any) and completed execution.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

136

63

34

31

24 23 22 21 20 19 18 17 16 15

8 7 Event Select

A I U Counter Mask I E N N N P E O S Unit Mask (UMASK) (CMASK) S R V N Y T C

IN_TXCPIn Tx exclude abort IN_TXIn Trans. Rgn INVInvert counter mask ENEnable counters ANYAny Thread INTAPIC interrupt enable PCPin control EEdge detect OSOperating system mode USRUser Mode

Reserved

Figure 18-34 Layout of IA32_PERFEVTSELx MSRs Supporting Intel TSX


A common usage of setting IN_TXCP=1 is to capture the number of events that were discarded due to a transactional abort. With IA32_PMC2 configured to count in such a manner, then when a TX region aborts, the value for that counter is restored to the value it had prior to the aborted transactional region. As a result, any updates performed to the counter during the aborted transactional region are discarded. On the other hand, setting IN_TX=1 can be used to drill down on the performance characteristics of transactional code regions. When a PMCx is configured with the corresponding IA32_PERFEVTSELx.IN_TX=1, only eventing conditions that occur inside transactional code regions are propagated to the event logic and reflected in the counter result. Eventing conditions specified by IA32_PERFEVTSELx but occurring outside a transactional code region are discarded. The following example illustrates using three counters to drill down cycles spent inside and outside of transactional regions: Program IA32_PERFEVTSEL2 to count Unhalted_Core_Cycles with (IN_TXCP=1, IN_TX=0), such that IA32_PMC2 will count cycles spent due to aborted TSX transactions; Program IA32_PERFEVTSEL0 to count Unhalted_Core_Cycles with (IN_TXCP=0, IN_TX=1), such that IA32_PMC0 will count cycles spent by the transactional code regions; Program IA32_PERFEVTSEL1 to count Unhalted_Core_Cycles with (IN_TXCP=0, IN_TX=0), such that IA32_PMC1 will count total cycles spent by the non-transactional code and transactional code regions.

Additionally, a number of performance events are solely focused on characterizing the execution of Intel TSX transactional code, they are listed in Table 19-3.

18.10.5.1 Intel TSX and PEBS Support


If a PEBS event would have occurred inside a transactional region, then the transactional region first aborts, and then the PEBS event is processed. Two of the TSX performance monitoring events in Table 19-3 also support using PEBS facility to capture additional information. They are: HLE_RETIRED.ABORT ED (encoding 0xc8 mask 0x4), RTM_RETIRED.ABORTED (encoding 0xc9 mask 0x4).

A transactional abort (HLE_RETIRED.ABORTED,RTM_RETIRED.ABORTED) can also be programmed to cause PEBS events. In this scenario, a PEBS event is processed following the abort. Pending a PEBS record inside of a transactional region will cause a transactional abort. If a PEBS record was pended at the time of the abort or on an overflow of the TSX PEBS events listed above, only the following PEBS

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

137

entries will be valid (enumerated by PEBS entry offset 0xB8 bits[33:32] to indicate an HLE abort or an RTM abort): Offset 0x98 Data Linear Address (if the uop that triggered PEBS was a load or a store), Offset 0xB0 EventingIP, Offset 0xB8 TX Abort Information

In the case of HLE, an aborted transaction will restart execution deterministically at the start of the HLE region. In the case of RTM, an aborted transaction will transfer execution to the RTM fallback handler. The layout of the TX Abort Information field is given in Table 18-37.

Table 18-37 TX Abort Information Field Definition


Bit Name Cycles_Last_Block HLE_Abort RTM_Abort Instruction_Abort Non_Instruction_Abort Retry Memory_Data_Conflict Capacity Reserved Offset 31:0 32 33 34 35 36 37 38 63:39 Description the number of cycles in the last TSX region, regardless of whether that region had aborted or committed. If set, the abort information corresponds to an aborted HLE execution If set, the abort information corresponds to an aborted RTM execution If set, the transactional abort was associated with the instruction corresponding to the eventing IP If set, the instruction corresponding to the eventing IP may not necessarily be related to the transactional abort. If set, retrying the transactional execution may have succeeded. This value matches the RTM Abort Status Information in EAX bit[1] If set, another logical processor conflicted with a memory address that was part of the transactional region that aborted. Matches RTM Abort Encoding EAX bit[2] Matches RTM Abort Encoding EAX bit[3] Reserved

18.10.6 Uncore Performance Monitoring Facilities in Next Generation Intel Core Processors
The uncore sub-system in the Next Generation Intel Core processors provides its own performance monitoring facility. The uncore PMU facility provides dedicated MSRs to select uncore performance monitoring events in a similar manner as those described in Section 18.8.6. The ARB unit and each C-Box provide local pairs of event select MSR and counter register. The layout of the event select MSRs in the C-Boxes are identical as shown in Figure 18-32. At the uncore domain level, there is a master set of control MSRs that centrally manages all the performance monitoring facility of uncore units. Figure 18-33 shows the layout of the uncore domain global control. Additionally, there is also a fixed counter, counting uncore clockticks, for the uncore domain. Table 18-28 summarizes the number MSRs for uncore PMU for each box.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

138

Table 18-37 Uncore PMU MSR Summary


Box C-Box ARB Fixed Counter # of Boxes SKU specific 1 N.A. Counters per Box 2 2 N.A. Counter Width 44 44 48 General Purpose Yes Yes No Global Enable Per-box Uncore Uncore Comment Up to 4, seeTable 35-12 MSR_UNC_CBO_CONFIG

The uncore performance events for the C-Box and ARB units are listed in Table 19-4. ...

18.Updates to Chapter 19, Volume 3B


Change bars show changes to Chapter 19 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3B: System Programming Guide, Part 2. -----------------------------------------------------------------------------------------... This chapter lists the performance-monitoring events that can be monitored with the Intel 64 or IA-32 processors. The ability to monitor performance events and the events that can be monitored in these processors are mostly model-specific, except for architectural performance events, described in Section 19.1. Non-architectural performance events (i.e. model-specific events) are listed for each generation of microarchitecture: Section 19.2 - Next Generation Intel Core Processors Section 19.3 - Processors based on Intel microarchitecture code name Ivy Bridge Section 19.4 - Processors based on Intel microarchitecture code name Sandy Bridge Section 19.5 - Processors based on Intel microarchitecture code name Nehalem Section 19.6 - Processors based on Intel microarchitecture code name Westmere Section 19.7 - Processors based on Enhanced Intel Core microarchitecture Section 19.8 - Processors based on Intel Core microarchitecture Section 19.9 - Processors based on Intel Atom microarchitecture Section 19.10 - Intel Core Solo and Intel Core Duo processors Section 19.11 - Processors based on Intel NetBurst microarchitecture Section 19.12 - Pentium M family processors Section 19.13 - P6 family processors Section 19.14 - Pentium processors

NOTE
These performance-monitoring events are intended to be used as guides for performance tuning. The counter values reported by the performance-monitoring events are approximate and believed to be useful as relative guides for tuning software. Known discrepancies are documented where applicable. ...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

139

19.2

PERFORMANCE MONITORING EVENTS FOR NEXT GENERATION INTEL CORE PROCESSORS

The Next Generation Intel Core Processors are based on the Intel microarchitecture code name Haswell. They support the architectural performance-monitoring events listed in Table 19-1. Non-architectural performancemonitoring events in the processor core are listed in Table 19-5. The events in Table 19-5 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3CH and 06_45H.

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors
Event Num. 03H 05H 05H 07H 08H 08H 08H 08H 08H 08H 08H 08H 08H 0DH 0EH Umask Value 02H 01H 02H 01H 01H 02H 04H 0EH 10H 20H 40H 60H 80H 03H 01H Event Mask Mnemonic Description Comment

LD_BLOCKS.STORE_FORWARD loads blocked by overlapping with store buffer that cannot be forwarded . MISALIGN_MEM_REF.LOADS MISALIGN_MEM_REF.STORES Speculative cache-line split load uops dispatched to L1D. Speculative cache-line split Store-address uops dispatched to L1D.

LD_BLOCKS_PARTIAL.ADDRES False dependencies in MOB due to partial compare S_ALIAS on address. DTLB_LOAD_MISSES.MISS_CA USES_A_WALK Misses in all TLB levels that cause a page walk of any page size.

DTLB_LOAD_MISSES.WALK_CO Completed page walks due to demand load misses MPLETED_4K that caused 4K page walks in any TLB levels. DTLB_LOAD_MISSES.WALK_CO Completed page walks due to demand load misses MPLETED_2M_4M that caused 2M/4M page walks in any TLB levels. DTLB_LOAD_MISSES.WALK_CO Completed page walks in any TLB of any page size MPLETED due to demand load misses DTLB_LOAD_MISSES.WALK_DU Cycle PMH is busy with a walk. RATION DTLB_LOAD_MISSES.STLB_HIT Load misses that missed DTLB but hit STLB (4K). _4K DTLB_LOAD_MISSES.STLB_HIT Load misses that missed DTLB but hit STLB (2M). _2M DTLB_LOAD_MISSES.STLB_HIT Number of cache load STLB hits. No page walk. DTLB_LOAD_MISSES.PDE_CAC HE_MISS INT_MISC.RECOVERY_CYCLES UOPS_ISSUED.ANY DTLB demand load misses with low part of linearto-physical address translation missed Cycles waiting to recover after Machine Clears except JEClear. Set Cmask= 1. Increments each cycle the # of Uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles of this core. Set Edge to count occurrences Set Cmask = 1, Inv = 1to count stalled cycles

0EH

10H

UOPS_ISSUED.FLAGS_MERGE

Number of flags-merge uops allocated. Such uops adds delay.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

140

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. 0EH Umask Value 20H Event Mask Mnemonic UOPS_ISSUED.SLOW_LEA Description Number of slow LEA or similar uops allocated. Such uop has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not. Number of multiply packed/scalar single precision uops allocated. Comment

0EH 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 27H 2EH

40H 21H 41H E1H 42H 22H E2H 44H 24H 27H E7H E4H 50H 30H F8H 3FH FFH 50H 4FH

UOPS_ISSUED.SiNGLE_MUL

L2_RQSTS.DEMAND_DATA_RD Demand Data Read requests that missed L2, no _MISS rejects. L2_RQSTS.DEMAND_DATA_RD Demand Data Read requests that hit L2 cache. _HIT L2_RQSTS.ALL_DEMAND_DAT Counts any demand and L1 HW prefetch data load A_RD requests to L2. L2_RQSTS.RFO_HIT L2_RQSTS.RFO_MISS L2_RQSTS.ALL_RFO L2_RQSTS.CODE_RD_HIT L2_RQSTS.CODE_RD_MISS Counts the number of store RFO requests that hit the L2 cache. Counts the number of store RFO requests that miss the L2 cache. Counts all L2 store RFO requests. Number of instruction fetches that hit the L2 cache. Number of instruction fetches that missed the L2 cache. Demand requests to L2 cache. Counts all L2 code requests. Counts all L2 HW prefetcher requests that hit L2. Counts all L2 HW prefetcher requests that missed L2. Counts all L2 HW prefetcher requests. All requests that missed L2. All requests to L2 cache. Not rejected writebacks that hit L2 cache see Table 19-1

L2_RQSTS.ALL_DEMAND_MISS Demand requests that miss L2 cache. L2_RQSTS.ALL_DEMAND_REF ERENCES L2_RQSTS.ALL_CODE_RD L2_RQSTS.L2_PF_HIT L2_RQSTS.L2_PF_MISS L2_RQSTS.ALL_PF L2_RQSTS.MISS L2_RQSTS.REFERENCES L2_DEMAND_RQSTS.WB_HIT

LONGEST_LAT_CACHE.REFERE This event counts requests originating from the NCE core that reference a cache line in the last level cache. LONGEST_LAT_CACHE.MISS This event counts each cache miss condition for references to the last level cache.

2EH 3CH

41H 00H

see Table 19-1 see Table 19-1

CPU_CLK_UNHALTED.THREAD Counts the number of thread cycles while the _P thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. CPU_CLK_THREAD_UNHALTE D.REF_XCLK Increments at the frequency of XCLK (100 MHz) when not halted.

3CH

01H

see Table 19-1

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

141

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. 48H Umask Value 01H Event Mask Mnemonic L1D_PEND_MISS.PENDING Description Increments the number of outstanding L1D misses every cycle. Set Cmaks = 1 and Edge =1 to count occurrences. Comment Counter 2 only; Set Cmask = 1 to count cycles.

49H 49H 49H 49H 49H 49H 49H 49H 49H 4CH 4CH 51H 58H 58H 58H 58H 5CH 5CH 5EH 63H

01H 02H 04H 0EH 10H 20H 40H 60H 80H 01H 02H 01H 04H 08H 01H 02H 01H 02H 01H 01H

DTLB_STORE_MISSES.MISS_CA Miss in all TLB levels causes an page walk of any USES_A_WALK page size (4K/2M/4M/1G). DTLB_STORE_MISSES.WALK_C Completed page walks due to store misses in one or OMPLETED_4K more TLB levels of 4K page structure. DTLB_STORE_MISSES.WALK_C Completed page walks due to store misses in one or OMPLETED_2M_4M more TLB levels of 2M/4M page structure. DTLB_STORE_MISSES.WALK_C Completed page walks due to store miss in any TLB OMPLETED levels of any page size (4K/2M/4M/1G). DTLB_STORE_MISSES.WALK_D Cycles PMH is busy with this walk. URATION DTLB_STORE_MISSES.STLB_HI Store misses that missed DTLB but hit STLB (4K). T_4K DTLB_STORE_MISSES.STLB_HI Store misses that missed DTLB but hit STLB (2M). T_2M DTLB_STORE_MISSES.STLB_HI Store operations that miss the first TLB level but hit T the second and do not cause page walks. DTLB_STORE_MISSES.PDE_CA CHE_MISS LOAD_HIT_PRE.SW_PF LOAD_HIT_PRE.HW_PF L1D.REPLACEMENT DTLB store misses with low part of linear-tophysical address translation missed. Non-SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch. Non-SW-prefetch load dispatches that hit fill buffer allocated for H/W prefetch. Counts the number of lines brought into the L1 data cache.

MOVE_ELIMINATION.INT_NOT_ Number of integer Move Elimination candidate uops ELIMINATED that were not eliminated. MOVE_ELIMINATION.SIMD_NO T_ELIMINATED Number of SIMD Move Elimination candidate uops that were not eliminated.

MOVE_ELIMINATION.INT_ELIMI Number of integer Move Elimination candidate uops NATED that were eliminated. MOVE_ELIMINATION.SIMD_ELI MINATED CPL_CYCLES.RING0 CPL_CYCLES.RING123 RS_EVENTS.EMPTY_CYCLES Number of SIMD Move Elimination candidate uops that were eliminated. Unhalted core cycles when the thread is in ring 0. Unhalted core cycles when the thread is not in ring 0. Cycles the RS is empty for the thread. Use Edge to count transition

LOCK_CYCLES.SPLIT_LOCK_UC Cycles in which the L1D and L2 are locked, due to a _LOCK_DURATION UC lock or split lock.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

142

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. 63H 79H 79H Umask Value 02H 02H 04H Event Mask Mnemonic LOCK_CYCLES.CACHE_LOCK_D URATION IDQ.EMPTY IDQ.MITE_UOPS Description Cycles in which the L1D is locked. Counts cycles the IDQ is empty. Increment each cycle # of uops delivered to IDQ from MITE path. Set Cmask = 1 to count cycles. 79H 08H IDQ.DSB_UOPS Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles. 79H 10H IDQ.MS_DSB_UOPS Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery. Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles. Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles. Counts cycles DSB is delivered at least one uops. Set Cmask = 1. Counts cycles DSB is delivered four uops. Set Cmask = 4. Can combine Umask 04H, 08H Can combine Umask 04H, 08H Can combine Umask 04H, 08H Can combine Umask 08H and 10H Can combine Umask 04H and 20H Comment

79H

20H

IDQ.MS_MITE_UOPS

79H

30H

IDQ.MS_UOPS

79H 79H 79H 79H 79H 80H 85H 85H 85H 85H 85H 85H 85H 85H

18H 18H 24H 24H 3CH 02H 01H 02H 04H 0EH 10H 20H 40H 60H

IDQ.ALL_DSB_CYCLES_ANY_U OPS IDQ.ALL_DSB_CYCLES_4_UOP S

IDQ.ALL_MITE_CYCLES_ANY_U Counts cycles MITE is delivered at least one uops. OPS Set Cmask = 1. IDQ.ALL_MITE_CYCLES_4_UOP Counts cycles MITE is delivered four uops. Set S Cmask = 4. IDQ.MITE_ALL_UOPS ICACHE.MISSES ITLB_MISSES.MISS_CAUSES_A _WALK ITLB_MISSES.WALK_COMPLET ED_4K ITLB_MISSES.WALK_COMPLET ED_2M_4M ITLB_MISSES.WALK_COMPLET ED ITLB_MISSES.WALK_DURATIO N ITLB_MISSES.STLB_HIT_4K ITLB_MISSES.STLB_HIT_2M ITLB_MISSES.STLB_HIT # of uops delivered to IDQ from any path. Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses. Misses in ITLB that causes a page walk of any page size. Completed page walks due to misses in ITLB 4K page entries. Completed page walks due to misses in ITLB 2M/4M page entries. Completed page walks in ITLB of any page size. Cycle PMH is busy with a walk. ITLB misses that hit STLB (4K). ITLB misses that hit STLB (2M). ITLB misses that hit STLB. No page walk.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

143

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. 87H 87H 88H 88H 88H 88H 88H 88H 88H 88H 88H 89H 89H 89H 89H 89H 89H 89H 89H 9CH A1H Umask Value 01H 04H 01H 02H 04H 08H 10H 20H 40H 80H FFH 01H 04H 08H 10H 20H 40H 80H FFH 01H 01H Event Mask Mnemonic ILD_STALL.LCP ILD_STALL.IQ_FULL BR_INST_EXEC.COND BR_INST_EXEC.DIRECT_JMP BR_INST_EXEC.INDIRECT_JMP _NON_CALL_RET BR_INST_EXEC.RETURN_NEA R BR_INST_EXEC.DIRECT_NEAR _CALL Description Stalls caused by changing prefix length of the instruction. Stall cycles due to IQ is full. Qualify conditional near branch instructions executed, but not necessarily retired. Qualify all unconditional near branch instructions excluding calls and indirect branches. Qualify executed indirect near branch instructions that are not calls nor returns. Qualify indirect near branches that have a return mnemonic. Qualify unconditional near call branch instructions, excluding non call branch, executed. Must combine with umask 40H, 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Applicable to umask 01H only Comment

BR_INST_EXEC.INDIRECT_NEA Qualify indirect near calls, including both register R_CALL and memory indirect, executed. BR_INST_EXEC.NONTAKEN BR_INST_EXEC.TAKEN Qualify non-taken near branches executed. Qualify taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H.

BR_INST_EXEC.ALL_BRANCHE Counts all near executed branches (not necessarily S retired). BR_MISP_EXEC.COND BR_MISP_EXEC.INDIRECT_JMP _NON_CALL_RET BR_MISP_EXEC.RETURN_NEA R BR_MISP_EXEC.DIRECT_NEAR _CALL Qualify conditional near branch instructions mispredicted. Qualify mispredicted indirect near branch instructions that are not calls nor returns. Qualify mispredicted indirect near branches that have a return mnemonic. Qualify mispredicted unconditional near call branch instructions, excluding non call branch, executed. Must combine with umask 40H, 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Applicable to umask 01H only

BR_MISP_EXEC.INDIRECT_NEA Qualify mispredicted indirect near calls, including R_CALL both register and memory indirect, executed. BR_MISP_EXEC.NONTAKEN BR_MISP_EXEC.TAKEN Qualify mispredicted non-taken near branches executed. Qualify mispredicted taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H.

BR_MISP_EXEC.ALL_BRANCHE Counts all near executed branches (not necessarily S retired). IDQ_UOPS_NOT_DELIVERED.C ORE Count number of non-delivered uops to RAT per thread. Use Cmask to qualify uop b/w Set AnyThread to count per core

UOPS_EXECUTED_PORT.PORT Cycles which a Uop is dispatched on port 0 in this _0 thread.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

144

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. A1H A1H A1H A1H A1H A1H A1H A2H A2H A2H A2H A3H A3H AEH B0H B0H B0H B1H B7H BBH BCH BCH Umask Value 02H 04H 08H 10H 20H 40H 80H 01H 04H 08H 10H 02H 08H 01H 02H 04H 08H 02H 01H 01H 11H 21H Event Mask Mnemonic Description Comment Set AnyThread to count per core Set AnyThread to count per core Set AnyThread to count per core Set AnyThread to count per core Set AnyThread to count per core Set AnyThread to count per core Set AnyThread to count per core

UOPS_EXECUTED_PORT.PORT Cycles which a Uop is dispatched on port 1 in this _1 thread. UOPS_EXECUTED_PORT.PORT Cycles which a uop is dispatched on port 2 in this _2 thread. UOPS_EXECUTED_PORT.PORT Cycles which a uop is dispatched on port 3 in this _3 thread. UOPS_EXECUTED_PORT.PORT Cycles which a uop is dispatched on port 4 in this _4 thread. UOPS_EXECUTED_PORT.PORT Cycles which a uop is dispatched on port 5 in this _5 thread. UOPS_EXECUTED_PORT.PORT Cycles which a Uop is dispatched on port 6 in this _6 thread. UOPS_EXECUTED_PORT.PORT Cycles which a Uop is dispatched on port 7 in this _7 thread RESOURCE_STALLS.ANY RESOURCE_STALLS.RS RESOURCE_STALLS.SB RESOURCE_STALLS.ROB CYCLE_ACTIVITY.CYCLES_LDM _PENDING Cycles Allocation is stalled due to Resource Related reason. Cycles stalled due to no eligible RS entry available. Cycles stalled due to no store buffers available (not including draining form sync). Cycles stalled due to re-order buffer full. Cycles with pending memory loads. Set Cmask=2 to count cycle.

CYCLE_ACTIVITY.CYCLES_L1D_ Cycles with pending L1 cache miss loads. Set PENDING Cmask=8 to count cycle. ITLB.ITLB_FLUSH Counts the number of ITLB flushes, includes 4k/2M/4M pages.

PMC2 only

OFFCORE_REQUESTS.DEMAND Demand code read requests sent to uncore. _CODE_RD OFFCORE_REQUESTS.DEMAND Demand RFO read requests sent to uncore, _RFO including regular RFOs, locks, ItoM. OFFCORE_REQUESTS.ALL_DAT Data read requests sent to uncore (demand and A_RD prefetch). UOPS_EXECUTED.CORE OFF_CORE_RESPONSE_0 OFF_CORE_RESPONSE_1 PAGE_WALKER_LOADS.DTLB_ L1 Counts total number of uops to be executed percore each cycle. see Section 18.8.5, Off-core Response Performance Monitoring. See Section 18.8.5, Off-core Response Performance Monitoring. Number of DTLB page walker loads that hit in the L1+FB. Do not need to set ANY Requires MSR 01A6H Requires MSR 01A7H

PAGE_WALKER_LOADS.ITLB_L Number of ITLB page walker loads that hit in the 1 L1+FB.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

145

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. BCH BCH BCH BCH BCH BCH BDH BDH C0H C0H C1H C1H C1H C2H Umask Value 12H 22H 14H 24H 18H 28H 01H 20H 00H 01H 08H 10H 40H 01H Event Mask Mnemonic PAGE_WALKER_LOADS.DTLB_ L2 Description Number of DTLB page walker loads that hit in the L2. Comment

PAGE_WALKER_LOADS.ITLB_L Number of ITLB page walker loads that hit in the L2. 2 PAGE_WALKER_LOADS.DTLB_ L3 Number of DTLB page walker loads that hit in the L3.

PAGE_WALKER_LOADS.ITLB_L Number of ITLB page walker loads that hit in the L3. 3 PAGE_WALKER_LOADS.DTLB_ MEMORY Number of DTLB page walker loads from memory.

PAGE_WALKER_LOADS.ITLB_M Number of ITLB page walker loads from memory. EMORY TLB_FLUSH.DTLB_THREAD TLB_FLUSH.STLB_ANY INST_RETIRED.ANY_P INST_RETIRED.ALL DTLB flush attempts of the thread-specific entries. Count number of STLB flush attempts. Number of instructions at retirement. See Table 19-1 Precise instruction retired event with HW to reduce PMC1 only; Must quiesce effect of PEBS shadow in IP distribution. other PMCs.

OTHER_ASSISTS.AVX_TO_SSE Number of transitions from AVX-256 to legacy SSE when penalty applicable. OTHER_ASSISTS.SSE_TO_AVX Number of transitions from SSE to AVX-256 when penalty applicable. OTHER_ASSISTS.ANY_WB_AS SIST UOPS_RETIRED.ALL Number of microcode assists invoked by HW upon uop writeback. Counts the number of micro-ops retired, Use Supports PEBS, use cmask=1 and invert to count active cycles or stalled Any=1 for core granular. cycles.

C2H C3H C3H C3H

02H 02H 04H 20H

UOPS_RETIRED.RETIRE_SLOTS Counts the number of retirement slots used each cycle. MACHINE_CLEARS.MEMORY_O Counts the number of machine clears due to RDERING memory order conflicts. MACHINE_CLEARS.SMC MACHINE_CLEARS.MASKMOV Number of self-modifying-code machine clears detected. Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0. Branch instructions at retirement. See Table 19-1 Supports PEBS

C4H C4H C4H

00H 01H 02H

BR_INST_RETIRED.ALL_BRAN CHES

BR_INST_RETIRED.CONDITION Counts the number of conditional branch AL instructions retired. BR_INST_RETIRED.NEAR_CAL L Direct and indirect near call instructions retired.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

146

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. C4H C4H C4H C4H C4H C5H C5H C5H CAH CAH CAH CAH CAH CCH CDH D0H D0H D0H D0H D0H D0H D1H D1H Umask Value 04H 08H 10H 20H 40H 00H 01H 04H 02H 04H 08H 10H 1EH 20H 01H 01H 02H 10H 20H 40H 80H 01H 02H Event Mask Mnemonic BR_INST_RETIRED.ALL_BRAN CHES BR_INST_RETIRED.NEAR_RET URN BR_INST_RETIRED.NOT_TAKE N BR_INST_RETIRED.NEAR_TAK EN Description Counts the number of branch instructions retired. Counts the number of near return instructions retired. Counts the number of not taken branch instructions retired. Number of near taken branches retired. Comment

BR_INST_RETIRED.FAR_BRAN Number of far branches retired. CH BR_MISP_RETIRED.ALL_BRAN CHES Mispredicted branch instructions at retirement See Table 19-1

BR_MISP_RETIRED.CONDITION Mispredicted conditional branch instructions retired. Supports PEBS AL BR_MISP_RETIRED.ALL_BRAN CHES FP_ASSIST.X87_OUTPUT FP_ASSIST.X87_INPUT FP_ASSIST.SIMD_OUTPUT FP_ASSIST.SIMD_INPUT FP_ASSIST.ANY ROB_MISC_EVENTS.LBR_INSE RTS MEM_TRANS_RETIRED.LOAD_ LATENCY MEM_UOP_RETIRED.LOADS MEM_UOP_RETIRED.STORES Mispredicted macro branch instructions retired. Number of X87 FP assists due to Output values. Number of X87 FP assists due to input values. Number of SIMD FP assists due to Output values. Number of SIMD FP assists due to input values. Cycles with any input/output SSE* or FP assists. Count cases of saving new LBR records by hardware. Sample loads with specified latency threshold. PMC3 only. Qualify retired memory uops that are loads. Combine with umask 10H, 20H, 40H, 80H. Qualify retired memory uops that are stores. Combine with umask 10H, 20H, 40H, 80H. Specify threshold in MSR 0x3F6 Supports PEBS and DataLA Supports PEBS and DataLA Supports PEBS and DataLA Supports PEBS and DataLA Supports PEBS and DataLA Supports PEBS and DataLA Supports PEBS and DataLA Supports PEBS and DataLA

MEM_UOP_RETIRED.STLB_MIS Qualify retired memory uops with STLB miss. Must S combine with umask 01H, 02H, to produce counts. MEM_UOP_RETIRED.LOCK MEM_UOP_RETIRED.SPLIT MEM_UOP_RETIRED.ALL MEM_LOAD_UOPS_RETIRED.L 1_HIT MEM_LOAD_UOPS_RETIRED.L 2_HIT Qualify retired memory uops with lock. Must combine with umask 01H, 02H, to produce counts. Qualify retired memory uops with line split. Must combine with umask 01H, 02H, to produce counts. Qualify any retired memory uops. Must combine with umask 01H, 02H, to produce counts. Retired load uops with L1 cache hits as data sources. Retired load uops with L2 cache hits as data sources.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

147

Table 19-2 Non-Architectural Performance Events In the Processor Core of Next Generation Intel Core Processors (Contd.)
Event Num. D1H D1H D1H Umask Value 04H 10H 40H Event Mask Mnemonic Description Comment Supports PEBS and DataLA Supports PEBS and DataLA

MEM_LOAD_UOPS_RETIRED.LL Retired load uops with LLC cache hits as data C_HIT sources. MEM_LOAD_UOPS_RETIRED.L 2_MISS Retired load uops missed L2. Unknown data source excluded.

MEM_LOAD_UOPS_RETIRED.HI Retired load uops which data sources were load T_LFB uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC hit ETIRED.XSNP_MISS and cross-core snoop missed in on-pkg core cache. Supports PEBS and DataLA

D2H D2H D2H D2H D3H E6H F0H F0H F0H F0H F0H F0H F0H F0H F1H F1H F1H F1H F2H F2H

01H 02H 04H 08H 01H 1FH 01H 02H 04H 08H 10H 20H 40H 80H 01H 02H 04H 07H 05H 06H

MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC and Supports PEBS and ETIRED.XSNP_HIT cross-core snoop hits in on-pkg core cache. DataLA MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were HitM ETIRED.XSNP_HITM responses from shared LLC. MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were hits in ETIRED.XSNP_NONE LLC without snoops required. MEM_LOAD_UOPS_LLC_MISS_ RETIRED.LOCAL_DRAM BACLEARS.ANY Retired load uops which data sources missed LLC but serviced from local dram. Number of front end re-steers due to BPU misprediction. RFO requests that access L2 cache. L2 cache accesses when fetching instructions. Any MLC or LLC HW prefetch accessing L2, including rejects. L1D writebacks that access L2 cache. L2 fill requests that access L2 cache. L2 writebacks that access L2 cache. Transactions accessing L2 pipe. L2 cache lines in I state filling L2. L2 cache lines in S state filling L2. L2 cache lines in E state filling L2. L2 cache lines filling L2. Counting does not cover rejects. Counting does not cover rejects. Counting does not cover rejects. Counting does not cover rejects. Supports PEBS and DataLA Supports PEBS and DataLA Supports PEBS and DataLA.

L2_TRANS.DEMAND_DATA_RD Demand Data Read requests that access L2 cache. L2_TRANS.RFO L2_TRANS.CODE_RD L2_TRANS.ALL_PF L2_TRANS.L1D_WB L2_TRANS.L2_FILL L2_TRANS.L2_WB L2_TRANS.ALL_REQUESTS L2_LINES_IN.I L2_LINES_IN.S L2_LINES_IN.E L2_LINES_IN.ALL

L2_LINES_OUT.DEMAND_CLEA Clean L2 cache lines evicted by demand. N L2_LINES_OUT.DEMAND_DIRT Dirty L2 cache lines evicted by demand. Y

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

148

Table 19-3 Intel TSX Performance Events


Event Num. 54H Umask Value 01H Event Mask Mnemonic TX_MEM.ABORT_CONFLICT Description Number of times a transactional abort was signaled due to a data conflict on a transactionally accessed address Number of times a transactional abort was signaled due to a data capacity limitation Comment

02H 04H

TX_MEM.ABORT_CAPACITY

TX_MEM.ABORT_HLE_STORE_ Number of times a HLE transactional region aborted TO_ELIDED_LOCK due to a non XRELEASE prefixed instruction writing to an elided lock in the elision buffer TX_MEM.ABORT_HLE_ELISION Number of times an HLE transactional execution _BUFFER_NOT_EMPTY aborted due to NoAllocatedElisionBuffer being nonzero. TX_MEM.ABORT_HLE_ELISION Number of times an HLE transactional execution _BUFFER_MISMATCH aborted due to XRELEASE lock not satisfying the address and value requirements in the elision buffer. TX_MEM.ABORT_HLE_ELISION Number of times an HLE transactional execution _BUFFER_UNSUPPORTED_ALI aborted due to an unsupported read alignment from GNMENT the elision buffer. TX_MEM.ABORT_HLE_ELISION Number of times HLE lock could not be elided due to _BUFFER_FULL ElisionBufferAvailable being zero. TX_EXEC.MISC1 Counts the number of times a class of instructions that may cause a transactional abort was executed. Since this is the count of execution, it may not always cause a transactional abort. Counts the number of times a class of instructions that may cause a transactional abort was executed inside a transactional region Counts the number of times an instruction execution caused the nest count supported to be exceeded Counts the number of times an HLE XACQUIRE instruction was executed inside an RTM transactional region

08H

10H

20H

40H 5DH 01H

02H

TX_EXEC.MISC2

04H 08H

TX_EXEC.MISC3 TX_EXEC.MISC4

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

149

Event Num. C8H

Umask Value 01H 02H 04H 08H 10H 20H 40H 80H

Event Mask Mnemonic HLE_RETIRED.START HLE_RETIRED.COMMIT HLE_RETIRED.ABORTED HLE_RETIRED.ABORTED_MISC 1 HLE_RETIRED.ABORTED_MISC 2 HLE_RETIRED.ABORTED_MISC 3 HLE_RETIRED.ABORTED_MISC 4 HLE_RETIRED.ABORTED_MISC 5 RTM_RETIRED.START RTM_RETIRED.COMMIT RTM_RETIRED.ABORTED

Description Number of times an HLE execution started. Number of times an HLE execution successfully committed Number of times an HLE execution aborted due to any reasons (multiple categories may count as one) Number of times an HLE execution aborted due to various memory events Number of times an HLE execution aborted due to uncommon conditions Number of times an HLE execution aborted due to HLE-unfriendly instructions Number of times an HLE execution aborted due to incompatible memory type Number of times an HLE execution aborted due to none of the previous 4 categories (e.g. interrupt) Number of times an RTM execution started. Number of times an RTM execution successfully committed Number of times an RTM execution aborted due to any reasons (multiple categories may count as one)

Comment IF HLE is supported

C9H

01H 02H 04H 08H 10H 20H 40H 80H

IF RTM is supported

RTM_RETIRED.ABORTED_MISC Number of times an RTM execution aborted due to 1 various memory events RTM_RETIRED.ABORTED_MISC Number of times an RTM execution aborted due to 2 uncommon conditions RTM_RETIRED.ABORTED_MISC Number of times an RTM execution aborted due to 3 HLE-unfriendly instructions RTM_RETIRED.ABORTED_MISC Number of times an RTM execution aborted due to 4 incompatible memory type RTM_RETIRED.ABORTED_MISC Number of times an RTM execution aborted due to 5 none of the previous 4 categories (e.g. interrupt)

IF RTM is supported

Non-architectural performance monitoring events that are located in the uncore sub-system are implementation specific between different platforms using processors based on Intel microarchitecture Sandy Bridge. Processors with CPUID signature of DisplayFamily_DisplayModel 06_3CH and 06_45H support performance events listed in Table 19-4.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

150

Table 19-4 Non-Architectural Uncore Performance Events In the Next Generation Intel Core Processors
Event Num.1 22H 22H 22H 22H 22H 22H 22H 22H 34H 34H 34H 34H 34H Umask Value 01H 02H 04H 08H 10H 20H 40H 80H 01H 02H 04H 08H 10H Event Mask Mnemonic Description Comment Must combine with one of the umask values of 20H, 40H, 80H

UNC_CBO_XSNP_RESPONSE.M A snoop misses in some processor core. ISS UNC_CBO_XSNP_RESPONSE.I NVAL A snoop invalidates a non-modified line in some processor core.

UNC_CBO_XSNP_RESPONSE.H A snoop hits a non-modified line in some processor IT core. UNC_CBO_XSNP_RESPONSE.H A snoop hits a modified line in some processor core. ITM UNC_CBO_XSNP_RESPONSE.I NVAL_M UNC_CBO_XSNP_RESPONSE.E XTERNAL_FILTER A snoop invalidates a modified line in some processor core. Filter on cross-core snoops initiated by this Cbox due to external snoop request. Must combine with at least one of 01H, 02H, 04H, 08H, 10H

UNC_CBO_XSNP_RESPONSE.X Filter on cross-core snoops initiated by this Cbox due CORE_FILTER to processor core memory request. UNC_CBO_XSNP_RESPONSE.E VICTION_FILTER UNC_CBO_CACHE_LOOKUP.M UNC_CBO_CACHE_LOOKUP.E UNC_CBO_CACHE_LOOKUP.S UNC_CBO_CACHE_LOOKUP.I UNC_CBO_CACHE_LOOKUP.RE AD_FILTER Filter on cross-core snoops initiated by this Cbox due to LLC eviction.

LLC lookup request that access cache and found line in Must combine with M-state. one of the umask LLC lookup request that access cache and found line in values of 10H, 20H, 40H, 80H E-state. LLC lookup request that access cache and found line in S-state. LLC lookup request that access cache and found line in I-state. Filter on processor core initiated cacheable read requests. Must combine with at least one of 01H, 02H, 04H, 08H.

34H

20H

UNC_CBO_CACHE_LOOKUP.WR Filter on processor core initiated cacheable write ITE_FILTER requests. Must combine with at least one of 01H, 02H, 04H, 08H. UNC_CBO_CACHE_LOOKUP.EX TSNP_FILTER Filter on external snoop requests. Must combine with at least one of 01H, 02H, 04H, 08H.

34H 34H

40H 80H

UNC_CBO_CACHE_LOOKUP.AN Filter on any IRQ or IPQ initiated requests including Y_REQUEST_FILTER uncacheable, non-coherent requests. Must combine with at least one of 01H, 02H, 04H, 08H. UNC_ARB_TRK_OCCUPANCY.A Counts cycles weighted by the number of requests Counter 0 only LL waiting for data returning from the memory controller. Accounts for coherent and non-coherent requests initiated by IA cores, processor graphic units, or LLC. UNC_ARB_TRK_REQUEST.ALL Counts the number of coherent and in-coherent requests initiated by IA cores, processor graphic units, or LLC.

80H

01H

81H

01H

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

151

Table 19-4 Non-Architectural Uncore Performance Events In the Next Generation Intel Core Processors
Event Num.1 81H 81H 83H 84H
NOTES:

Umask Value 20H 80H 01H 01H

Event Mask Mnemonic UNC_ARB_TRK_REQUEST.WRI TES

Description Counts the number of allocated write entries, include full, partial, and LLC evictions.

Comment

UNC_ARB_TRK_REQUEST.EVIC Counts the number of LLC evictions allocated. TIONS UNC_ARB_COH_TRK_OCCUPA NCY.ALL UNC_ARB_COH_TRK_REQUES T.ALL Cycles weighted by number of requests pending in Coherency Tracker. Number of requests allocated in Coherency Tracker. Counter 0 only

1. The uncore events must be programmed using MSRs located in specific performance monitoring units in the uncore. UNC_CBO* events are supported using MSR_UNC_CBO* MSRs; UNC_ARB* events are supported using MSR_UNC_ARB*MSRs.

19.3

PERFORMANCE MONITORING EVENTS FOR 3RD GENERATION INTEL CORE PROCESSORS

3rd Generation Intel Core Processors are based on the Intel microarchitecture code name Ivy Bridge. They support architectural performance-monitoring events listed in Table 19-1. Non-architectural performance-monitoring events in the processor core are listed in Table 19-5. The events in Table 19-5 apply to processors with CPUID signature of DisplayFamily_DisplayModel encoding with the following values: 06_3AH.

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors
Event Num. 03H 05H 05H 07H 08H 08H 08H 0EH Umask Value 02H 01H 02H 01H 81H 82H 84H 01H Event Mask Mnemonic Description Comment

LD_BLOCKS.STORE_FORWARD loads blocked by overlapping with store buffer that cannot be forwarded . MISALIGN_MEM_REF.LOADS MISALIGN_MEM_REF.STORES Speculative cache-line split load uops dispatched to L1D. Speculative cache-line split Store-address uops dispatched to L1D.

LD_BLOCKS_PARTIAL.ADDRES False dependencies in MOB due to partial compare S_ALIAS on address. DTLB_LOAD_MISSES.MISS_CA USES_A_WALK Misses in all TLB levels that cause a page walk of any page size from demand loads.

DTLB_LOAD_MISSES.WALK_CO Misses in all TLB levels that caused page walk MPLETED completed of any size by demand loads. DTLB_LOAD_MISSES.WALK_D URATION UOPS_ISSUED.ANY Cycle PMH is busy with a walk due to demand loads. Increments each cycle the # of Uops issued by the RAT to RS. Set Cmask = 1, Inv = 1, Any= 1to count stalled cycles of this core. Set Cmask = 1, Inv = 1to count stalled cycles

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

152

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. 0EH 0EH Umask Value 10H 20H Event Mask Mnemonic UOPS_ISSUED.FLAGS_MERGE UOPS_ISSUED.SLOW_LEA Description Number of flags-merge uops allocated. Such uops adds delay. Number of slow LEA or similar uops allocated. Such uop has 3 sources (e.g. 2 sources + immediate) regardless if as a result of LEA instruction or not. Number of multiply packed/scalar single precision uops allocated. Cycles that the divider is active, includes INT and FP. Set 'edge =1, cmask=1' to count the number of divides. Comment

0EH 14H

40H 01H

UOPS_ISSUED.SiNGLE_MUL ARITH.FPU_DIV_ACTIVE

24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 24H 27H 27H 27H 28H 28H 28H 28H 2EH

01H 03H 04H 08H 0CH 10H 20H 30H 40H 80H C0H 01H 08H 0FH 01H 04H 08H 0FH 4FH

L2_RQSTS.DEMAND_DATA_RD Demand Data Read requests that hit L2 cache _HIT L2_RQSTS.ALL_DEMAND_DAT Counts any demand and L1 HW prefetch data load A_RD requests to L2. L2_RQSTS.RFO_HITS L2_RQSTS.RFO_MISS L2_RQSTS.ALL_RFO L2_RQSTS.CODE_RD_HIT L2_RQSTS.CODE_RD_MISS L2_RQSTS.ALL_CODE_RD L2_RQSTS.PF_HIT L2_RQSTS.PF_MISS L2_RQSTS.ALL_PF L2_STORE_LOCK_RQSTS.HIT_ M L2_STORE_LOCK_RQSTS.ALL L2_L1D_WB_RQSTS.MISS L2_L1D_WB_RQSTS.HIT_E L2_L1D_WB_RQSTS.HIT_M L2_L1D_WB_RQSTS.ALL Counts the number of store RFO requests that hit the L2 cache. Counts the number of store RFO requests that miss the L2 cache. Counts all L2 store RFO requests. Number of instruction fetches that hit the L2 cache. Number of instruction fetches that missed the L2 cache. Counts all L2 code requests. Counts all L2 HW prefetcher requests that hit L2. Counts all L2 HW prefetcher requests that missed L2. Counts all L2 HW prefetcher requests. RFOs that hit cache lines in M state RFOs that access cache lines in any state Not rejected writebacks that missed LLC. Not rejected writebacks from L1D to L2 cache lines in E state. Not rejected writebacks from L1D to L2 cache lines in M state. Not rejected writebacks from L1D to L2 cache lines in any state. see Table 19-1

L2_STORE_LOCK_RQSTS.MISS RFOs that miss cache lines

LONGEST_LAT_CACHE.REFERE This event counts requests originating from the NCE core that reference a cache line in the last level cache.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

153

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. 2EH 3CH Umask Value 41H 00H Event Mask Mnemonic LONGEST_LAT_CACHE.MISS Description This event counts each cache miss condition for references to the last level cache. Comment see Table 19-1 see Table 19-1

CPU_CLK_UNHALTED.THREAD Counts the number of thread cycles while the _P thread is not in a halt state. The thread enters the halt state when it is running the HLT instruction. The core frequency may change from time to time due to power or thermal throttling. CPU_CLK_THREAD_UNHALTE D.REF_XCLK L1D_PEND_MISS.PENDING Increments at the frequency of XCLK (100 MHz) when not halted. Increments the number of outstanding L1D misses every cycle. Set Cmaks = 1 and Edge =1 to count occurrences. Miss in all TLB levels causes an page walk of any page size (4K/2M/4M/1G).

3CH 48H

01H 01H

see Table 19-1 PMC2 only; Set Cmask = 1 to count cycles.

49H 49H 49H 49H 4CH 4CH 51H 58H 58H 58H 58H 5CH 5CH 5EH 5FH

01H 02H 04H 10H 01H 02H 01H 04H 08H 01H 02H 01H 02H 01H 01H

DTLB_STORE_MISSES.MISS_C AUSES_A_WALK

DTLB_STORE_MISSES.WALK_C Miss in all TLB levels causes a page walk that OMPLETED completes of any page size (4K/2M/4M/1G). DTLB_STORE_MISSES.WALK_D Cycles PMH is busy with this walk. URATION DTLB_STORE_MISSES.STLB_HI Store operations that miss the first TLB level but hit T the second and do not cause page walks LOAD_HIT_PRE.SW_PF LOAD_HIT_PRE.HW_PF L1D.REPLACEMENT Non-SW-prefetch load dispatches that hit fill buffer allocated for S/W prefetch. Non-SW-prefetch load dispatches that hit fill buffer allocated for H/W prefetch. Counts the number of lines brought into the L1 data cache.

MOVE_ELIMINATION.INT_NOT_ Number of integer Move Elimination candidate uops ELIMINATED that were not eliminated. MOVE_ELIMINATION.SIMD_NO T_ELIMINATED Number of SIMD Move Elimination candidate uops that were not eliminated.

MOVE_ELIMINATION.INT_ELIMI Number of integer Move Elimination candidate uops NATED that were eliminated. MOVE_ELIMINATION.SIMD_ELI MINATED CPL_CYCLES.RING0 CPL_CYCLES.RING123 RS_EVENTS.EMPTY_CYCLES Number of SIMD Move Elimination candidate uops that were eliminated. Unhalted core cycles when the thread is in ring 0. Unhalted core cycles when the thread is not in ring 0. Cycles the RS is empty for the thread. Use Edge to count transition

DTLB_LOAD_MISSES.STLB_HIT Counts load operations that missed 1st level DTLB but hit the 2nd level.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

154

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. 60H Umask Value 01H Event Mask Mnemonic OFFCORE_REQUESTS_OUTST ANDING.DEMAND_DATA_RD OFFCORE_REQUESTS_OUTST ANDING.DEMAND_CODE_RD OFFCORE_REQUESTS_OUTST ANDING.DEMAND_RFO OFFCORE_REQUESTS_OUTST ANDING.ALL_DATA_RD Description Offcore outstanding Demand Data Read transactions in SQ to uncore. Set Cmask=1 to count cycles. Offcore outstanding Demand Code Read transactions in SQ to uncore. Set Cmask=1 to count cycles. Offcore outstanding RFO store transactions in SQ to uncore. Set Cmask=1 to count cycles. Offcore outstanding cacheable data read transactions in SQ to uncore. Set Cmask=1 to count cycles. Comment

60H

02H

60H 60H

04H 08H

63H 63H 79H 79H

01H 02H 02H 04H

LOCK_CYCLES.SPLIT_LOCK_UC Cycles in which the L1D and L2 are locked, due to a _LOCK_DURATION UC lock or split lock. LOCK_CYCLES.CACHE_LOCK_D Cycles in which the L1D is locked. URATION IDQ.EMPTY IDQ.MITE_UOPS Counts cycles the IDQ is empty. Increment each cycle # of uops delivered to IDQ from MITE path. Set Cmask = 1 to count cycles. Increment each cycle. # of uops delivered to IDQ from DSB path. Set Cmask = 1 to count cycles. Increment each cycle # of uops delivered to IDQ when MS_busy by DSB. Set Cmask = 1 to count cycles. Add Edge=1 to count # of delivery. Increment each cycle # of uops delivered to IDQ when MS_busy by MITE. Set Cmask = 1 to count cycles. Increment each cycle # of uops delivered to IDQ from MS by either DSB or MITE. Set Cmask = 1 to count cycles. Counts cycles DSB is delivered at least one uops. Set Cmask = 1. Counts cycles DSB is delivered four uops. Set Cmask = 4. Can combine Umask 04H and 20H Can combine Umask 08H and 10H Can combine Umask 04H, 08H Can combine Umask 04H, 08H Can combine Umask 04H, 08H

79H

08H

IDQ.DSB_UOPS

79H

10H

IDQ.MS_DSB_UOPS

79H

20H

IDQ.MS_MITE_UOPS

79H

30H

IDQ.MS_UOPS

79H 79H 79H 79H 79H 80H

18H 18H 24H 24H 3CH 02H

IDQ.ALL_DSB_CYCLES_ANY_U OPS IDQ.ALL_DSB_CYCLES_4_UOP S

IDQ.ALL_MITE_CYCLES_ANY_U Counts cycles MITE is delivered at least one uops. OPS Set Cmask = 1. IDQ.ALL_MITE_CYCLES_4_UOP Counts cycles MITE is delivered four uops. Set S Cmask = 4. IDQ.MITE_ALL_UOPS ICACHE.MISSES # of uops delivered to IDQ from any path. Number of Instruction Cache, Streaming Buffer and Victim Cache Misses. Includes UC accesses.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

155

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. 85H 85H 85H 85H 87H 87H 88H 88H 88H 88H 88H 88H 88H 88H 88H 89H 89H 89H 89H 89H 89H 89H Umask Value 01H 02H 04H 10H 01H 04H 01H 02H 04H 08H 10H 20H 40H 80H FFH 01H 04H 08H 10H 20H 40H 80H Event Mask Mnemonic Description Comment

ITLB_MISSES.MISS_CAUSES_A Misses in all ITLB levels that cause page walks _WALK ITLB_MISSES.WALK_COMPLET ED ITLB_MISSES.WALK_DURATIO N ITLB_MISSES.STLB_HIT ILD_STALL.LCP ILD_STALL.IQ_FULL BR_INST_EXEC.COND BR_INST_EXEC.DIRECT_JMP BR_INST_EXEC.INDIRECT_JMP _NON_CALL_RET BR_INST_EXEC.RETURN_NEA R BR_INST_EXEC.DIRECT_NEAR _CALL Misses in all ITLB levels that cause completed page walks Cycle PMH is busy with a walk. Number of cache load STLB hits. No page walk. Stalls caused by changing prefix length of the instruction. Stall cycles due to IQ is full. Qualify conditional near branch instructions executed, but not necessarily retired. Qualify all unconditional near branch instructions excluding calls and indirect branches. Qualify executed indirect near branch instructions that are not calls nor returns. Qualify indirect near branches that have a return mnemonic. Qualify unconditional near call branch instructions, excluding non call branch, executed. Must combine with umask 40H, 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Applicable to umask 01H only

BR_INST_EXEC.INDIRECT_NEA Qualify indirect near calls, including both register R_CALL and memory indirect, executed. BR_INST_EXEC.NONTAKEN BR_INST_EXEC.TAKEN Qualify non-taken near branches executed. Qualify taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H.

BR_INST_EXEC.ALL_BRANCHE Counts all near executed branches (not necessarily S retired). BR_MISP_EXEC.COND Qualify conditional near branch instructions mispredicted. Must combine with umask 40H, 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Must combine with umask 80H Applicable to umask 01H only

BR_MISP_EXEC.INDIRECT_JMP Qualify mispredicted indirect near branch _NON_CALL_RET instructions that are not calls nor returns. BR_MISP_EXEC.RETURN_NEA R BR_MISP_EXEC.DIRECT_NEAR _CALL Qualify mispredicted indirect near branches that have a return mnemonic. Qualify mispredicted unconditional near call branch instructions, excluding non call branch, executed.

BR_MISP_EXEC.INDIRECT_NEA Qualify mispredicted indirect near calls, including R_CALL both register and memory indirect, executed. BR_MISP_EXEC.NONTAKEN BR_MISP_EXEC.TAKEN Qualify mispredicted non-taken near branches executed. Qualify mispredicted taken near branches executed. Must combine with 01H,02H, 04H, 08H, 10H, 20H.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

156

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. 89H 9CH A1H A1H A1H A1H A1H A1H A1H A1H A1H A1H A2H A2H A2H A2H A3H A3H A3H A3H ABH ABH ACH Umask Value FFH 01H 01H 02H 04H 08H 0CH 10H 20H 30H 40H 80H 01H 04H 08H 10H 01H 02H 08H 04H 01H 02H 08H Event Mask Mnemonic Description Comment

BR_MISP_EXEC.ALL_BRANCHE Counts all near executed branches (not necessarily S retired). IDQ_UOPS_NOT_DELIVERED.C ORE Count number of non-delivered uops to RAT per thread. Use Cmask to qualify uop b/w

UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 0. T_0 UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 1. T_1 UOPS_DISPATCHED_PORT.POR Cycles which a load uop is dispatched on port 2. T_2_LD UOPS_DISPATCHED_PORT.POR Cycles which a store address uop is dispatched on T_2_STA port 2. UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 2. T_2 UOPS_DISPATCHED_PORT.POR Cycles which a load uop is dispatched on port 3. T_3_LD UOPS_DISPATCHED_PORT.POR Cycles which a store address uop is dispatched on T_3_STA port 3. UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 3. T_3 UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 4. T_4 UOPS_DISPATCHED_PORT.POR Cycles which a Uop is dispatched on port 5. T_5 RESOURCE_STALLS.ANY RESOURCE_STALLS.RS RESOURCE_STALLS.SB RESOURCE_STALLS.ROB Cycles Allocation is stalled due to Resource Related reason. Cycles stalled due to no eligible RS entry available. Cycles stalled due to no store buffers available (not including draining form sync). Cycles stalled due to re-order buffer full.

CYCLE_ACTIVITY.CYCLES_L2_P Cycles with pending L2 miss loads. Set AnyThread ENDING to count per core. CYCLE_ACTIVITY.CYCLES_LDM _PENDING Cycles with pending memory loads. Set AnyThread to count per core. PMC0-3 only. PMC2 only

CYCLE_ACTIVITY.CYCLES_L1D_ Cycles with pending L1 cache miss loads. Set PENDING AnyThread to count per core. CYCLE_ACTIVITY.CYCLES_NO_ EXECUTE DSB2MITE_SWITCHES.COUNT Cycles of dispatch stalls. Set AnyThread to count per core. Number of DSB to MITE switches.

DSB2MITE_SWITCHES.PENALT Cycles DSB to MITE switches caused delay. Y_CYCLES DSB_FILL.EXCEED_DSB_LINES DSB Fill encountered > 3 DSB lines.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

157

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. AEH B0H B0H B0H B0H B1H Umask Value 01H 01H 02H 04H 08H 01H Event Mask Mnemonic ITLB.ITLB_FLUSH Description Counts the number of ITLB flushes, includes 4k/2M/4M pages. Comment

OFFCORE_REQUESTS.DEMAND Demand data read requests sent to uncore. _DATA_RD OFFCORE_REQUESTS.DEMAND Demand code read requests sent to uncore. _CODE_RD OFFCORE_REQUESTS.DEMAND Demand RFO read requests sent to uncore, _RFO including regular RFOs, locks, ItoM OFFCORE_REQUESTS.ALL_DA TA_RD UOPS_EXECUTED.THREAD Data read requests sent to uncore (demand and prefetch). Counts total number of uops to be executed perthread each cycle. Set Cmask = 1, INV =1 to count stall cycles. Counts total number of uops to be executed percore each cycle. see Section 18.8.5, Off-core Response Performance Monitoring. See Section 18.8.5, Off-core Response Performance Monitoring. DTLB flush attempts of the thread-specific entries. Count number of STLB flush attempts. Number of instructions at retirement. See Table 19-1 Precise instruction retired event with HW to reduce PMC1 only effect of PEBS shadow in IP distribution. Number of assists associated with 256-bit AVX store operations. Do not need to set ANY Requires MSR 01A6H Requires MSR 01A7H

B1H B7H BBH BDH BDH C0H C0H C1H C1H C1H C2H

02H 01H 01H 01H 20H 00H 01H 08H 10H 20H 01H

UOPS_EXECUTED.CORE OFFCORE_RESPONSE_0 OFFCORE_RESPONSE_1 TLB_FLUSH.DTLB_THREAD TLB_FLUSH.STLB_ANY INST_RETIRED.ANY_P INST_RETIRED.ALL OTHER_ASSISTS.AVX_STORE

OTHER_ASSISTS.AVX_TO_SSE Number of transitions from AVX-256 to legacy SSE when penalty applicable. OTHER_ASSISTS.SSE_TO_AVX Number of transitions from SSE to AVX-256 when penalty applicable. UOPS_RETIRED.ALL Counts the number of micro-ops retired, Use Supports PEBS, use cmask=1 and invert to count active cycles or stalled Any=1 for core granular. cycles.

C2H C3H C3H C3H

02H 02H 04H 20H

UOPS_RETIRED.RETIRE_SLOTS Counts the number of retirement slots used each cycle. MACHINE_CLEARS.MEMORY_O Counts the number of machine clears due to RDERING memory order conflicts. MACHINE_CLEARS.SMC MACHINE_CLEARS.MASKMOV Number of self-modifying-code machine clears detected. Counts the number of executed AVX masked load operations that refer to an illegal address range with the mask bits set to 0.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

158

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. C4H C4H C4H C4H C4H C4H C4H C4H C5H C5H C5H C5H C5H C5H CAH CAH CAH CAH CAH CCH CDH CDH D0H Umask Value 00H 01H 02H 04H 08H 10H 20H 40H 00H 01H 02H 04H 10H 20H 02H 04H 08H 10H 1EH 20H 01H 02H 01H Event Mask Mnemonic BR_INST_RETIRED.ALL_BRAN CHES Description Branch instructions at retirement. Comment See Table 19-1 Supports PEBS

BR_INST_RETIRED.CONDITION Counts the number of conditional branch AL instructions retired. BR_INST_RETIRED.NEAR_CAL L BR_INST_RETIRED.ALL_BRAN CHES Direct and indirect near call instructions retired. Counts the number of branch instructions retired.

BR_INST_RETIRED.NEAR_RET Counts the number of near return instructions URN retired. BR_INST_RETIRED.NOT_TAKE N Counts the number of not taken branch instructions retired.

BR_INST_RETIRED.NEAR_TAK Number of near taken branches retired. EN BR_INST_RETIRED.FAR_BRAN Number of far branches retired. CH BR_MISP_RETIRED.ALL_BRAN Mispredicted branch instructions at retirement. CHES See Table 19-1

BR_MISP_RETIRED.CONDITION Mispredicted conditional branch instructions retired. Supports PEBS AL BR_MISP_RETIRED.NEAR_CAL Direct and indirect mispredicted near call L instructions retired. BR_MISP_RETIRED.ALL_BRAN Mispredicted macro branch instructions retired. CHES BR_MISP_RETIRED.NOT_TAKE Mispredicted not taken branch instructions retired. N BR_MISP_RETIRED.TAKEN FP_ASSIST.X87_OUTPUT FP_ASSIST.X87_INPUT FP_ASSIST.SIMD_OUTPUT FP_ASSIST.SIMD_INPUT FP_ASSIST.ANY ROB_MISC_EVENTS.LBR_INSE RTS MEM_TRANS_RETIRED.LOAD_ LATENCY Mispredicted taken branch instructions retired. Number of X87 FP assists due to Output values. Number of X87 FP assists due to input values. Number of SIMD FP assists due to Output values. Number of SIMD FP assists due to input values. Cycles with any input/output SSE* or FP assists. Count cases of saving new LBR records by hardware. Sample loads with specified latency threshold. PMC3 only. Specify threshold in MSR 0x3F6 See Section 18.8.4.3 Supports PEBS

MEM_TRANS_RETIRED.PRECIS Sample stores and collect precise store operation E_STORE via PEBS record. PMC3 only. MEM_UOPS_RETIRED.LOADS Qualify retired memory uops that are loads. Combine with umask 10H, 20H, 40H, 80H.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

159

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. D0H D0H D0H D0H D0H D1H D1H D1H D1H D1H Umask Value 02H 10H 20H 40H 80H 01H 02H 04H 20H 40H Event Mask Mnemonic MEM_UOPS_RETIRED.STORES Description Qualify retired memory uops that are stores. Combine with umask 10H, 20H, 40H, 80H. Comment

MEM_UOPS_RETIRED.STLB_MI Qualify retired memory uops with STLB miss. Must SS combine with umask 01H, 02H, to produce counts. MEM_UOPS_RETIRED.LOCK MEM_UOPS_RETIRED.SPLIT MEM_UOPS_RETIRED.ALL MEM_LOAD_UOPS_RETIRED.L 1_HIT MEM_LOAD_UOPS_RETIRED.L 2_HIT MEM_LOAD_UOPS_RETIRED.L LC_HIT MEM_LOAD_UOPS_RETIRED.L LC_MISS Qualify retired memory uops with lock. Must combine with umask 01H, 02H, to produce counts. Qualify retired memory uops with line split. Must combine with umask 01H, 02H, to produce counts. Qualify any retired memory uops. Must combine with umask 01H, 02H, to produce counts. Retired load uops with L1 cache hits as data sources. Retired load uops with L2 cache hits as data sources. Retired load uops with LLC cache hits as data sources. Retired load uops which data sources were data missed LLC (excluding unknown data source). Supports PEBS

MEM_LOAD_UOPS_RETIRED.HI Retired load uops which data sources were load T_LFB uops missed L1 but hit FB due to preceding miss to the same cache line with data not ready. MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC hit ETIRED.XSNP_MISS and cross-core snoop missed in on-pkg core cache. Supports PEBS

D2H D2H D2H D2H D3H E6H F0H F0H F0H F0H F0H F0H

01H 02H 04H 08H 01H 1FH 01H 02H 04H 08H 10H 20H

MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC and Supports PEBS ETIRED.XSNP_HIT cross-core snoop hits in on-pkg core cache. MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were HitM ETIRED.XSNP_HITM responses from shared LLC. MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were hits in ETIRED.XSNP_NONE LLC without snoops required. MEM_LOAD_UOPS_LLC_MISS_ Retired load uops which data sources missed LLC RETIRED.LOCAL_DRAM but serviced from local dram. BACLEARS.ANY L2_TRANS.DEMAND_DATA_R D L2_TRANS.RFO L2_TRANS.CODE_RD L2_TRANS.ALL_PF L2_TRANS.L1D_WB L2_TRANS.L2_FILL Number of front end re-steers due to BPU misprediction. Demand Data Read requests that access L2 cache. RFO requests that access L2 cache. L2 cache accesses when fetching instructions. Any MLC or LLC HW prefetch accessing L2, including rejects. L1D writebacks that access L2 cache. L2 fill requests that access L2 cache. Supports PEBS.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

160

Table 19-5 Non-Architectural Performance Events In the Processor Core of 3rd Generation Intel Core i7, i5, i3 Processors (Contd.)
Event Num. F0H F0H F1H F1H F1H F1H F2H F2H F2H F2H F2H Umask Value 40H 80H 01H 02H 04H 07H 01H 02H 04H 08H 0AH Event Mask Mnemonic L2_TRANS.L2_WB L2_TRANS.ALL_REQUESTS L2_LINES_IN.I L2_LINES_IN.S L2_LINES_IN.E L2_LINES_IN.ALL Description L2 writebacks that access L2 cache. Transactions accessing L2 pipe. L2 cache lines in I state filling L2. L2 cache lines in S state filling L2. L2 cache lines in E state filling L2. L2 cache lines filling L2. Counting does not cover rejects. Counting does not cover rejects. Counting does not cover rejects. Counting does not cover rejects. Comment

L2_LINES_OUT.DEMAND_CLEA Clean L2 cache lines evicted by demand. N L2_LINES_OUT.DEMAND_DIRT Dirty L2 cache lines evicted by demand. Y L2_LINES_OUT.PF_CLEAN L2_LINES_OUT.PF_DIRTY L2_LINES_OUT.DIRTY_ALL Clean L2 cache lines evicted by the MLC prefetcher. Dirty L2 cache lines evicted by the MLC prefetcher. Dirty L2 cache lines filling the L2. Counting does not cover rejects.

...

Table 19-6 Non-Architectural Performance Events In the Processor Core Common to 2nd Generation Intel Core i7-2xxx, Intel Core i5-2xxx, Intel Core i3-2xxx Processor Series and Intel Xeon Processors E5 Family (Contd.)
Event Num. ... 48H 01H L1D_PEND_MISS.PENDING Increments the number of outstanding L1D misses every cycle. Set Cmaks = 1 and Edge =1 to count occurrences. Cycles stalled due to free list empty. PMC2 only; Set Cmask = 1 to count cycles. PMC0-3 only regardless HTT Umask Value Event Mask Mnemonic Description Comment

... 5BH ... A2H ... A3H A3H 02H 01H CYCLE_ACTIVITY.CYCLES_L1D_ Cycles with pending L1 cache miss loads.Set PENDING AnyThread to count per core. CYCLE_ACTIVITY.CYCLES_L2_P Cycles with pending L2 miss loads. Set AnyThread ENDING to count per core. PMC2 only 01H RESOURCE_STALLS.ANY Cycles Allocation is stalled due to Resource Related reason. 0CH RESOURCE_STALLS2.ALL_FL_ EMPTY

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

161

A3H ... B1H

04H

CYCLE_ACTIVITY.CYCLES_NO_ DISPATCH UOPS_DISPATCHED.THREAD

Cycles of dispatch stalls. Set AnyThread to count per PMC0-3 only core. Counts total number of uops to be dispatched perthread each cycle. Set Cmask = 1, INV =1 to count stall cycles. see Section 18.8.5, Off-core Response Performance Monitoring. See Section 18.8.5, Off-core Response Performance Monitoring. Qualify retired memory uops that are loads. Combine with umask 10H, 20H, 40H, 80H. Retired load uops with L1 cache hits as data sources. PMC0-3 only regardless HTT

01H

... B7H BBH ... D0H ... D1H ... D4H ... ... 02H MEM_LOAD_UOPS_MISC_RETI RED.LLC_MISS Retired load uops with unknown information as data Supports PEBS. PMC0-3 source in cache serviced the load. only regardless HTT 01H MEM_LOAD_UOPS_RETIRED.L 1_HIT Supports PEBS. PMC0-3 only regardless HTT 01H MEM_UOP_RETIRED.LOADS Supports PEBS. PMC0-3 only regardless HTT. 01H 01H OFF_CORE_RESPONSE_0 OFF_CORE_RESPONSE_1 Requires MSR 01A6H Requires MSR 01A7H

Table 19-7 Non-Architectural Performance Events applicable only to the Processor core for 2nd Generation Intel Core i7-2xxx, Intel Core i5-2xxx, Intel Core i3-2xxx Processor Series
Event Num. D2H D2H D2H D2H ... ... Umask Value 01H 02H 04H 08H Event Mask Mnemonic Description Comment

MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC hit and Supports PEBS. PMC0ETIRED.XSNP_MISS cross-core snoop missed in on-pkg core cache. 3 only regardless HTT MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were LLC and ETIRED.XSNP_HIT cross-core snoop hits in on-pkg core cache. MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were HitM ETIRED.XSNP_HITM responses from shared LLC. MEM_LOAD_UOPS_LLC_HIT_R Retired load uops which data sources were hits in LLC ETIRED.XSNP_NONE without snoops required.

19.Updates to Chapter 24, Volume 3C


Change bars show changes to Chapter 24 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. ------------------------------------------------------------------------------------------

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

162

...

24.4.1

Guest Register State

The following fields in the guest-state area correspond to processor registers: Control registers CR0, CR3, and CR4 (64 bits each; 32 bits on processors that do not support Intel 64 architecture). Debug register DR7 (64 bits; 32 bits on processors that do not support Intel 64 architecture). RSP, RIP, and RFLAGS (64 bits each; 32 bits on processors that do not support Intel 64 architecture).1 The following fields for each of the registers CS, SS, DS, ES, FS, GS, LDTR, and TR: Selector (16 bits). Base address (64 bits; 32 bits on processors that do not support Intel 64 architecture). The base-address fields for CS, SS, DS, and ES have only 32 architecturally-defined bits; nevertheless, the corresponding VMCS fields have 64 bits on processors that support Intel 64 architecture. Segment limit (32 bits). The limit field is always a measure in bytes. Access rights (32 bits). The format of this field is given in Table 24-2 and detailed as follows: The low 16 bits correspond to bits 23:8 of the upper 32 bits of a 64-bit segment descriptor. While bits 19:16 of code-segment and data-segment descriptors correspond to the upper 4 bits of the segment limit, the corresponding bits (bits 11:8) are reserved in this VMCS field. Bit 16 indicates an unusable segment. Attempts to use such a segment fault except in 64-bit mode. In general, a segment register is unusable if it has been loaded with a null selector.2 Bits 31:17 are reserved.

Table 24-2 Format of Access Rights


Bit Position(s) 3:0 4 6:5 7 11:8 12 Field Segment type S Descriptor type (0 = system; 1 = code or data) DPL Descriptor privilege level P Segment present Reserved AVL Available for use by system software

1. This chapter uses the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For processors that do not support Intel 64 architecture, this notation refers to the 32-bit forms of those registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to lower 32 bits of the indicated register. 2. There are a few exceptions to this statement. For example, a segment with a non-null selector may be unusable following a task switch that fails after its commit point; see Interrupt 10Invalid TSS Exception (#TS) in Section 6.14, Exception and Interrupt Handling in 64-bit Mode, of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A. In contrast, the TR register is usable after processor reset despite having a null selector; see Table 10-1 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

163

Table 24-2 Format of Access Rights (Contd.)


Bit Position(s) 13 14 15 16 31:17 Field Reserved (except for CS) L 64-bit mode active (for CS only) D/B Default operation size (0 = 16-bit segment; 1 = 32-bit segment) G Granularity Segment unusable (0 = usable; 1 = unusable) Reserved

The base address, segment limit, and access rights compose the hidden part (or descriptor cache) of each segment register. These data are included in the VMCS because it is possible for a segment registers descriptor cache to be inconsistent with the segment descriptor in memory (in the GDT or the LDT) referenced by the segment registers selector. The value of the DPL field for SS is always equal to the logical processors current privilege level (CPL).1 The following fields for each of the registers GDTR and IDTR: Base address (64 bits; 32 bits on processors that do not support Intel 64 architecture). Limit (32 bits). The limit fields contain 32 bits even though these fields are specified as only 16 bits in the architecture. The following MSRs: IA32_DEBUGCTL (64 bits) IA32_SYSENTER_CS (32 bits) IA32_SYSENTER_ESP and IA32_SYSENTER_EIP (64 bits; 32 bits on processors that do not support Intel 64 architecture) IA32_PERF_GLOBAL_CTRL (64 bits). This field is supported only on processors that support the 1-setting of the load IA32_PERF_GLOBAL_CTRL VM-entry control. IA32_PAT (64 bits). This field is supported only on processors that support either the 1-setting of the load IA32_PAT VM-entry control or that of the save IA32_PAT VM-exit control. IA32_EFER (64 bits). This field is supported only on processors that support either the 1-setting of the load IA32_EFER VM-entry control or that of the save IA32_EFER VM-exit control. The register SMBASE (32 bits). This register contains the base address of the logical processors SMRAM image.

24.4.2

Guest Non-Register State

In addition to the register state described in Section 24.4.1, the guest-state area includes the following fields that characterize guest state but which do not correspond to processor registers: Activity state (32 bits). This field identifies the logical processors activity state. When a logical processor is executing instructions normally, it is in the active state. Execution of certain instructions and the occurrence of certain events may cause a logical processor to transition to an inactive state in which it ceases to execute instructions.

1. In protected mode, CPL is also associated with the RPL field in the CS selector. However, the RPL fields are not meaningful in realaddress mode or in virtual-8086 mode.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

164

The following activity states are defined:1 0: Active. The logical processor is executing instructions normally. 1: HLT. The logical processor is inactive because it executed the HLT instruction. 2: Shutdown. The logical processor is inactive because it incurred a triple fault2 or some other serious error. 3: Wait-for-SIPI. The logical processor is inactive because it is waiting for a startup-IPI (SIPI). Future processors may include support for other activity states. Software should read the VMX capability MSR IA32_VMX_MISC (see Appendix A.6) to determine what activity states are supported. Interruptibility state (32 bits). The IA-32 architecture includes features that permit certain events to be blocked for a period of time. This field contains information about such blocking. Details and the format of this field are given in Table 24-3.

Table 24-3 Format of Interruptibility State


Bit Position(s) 0 Bit Name Blocking by STI Notes See the STISet Interrupt Flag section in Chapter 4 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2B. Execution of STI with RFLAGS.IF = 0 blocks interrupts (and, optionally, other events) for one instruction after its execution. Setting this bit indicates that this blocking is in effect. 1 Blocking by MOV SS See the MOVMove a Value from the Stack and POPPop a Value from the Stack sections in Chapter 4 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2B, and Section 6.8.3 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A. Execution of a MOV to SS or a POP to SS blocks interrupts for one instruction after its execution. In addition, certain debug exceptions are inhibited between a MOV to SS or a POP to SS and a subsequent instruction. Setting this bit indicates that the blocking of all these events is in effect. This document uses the term blocking by MOV SS, but it applies equally to POP SS. 2 3 Blocking by SMI Blocking by NMI See Section 34.2. System-management interrupts (SMIs) are disabled while the processor is in system-management mode (SMM). Setting this bit indicates that blocking of SMIs is in effect. See Section 6.7.1 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A and Section 34.8. Delivery of a non-maskable interrupt (NMI) or a system-management interrupt (SMI) blocks subsequent NMIs until the next execution of IRET. See Section 25.3 for how this behavior of IRET may change in VMX non-root operation. Setting this bit indicates that blocking of NMIs is in effect. Clearing this bit does not imply that NMIs are not (temporarily) blocked for other reasons. If the virtual NMIs VM-execution control (see Section 24.6.1) is 1, this bit does not control the blocking of NMIs. Instead, it refers to virtual-NMI blocking (the fact that guest software is not ready for an NMI). 31:4 Reserved VM entry will fail if these bits are not 0. See Section 26.3.1.5.

1. Execution of the MWAIT instruction may put a logical processor into an inactive state. However, this VMCS field never reflects this state. See Section 27.1. 2. A triple fault occurs when a logical processor encounters an exception while attempting to deliver a double fault.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

165

Pending debug exceptions (64 bits; 32 bits on processors that do not support Intel 64 architecture). IA-32 processors may recognize one or more debug exceptions without immediately delivering them.1 This field contains information about such exceptions. This field is described in Table 24-4.

Table 24-4 Format of Pending-Debug-Exceptions


Bit Position(s) 3:0 11:4 12 13 14 63:15 Bit Name B3 B0 Reserved Enabled breakpoint Reserved BS Reserved Notes When set, each of these bits indicates that the corresponding breakpoint condition was met. Any of these bits may be set even if the corresponding enabling bit in DR7 is not set. VM entry fails if these bits are not 0. See Section 26.3.1.5. When set, this bit indicates that at least one data or I/O breakpoint was met and was enabled in DR7. VM entry fails if this bit is not 0. See Section 26.3.1.5. When set, this bit indicates that a debug exception would have been triggered by single-step execution mode. VM entry fails if these bits are not 0. See Section 26.3.1.5. Bits 63:32 exist only on processors that support Intel 64 architecture.

VMCS link pointer (64 bits). This field is included for future expansion. Software should set this field to FFFFFFFF_FFFFFFFFH to avoid VM-entry failures (see Section 26.3.1.5). VMX-preemption timer value (32 bits). This field is supported only on processors that support the 1setting of the activate VMX-preemption timer VM-execution control. This field contains the value that the VMX-preemption timer will use following the next VM entry with that setting. See Section 25.5.1 and Section 26.6.4. Page-directory-pointer-table entries (PDPTEs; 64 bits each). These four (4) fields (PDPTE0, PDPTE1, PDPTE2, and PDPTE3) are supported only on processors that support the 1-setting of the enable EPT VMexecution control. They correspond to the PDPTEs referenced by CR3 when PAE paging is in use (see Section 4.4 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A). They are used only if the enable EPT VM-execution control is 1. Guest interrupt status (16 bits). This field is supported only on processors that support the 1-setting of the virtual-interrupt delivery VM-execution control. It characterizes part of the guests virtual-APIC state and does not correspond to any processor or APIC registers. It comprises two 8-bit subfields: Requesting virtual interrupt (RVI). This is the low byte of the guest interrupt status. The processor treats this value as the vector of the highest priority virtual interrupt that is requesting service. (The value 0 implies that there is no such interrupt.) Servicing virtual interrupt (SVI). This is the high byte of the guest interrupt status. The processor treats this value as the vector of the highest priority virtual interrupt that is in service. (The value 0 implies that there is no such interrupt.) See Chapter 29 for more information on the use of this field.

1. For example, execution of a MOV to SS or a POP to SS may inhibit some debug exceptions for one instruction. See Section 6.8.3 of Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A. In addition, certain events incident to an instruction (for example, an INIT signal) may take priority over debug traps generated by that instruction. See Table 6-2 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

166

24.5

HOST-STATE AREA

This section describes fields contained in the host-state area of the VMCS. As noted earlier, processor state is loaded from these fields on every VM exit (see Section 27.5). All fields in the host-state area correspond to processor registers: CR0, CR3, and CR4 (64 bits each; 32 bits on processors that do not support Intel 64 architecture). RSP and RIP (64 bits each; 32 bits on processors that do not support Intel 64 architecture). Selector fields (16 bits each) for the segment registers CS, SS, DS, ES, FS, GS, and TR. There is no field in the host-state area for the LDTR selector. Base-address fields for FS, GS, TR, GDTR, and IDTR (64 bits each; 32 bits on processors that do not support Intel 64 architecture). The following MSRs: IA32_SYSENTER_CS (32 bits) IA32_SYSENTER_ESP and IA32_SYSENTER_EIP (64 bits; 32 bits on processors that do not support Intel 64 architecture). IA32_PERF_GLOBAL_CTRL (64 bits). This field is supported only on processors that support the 1-setting of the load IA32_PERF_GLOBAL_CTRL VM-exit control. IA32_PAT (64 bits). This field is supported only on processors that support either the 1-setting of the load IA32_PAT VM-exit control. IA32_EFER (64 bits). This field is supported only on processors that support either the 1-setting of the load IA32_EFER VM-exit control. In addition to the state identified here, some processor state components are loaded with fixed values on every VM exit; there are no fields corresponding to these components in the host-state area. See Section 27.5 for details of how state is loaded on VM exits. ...

24.6.1

Pin-Based VM-Execution Controls

The pin-based VM-execution controls constitute a 32-bit vector that governs the handling of asynchronous events (for example: interrupts).1 Table 24-5 lists the controls. See Chapter 25 for how these controls affect processor behavior in VMX non-root operation.

Table 24-5 Definitions of Pin-Based VM-Execution Controls


Bit Position(s) Name 0 External-interrupt exiting NMI exiting Description If this control is 1, external interrupts cause VM exits. Otherwise, they are delivered normally through the guest interrupt-descriptor table (IDT). If this control is 1, the value of RFLAGS.IF does not affect interrupt blocking. If this control is 1, non-maskable interrupts (NMIs) cause VM exits. Otherwise, they are delivered normally using descriptor 2 of the IDT. This control also determines interactions between IRET and blocking by NMI (see Section 25.3).

1. Some asynchronous events cause VM exits regardless of the settings of the pin-based VM-execution controls (see Section 25.2).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

167

Table 24-5 Definitions of Pin-Based VM-Execution Controls (Contd.)


Bit Position(s) Name 5 Virtual NMIs Description If this control is 1, NMIs are never blocked and the blocking by NMI bit (bit 3) in the interruptibility-state field indicates virtual-NMI blocking (see Table 24-3). This control also interacts with the NMI-window exiting VM-execution control (see Section 24.6.2). If this control is 1, the VMX-preemption timer counts down in VMX non-root operation; see Section 25.5.1. A VM exit occurs when the timer counts down to zero; see Section 25.2. If this control is 1, the processor treats interrupts with the posted-interrupt notification vector (see Section 24.6.8) specially, updating the virtual-APIC page with posted-interrupt requests (see Section 29.6).

6 7

Activate VMXpreemption timer Process posted interrupts

All other bits in this field are reserved, some to 0 and some to 1. Software should consult the VMX capability MSRs IA32_VMX_PINBASED_CTLS and IA32_VMX_TRUE_PINBASED_CTLS (see Appendix A.3.1) to determine how to set reserved bits. Failure to set reserved bits properly causes subsequent VM entries to fail (see Section 26.2.1.1). The first processors to support the virtual-machine extensions supported only the 1-settings of bits 1, 2, and 4. The VMX capability MSR IA32_VMX_PINBASED_CTLS will always report that these bits must be 1. Logical processors that support the 0-settings of any of these bits will support the VMX capability MSR IA32_VMX_TRUE_PINBASED_CTLS MSR, and software should consult this MSR to discover support for the 0settings of these bits. Software that is not aware of the functionality of any one of these bits should set that bit to 1.

24.6.2

Processor-Based VM-Execution Controls

The processor-based VM-execution controls constitute two 32-bit vectors that govern the handling of synchronous events, mainly those caused by the execution of specific instructions.1 These are the primary processorbased VM-execution controls and the secondary processor-based VM-execution controls. Table 24-6 lists the primary processor-based VM-execution controls. See Chapter 25 for more details of how these controls affect processor behavior in VMX non-root operation.

Table 24-6 Definitions of Primary Processor-Based VM-Execution Controls


Bit Position(s) Name 2 3 Interrupt-window exiting Use TSC offsetting Description If this control is 1, a VM exit occurs at the beginning of any instruction if RFLAGS.IF = 1 and there are no other blocking of interrupts (see Section 24.4.2). This control determines whether executions of RDTSC, executions of RDTSCP, and executions of RDMSR that read from the IA32_TIME_STAMP_COUNTER MSR return a value modified by the TSC offset field (see Section 24.6.5 and Section 25.3). This control determines whether executions of HLT cause VM exits. This determines whether executions of INVLPG cause VM exits. This control determines whether executions of MWAIT cause VM exits. This control determines whether executions of RDPMC cause VM exits. This control determines whether executions of RDTSC and RDTSCP cause VM exits.

7 9 10 11 12

HLT exiting INVLPG exiting MWAIT exiting RDPMC exiting RDTSC exiting

1. Some instructions cause VM exits regardless of the settings of the processor-based VM-execution controls (see Section 25.1.2), as do task switches (see Section 25.2).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

168

Table 24-6 Definitions of Primary Processor-Based VM-Execution Controls (Contd.)


Bit Position(s) Name 15 CR3-load exiting Description In conjunction with the CR3-target controls (see Section 24.6.7), this control determines whether executions of MOV to CR3 cause VM exits. See Section 25.1.3. The first processors to support the virtual-machine extensions supported only the 1-setting of this control. 16 CR3-store exiting This control determines whether executions of MOV from CR3 cause VM exits. The first processors to support the virtual-machine extensions supported only the 1-setting of this control. 19 20 21 22 23 24 25 CR8-load exiting CR8-store exiting Use TPR shadow NMI-window exiting MOV-DR exiting Unconditional I/O exiting Use I/O bitmaps This control determines whether executions of MOV to CR8 cause VM exits. This control determines whether executions of MOV from CR8 cause VM exits. Setting this control to 1 enables TPR virtualization and other APIC-virtualization features. See Chapter 29. If this control is 1, a VM exit occurs at the beginning of any instruction if there is no virtualNMI blocking (see Section 24.4.2). This control determines whether executions of MOV DR cause VM exits. This control determines whether executions of I/O instructions (IN, INS/INSB/INSW/INSD, OUT, and OUTS/OUTSB/OUTSW/OUTSD) cause VM exits. This control determines whether I/O bitmaps are used to restrict executions of I/O instructions (see Section 24.6.4 and Section 25.1.3). For this control, 0 means do not use I/O bitmaps and 1 means use I/O bitmaps. If the I/O bitmaps are used, the setting of the unconditional I/O exiting control is ignored. 27 28 Monitor trap flag Use MSR bitmaps If this control is 1, the monitor trap flag debugging feature is enabled. See Section 25.5.2. This control determines whether MSR bitmaps are used to control execution of the RDMSR and WRMSR instructions (see Section 24.6.9 and Section 25.1.3). For this control, 0 means do not use MSR bitmaps and 1 means use MSR bitmaps. If the MSR bitmaps are not used, all executions of the RDMSR and WRMSR instructions cause VM exits. 29 30 31 MONITOR exiting PAUSE exiting This control determines whether executions of MONITOR cause VM exits. This control determines whether executions of PAUSE cause VM exits.

Activate secondary This control determines whether the secondary processor-based VM-execution controls are controls used. If this control is 0, the logical processor operates as if all the secondary processor-based VM-execution controls were also 0.

All other bits in this field are reserved, some to 0 and some to 1. Software should consult the VMX capability MSRs IA32_VMX_PROCBASED_CTLS and IA32_VMX_TRUE_PROCBASED_CTLS (see Appendix A.3.2) to determine how to set reserved bits. Failure to set reserved bits properly causes subsequent VM entries to fail (see Section 26.2.1.1). The first processors to support the virtual-machine extensions supported only the 1-settings of bits 1, 46, 8, 13 16, and 26. The VMX capability MSR IA32_VMX_PROCBASED_CTLS will always report that these bits must be 1. Logical processors that support the 0-settings of any of these bits will support the VMX capability MSR IA32_VMX_TRUE_PROCBASED_CTLS MSR, and software should consult this MSR to discover support for the 0settings of these bits. Software that is not aware of the functionality of any one of these bits should set that bit to 1. Bit 31 of the primary processor-based VM-execution controls determines whether the secondary processor-based VM-execution controls are used. If that bit is 0, VM entry and VMX non-root operation function as if all the

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

169

secondary processor-based VM-execution controls were 0. Processors that support only the 0-setting of bit 31 of the primary processor-based VM-execution controls do not support the secondary processor-based VM-execution controls. Table 24-7 lists the secondary processor-based VM-execution controls. See Chapter 25 for more details of how these controls affect processor behavior in VMX non-root operation.

Table 24-7 Definitions of Secondary Processor-Based VM-Execution Controls


Bit Position(s) Name 0 1 2 3 4 5 6 7 8 9 10 11 12 13 Virtualize APIC accesses Enable EPT Descriptor-table exiting Enable RDTSCP Virtualize x2APIC mode Enable VPID WBINVD exiting Unrestricted guest APIC-register virtualization Virtual-interrupt delivery Description If this control is 1, the logical processor treats specially accesses to the page with the APICaccess address. See Section 29.4. If this control is 1, extended page tables (EPT) are enabled. See Section 28.2. This control determines whether executions of LGDT, LIDT, LLDT, LTR, SGDT, SIDT, SLDT, and STR cause VM exits. If this control is 0, any execution of RDTSCP causes an invalid-opcode exception (#UD). If this control is 1, the logical processor treats specially RDMSR and WRMSR to APIC MSRs (in the range 800H8FFH). See Section 29.5. If this control is 1, cached translations of linear addresses are associated with a virtualprocessor identifier (VPID). See Section 28.1. This control determines whether executions of WBINVD cause VM exits. This control determines whether guest software may run in unpaged protected mode or in realaddress mode. If this control is 1, the logical processor virtualizes certain APIC accesses. See Section 29.4 and Section 29.5. This controls enables the evaluation and delivery of pending virtual interrupts as well as the emulation of writes to the APIC registers that control interrupt prioritization.

PAUSE-loop exiting This control determines whether a series of executions of PAUSE can cause a VM exit (see Section 24.6.13 and Section 25.1.3). RDRAND exiting Enable INVPCID Enable VM functions This control determines whether executions of RDRAND cause VM exits. If this control is 0, any execution of INVPCID causes an invalid-opcode exception (#UD). Setting this control to 1 enables use of the VMFUNC instruction in VMX non-root operation. See Section 25.5.5.

All other bits in this field are reserved to 0. Software should consult the VMX capability MSR IA32_VMX_PROCBASED_CTLS2 (see Appendix A.3.3) to determine which bits may be set to 1. Failure to clear reserved bits causes subsequent VM entries to fail (see Section 26.2.1.1). ...

24.6.8

Controls for APIC Virtualization

There are three mechanisms by which software accesses registers of the logical processors local APIC: If the local APIC is in xAPIC mode, it can perform memory-mapped accesses to addresses in the 4-KByte page referenced by the physical address in the IA32_APIC_BASE MSR (see Section 10.4.4, Local APIC Status and Location in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A and Intel 64 Architecture Processor Topology Enumeration).1

1. If the local APIC does not support x2APIC mode, it is always in xAPIC mode.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

170

If the local APIC is in x2APIC mode, it can accesses the local APICs registers using the RDMSR and WRMSR instructions (see Intel 64 Architecture Processor Topology Enumeration). In 64-bit mode, it can access the local APICs task-priority register (TPR) using the MOV CR8 instruction.

There are five processor-based VM-execution controls (see Section 24.6.2) that control such accesses. There are use TPR shadow, virtualize APIC accesses, virtualize x2APIC mode, virtual-interrupt delivery, and APICregister virtualization. These controls interact with the following fields: APIC-access address (64 bits). This field contains the physical address of the 4-KByte APIC-access page. If the virtualize APIC accesses VM-execution control is 1, access to this page may cause VM exits or be virtualized by the processor. See Section 29.4. The APIC-access address exists only on processors that support the 1-setting of the virtualize APIC accesses VM-execution control. Virtual-APIC address (64 bits). This field contains the physical address of the 4-KByte virtual-APIC page. The processor uses the virtual-APIC page to virtualize certain accesses to APIC registers and to manage virtual interrupts; see Chapter 29. Depending on the setting of the controls indicated earlier, the virtual-APIC page may be accessed by the following operations: The MOV CR8 instructions (see Section 29.3). Accesses to the APIC-access page if, in addition, the virtualize APIC accesses VM-execution control is 1 (see Section 29.4). The RDMSR and WRMSR instructions if, in addition, the value of ECX is in the range 800H8FFH (indicating an APIC MSR) and the virtualize x2APIC mode VM-execution control is 1 (see Section 29.5). If the use TPR shadow VM-execution control is 1, VM entry ensures that the virtual-APIC address is 4-KByte aligned. The virtual-APIC address exists only on processors that support the 1-setting of the use TPR shadow VM-execution control. TPR threshold (32 bits). Bits 3:0 of this field determine the threshold below which bits 7:4 of VTPR (see Section 29.1.1) cannot fall. If the virtual-interrupt delivery VM-execution control is 0, a VM exit occurs after an operation (e.g., an execution of MOV to CR8) that reduces the value of those bits below the TPR threshold. See Section 29.1.2. The TPR threshold exists only on processors that support the 1-setting of the use TPR shadow VM-execution control. EOI-exit bitmap (4 fields; 64 bits each). These fields are supported only on processors that support the 1setting of the virtual-interrupt delivery VM-execution control. They are used to determine which virtualized writes to the APICs EOI register cause VM exits: EOI_EXIT0 contains bits for vectors from 0 (bit 0) to 63 (bit 63). EOI_EXIT1 contains bits for vectors from 64 (bit 0) to 127 (bit 63). EOI_EXIT2 contains bits for vectors from 128 (bit 0) to 191 (bit 63). EOI_EXIT3 contains bits for vectors from 192 (bit 0) to 255 (bit 63). See Section 29.1.4 for more information on the use of this field. Posted-interrupt notification vector (16 bits). This field is supported only on processors that support the 1-setting of the process posted interrupts VM-execution control. Its low 8 bits contain the interrupt vector that is used to notify a logical processor that virtual interrupts have been posted. See Section 29.6 for more information on the use of this field. Posted-interrupt descriptor address (64 bits). This field is supported only on processors that support the 1-setting of the process posted interrupts VM-execution control. It is the physical address of a 64-byte aligned posted interrupt descriptor. See Section 29.6 for more information on the use of this field.

...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

171

20.Updates to Chapter 25, Volume 3C


Change bars show changes to Chapter 25 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------... In a virtualized environment using VMX, the guest software stack typically runs on a logical processor in VMX nonroot operation. This mode of operation is similar to that of ordinary processor operation outside of the virtualized environment. This chapter describes the differences between VMX non-root operation and ordinary processor operation with special attention to causes of VM exits (which bring a logical processor from VMX non-root operation to root operation). The differences between VMX non-root operation and ordinary processor operation are described in the following sections: Section 25.1, Instructions That Cause VM Exits Section 25.2, Other Causes of VM Exits Section 25.3, Changes to Instruction Behavior in VMX Non-Root Operation Section 25.4, Other Changes in VMX Non-Root Operation Section 25.5, Features Specific to VMX Non-Root Operation

Chapter 24, Virtual-Machine Control Structures, describes the data control structures that govern VMX non-root operation. Chapter 26, VM Entries, describes the operation of VM entries by which the processor transitions from VMX root operation to VMX non-root operation. Chapter 27, VM Exits, describes the operation of VM exits by which the processor transitions from VMX non-root operation to VMX root operation. Chapter 28, VMX Support for Address Translation, describes two features that support address translation in VMX non-root operation. Chapter 29, APIC Virtualization and Virtual Interrupts, describes features that support virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC) in VMX non-root operation. ...

25.1.2

Instructions That Cause VM Exits Unconditionally

The following instructions cause VM exits when they are executed in VMX non-root operation: CPUID, GETSEC,1 INVD, and XSETBV. This is also true of instructions introduced with VMX, which include: INVEPT, INVVPID, VMCALL,2 VMCLEAR, VMLAUNCH, VMPTRLD, VMPTRST, VMREAD, VMRESUME, VMWRITE, VMXOFF, and VMXON.

25.1.3

Instructions That Cause VM Exits Conditionally

Certain instructions cause VM exits in VMX non-root operation depending on the setting of the VM-execution controls. The following instructions can cause fault-like VM exits based on the conditions described: CLTS. The CLTS instruction causes a VM exit if the bits in position 3 (corresponding to CR0.TS) are set in both the CR0 guest/host mask and the CR0 read shadow. HLT. The HLT instruction causes a VM exit if the HLT exiting VM-execution control is 1.

1. An execution of GETSEC in VMX non-root operation causes a VM exit if CR4.SMXE[Bit 14] = 1 regardless of the value of CPL or RAX. An execution of GETSEC causes an invalid-opcode exception (#UD) if CR4.SMXE[Bit 14] = 0. 2. Under the dual-monitor treatment of SMIs and SMM, executions of VMCALL cause SMM VM exits in VMX root operation outside SMM. See Section 34.15.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

172

IN, INS/INSB/INSW/INSD, OUT, OUTS/OUTSB/OUTSW/OUTSD. The behavior of each of these instructions is determined by the settings of the unconditional I/O exiting and use I/O bitmaps VM-execution controls: If both controls are 0, the instruction executes normally. If the unconditional I/O exiting VM-execution control is 1 and the use I/O bitmaps VM-execution control is 0, the instruction causes a VM exit. If the use I/O bitmaps VM-execution control is 1, the instruction causes a VM exit if it attempts to access an I/O port corresponding to a bit set to 1 in the appropriate I/O bitmap (see Section 24.6.4). If an I/O operation wraps around the 16-bit I/O-port space (accesses ports FFFFH and 0000H), the I/O instruction causes a VM exit (the unconditional I/O exiting VM-execution control is ignored if the use I/O bitmaps VM-execution control is 1). See Section 25.1.1 for information regarding the priority of VM exits relative to faults that may be caused by the INS and OUTS instructions.

INVLPG. The INVLPG instruction causes a VM exit if the INVLPG exiting VM-execution control is 1. INVPCID. The INVPCID instruction causes a VM exit if the INVLPG exiting and enable INVPCID VM-execution controls are both 1.1 LGDT, LIDT, LLDT, LTR, SGDT, SIDT, SLDT, STR. These instructions cause VM exits if the descriptortable exiting VM-execution control is 1.2 LMSW. In general, the LMSW instruction causes a VM exit if it would write, for any bit set in the low 4 bits of the CR0 guest/host mask, a value different than the corresponding bit in the CR0 read shadow. LMSW never clears bit 0 of CR0 (CR0.PE); thus, LMSW causes a VM exit if either of the following are true: The bits in position 0 (corresponding to CR0.PE) are set in both the CR0 guest/mask and the source operand, and the bit in position 0 is clear in the CR0 read shadow. For any bit position in the range 3:1, the bit in that position is set in the CR0 guest/mask and the values of the corresponding bits in the source operand and the CR0 read shadow differ.

MONITOR. The MONITOR instruction causes a VM exit if the MONITOR exiting VM-execution control is 1. MOV from CR3. The MOV from CR3 instruction causes a VM exit if the CR3-store exiting VM-execution control is 1. The first processors to support the virtual-machine extensions supported only the 1-setting of this control. MOV from CR8. The MOV from CR8 instruction causes a VM exit if the CR8-store exiting VM-execution control is 1. MOV to CR0. The MOV to CR0 instruction causes a VM exit unless the value of its source operand matches, for the position of each bit set in the CR0 guest/host mask, the corresponding bit in the CR0 read shadow. (If every bit is clear in the CR0 guest/host mask, MOV to CR0 cannot cause a VM exit.) MOV to CR3. The MOV to CR3 instruction causes a VM exit unless the CR3-load exiting VM-execution control is 0 or the value of its source operand is equal to one of the CR3-target values specified in the VMCS. If the CR3-target count in n, only the first n CR3-target values are considered; if the CR3-target count is 0, MOV to CR3 always causes a VM exit. The first processors to support the virtual-machine extensions supported only the 1-setting of the CR3-load exiting VM-execution control. These processors always consult the CR3-target controls to determine whether an execution of MOV to CR3 causes a VM exit.

1. Enable INVPCID is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the enable INVPCID VM-execution control were 0. See Section 24.6.2. 2. Descriptor-table exiting is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the descriptor-table exiting VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

173

MOV to CR4. The MOV to CR4 instruction causes a VM exit unless the value of its source operand matches, for the position of each bit set in the CR4 guest/host mask, the corresponding bit in the CR4 read shadow. MOV to CR8. The MOV to CR8 instruction causes a VM exit if the CR8-load exiting VM-execution control is 1. MOV DR. The MOV DR instruction causes a VM exit if the MOV-DR exiting VM-execution control is 1. Such VM exits represent an exception to the principles identified in Section 25.1.1 in that they take priority over the following: general-protection exceptions based on privilege level; and invalid-opcode exceptions that occur because CR4.DE=1 and the instruction specified access to DR4 or DR5. MWAIT. The MWAIT instruction causes a VM exit if the MWAIT exiting VM-execution control is 1. If this control is 0, the behavior of the MWAIT instruction may be modified (see Section 25.3). PAUSE.The behavior of each of this instruction depends on CPL and the settings of the PAUSE exiting and PAUSE-loop exiting VM-execution controls:1 CPL = 0. If the PAUSE exiting and PAUSE-loop exiting VM-execution controls are both 0, the PAUSE instruction executes normally. If the PAUSE exiting VM-execution control is 1, the PAUSE instruction causes a VM exit (the PAUSEloop exiting VM-execution control is ignored if CPL = 0 and the PAUSE exiting VM-execution control is 1). If the PAUSE exiting VM-execution control is 0 and the PAUSE-loop exiting VM-execution control is 1, the following treatment applies. The processor determines the amount of time between this execution of PAUSE and the previous execution of PAUSE at CPL 0. If this amount of time exceeds the value of the VM-execution control field PLE_Gap, the processor considers this execution to be the first execution of PAUSE in a loop. (It also does so for the first execution of PAUSE at CPL 0 after VM entry.) Otherwise, the processor determines the amount of time since the most recent execution of PAUSE that was considered to be the first in a loop. If this amount of time exceeds the value of the VMexecution control field PLE_Window, a VM exit occurs. For purposes of these computations, time is measured based on a counter that runs at the same rate as the timestamp counter (TSC). CPL > 0. If the PAUSE exiting VM-execution control is 0, the PAUSE instruction executes normally. If the PAUSE exiting VM-execution control is 1, the PAUSE instruction causes a VM exit.

The PAUSE-loop exiting VM-execution control is ignored if CPL > 0. RDMSR. The RDMSR instruction causes a VM exit if any of the following are true: The use MSR bitmaps VM-execution control is 0. The value of ECX is not in the range 00000000H 00001FFFH or C0000000H C0001FFFH. The value of ECX is in the range 00000000H 00001FFFH and bit n in read bitmap for low MSRs is 1, where n is the value of ECX. The value of ECX is in the range C0000000H C0001FFFH and bit n in read bitmap for high MSRs is 1, where n is the value of ECX & 00001FFFH. See Section 24.6.9 for details regarding how these bitmaps are identified. RDPMC. The RDPMC instruction causes a VM exit if the RDPMC exiting VM-execution control is 1. RDRAND. The RDRAND instruction causes a VM exit if the RDRAND exiting VM-execution control is 1.2

1. PAUSE-loop exiting is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the PAUSE-loop exiting VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

174

RDTSC. The RDTSC instruction causes a VM exit if the RDTSC exiting VM-execution control is 1. RDTSCP. The RDTSCP instruction causes a VM exit if the RDTSC exiting and enable RDTSCP VM-execution controls are both 1.1 RSM. The RSM instruction causes a VM exit if executed in system-management mode (SMM).2 WBINVD. The WBINVD instruction causes a VM exit if the WBINVD exiting VM-execution control is 1.3 WRMSR. The WRMSR instruction causes a VM exit if any of the following are true: The use MSR bitmaps VM-execution control is 0. The value of ECX is not in the range 00000000H 00001FFFH or C0000000H C0001FFFH. The value of ECX is in the range 00000000H 00001FFFH and bit n in write bitmap for low MSRs is 1, where n is the value of ECX. The value of ECX is in the range C0000000H C0001FFFH and bit n in write bitmap for high MSRs is 1, where n is the value of ECX & 00001FFFH. See Section 24.6.9 for details regarding how these bitmaps are identified.

25.2

OTHER CAUSES OF VM EXITS

In addition to VM exits caused by instruction execution, the following events can cause VM exits: Exceptions. Exceptions (faults, traps, and aborts) cause VM exits based on the exception bitmap (see Section 24.6.3). If an exception occurs, its vector (in the range 031) is used to select a bit in the exception bitmap. If the bit is 1, a VM exit occurs; if the bit is 0, the exception is delivered normally through the guest IDT. This use of the exception bitmap applies also to exceptions generated by the instructions INT3, INTO, BOUND, and UD2. Page faults (exceptions with vector 14) are specially treated. When a page fault occurs, a processor consults (1) bit 14 of the exception bitmap; (2) the error code produced with the page fault [PFEC]; (3) the page-fault error-code mask field [PFEC_MASK]; and (4) the page-fault error-code match field [PFEC_MATCH]. It checks if PFEC & PFEC_MASK = PFEC_MATCH. If there is equality, the specification of bit 14 in the exception bitmap is followed (for example, a VM exit occurs if that bit is set). If there is inequality, the meaning of that bit is reversed (for example, a VM exit occurs if that bit is clear). Thus, if software desires VM exits on all page faults, it can set bit 14 in the exception bitmap to 1 and set the page-fault error-code mask and match fields each to 00000000H. If software desires VM exits on no page faults, it can set bit 14 in the exception bitmap to 1, the page-fault error-code mask field to 00000000H, and the page-fault error-code match field to FFFFFFFFH. Triple fault. A VM exit occurs if the logical processor encounters an exception while attempting to call the double-fault handler and that exception itself does not cause a VM exit due to the exception bitmap. This applies to the case in which the double-fault exception was generated within VMX non-root operation, the case in which the double-fault exception was generated during event injection by VM entry, and to the case in which VM entry is injecting a double-fault exception.

2. RDRAND exiting is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the RDRAND exiting VM-execution control were 0. See Section 24.6.2. 1. Enable RDTSCP is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the enable RDTSCP VM-execution control were 0. See Section 24.6.2. 2. Execution of the RSM instruction outside SMM causes an invalid-opcode exception regardless of whether the processor is in VMX operation. It also does so in VMX root operation in SMM; see Section 34.15.3. 3. WBINVD exiting is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the WBINVD exiting VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

175

External interrupts. An external interrupt causes a VM exit if the external-interrupt exiting VM-execution control is 1. Otherwise, the interrupt is delivered normally through the IDT. (If a logical processor is in the shutdown state or the wait-for-SIPI state, external interrupts are blocked. The interrupt is not delivered through the IDT and no VM exit occurs.) Non-maskable interrupts (NMIs). An NMI causes a VM exit if the NMI exiting VM-execution control is 1. Otherwise, it is delivered using descriptor 2 of the IDT. (If a logical processor is in the wait-for-SIPI state, NMIs are blocked. The NMI is not delivered through the IDT and no VM exit occurs.) INIT signals. INIT signals cause VM exits. A logical processor performs none of the operations normally associated with these events. Such exits do not modify register state or clear pending events as they would outside of VMX operation. (If a logical processor is in the wait-for-SIPI state, INIT signals are blocked. They do not cause VM exits in this case.) Start-up IPIs (SIPIs). SIPIs cause VM exits. If a logical processor is not in the wait-for-SIPI activity state when a SIPI arrives, no VM exit occurs and the SIPI is discarded. VM exits due to SIPIs do not perform any of the normal operations associated with those events: they do not modify register state as they would outside of VMX operation. (If a logical processor is not in the wait-for-SIPI state, SIPIs are blocked. They do not cause VM exits in this case.) Task switches. Task switches are not allowed in VMX non-root operation. Any attempt to effect a task switch in VMX non-root operation causes a VM exit. See Section 25.4.2. System-management interrupts (SMIs). If the logical processor is using the dual-monitor treatment of SMIs and system-management mode (SMM), SMIs cause SMM VM exits. See Section 34.15.2.1 VMX-preemption timer. A VM exit occurs when the timer counts down to zero. See Section 25.5.1 for details of operation of the VMX-preemption timer. Debug-trap exceptions and higher priority events take priority over VM exits caused by the VMX-preemption timer. VM exits caused by the VMX-preemption timer take priority over VM exits caused by the NMI-window exiting VM-execution control and lower priority events. These VM exits wake a logical processor from the same inactive states as would a non-maskable interrupt. Specifically, they wake a logical processor from the shutdown state and from the states entered using the HLT and MWAIT instructions. These VM exits do not occur if the logical processor is in the wait-for-SIPI state.

In addition, there are controls that cause VM exits based on the readiness of guest software to receive interrupts: If the interrupt-window exiting VM-execution control is 1, a VM exit occurs before execution of any instruction if RFLAGS.IF = 1 and there is no blocking of events by STI or by MOV SS (see Table 24-3). Such a VM exit occurs immediately after VM entry if the above conditions are true (see Section 26.6.5). Non-maskable interrupts (NMIs) and higher priority events take priority over VM exits caused by this control. VM exits caused by this control take priority over external interrupts and lower priority events. These VM exits wake a logical processor from the same inactive states as would an external interrupt. Specifically, they wake a logical processor from the states entered using the HLT and MWAIT instructions. These VM exits do not occur if the logical processor is in the shutdown state or the wait-for-SIPI state. If the NMI-window exiting VM-execution control is 1, a VM exit occurs before execution of any instruction if there is no virtual-NMI blocking and there is no blocking of events by MOV SS (see Table 24-3). (A logical processor may also prevent such a VM exit if there is blocking of events by STI.) Such a VM exit occurs immediately after VM entry if the above conditions are true (see Section 26.6.6). VM exits caused by the VMX-preemption timer and higher priority events take priority over VM exits caused by this control. VM exits caused by this control take priority over non-maskable interrupts (NMIs) and lower priority events.

1. Under the dual-monitor treatment of SMIs and SMM, SMIs also cause SMM VM exits if they occur in VMX root operation outside SMM. If the processor is using the default treatment of SMIs and SMM, SMIs are delivered as described in Section 34.14.1.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

176

These VM exits wake a logical processor from the same inactive states as would an NMI. Specifically, they wake a logical processor from the shutdown state and from the states entered using the HLT and MWAIT instructions. These VM exits do not occur if the logical processor is in the wait-for-SIPI state.

25.3

CHANGES TO INSTRUCTION BEHAVIOR IN VMX NON-ROOT OPERATION

The behavior of some instructions is changed in VMX non-root operation. Some of these changes are determined by the settings of certain VM-execution control fields. The following items detail such changes: CLTS. Behavior of the CLTS instruction is determined by the bits in position 3 (corresponding to CR0.TS) in the CR0 guest/host mask and the CR0 read shadow: If bit 3 in the CR0 guest/host mask is 0, CLTS clears CR0.TS normally (the value of bit 3 in the CR0 read shadow is irrelevant in this case), unless CR0.TS is fixed to 1 in VMX operation (see Section 23.8), in which case CLTS causes a general-protection exception. If bit 3 in the CR0 guest/host mask is 1 and bit 3 in the CR0 read shadow is 0, CLTS completes but does not change the contents of CR0.TS. If the bits in position 3 in the CR0 guest/host mask and the CR0 read shadow are both 1, CLTS causes a VM exit. INVPCID. Behavior of the INVPCID instruction is determined first by the setting of the enable INVPCID VM-execution control:1 If the enable INVPCID VM-execution control is 0, INVPCID causes an invalid-opcode exception (#UD). If the enable INVPCID VM-execution control is 1, treatment is based on the setting of the INVLPG exiting VM-execution control: If the INVLPG exiting VM-execution control is 0, INVPCID operates normally. If the INVLPG exiting VM-execution control is 1, INVPCID causes a VM exit.

IRET. Behavior of IRET with regard to NMI blocking (see Table 24-3) is determined by the settings of the NMI exiting and virtual NMIs VM-execution controls: If the NMI exiting VM-execution control is 0, IRET operates normally and unblocks NMIs. (If the NMI exiting VM-execution control is 0, the virtual NMIs control must be 0; see Section 26.2.1.1.) If the NMI exiting VM-execution control is 1, IRET does not affect blocking of NMIs. If, in addition, the virtual NMIs VM-execution control is 1, the logical processor tracks virtual-NMI blocking. In this case, IRET removes any virtual-NMI blocking. The unblocking of NMIs or virtual NMIs specified above occurs even if IRET causes a fault.

LMSW. Outside of VMX non-root operation, LMSW loads its source operand into CR0[3:0], but it does not clear CR0.PE if that bit is set. In VMX non-root operation, an execution of LMSW that does not cause a VM exit (see Section 25.1.3) leaves unmodified any bit in CR0[3:0] corresponding to a bit set in the CR0 guest/host mask. An attempt to set any other bit in CR0[3:0] to a value not supported in VMX operation (see Section 23.8) causes a general-protection exception. Attempts to clear CR0.PE are ignored without fault. MOV from CR0. The behavior of MOV from CR0 is determined by the CR0 guest/host mask and the CR0 read shadow. For each position corresponding to a bit clear in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in CR0. For each position corresponding to a bit set in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in the CR0 read shadow. Thus, if every bit is cleared in the CR0 guest/host mask, MOV from CR0 reads normally from CR0; if every bit is set in the CR0 guest/host mask, MOV from CR0 returns the value of the CR0 read shadow.

1. Enable INVPCID is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the enable INVPCID VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

177

Depending on the contents of the CR0 guest/host mask and the CR0 read shadow, bits may be set in the destination that would never be set when reading directly from CR0. MOV from CR3. If the enable EPT VM-execution control is 1 and an execution of MOV from CR3 does not cause a VM exit (see Section 25.1.3), the value loaded from CR3 is a guest-physical address; see Section 28.2.1. MOV from CR4. The behavior of MOV from CR4 is determined by the CR4 guest/host mask and the CR4 read shadow. For each position corresponding to a bit clear in the CR4 guest/host mask, the destination operand is loaded with the value of the corresponding bit in CR4. For each position corresponding to a bit set in the CR4 guest/host mask, the destination operand is loaded with the value of the corresponding bit in the CR4 read shadow. Thus, if every bit is cleared in the CR4 guest/host mask, MOV from CR4 reads normally from CR4; if every bit is set in the CR4 guest/host mask, MOV from CR4 returns the value of the CR4 read shadow. Depending on the contents of the CR4 guest/host mask and the CR4 read shadow, bits may be set in the destination that would never be set when reading directly from CR4. MOV from CR8. If the MOV from CR8 instruction does not cause a VM exit (see Section 25.1.3), its behavior is modified if the use TPR shadow VM-execution control is 1; see Section 29.3. MOV to CR0. An execution of MOV to CR0 that does not cause a VM exit (see Section 25.1.3) leaves unmodified any bit in CR0 corresponding to a bit set in the CR0 guest/host mask. Treatment of attempts to modify other bits in CR0 depends on the setting of the unrestricted guest VM-execution control:1 If the control is 0, MOV to CR0 causes a general-protection exception if it attempts to set any bit in CR0 to a value not supported in VMX operation (see Section 23.8). If the control is 1, MOV to CR0 causes a general-protection exception if it attempts to set any bit in CR0 other than bit 0 (PE) or bit 31 (PG) to a value not supported in VMX operation. It remains the case, however, that MOV to CR0 causes a general-protection exception if it would result in CR0.PE = 0 and CR0.PG = 1 or if it would result in CR0.PG = 1, CR4.PAE = 0, and IA32_EFER.LME = 1. MOV to CR3. If the enable EPT VM-execution control is 1 and an execution of MOV to CR3 does not cause a VM exit (see Section 25.1.3), the value loaded into CR3 is treated as a guest-physical address; see Section 28.2.1. If PAE paging is not being used, the instruction does not use the guest-physical address to access memory and it does not cause it to be translated through EPT.2 If PAE paging is being used, the instruction translates the guest-physical address through EPT and uses the result to load the four (4) page-directory-pointer-table entries (PDPTEs). The instruction does not use the guest-physical addresses the PDPTEs to access memory and it does not cause them to be translated through EPT. MOV to CR4. An execution of MOV to CR4 that does not cause a VM exit (see Section 25.1.3) leaves unmodified any bit in CR4 corresponding to a bit set in the CR4 guest/host mask. Such an execution causes a general-protection exception if it attempts to set any bit in CR4 (not corresponding to a bit set in the CR4 guest/host mask) to a value not supported in VMX operation (see Section 23.8). MOV to CR8. If the MOV to CR8 instruction does not cause a VM exit (see Section 25.1.3), its behavior is modified if the use TPR shadow VM-execution control is 1; see Section 29.3. MWAIT. Behavior of the MWAIT instruction (which always causes an invalid-opcode exception#UDif CPL > 0) is determined by the setting of the MWAIT exiting VM-execution control: If the MWAIT exiting VM-execution control is 1, MWAIT causes a VM exit.

1. Unrestricted guest is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the unrestricted guest VM-execution control were 0. See Section 24.6.2. 2. A logical processor uses PAE paging if CR0.PG = 1, CR4.PAE = 1 and IA32_EFER.LMA = 0. See Section 4.4 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

178

If the MWAIT exiting VM-execution control is 0, MWAIT does not cause the processor to enter an implementation-dependent optimized state if (1) ECX[0] = 1; and (2) either (a) the interrupt-window exiting VM-execution control is 0; or (b) the logical processor has recognized a pending virtual interrupt (see Section 29.2.1). Instead, control passes to the instruction following the MWAIT instruction. RDMSR. Section 25.1.3 identifies when executions of the RDMSR instruction cause VM exits. If such an execution causes neither a fault due to CPL > 0 nor a VM exit, the instructions behavior may be modified for certain values of ECX: If ECX contains 10H (indicating the IA32_TIME_STAMP_COUNTER MSR), the value returned by the instruction is determined by the setting of the use TSC offsetting VM-execution control as well as the TSC offset: If the control is 0, the instruction operates normally, loading EAX:EDX with the value of the IA32_TIME_STAMP_COUNTER MSR. If the control is 1, the instruction loads EAX:EDX with the sum (using signed addition) of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC offset (interpreted as a signed value).

The 1-setting of the use TSC-offsetting VM-execution control does not effect executions of RDMSR if ECX contains 6E0H (indicating the IA32_TSC_DEADLINE MSR). Such executions return the APIC-timer deadline relative to the actual timestamp counter without regard to the TSC offset. If ECX is in the range 800H8FFH (indicating an APIC MSR), instruction behavior may be modified if the virtualize x2APIC mode VM-execution control is 1; see Section 29.5.1 RDTSC. Behavior of the RDTSC instruction is determined by the settings of the RDTSC exiting and use TSC offsetting VM-execution controls as well as the TSC offset: If both controls are 0, RDTSC operates normally. If the RDTSC exiting VM-execution control is 0 and the use TSC offsetting VM-execution control is 1, RDTSC loads EAX:EDX with the sum (using signed addition) of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC offset (interpreted as a signed value). If the RDTSC exiting VM-execution control is 1, RDTSC causes a VM exit. RDTSCP. Behavior of the RDTSCP instruction is determined first by the setting of the enable RDTSCP VM-execution control:2 If the enable RDTSCP VM-execution control is 0, RDTSCP causes an invalid-opcode exception (#UD). If the enable RDTSCP VM-execution control is 1, treatment is based on the settings of the RDTSC exiting and use TSC offsetting VM-execution controls as well as the TSC offset: If both controls are 0, RDTSCP operates normally. If the RDTSC exiting VM-execution control is 0 and the use TSC offsetting VM-execution control is 1, RDTSCP loads EAX:EDX with the sum (using signed addition) of the value of the IA32_TIME_STAMP_COUNTER MSR and the value of the TSC offset (interpreted as a signed value); it also loads ECX with the value of bits 31:0 of the IA32_TSC_AUX MSR. If the RDTSC exiting VM-execution control is 1, RDTSCP causes a VM exit.

SMSW. The behavior of SMSW is determined by the CR0 guest/host mask and the CR0 read shadow. For each position corresponding to a bit clear in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in CR0. For each position corresponding to a bit set in the CR0 guest/host mask, the destination operand is loaded with the value of the corresponding bit in the CR0 read shadow. Thus, if

1. Virtualize x2APIC mode is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the virtualize x2APIC mode VM-execution control were 0. See Section 24.6.2. 2. Enable RDTSCP is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the enable RDTSCP VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

179

every bit is cleared in the CR0 guest/host mask, MOV from CR0 reads normally from CR0; if every bit is set in the CR0 guest/host mask, MOV from CR0 returns the value of the CR0 read shadow. Note the following: (1) for any memory destination or for a 16-bit register destination, only the low 16 bits of the CR0 guest/host mask and the CR0 read shadow are used (bits 63:16 of a register destination are left unchanged); (2) for a 32-bit register destination, only the low 32 bits of the CR0 guest/host mask and the CR0 read shadow are used (bits 63:32 of the destination are cleared); and (3) depending on the contents of the CR0 guest/host mask and the CR0 read shadow, bits may be set in the destination that would never be set when reading directly from CR0. WRMSR. Section 25.1.3 identifies when executions of the WRMSR instruction cause VM exits. If such an execution neither a fault due to CPL > 0 nor a VM exit, the instructions behavior may be modified for certain values of ECX: If ECX contains 79H (indicating IA32_BIOS_UPDT_TRIG MSR), no microcode update is loaded, and control passes to the next instruction. This implies that microcode updates cannot be loaded in VMX non-root operation. If ECX contains 808H (indicating the TPR MSR), 80BH (the EOI MSR), or 83FH (self-IPI MSR), instruction behavior may modified if the virtualize x2APIC mode VM-execution control is 1; see Section 29.5.1

25.4

OTHER CHANGES IN VMX NON-ROOT OPERATION

Treatments of event blocking and of task switches differ in VMX non-root operation as described in the following sections. ...

25.4.2

Treatment of Task Switches

Task switches are not allowed in VMX non-root operation. Any attempt to effect a task switch in VMX non-root operation causes a VM exit. However, the following checks are performed (in the order indicated), possibly resulting in a fault, before there is any possibility of a VM exit due to task switch: 1. If a task gate is being used, appropriate checks are made on its P bit and on the proper values of the relevant privilege fields. The following cases detail the privilege checks performed: a. If CALL, INT n, or JMP accesses a task gate in IA-32e mode, a general-protection exception occurs. b. If CALL, INT n, INT3, INTO, or JMP accesses a task gate outside IA-32e mode, privilege-levels checks are performed on the task gate but, if they pass, privilege levels are not checked on the referenced task-state segment (TSS) descriptor. c. If CALL or JMP accesses a TSS descriptor directly in IA-32e mode, a general-protection exception occurs. d. If CALL or JMP accesses a TSS descriptor directly outside IA-32e mode, privilege levels are checked on the TSS descriptor. e. If a non-maskable interrupt (NMI), an exception, or an external interrupt accesses a task gate in the IDT in IA-32e mode, a general-protection exception occurs. f. If a non-maskable interrupt (NMI), an exception other than breakpoint exceptions (#BP) and overflow exceptions (#OF), or an external interrupt accesses a task gate in the IDT outside IA-32e mode, no privilege checks are performed.

1. Virtualize x2APIC mode is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the virtualize x2APIC mode VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

180

g. If IRET is executed with RFLAGS.NT = 1 in IA-32e mode, a general-protection exception occurs. h. If IRET is executed with RFLAGS.NT = 1 outside IA-32e mode, a TSS descriptor is accessed directly and no privilege checks are made. 2. Checks are made on the new TSS selector (for example, that is within GDT limits). 3. The new TSS descriptor is read. (A page fault results if a relevant GDT page is not present). 4. The TSS descriptor is checked for proper values of type (depends on type of task switch), P bit, S bit, and limit. Only if checks 14 all pass (do not generate faults) might a VM exit occur. However, the ordering between a VM exit due to a task switch and a page fault resulting from accessing the old TSS or the new TSS is implementation-specific. Some processors may generate a page fault (instead of a VM exit due to a task switch) if accessing either TSS would cause a page fault. Other processors may generate a VM exit due to a task switch even if accessing either TSS would cause a page fault. If an attempt at a task switch through a task gate in the IDT causes an exception (before generating a VM exit due to the task switch) and that exception causes a VM exit, information about the event whose delivery that accessed the task gate is recorded in the IDT-vectoring information fields and information about the exception that caused the VM exit is recorded in the VM-exit interruption-information fields. See Section 27.2. The fact that a task gate was being accessed is not recorded in the VMCS. If an attempt at a task switch through a task gate in the IDT causes VM exit due to the task switch, information about the event whose delivery accessed the task gate is recorded in the IDT-vectoring fields of the VMCS. Since the cause of such a VM exit is a task switch and not an interruption, the valid bit for the VM-exit interruption information field is 0. See Section 27.2. ...

25.5.4

APIC Virtualization

APIC virtualization is a collection of features that can be used to support the virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC). When APIC virtualization is enabled , the processor emulates many accesses to the APIC, tracks the state of the virtual APIC, and delivers virtual interrupts all in VMX non-root operation without a VM exit. Details of the APIC virtualization are given in Chapter 29. ...

25.5.5.3

EPTP Switching

EPTP switching is VM function 0. This VM function allows software in VMX non-root operation to load a new value for the EPT pointer (EPTP), thereby establishing a different EPT paging-structure hierarchy (see Section 28.2 for details of the operation of EPT). Software is limited to selecting from a list of potential EPTP values configured in advance by software in VMX root operation. Specifically, the value of ECX is used to select an entry from the EPTP list, the 4-KByte structure referenced by the EPTP-list address (see Section 24.6.14; because this structure contains 512 8-Byte entries, VMFUNC causes a VM exit if ECX 512). If the selected entry is a valid EPTP value (it would not cause VM entry to fail; see Section 26.2.1.1), it is stored in the EPTP field of the current VMCS and is used for subsequent accesses using guest-physical addresses. The following pseudocode provides details: IF ECX 512 THEN VM exit; ELSE tent_EPTP 8 bytes from EPTP-list address + 8 * ECX;

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

181

IF tent_EPTP is not a valid EPTP value (would cause VM entry to fail if in EPTP) THEN VMexit; ELSE write tent_EPTP to the EPTP field in the current VMCS; start using tent_EPTP as the new EPTP value for address translation; FI; FI; Execution of the EPTP-switching VM function does not modify the state of any registers; no flags are modified. As noted in Section 25.5.5.2, an execution of the EPTP-switching VM function that causes a VM exit (as specified above), uses the basic exit reason 59, indicating VMFUNC. The length of the VMFUNC instruction is saved into the VM-exit instruction-length field. No additional VM-exit information is provided. An execution of VMFUNC loads EPTP from the EPTP list (and thus does not cause a fault or VM exit) is called an EPTP-switching VMFUNC. After an EPTP-switching VMFUNC, control passes to the next instruction. The logical processor starts creating and using guest-physical and combined mappings associated with the new value of bits 51:12 of EPTP; the combined mappings created and used are associated with the current VPID and PCID (these are not changed by VMFUNC).1 If the enable VPID VM-execution control is 0, an EPTP-switching VMFUNC invalidates combined mappings associated with VPID 0000H (for all PCIDs and for all EP4TA values, where EP4TA is the value of bits 51:12 of EPTP). Because an EPTP-switching VMFUNC may change the translation of guest-physical addresses, it may affect use of the guest-physical address in CR3. The EPTP-switching VMFUNC cannot itself cause a VM exit due to an EPT violation or an EPT misconfiguration due to the translation of that guest-physical address through the new EPT paging structures. The following items provide details that apply if CR0.PG = 1: If 32-bit paging or IA-32e paging is in use (either CR4.PAE = 0 or IA32_EFER.LMA = 1), the next memory access with a linear address uses the translation of the guest-physical address in CR3 through the new EPT paging structures. As a result, this access may cause a VM exit due to an EPT violation or an EPT misconfiguration encountered during that translation. If PAE paging is in use (CR4.PAE = 1 and IA32_EFER.LMA = 0), an EPTP-switching VMFUNC does not load the four page-directory-pointer-table entries (PDPTEs) from the guest-physical address in CR3. The logical processor continues to use the four guest-physical addresses already present in the PDPTEs. The guestphysical address in CR3 is not translated through the new EPT paging structures (until some operation that would load the PDPTEs). The EPTP-switching VMFUNC cannot itself cause a VM exit due to an EPT violation or an EPT misconfiguration encountered during the translation of a guest-physical address in any of the PDPTEs. A subsequent memory access with a linear address uses the translation of the guest-physical address in the appropriate PDPTE through the new EPT paging structures. As a result, such an access may cause a VM exit due to an EPT violation or an EPT misconfiguration encountered during that translation. If an EPTP-switching VMFUNC establishes an EPTP value that enables accessed and dirty flags for EPT (by setting bit 6), subsequent memory accesses may fail to set those flags as specified if there has been no appropriate execution of INVEPT since the last use of an EPTP value that does not enable accessed and dirty flags for EPT (because bit 6 is clear) and that is identical to the new value on bits 51:12. ...

21.Updates to Chapter 26, Volume 3C


Change bars show changes to Chapter 26 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------1. If the enable VPID VM-execution control is 0, the current VPID is 0000H; if CR4.PCIDE = 0, the current PCID is 000H.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

182

...

26.2.1.1

VM-Execution Control Fields

VM entries perform the following checks on the VM-execution control fields:1 Reserved bits in the pin-based VM-execution controls must be set properly. Software may consult the VMX capability MSRs to determine the proper settings (see Appendix A.3.1). Reserved bits in the primary processor-based VM-execution controls must be set properly. Software may consult the VMX capability MSRs to determine the proper settings (see Appendix A.3.2). If the activate secondary controls primary processor-based VM-execution control is 1, reserved bits in the secondary processor-based VM-execution controls must be cleared. Software may consult the VMX capability MSRs to determine which bits are reserved (see Appendix A.3.3). If the activate secondary controls primary processor-based VM-execution control is 0 (or if the processor does not support the 1-setting of that control), no checks are performed on the secondary processor-based VM-execution controls. The logical processor operates as if all the secondary processor-based VM-execution controls were 0. The CR3-target count must not be greater than 4. Future processors may support a different number of CR3target values. Software should read the VMX capability MSR IA32_VMX_MISC to determine the number of values supported (see Appendix A.6). If the use I/O bitmaps VM-execution control is 1, bits 11:0 of each I/O-bitmap address must be 0. Neither address should set any bits beyond the processors physical-address width.2,3 If the use MSR bitmaps VM-execution control is 1, bits 11:0 of the MSR-bitmap address must be 0. The address should not set any bits beyond the processors physical-address width.4 If the use TPR shadow VM-execution control is 1, the virtual-APIC address must satisfy the following checks: Bits 11:0 of the address must be 0. The address should not set any bits beyond the processors physical-address width.5 If all of the above checks are satisfied and the use TPR shadow VM-execution control is 1, bytes 3:1 of VTPR (see Section 29.1.1) may be cleared (behavior may be implementation-specific). The clearing of these bytes may occur even if the VM entry fails. This is true either if the failure causes control to pass to the instruction following the VM-entry instruction or if it causes processor state to be loaded from the host-state area of the VMCS. If the use TPR shadow VM-execution control is 1 and the virtual-interrupt delivery VM-execution control is 0, bits 31:4 of the TPR threshold VM-execution control field must be 0.6 The following check is performed if the use TPR shadow VM-execution control is 1 and the virtualize APIC accesses and virtual-interrupt delivery VM-execution controls are both 0: the value of bits 3:0 of the TPR threshold VM-execution control field should not be greater than the value of bits 7:4 of VTPR (see Section 29.1.1).

1. If the activate secondary controls primary processor-based VM-execution control is 0, VM entry operates as if each secondary processor-based VM-execution control were 0. 2. Software can determine a processors physical-address width by executing CPUID with 80000008H in EAX. The physical-address width is returned in bits 7:0 of EAX. 3. If IA32_VMX_BASIC[48] is read as 1, these addresses must not set any bits in the range 63:32; see Appendix A.1. 4. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 5. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 6. Virtual-interrupt delivery is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the virtual-interrupt delivery VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

183

If the NMI exiting VM-execution control is 0, the virtual NMIs VM-execution control must be 0. If the virtual NMIs VM-execution control is 0, the NMI-window exiting VM-execution control must be 0. If the virtualize APIC-accesses VM-execution control is 1, the APIC-access address must satisfy the following checks: Bits 11:0 of the address must be 0. The address should not set any bits beyond the processors physical-address width.1

If the use TPR shadow VM-execution control is 0, the following VM-execution controls must also be 0: virtualize x2APIC mode, APIC-register virtualization, and virtual-interrupt delivery.2 If the virtualize x2APIC mode VM-execution control is 1, the virtualize APIC accesses VM-execution control must be 0. If the virtual-interrupt delivery VM-execution control is 1, the external-interrupt exiting VM-execution control must be 1. If the process posted interrupts VM-execution control is 1, the following must be true:3 The virtual-interrupt delivery VM-execution control is 1. The acknowledge interrupt on exit VM-exit control is 1. The posted-interrupt notification vector has a value in the range 0255 (bits 15:8 are all 0). Bits 5:0 of the posted-interrupt descriptor address are all 0. The posted-interrupt descriptor address does not set any bits beyond the processor's physical-address width.4

If the enable VPID VM-execution control is 1, the value of the VPID VM-execution control field must not be 0000H.5 If the enable EPT VM-execution control is 1, the EPTP VM-execution control field (see Table 24-8 in Section 24.6.11) must satisfy the following checks:6 The EPT memory type (bits 2:0) must be a value supported by the processor as indicated in the IA32_VMX_EPT_VPID_CAP MSR (see Appendix A.10). Bits 5:3 (1 less than the EPT page-walk length) must be 3, indicating an EPT page-walk length of 4; see Section 28.2.2. Bit 6 (enable bit for accessed and dirty flags for EPT) must be 0 if bit 21 of the IA32_VMX_EPT_VPID_CAP MSR (see Appendix A.10) is read as 0, indicating that the processor does not support accessed and dirty flags for EPT. Reserved bits 11:7 and 63:N (where N is the processors physical-address width) must all be 0. If the unrestricted guest VM-execution control is 1, the enable EPT VM-execution control must also be 1.7

1. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 2. Virtualize x2APIC mode and APIC-register virtualization are secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if these controls were 0. See Section 24.6.2. 3. Process posted interrupts is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VMexecution controls is 0, VM entry functions as if the process posted interrupts VM-execution control were 0. See Section 24.6.2. 4. If IA32_VMX_BASIC[48] is read as 1, this address must not set any bits in the range 63:32; see Appendix A.1. 5. Enable VPID is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the enable VPID VM-execution control were 0. See Section 24.6.2. 6. Enable EPT is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the enable EPT VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

184

If the enable VM functions processor-based VM-execution control is 1, reserved bits in the VM-function controls must be clear.1 Software may consult the VMX capability MSRs to determine which bits are reserved (see Appendix A.11). In addition, the following check is performed based on the setting of bits in the VMfunction controls (see Section 24.6.14): If EPTP switching VM-function control is 1, the enable EPT VM-execution control must also 1. In addition, the EPTP-list address must satisfy the following checks: Bits 11:0 of the address must be 0. The address must not set any bits beyond the processors physical-address width.

If the enable VM functions processor-based VM-execution control is 0, no checks are performed on the VMfunction controls. ...

26.3.2.5

Updating Non-Register State

Section 28.3 describes how the VMX architecture controls how a logical processor manages information in the TLBs and paging-structure caches. The following items detail how VM entries invalidate cached mappings: If the enable VPID VM-execution control is 0, the logical processor invalidates linear mappings and combined mappings associated with VPID 0000H (for all PCIDs); combined mappings for VPID 0000H are invalidated for all EP4TA values (EP4TA is the value of bits 51:12 of EPTP). VM entries are not required to invalidate any guest-physical mappings, nor are they required to invalidate any linear mappings or combined mappings if the enable VPID VM-execution control is 1.

If the virtual-interrupt delivery VM-execution control is 1, VM entry loads the values of RVI and SVI from the guest interrupt-status field in the VMCS (see Section 24.4.2). After doing so, the logical processor first causes PPR virtualization (Section 29.1.3) and then evaluates pending virtual interrupts (Section 29.2.1). If a virtual interrupt is recognized, it may be delivered in VMX non-root operation immediately after VM entry (including any specified event injection) completes; see Section 26.6.5. See Section 29.2.2 for details regarding the delivery of virtual interrupts. ...

26.5.1.2

VM Exits During Event Injection

An event being injected never causes a VM exit directly regardless of the settings of the VM-execution controls. For example, setting the NMI exiting VM-execution control to 1 does not cause a VM exit due to injection of an NMI. However, the event-delivery process may lead to a VM exit: If the vector in the VM-entry interruption-information field identifies a task gate in the IDT, the attempted task switch may cause a VM exit just as it would had the injected event occurred during normal execution in VMX non-root operation (see Section 25.4.2). If event delivery encounters a nested exception, a VM exit may occur depending on the contents of the exception bitmap (see Section 25.2). If event delivery generates a double-fault exception (due to a nested exception); the logical processor encounters another nested exception while attempting to call the double-fault handler; and that exception

7. Unrestricted guest and enable EPT are both secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if both these controls were 0. See Section 24.6.2. 1. Enable VM functions is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the enable VM functions VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

185

does not cause a VM exit due to the exception bitmap; then a VM exit occurs due to triple fault (see Section 25.2). If event delivery injects a double-fault exception and encounters a nested exception that does not cause a VM exit due to the exception bitmap, then a VM exit occurs due to triple fault (see Section 25.2). If the virtualize APIC accesses VM-execution control is 1 and event delivery generates an access to the APIC-access page, that access is treated as described in Section 29.4 and may cause a VM exit.1

If the event-delivery process does cause a VM exit, the processor state before the VM exit is determined just as it would be had the injected event occurred during normal execution in VMX non-root operation. If the injected event directly accesses a task gate that cause a VM exit or if the first nested exception encountered causes a VM exit, information about the injected event is saved in the IDT-vectoring information field (see Section 27.2.3). ...

26.6.3

Delivery of Pending Debug Exceptions after VM Entry

The pending debug exceptions field in the guest-state area indicates whether there are debug exceptions that have not yet been delivered (see Section 24.4.2). This section describes how these are treated on VM entry. There are no pending debug exceptions after VM entry if any of the following are true: The VM entry is vectoring with one of the following interruption types: external interrupt, non-maskable interrupt (NMI), hardware exception, or privileged software exception. The interruptibility-state field does not indicate blocking by MOV SS and the VM entry is vectoring with either of the following interruption type: software interrupt or software exception. The VM entry is not vectoring and the activity-state field indicates either shutdown or wait-for-SIPI.

If none of the above hold, the pending debug exceptions field specifies the debug exceptions that are pending for the guest. There are valid pending debug exceptions if either the BS bit (bit 14) or the enable-breakpoint bit (bit 12) is 1. If there are valid pending debug exceptions, they are handled as follows: If the VM entry is not vectoring, the pending debug exceptions are treated as they would had they been encountered normally in guest execution: If the logical processor is not blocking such exceptions (the interruptibility-state field indicates no blocking by MOV SS), a debug exception is delivered after VM entry (see below). If the logical processor is blocking such exceptions (due to blocking by MOV SS), the pending debug exceptions are held pending or lost as would normally be the case. If the VM entry is vectoring (with interruption type software interrupt or software exception and with blocking by MOV SS), the following items apply: For injection of a software interrupt or of a software exception with vector 3 (#BP) or vector 4 (#OF), the pending debug exceptions are treated as they would had they been encountered normally in guest execution if the corresponding instruction (INT3 or INTO) were executed after a MOV SS that encountered a debug trap. For injection of a software exception with a vector other than 3 and 4, the pending debug exceptions may be lost or they may be delivered after injection (see below). If there are no valid pending debug exceptions (as defined above), no pending debug exceptions are delivered after VM entry. If a pending debug exception is delivered after VM entry, it has the priority of traps on the previous instruction (see Section 6.9 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A). Thus, INIT 1. Virtualize APIC accesses is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if the virtualize APIC accesses VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

186

signals and system-management interrupts (SMIs) take priority of such an exception, as do VM exits induced by the TPR threshold (see Section 26.6.7) and pending MTF VM exits (see Section 26.6.8. The exception takes priority over any pending non-maskable interrupt (NMI) or external interrupt and also over VM exits due to the 1settings of the interrupt-window exiting and NMI-window exiting VM-execution controls. A pending debug exception delivered after VM entry causes a VM exit if the bit 1 (#DB) is 1 in the exception bitmap. If it does not cause a VM exit, it updates DR6 normally. ...

26.6.5

Interrupt-Window Exiting and Virtual-Interrupt Delivery

If interrupt-window exiting VM-execution control is 1, an open interrupt window may cause a VM exit immediately after VM entry (see Section 25.2 for details). If the interrupt-window exiting VM-execution control is 0 but the virtual-interrupt delivery VM-execution control is 1, a virtual interrupt may be delivered immediately after VM entry (see Section 26.3.2.5 and Section 29.2.1). The following items detail the treatment of these events: ... These events occur after any event injection specified for VM entry. Non-maskable interrupts (NMIs) and higher priority events take priority over these events. These events take priority over external interrupts and lower priority events. These events wake the logical processor if it just entered the HLT state because of a VM entry (see Section 26.6.2). They do not occur if the logical processor just entered the shutdown state or the wait-for-SIPI state.

26.6.7

VM Exits Induced by the TPR Threshold

If the use TPR shadow and virtualize APIC accesses VM-execution controls are both 1 and the virtual-interrupt delivery VM-execution control is 0, a VM exit occurs immediately after VM entry if the value of bits 3:0 of the TPR threshold VM-execution control field is greater than the value of bits 7:4 of VTPR (see Section 29.1.1).1 The following items detail the treatment of these VM exits: The VM exits are not blocked if RFLAGS.IF = 0 or by the setting of bits in the interruptibility-state field in guest-state area. The VM exits follow event injection if such injection is specified for VM entry. VM exits caused by this control take priority over system-management interrupts (SMIs), INIT signals, and lower priority events. They thus have priority over the VM exits described in Section 26.6.5, Section 26.6.6, and Section 26.6.8, as well as any interrupts or debug exceptions that may be pending at the time of VM entry. These VM exits wake the logical processor if it just entered the HLT state as part of a VM entry (see Section 26.6.2). They do not occur if the logical processor just entered the shutdown state or the wait-for-SIPI state. If such a VM exit is suppressed because the processor just entered the shutdown state, it occurs after the delivery of any event that cause the logical processor to leave the shutdown state while remaining in VMX non-root operation (e.g., due to an NMI that occurs while the NMI-exiting VM-execution control is 0). ... 1. Virtualize APIC accesses and virtual-interrupt delivery are secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VM-execution controls is 0, VM entry functions as if these controls were 0. See Section 24.6.2. The basic exit reason is TPR below threshold.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

187

22.Updates to Chapter 27, Volume 3C


Change bars show changes to Chapter 27 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------... VM exits occur in response to certain instructions and events in VMX non-root operation as detailed in Section 25.1 through Section 25.2. VM exits perform the following operations: 1. Information about the cause of the VM exit is recorded in the VM-exit information fields and VM-entry control fields are modified as described in Section 27.2. 2. Processor state is saved in the guest-state area (Section 27.3). 3. MSRs may be saved in the VM-exit MSR-store area (Section 27.4). This step is not performed for SMM VM exits that activate the dual-monitor treatment of SMIs and SMM. 4. The following may be performed in parallel and in any order (Section 27.5): Processor state is loaded based in part on the host-state area and some VM-exit controls. This step is not performed for SMM VM exits that activate the dual-monitor treatment of SMIs and SMM. See Section 34.15.6 for information on how processor state is loaded by such VM exits. Address-range monitoring is cleared. 5. MSRs may be loaded from the VM-exit MSR-load area (Section 27.6). This step is not performed for SMM VM exits that activate the dual-monitor treatment of SMIs and SMM. VM exits are not logged with last-branch records, do not produce branch-trace messages, and do not update the branch-trace store. Section 27.1 clarifies the nature of the architectural state before a VM exit begins. The steps described above are detailed in Section 27.2 through Section 27.6. Section 34.15 describes the dual-monitor treatment of system-management interrupts (SMIs) and systemmanagement mode (SMM). Under this treatment, ordinary transitions to SMM are replaced by VM exits to a separate SMM monitor. Called SMM VM exits, these are caused by the arrival of an SMI or the execution of VMCALL in VMX root operation. SMM VM exits differ from other VM exits in ways that are detailed in Section 34.15.2.

27.1

ARCHITECTURAL STATE BEFORE A VM EXIT

This section describes the architectural state that exists before a VM exit, especially for VM exits caused by events that would normally be delivered through the IDT. Note the following: An exception causes a VM exit directly if the bit corresponding to that exception is set in the exception bitmap. A non-maskable interrupt (NMI) causes a VM exit directly if the NMI exiting VM-execution control is 1. An external interrupt causes a VM exit directly if the external-interrupt exiting VM-execution control is 1. A start-up IPI (SIPI) that arrives while a logical processor is in the wait-for-SIPI activity state causes a VM exit directly. INIT signals that arrive while the processor is not in the wait-for-SIPI activity state cause VM exits directly. An exception, NMI, external interrupt, or software interrupt causes a VM exit indirectly if it does not do so directly but delivery of the event causes a nested exception, double fault, task switch, APIC access (see Section 29.4), EPT violation, or EPT misconfiguration that causes a VM exit. An event results in a VM exit if it causes a VM exit (directly or indirectly).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

188

The following bullets detail when architectural state is and is not updated in response to VM exits: If an event causes a VM exit directly, it does not update architectural state as it would have if it had it not caused the VM exit: A debug exception does not update DR6, DR7.GD, or IA32_DEBUGCTL.LBR. (Information about the nature of the debug exception is saved in the exit qualification field.) A page fault does not update CR2. (The linear address causing the page fault is saved in the exit-qualification field.) An NMI causes subsequent NMIs to be blocked, but only after the VM exit completes. An external interrupt does not acknowledge the interrupt controller and the interrupt remains pending, unless the acknowledge interrupt on exit VM-exit control is 1. In such a case, the interrupt controller is acknowledged and the interrupt is no longer pending. The flags L0 L3 in DR7 (bit 0, bit 2, bit 4, and bit 6) are not cleared when a task switch causes a VM exit. If a task switch causes a VM exit, none of the following are modified by the task switch: old task-state segment (TSS); new TSS; old TSS descriptor; new TSS descriptor; RFLAGS.NT1; or the TR register. No last-exception record is made if the event that would do so directly causes a VM exit. If a machine-check exception causes a VM exit directly, this does not prevent machine-check MSRs from being updated. These are updated by the machine-check event itself and not the resulting machine-check exception. If the logical processor is in an inactive state (see Section 24.4.2) and not executing instructions, some events may be blocked but others may return the logical processor to the active state. Unblocked events may cause VM exits.2 If an unblocked event causes a VM exit directly, a return to the active state occurs only after the VM exit completes.3 The VM exit generates any special bus cycle that is normally generated when the active state is entered from that activity state. MTF VM exits (see Section 25.5.2 and Section 26.6.8) are not blocked in the HLT activity state. If an MTF VM exit occurs in the HLT activity state, the logical processor returns to the active state only after the VM exit completes. MTF VM exits are blocked the shutdown state and the wait-for-SIPI state. If an event causes a VM exit indirectly, the event does update architectural state: A debug exception updates DR6, DR7, and the IA32_DEBUGCTL MSR. No debug exceptions are considered pending. A page fault updates CR2. An NMI causes subsequent NMIs to be blocked before the VM exit commences. An external interrupt acknowledges the interrupt controller and the interrupt is no longer pending. If the logical processor had been in an inactive state, it enters the active state and, before the VM exit commences, generates any special bus cycle that is normally generated when the active state is entered from that activity state. There is no blocking by STI or by MOV SS when the VM exit commences.

1. This chapter uses the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For processors that do not support Intel 64 architecture, this notation refers to the 32-bit forms of those registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to lower 32 bits of the indicated register. 2. If a VM exit takes the processor from an inactive state resulting from execution of a specific instruction (HLT or MWAIT), the value saved for RIP by that VM exit will reference the following instruction. 3. An exception is made if the logical processor had been inactive due to execution of MWAIT; in this case, it is considered to have become active before the VM exit.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

189

Processor state that is normally updated as part of delivery through the IDT (CS, RIP, SS, RSP, RFLAGS) is not modified. However, the incomplete delivery of the event may write to the stack. The treatment of last-exception records is implementation dependent: Some processors make a last-exception record when beginning the delivery of an event through the IDT (before it can encounter a nested exception). Such processors perform this update even if the event encounters a nested exception that causes a VM exit (including the case where nested exceptions lead to a triple fault). Other processors delay making a last-exception record until event delivery has reached some event handler successfully (perhaps after one or more nested exceptions). Such processors do not update the last-exception record if a VM exit or triple fault occurs before an event handler is reached.

If the virtual NMIs VM-execution control is 1, VM entry injects an NMI, and delivery of the NMI causes a nested exception, double fault, task switch, or APIC access that causes a VM exit, virtual-NMI blocking is in effect before the VM exit commences. If a VM exit results from a fault, EPT violation, or EPT misconfiguration encountered during execution of IRET and the NMI exiting VM-execution control is 0, any blocking by NMI is cleared before the VM exit commences. However, the previous state of blocking by NMI may be recorded in the VM-exit interruptioninformation field; see Section 27.2.2. If a VM exit results from a fault, EPT violation, or EPT misconfiguration encountered during execution of IRET and the virtual NMIs VM-execution control is 1, virtual-NMI blocking is cleared before the VM exit commences. However, the previous state of virtual-NMI blocking may be recorded in the VM-exit interruptioninformation field; see Section 27.2.2. Suppose that a VM exit is caused directly by an x87 FPU Floating-Point Error (#MF) or by any of the following events if the event was unblocked due to (and given priority over) an x87 FPU Floating-Point Error: an INIT signal, an external interrupt, an NMI, an SMI; or a machine-check exception. In these cases, there is no blocking by STI or by MOV SS when the VM exit commences. Normally, a last-branch record may be made when an event is delivered through the IDT. However, if such an event results in a VM exit before delivery is complete, no last-branch record is made. If machine-check exception results in a VM exit, processor state is suspect and may result in suspect state being saved to the guest-state area. A VM monitor should consult the RIPV and EIPV bits in the IA32_MCG_STATUS MSR before resuming a guest that caused a VM exit resulting from a machine-check exception. If a VM exit results from a fault, APIC access (see Section 29.4), EPT violation, or EPT misconfiguration encountered while executing an instruction, data breakpoints due to that instruction may have been recognized and information about them may be saved in the pending debug exceptions field (see Section 27.3.4). The following VM exits are considered to happen after an instruction is executed: VM exits resulting from debug traps (single-step, I/O breakpoints, and data breakpoints). VM exits resulting from debug exceptions whose recognition was delayed by blocking by MOV SS. VM exits resulting from some machine-check exceptions. Trap-like VM exits due to execution of MOV to CR8 when the CR8-load exiting VM-execution control is 0 and the use TPR shadow VM-execution control is 1 (see Section 29.3). (Such VM exits can occur only from 64-bit mode and thus only on processors that support Intel 64 architecture.) Trap-like VM exits due to execution of WRMSR when the use MSR bitmaps VM-execution control is 1; the value of ECX is in the range 800H8FFH; and the bit corresponding to the ECX value in write bitmap for low MSRs is 0; and the virtualize x2APIC mode VM-execution control is 1. See Section 29.5. VM exits caused by APIC-write emulation (see Section 29.4.3.2) that result from APIC accesses as part of instruction execution.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

190

For these VM exits, the instructions modifications to architectural state complete before the VM exit occurs. Such modifications include those to the logical processors interruptibility state (see Table 24-3). If there had been blocking by MOV SS, POP SS, or STI before the instruction executed, such blocking is no longer in effect.

27.2.1

Basic VM-Exit Information

Section 24.9.1 defines the basic VM-exit information fields. The following items detail their use. Exit reason. Bits 15:0 of this field contain the basic exit reason. It is loaded with a number indicating the general cause of the VM exit. Appendix C lists the numbers used and their meaning. The remainder of the field (bits 31:16) is cleared to 0 (certain SMM VM exits may set some of these bits; see Section 34.15.2.3).1 Exit qualification. This field is saved for VM exits due to the following causes: debug exceptions; page-fault exceptions; start-up IPIs (SIPIs); system-management interrupts (SMIs) that arrive immediately after the retirement of I/O instructions; task switches; INVEPT; INVLPG; INVPCID; INVVPID; LGDT; LIDT; LLDT; LTR; SGDT; SIDT; SLDT; STR; VMCLEAR; VMPTRLD; VMPTRST; VMREAD; VMWRITE; VMXON; control-register accesses; MOV DR; I/O instructions; MWAIT; accesses to the APIC-access page (see Section 29.4); EPT violations; EOI virtualization (Section 29.1.4); and APIC-write emulation (see Section 29.4.3.3). For all other VM exits, this field is cleared. The following items provide details:

For a debug exception, the exit qualification contains information about the debug exception. The information has the format given in Table 24-4. ... For an APIC-access VM exit resulting from a linear access or a guest-physical access to the APIC-access page (see Section 29.4), the exit qualification contains information about the access and has the format given in Table 27-6.2

Table 27-6 Exit Qualification for APIC-Access VM Exits from Linear Accesses and Guest-Physical Accesses
Bit Position(s) 11:0 15:12 Contents If the APIC-access VM exit is due to a linear access, the offset of access within the APIC page. Undefined if the APIC-access VM exit is due a guest-physical access Access type: 0 = linear access for a data read during instruction execution 1 = linear access for a data write during instruction execution 2 = linear access for an instruction fetch 3 = linear access (read or write) during event delivery 10 = guest-physical access during event delivery 15 = guest-physical access for an instruction fetch or during instruction execution Other values not used 63:16 Reserved (cleared to 0). Bits 63:32 exist only on processors that support Intel 64 architecture.

1. Bit 13 of this field is set on certain VM-entry failures; see Section 26.7. 2. The exit qualification is undefined if the access was part of the logging of a branch record or a precise-event-based-sampling (PEBS) record to the DS save area. It is recommended that software configure the paging structures so that no address in the DS save area translates to an address on the APIC-access page.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

191

Such a VM exit that set bits 15:12 of the exit qualification to 0000b (data read during instruction execution) or 0001b (data write during instruction execution) set bit 12which distinguishes data read from data writeto that which would have been stored in bit 1W/Rof the page-fault error code had the access caused a page fault instead of an APIC-access VM exit. This implies the following: For an APIC-access VM exit caused by the CLFLUSH instruction, the access type is data read during instruction execution. For an APIC-access VM exit caused by the ENTER instruction, the access type is data write during instruction execution. For an APIC-access VM exit caused by the MASKMOVQ instruction or the MASKMOVDQU instruction, the access type is data write during instruction execution. For an APIC-access VM exit caused by the MONITOR instruction, the access type is data read during instruction execution.

Such a VM exit stores 1 for bit 31 for IDT-vectoring information field (see Section 27.2.3) if and only if it sets bits 15:12 of the exit qualification to 0011b (linear access during event delivery) or 1010b (guestphysical access during event delivery). See Section 29.4.4 for further discussion of these instructions and APIC-access VM exits. For APIC-access VM exits resulting from physical accesses, the APIC-access page (see Section 29.4.6), the exit qualification is undefined. ... An EPT violation that occurs during as a result of execution of a read-modify-write operation sets bit 1 (data write). Whether it also sets bit 0 (data read) is implementation-specific and, for a given implementation, may differ for different kinds of read-modify-write operations. Bit 12 is undefined in any of the following cases: If the NMI exiting VM-execution control is 1 and the virtual NMIs VM-execution control is 0. If the VM exit sets the valid bit in the IDT-vectoring information field (see Section 27.2.3). If the virtual NMIs VM-execution control is 0, the EPT violation was caused by a memory access as part of execution of the IRET instruction, and blocking by NMI (see Table 24-3) was in effect before execution of IRET, bit 12 is set to 1. If the virtual NMIs VM-execution control is 1,the EPT violation was caused by a memory access as part of execution of the IRET instruction, and virtual-NMI blocking was in effect before execution of IRET, bit 12 is set to 1. For all other relevant VM exits, bit 12 is cleared to 0.

Otherwise, bit 12 is defined as follows:

For VM exits caused as part of EOI virtualization (Section 29.1.4), bits 7:0 of the exit qualification are set to vector of the virtual interrupt that was dismissed by the EOI virtualization. Bits above bit 7 are cleared. For APIC-write VM exits (Section 29.4.3.3), bits 11:0 of the exit qualification are set to the page offset of the write access that caused the VM exit.1 Bits above bit 11 are cleared. ...

1. Execution of WRMSR with ECX = 83FH (self-IPI MSR) can lead to an APIC-write VM exit; the exit qualification for such an APICwrite VM exit is 3F0H.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

192

27.2.3

Information for VM Exits During Event Delivery

Section 24.9.3 defined fields containing information for VM exits that occur while delivering an event through the IDT and as a result of any of the following cases:1 A fault occurs during event delivery and causes a VM exit (because the bit associated with the fault is set to 1 in the exception bitmap). A task switch is invoked through a task gate in the IDT. The VM exit occurs due to the task switch only after the initial checks of the task switch pass (see Section 25.4.2). Event delivery causes an APIC-access VM exit (see Section 29.4). An EPT violation or EPT misconfiguration that occurs during event delivery.

These fields are used for VM exits that occur during delivery of events injected as part of VM entry (see Section 26.5.1.2). ...

27.2.4

Information for VM Exits Due to Instruction Execution

Section 24.9.4 defined fields containing information for VM exits that occur due to instruction execution. (The VMexit instruction length is also used for VM exits that occur during the delivery of a software interrupt or software exception.) The following items detail their use. VM-exit instruction length. This field is used in the following cases: For fault-like VM exits due to attempts to execute one of the following instructions that cause VM exits unconditionally (see Section 25.1.2) or based on the settings of VM-execution controls (see Section 25.1.3): CLTS, CPUID, GETSEC, HLT, IN, INS, INVD, INVEPT, INVLPG, INVPCID, INVVPID, LGDT, LIDT, LLDT, LMSW, LTR, MONITOR, MOV CR, MOV DR, MWAIT, OUT, OUTS, PAUSE, RDMSR, RDPMC, RDRAND, RDTSC, RDTSCP, RSM, SGDT, SIDT, SLDT, STR, VMCALL, VMCLEAR, VMLAUNCH, VMPTRLD, VMPTRST, VMREAD, VMRESUME, VMWRITE, VMXOFF, VMXON, WBINVD, WRMSR, and XSETBV.2 For VM exits due to software exceptions (those generated by executions of INT3 or INTO). For VM exits due to faults encountered during delivery of a software interrupt, privileged software exception, or software exception. For VM exits due to attempts to effect a task switch via instruction execution. These are VM exits that produce an exit reason indicating task switch and either of the following: An exit qualification indicating execution of CALL, IRET, or JMP instruction. An exit qualification indicating a task gate in the IDT and an IDT-vectoring information field indicating that the task gate was encountered during delivery of a software interrupt, privileged software exception, or software exception.

For APIC-access VM exits resulting from accesses (see Section 29.4) during delivery of a software interrupt, privileged software exception, or software exception.3 ... 1. This includes the case in which a VM exit occurs while delivering a software interrupt (INT n) through the 16-bit IVT (interrupt vector table) that is used in virtual-8086 mode with virtual-machine extensions (if RFLAGS.VM = CR4.VME = 1). 2. This item applies only to fault-like VM exits. It does not apply to trap-like VM exits following executions of the MOV to CR8 instruction when the use TPR shadow VM-execution control is 1 or to those following executions of the WRMSR instruction when the virtualize x2APIC mode VM-execution control is 1. 3. The VM-exit instruction-length field is not defined following APIC-access VM exits resulting from physical accesses (see Section 29.4.6) even if encountered during delivery of a software interrupt, privileged software exception, or software exception.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

193

27.3.3

Saving RIP, RSP, and RFLAGS

The contents of the RIP, RSP, and RFLAGS registers are saved as follows: The value saved in the RIP field is determined by the nature and cause of the VM exit: If the VM exit occurs due to by an attempt to execute an instruction that causes VM exits unconditionally or that has been configured to cause a VM exit via the VM-execution controls, the value saved references that instruction. If the VM exit is caused by an occurrence of an INIT signal, a start-up IPI (SIPI), or system-management interrupt (SMI), the value saved is that which was in RIP before the event occurred. If the VM exit occurs due to the 1-setting of either the interrupt-window exiting VM-execution control or the NMI-window exiting VM-execution control, the value saved is that which would be in the register had the VM exit not occurred. If the VM exit is due to an external interrupt, non-maskable interrupt (NMI), or hardware exception (as defined in Section 27.2.2), the value saved is the return pointer that would have been saved (either on the stack had the event been delivered through a trap or interrupt gate,1 or into the old task-state segment had the event been delivered through a task gate). If the VM exit is due to a triple fault, the value saved is the return pointer that would have been saved (either on the stack had the event been delivered through a trap or interrupt gate, or into the old taskstate segment had the event been delivered through a task gate) had delivery of the double fault not encountered the nested exception that caused the triple fault. If the VM exit is due to a software exception (due to an execution of INT3 or INTO), the value saved references the INT3 or INTO instruction that caused that exception. Suppose that the VM exit is due to a task switch that was caused by execution of CALL, IRET, or JMP or by execution of a software interrupt (INT n) or software exception (due to execution of INT3 or INTO) that encountered a task gate in the IDT. The value saved references the instruction that caused the task switch (CALL, IRET, JMP, INT n, INT3, or INTO). Suppose that the VM exit is due to a task switch that was caused by a task gate in the IDT that was encountered for any reason except the direct access by a software interrupt or software exception. The value saved is that which would have been saved in the old task-state segment had the task switch completed normally. If the VM exit is due to an execution of MOV to CR8 or WRMSR that reduced the value of bits 7:4 of VTPR (see Section 29.1.1) below that of TPR threshold VM-execution control field (see Section 29.1.2), the value saved references the instruction following the MOV to CR8 or WRMSR. If the VM exit was caused by APIC-write emulation (see Section 29.4.3.2) that results from an APIC access as part of instruction execution, the value saved references the instruction following the one whose execution caused the APIC-write emulation. The contents of the RSP register are saved into the RSP field. With the exception of the resume flag (RF; bit 16), the contents of the RFLAGS register is saved into the RFLAGS field. RFLAGS.RF is saved as follows: If the VM exit is caused directly by an event that would normally be delivered through the IDT, the value saved is that which would appear in the saved RFLAGS image (either that which would be saved on the stack had the event been delivered through a trap or interrupt gate2 or into the old task-state segment

1. The reference here is to the full value of RIP before any truncation that would occur had the stack width been only 32 bits or 16 bits. 2. The reference here is to the full value of RFLAGS before any truncation that would occur had the stack width been only 32 bits or 16 bits.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

194

had the event been delivered through a task gate) had the event been delivered through the IDT. See below for VM exits due to task switches caused by task gates in the IDT. If the VM exit is caused by a triple fault, the value saved is that which the logical processor would have in RF in the RFLAGS register had the triple fault taken the logical processor to the shutdown state. If the VM exit is caused by a task switch (including one caused by a task gate in the IDT), the value saved is that which would have been saved in the RFLAGS image in the old task-state segment (TSS) had the task switch completed normally without exception. If the VM exit is caused by an attempt to execute an instruction that unconditionally causes VM exits or one that was configured to do with a VM-execution control, the value saved is 0.1 For APIC-access VM exits and for VM exits caused by EPT violations and EPT misconfigurations, the value saved depends on whether the VM exit occurred during delivery of an event through the IDT: If the VM exit stored 0 for bit 31 for IDT-vectoring information field (because the VM exit did not occur during delivery of an event through the IDT; see Section 27.2.3), the value saved is 1. If the VM exit stored 1 for bit 31 for IDT-vectoring information field (because the VM exit did occur during delivery of an event through the IDT), the value saved is the value that would have appeared in the saved RFLAGS image had the event been delivered through the IDT (see above).

For all other VM exits, the value saved is the value RFLAGS.RF had before the VM exit occurred. ...

27.3.4

Saving Non-Register State

Information corresponding to guest non-register state is saved as follows: The activity-state field is saved with the logical processors activity state before the VM exit.2 See Section 27.1 for details of how events leading to a VM exit may affect the activity state. The interruptibility-state field is saved to reflect the logical processors interruptibility before the VM exit. See Section 27.1 for details of how events leading to a VM exit may affect this state. VM exits that end outside system-management mode (SMM) save bit 2 (blocking by SMI) as 0 regardless of the state of such blocking before the VM exit. Bit 3 (blocking by NMI) is treated specially if the virtual NMIs VM-execution control is 1. In this case, the value saved for this field does not indicate the blocking of NMIs but rather the state of virtual-NMI blocking. The pending debug exceptions field is saved as clear for all VM exits except the following: A VM exit caused by an INIT signal, a machine-check exception, or a system-management interrupt (SMI). A VM exit with basic exit reason TPR below threshold,3 virtualized EOI, APIC write, or monitor trap flag. VM exits that are not caused by debug exceptions and that occur while there is MOV-SS blocking of debug exceptions. For VM exits that do not clear the field, the value saved is determined as follows: 1. This is true even if RFLAGS.RF was 1 before the instruction was executed. If, in response to such a VM exit, a VM monitor re-enters the guest to re-execute the instruction that caused the VM exit (for example, after clearing the VM-execution control that caused the VM exit), the instruction may encounter a code breakpoint that has already been processed. A VM monitor can avoid this by setting the guest value of RFLAGS.RF to 1 before resuming guest software. 2. If this activity state was an inactive state resulting from execution of a specific instruction (HLT or MWAIT), the value saved for RIP by that VM exit will reference the following instruction. 3. This item includes VM exits that occur as a result of certain VM entries (Section 26.6.7).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

195

Each of bits 3:0 may be set if it corresponds to a matched breakpoint. This may be true even if the corresponding breakpoint is not enabled in DR7. Suppose that a VM exit is due to an INIT signal, a machine-check exception, or an SMI; or that a VM exit has basic exit reason TPR below threshold or monitor trap flag. In this case, the value saved sets bits corresponding to the causes of any debug exceptions that were pending at the time of the VM exit. If the VM exit occurs immediately after VM entry, the value saved may match that which was loaded on VM entry (see Section 26.6.3). Otherwise, the following items apply: Bit 12 (enabled breakpoint) is set to 1 if there was at least one matched data or I/O breakpoint that was enabled in DR7. Bit 12 is also set if it had been set on VM entry, causing there to be valid pending debug exceptions (see Section 26.6.3) and the VM exit occurred before those exceptions were either delivered or lost. In other cases, bit 12 is cleared to 0. Bit 14 (BS) is set if RFLAGS.TF = 1 in either of the following cases: IA32_DEBUGCTL.BTF = 0 and the cause of a pending debug exception was the execution of a single instruction. IA32_DEBUGCTL.BTF = 1 and the cause of a pending debug exception was a taken branch.

Suppose that a VM exit is due to another reason (but not a debug exception) and occurs while there is MOV-SS blocking of debug exceptions. In this case, the value saved sets bits corresponding to the causes of any debug exceptions that were pending at the time of the VM exit. If the VM exit occurs immediately after VM entry (no instructions were executed in VMX non-root operation), the value saved may match that which was loaded on VM entry (see Section 26.6.3). Otherwise, the following items apply: Bit 12 (enabled breakpoint) is set to 1 if there was at least one matched data or I/O breakpoint that was enabled in DR7. Bit 12 is also set if it had been set on VM entry, causing there to be valid pending debug exceptions (see Section 26.6.3) and the VM exit occurred before those exceptions were either delivered or lost. In other cases, bit 12 is cleared to 0. The setting of bit 14 (BS) is implementation-specific. However, it is not set if RFLAGS.TF = 0 or IA32_DEBUGCTL.BTF = 1.

The reserved bits in the field are cleared. If the save VMX-preemption timer value VM-exit control is 1, the value of timer is saved into the VMXpreemption timer-value field. This is the value loaded from this field on VM entry as subsequently decremented (see Section 25.5.1). VM exits due to timer expiration save the value 0. Other VM exits may also save the value 0 if the timer expired during VM exit. (If the save VMX-preemption timer value VM-exit control is 0, VM exit does not modify the value of the VMX-preemption timer-value field.) If the logical processor supports the 1-setting of the enable EPT VM-execution control, values are saved into the four (4) PDPTE fields as follows: If the enable EPT VM-execution control is 1 and the logical processor was using PAE paging at the time of the VM exit, the PDPTE values currently in use are saved:1 The values saved into bits 11:9 of each of the fields is undefined. If the value saved into one of the fields has bit 0 (present) clear, the value saved into bits 63:1 of that field is undefined. That value need not correspond to the value that was loaded by VM entry or to any value that might have been loaded in VMX non-root operation. If the value saved into one of the fields has bit 0 (present) set, the value saved into bits 63:12 of the field is a guest-physical address.

1. A logical processor uses PAE paging if CR0.PG = 1, CR4.PAE = 1 and IA32_EFER.LMA = 0. See Section 4.4 in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A. Enable EPT is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VM exit functions as if the enable EPT VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

196

If the enable EPT VM-execution control is 0 or the logical processor was not using PAE paging at the time of the VM exit, the values saved are undefined. ...

23.Updates to Chapter 28, Volume 3C


Change bars show changes to Chapter 28 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------...

28.3.3.3

Guidelines for Use of the INVVPID Instruction

The need for VMM software to use the INVVPID instruction depends on how that software is virtualizing memory (e.g., see Section 32.3, Memory Virtualization). If EPT is not in use, it is likely that the VMM is virtualizing the guest paging structures. Such a VMM may configure the VMCS so that all or some of the operations that invalidate entries the TLBs and the paging-structure caches (e.g., the INVLPG instruction) cause VM exits. If VMM software is emulating these operations, it may be necessary to use the INVVPID instruction to ensure that the logical processors TLBs and the paging-structure caches are appropriately invalidated. Requirements of when software should use the INVVPID instruction depend on the specific algorithm being used for page-table virtualization. The following items provide guidelines for software developers: Emulation of the INVLPG instruction may require execution of the INVVPID instruction as follows: The INVVPID type is individual-address (0). The VPID in the INVVPID descriptor is the one assigned to the virtual processor whose execution is being emulated. The linear address in the INVVPID descriptor is that of the operand of the INVLPG instruction being emulated. Some instructions invalidate all entries in the TLBs and paging-structure cachesexcept for global translations. An example is the MOV to CR3 instruction. (See Section 4.10, Caching Translation Information in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A for details regarding global translations.) Emulation of such an instruction may require execution of the INVVPID instruction as follows: The INVVPID type is single-context-retaining-globals (3). The VPID in the INVVPID descriptor is the one assigned to the virtual processor whose execution is being emulated. Some instructions invalidate all entries in the TLBs and paging-structure cachesincluding for global translations. An example is the MOV to CR4 instruction if the value of value of bit 4 (page global enablePGE) is changing. Emulation of such an instruction may require execution of the INVVPID instruction as follows: The INVVPID type is single-context (1). The VPID in the INVVPID descriptor is the one assigned to the virtual processor whose execution is being emulated. If EPT is not in use, the logical processor associates all mappings it creates with the current VPID, and it will use such mappings to translate linear addresses. For that reason, a VMM should not use the same VPID for different non-EPT guests that use different page tables. Doing so may result in one guest using translations that pertain to the other.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

197

If EPT is in use, the instructions enumerated above might not be configured to cause VM exits and the VMM might not be emulating them. In that case, executions of the instructions by guest software properly invalidate the required entries in the TLBs and paging-structure caches (see Section 28.3.3.1); execution of the INVVPID instruction is not required. If EPT is in use, the logical processor associates all mappings it creates with the value of bits 51:12 of current EPTP. If a VMM uses different EPTP values for different guests, it may use the same VPID for those guests. Doing so cannot result in one guest using translations that pertain to the other. The following guidelines apply more generally and are appropriate even if EPT is in use: As detailed in Section 29.4.5, an access to the APIC-access page might not cause an APIC-access VM exit if software does not properly invalidate information that may be cached from the paging structures. If, at one time, the current VPID on a logical processor was a non-zero value X, it is recommended that software use the INVVPID instruction with the single-context INVVPID type and with VPID X in the INVVPID descriptor before a VM entry on the same logical processor that establishes VPID X and either (a) the virtualize APIC accesses VM-execution control was changed from 0 to 1; or (b) the value of the APIC-access address was changed. Software can use the INVVPID instruction with the all-context INVVPID type immediately after execution of the VMXON instruction or immediately prior to execution of the VMXOFF instruction. Either prevents potentially undesired retention of information cached from paging structures between separate uses of VMX operation.

28.3.3.4

Guidelines for Use of the INVEPT Instruction

The following items provide guidelines for use of the INVEPT instruction to invalidate information cached from the EPT paging structures. Software should use the INVEPT instruction with the single-context INVEPT type after making any of the following changes to an EPT paging-structure entry (the INVEPT descriptor should contain an EPTP value that references directly or indirectly the modified EPT paging structure): Changing any of the privilege bits 2:0 from 1 to 0. Changing the physical address in bits 51:12. Clearing bit 8 (the accessed flag) if accessed and dirty flags for EPT will be enabled. For an EPT PDPTE or an EPT PDE, changing bit 7 (which determines whether the entry maps a page). For the last EPT paging-structure entry used to translate a guest-physical address (an EPT PDPTE with bit 7 set to 1, an EPT PDE with bit 7 set to 1, or an EPT PTE), changing either bits 5:3 or bit 6. (These bits determine the effective memory type of accesses using that EPT paging-structure entry; see Section 28.2.5.) For the last EPT paging-structure entry used to translate a guest-physical address (an EPT PDPTE with bit 7 set to 1, an EPT PDE with bit 7 set to 1, or an EPT PTE), clearing bit 9 (the dirty flag) if accessed and dirty flags for EPT will be enabled. Software should use the INVEPT instruction with the single-context INVEPT type before a VM entry with an EPTP value X such that X[6] = 1 (accessed and dirty flags for EPT are enabled) if the logical processor had earlier been in VMX non-root operation with an EPTP value Y such that Y[6] = 0 (accessed and dirty flags for EPT are not enabled) and Y[51:12] = X[51:12]. Software may use the INVEPT instruction after modifying a present EPT paging-structure entry to change any of the privilege bits 2:0 from 0 to 1. Failure to do so may cause an EPT violation that would not otherwise occur. Because an EPT violation invalidates any mappings that would be used by the access that caused the EPT violation (see Section 28.3.3.1), an EPT violation will not recur if the original access is performed again, even if the INVEPT instruction is not executed.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

198

Because a logical processor does not cache any information derived from EPT paging-structure entries that are not present or misconfigured (see Section 28.2.3.1), it is not necessary to execute INVEPT following modification of an EPT paging-structure entry that had been not present or misconfigured. As detailed in Section 29.4.5, an access to the APIC-access page might not cause an APIC-access VM exit if software does not properly invalidate information that may be cached from the EPT paging structures. If EPT was in use on a logical processor at one time with EPTP X, it is recommended that software use the INVEPT instruction with the single-context INVEPT type and with EPTP X in the INVEPT descriptor before a VM entry on the same logical processor that enables EPT with EPTP X and either (a) the virtualize APIC accesses VMexecution control was changed from 0 to 1; or (b) the value of the APIC-access address was changed. Software can use the INVEPT instruction with the all-context INVEPT type immediately after execution of the VMXON instruction or immediately prior to execution of the VMXOFF instruction. Either prevents potentially undesired retention of information cached from EPT paging structures between separate uses of VMX operation.

In a system containing more than one logical processor, software must account for the fact that information from an EPT paging-structure entry may be cached on logical processors other than the one that modifies that entry. The process of propagating the changes to a paging-structure entry is commonly referred to as TLB shootdown. A discussion of TLB shootdown appears in Section 4.10.5, Propagation of Paging-Structure Changes to Multiple Processors, in the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3A. ...

24.Updates to Chapter 29, Volume 3C


Chapter 29 is a new chapter of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------...

CHAPTER 29 APIC VIRTUALIZATION AND VIRTUAL INTERRUPTS

The VMCS includes controls that enable the virtualization of interrupts and the Advanced Programmable Interrupt Controller (APIC). When these controls are used, the processor will emulate many accesses to the APIC, track the state of the virtual APIC, and deliver virtual interrupts all in VMX non-root operation with out a VM exit.1 The processor tracks the state of the virtual APIC using a virtual-APIC page identified by the virtual-machine monitor (VMM). Section 29.1 discusses the virtual-APIC page and how the processor uses it to track the state of the virtual APIC. The following are the VM-execution controls relevant to APIC virtualization and virtual interrupts (see Section 24.6 for information about the locations of these controls): Virtual-interrupt delivery. This controls enables the evaluation and delivery of pending virtual interrupts (Section 29.2). It also enables the emulation of writes (memory-mapped or MSR-based, as enabled) to the APIC registers that control interrupt prioritization. Use TPR shadow. This control enables emulation of accesses to the APICs task-priority register (TPR) via CR8 (Section 29.3) and, if enabled, via the memory-mapped or MSR-based interfaces.

1. In most cases, it is not necessary for a virtual-machine monitor (VMM) to inject virtual interrupts as part of VM entry.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

199

Virtualize APIC accesses. This control enables virtualization of memory-mapped accesses to the APIC (Section 29.4) by causing VM exits on accesses to a VMM-specified APIC-access page. Some of the other controls, if set, may cause some of these accesses to be emulated rather than causing VM exits. Virtualize x2APIC mode. This control enables virtualization of MSR-based accesses to the APIC (Section 29.5). APIC-register virtualization. This control allows memory-mapped and MSR-based reads of most APIC registers (as enabled) by satisfying them from the virtual-APIC page. It directs memory-mapped writes to the APIC-access page to the virtual-APIC page, following them by VM exits for VMM emulation. Process posted interrupts. This control allows software to post virtual interrupts in a data structure and send a notification to another logical processor; upon receipt of the notification, the target processor will process the posted interrupts by copying them into the virtual-APIC page (Section 29.6).

Virtualize APIC accesses, virtualize x2APIC mode, virtual-interrupt delivery, and APIC-register virtualization are all secondary processor-based VM-execution controls. If bit 31 of the primary processor-based VMexecution controls is 0, the processor operates as if these controls were all 0. See Section 24.6.2.

29.1

VIRTUAL APIC STATE

The virtual-APIC page is a 4-KByte region of memory that the processor uses the virtual-APIC page to virtualize certain accesses to APIC registers and to manage virtual interrupts. The physical address of the virtual-APIC page is the virtual-APIC address, a 64-bit VM-execution control field in the VMCS (see Section 24.6.8). Depending on the settings of certain VM-execution controls, the processor may virtualize certain fields on the virtual-APIC page with functionality analogous to that performed by the local APIC. Section 29.1.1 identifies and defines these fields. Section 29.1.2, Section 29.1.3, Section 29.1.4, and Section 29.1.5 detail the actions taken to virtualize updates to some of these fields.

29.1.1

Virtualized APIC Registers

Depending on the setting of certain VM-execution controls, a logical processor may virtualize certain accesses to APIC registers using the following fields on the virtual-APIC page: Virtual task-priority register (VTPR): the 32-bit field located at offset 080H on the virtual-APIC page. Virtual processor-priority register (VPPR): the 32-bit field located at offset 0A0H on the virtual-APIC page. Virtual end-of-interrupt register (VEOI): the 32-bit field located at offset 0B0H on the virtual-APIC page. Virtual interrupt-service register (VISR): the 256-bit value comprising eight non-contiguous 32-bit fields at offsets 100H, 110H, 120H, 130H, 140H, 150H, 160H, and 170H on the virtual-APIC page. Bit x of the VISR is at bit position (x & 1FH) at offset (100H | ((x & E0H) 1)). The processor uses only the low 4 bytes of each of the 16-byte fields at offsets 100H, 110H, 120H, 130H, 140H, 150H, 160H, and 170H. Virtual interrupt-request register (VIRR): the 256-bit value comprising eight non-contiguous 32-bit fields at offsets 200H, 210H, 220H, 230H, 240H, 250H, 260H, and 270H on the virtual-APIC page. Bit x of the VIRR is at bit position (x & 1FH) at offset (200H | ((x & E0H) 1)). The processor uses only the low 4 bytes of each of the 16-Byte fields at offsets 200H, 210H, 220H, 230H, 240H, 250H, 260H, and 270H. Virtual interrupt-command register (VICR_LO): the 32-bit field located at offset 300H on the virtualAPIC page Virtual interrupt-command register (VICR_HI): the 32-bit field located at offset 310H on the virtualAPIC page.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

200

29.1.2

TPR Virtualization

The processor performs TPR virtualization in response to the following operations: (1) virtualization of the MOV to CR8 instruction; (2) virtualization of a write to offset 080H on the APIC-access page; and (3) virtualization of the WRMSR instruction with ECX = 808H. See Section 29.3, Section 29.4.3, and Section 29.5 for details of when TPR virtualization is performed. The following pseudocode details the behavior of TPR virtualization: IF virtual-interrupt delivery is 0 THEN IF VTPR[7:4] < TPR threshold (see Section 24.6.8) THEN cause VM exit due to TPR below threshold; FI; ELSE perform PPR virtualization (see Section 29.1.3); evaluate pending virtual interrupts (see Section 29.2.1); FI; Any VM exit caused by TPR virtualization is trap-like: the instruction causing TPR virtualization completes before the VM exit occurs (for example, the value of CS:RIP saved in the guest-state area of the VMCS references the next instruction).

29.1.3

PPR Virtualization

The processor performs PPR virtualization in response to the following operations: (1) VM entry; (2) TPR virtualization; and (3) EOI virtualization. See Section 26.3.2.5, Section 29.1.2, and Section 29.1.4 for details of when PPR virtualization is performed. PPR virtualization uses the guest interrupt status (specifically, SVI; see Section 24.4.2) and VTPR.The following pseudocode details the behavior of PPR virtualization: IF VTPR[7:4] SVI[7:4] THEN VPPR VTPR & FFH; ELSE VPPR SVI & F0H; FI; PPR virtualization always clears bytes 3:1 of VPPR. PPR virtualization is caused only by TPR virtualization, EOI virtualization, and VM entry. Delivery of a virtual interrupt also modifies VPPR, but in a different way (see Section 29.2.2). No other operations modify VPPR, even if they modify SVI, VISR, or VTPR.

29.1.4

EOI Virtualization

The processor performs EOI virtualization in response to the following operations: (1) virtualization of a write to offset 0B0H on the APIC-access page; and (2) virtualization of the WRMSR instruction with ECX = 80BH. See Section 29.4.3 and Section 29.5 for details of when EOI virtualization is performed. EOI virtualization occurs only if the virtual-interrupt delivery VM-execution control is 1. EOI virtualization uses and updates the guest interrupt status (specifically, SVI; see Section 24.4.2). The following pseudocode details the behavior of EOI virtualization: Vector SVI; VISR[Vector] 0; (see Section 29.1.1 for definition of VISR) IF any bits set in VISR THEN SVI highest index of bit set in VISR

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

201

ELSE SVI 0; FI; perform PPR virtualiation (see Section 29.1.3); IF EOI_exit_bitmap[Vector] = 1 (see Section 24.6.8 for definition of EOI_exit_bitmap) THEN cause EOI-induced VM exit with Vector as exit qualification; ELSE evaluate pending virtual interrupts; (see Section 29.2.1) FI; Any VM exit caused by EOI virtualization is trap-like: the instruction causing EOI virtualization completes before the VM exit occurs (for example, the value of CS:RIP saved in the guest-state area of the VMCS references the next instruction).

29.1.5

Self-IPI Virtualization

The processor performs self-IPI virtualization in response to the following operations: (1) virtualization of a write to offset 300H on the APIC-access page; and (2) virtualization of the WRMSR instruction with ECX = 83FH. See Section 29.4.3 and Section 29.5 for details of when self-IPI virtualization is performed. Self-IPI virtualization occurs only if the virtual-interrupt delivery VM-execution control is 1. Each operation that leads to self-IPI virtualization provides an 8-bit vector (see Section 29.4.3 and Section 29.5). Self-IPI virtualization updates the guest interrupt status (specifically, RVI; see Section 24.4.2). The following pseudocode details the behavior of self-IPI virtualization: VIRR[Vector] 1; (see Section 29.1.1 for definition of VIRR) RVI max{RVI,Vector}; evaluate pending virtual interrupts; (see Section 29.2.1)

29.2

EVALUATION AND DELIVERY OF VIRTUAL INTERRUPTS

If the virtual-interrupt delivery VM-execution control is 1, certain actions in VMX non-root operation or during VM entry cause the processor to evaluate and deliver virtual interrupts. Evaluation of virtual interrupts is triggered by certain actions change the state of the virtual-APIC page and is described in Section 29.2.1. This evaluation may result in recognition of a virtual interrupt. Once a virtual interrupt is recognized, the processor may deliver it within VMX non-root operation without a VM exit. Virtual-interrupt delivery is described in Section 29.2.2.

29.2.1

Evaluation of Pending Virtual Interrupts

If the virtual-interrupt delivery VM-execution control is 1, certain actions cause a logical processor to evaluate pending virtual interrupts. The following actions cause the evaluation of pending virtual interrupts: VM entry; TPR virtualization; EOI virtualization; self-IPI virtualization; and posted-interrupt processing. See Section 26.3.2.5, Section 29.1.2, Section 29.1.4, Section 29.1.5, and Section 29.6 for details of when evaluation of pending virtual interrupts is performed. No other operations cause the evaluation of pending virtual interrupts, even if they modify RVI or VPPR. Evaluation of pending virtual interrupts uses the guest interrupt status (specifically, RVI; see Section 24.4.2). The following pseudocode details the evaluation of pending virtual interrupts: IF interrupt-window exiting is 0 AND RVI[7:4] > VPPR[7:4] (see Section 29.1.1 for definition of VPPR) THEN recognize a pending virtual interrupt; ELSE

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

202

do not recognize a pending virtual interrupt; FI; Once recognized, a virtual interrupt may be delivered in VMX non-root operation; see Section 29.2.2. Evaluation of pending virtual interrupts is caused only by VM entry, TPR virtualization, EOI virtualization, self-IPI virtualization, and posted-interrupt processing. No other operations do so, even if they modify RVI or VPPR. The logical processor ceases recognition of a pending virtual interrupt following the delivery of a virtual interrupt.

29.2.2

Virtual-Interrupt Delivery

If a virtual interrupt has been recognized (see Section 29.2.1), it will be delivered at an instruction boundary when the following conditions all hold: (1) RFLAGS.IF = 1; (2) there is no blocking by STI; (3) there is no blocking by MOV SS or by POP SS; and (4) the interrupt-window exiting VM-execution control is 0. Virtual-interrupt delivery has the same priority as that of VM exits due to the 1-setting of the interrupt-window exiting VM-execution control.1 Thus, non-maskable interrupts (NMIs) and higher priority events take priority over delivery of a virtual interrupt; delivery of a virtual interrupt takes priority over external interrupts and lower priority events. Virtual-interrupt delivery wakes a logical processor from the same inactive activity states as would an external interrupt. Specifically, it wakes a logical processor from the states entered using the HLT and MWAIT instructions. It does not wake a logical processor in the shutdown state or in the wait-for-SIPI state. Virtual-interrupt delivery updates the guest interrupt status (both RVI and SVI; see Section 24.4.2) and delivers an event within VMX non-root operation without a VM exit. The following pseudocode details the behavior of virtual-interrupt delivery (see Section 29.1.1 for definition of VISR, VIRR, and VPPR): Vector RVI; VISR[Vector] 1; SVI Vector; VPPR Vector & F0H; VIRR[Vector] 0; IF any bits set in VIRR THEN RVI highest index of bit set in VIRR ELSE RVI 0; FI; deliver interrupt with Vector through IDT; cease recognition of any pending virtual interrupt;

29.3

VIRTUALIZING CR8-BASED TPR ACCESSES

In 64-bit mode, software can access the local APICs task-priority register (TPR) through CR8. Specifically, software uses the MOV from CR8 and MOV to CR8 instructions (see Section 10.8.6, Task Priority in IA-32e Mode). This section describes how these accesses can be virtualized. A virtual-machine monitor can virtualize these CR8-based APIC accesses by setting the CR8-load exiting and CR8-store exiting VM-execution controls, ensuring that the accesses cause VM exits (see Section 25.1.3). Alternatively, there are methods for virtualizing some CR8-based APIC accesses without VM exits.

1. A logical processor never recognizes or delivers a virtual interrupt if the interrupt-window exiting VM-execution control is 1. Because of this, the relative priority of virtual-interrupt delivery and VM exits due to the 1-setting of that control is not defined.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

203

Normally, an execution of MOV from CR8 or MOV to CR8 that does not fault or cause a VM exit accesses the APICs TPR. However, such an execution are treated specially if the use TPR shadow VM-execution control is 1. The following items provide details: MOV from CR8. The instruction loads bits 3:0 of its destination operand with bits 7:4 of VTPR (see Section 29.1.1). Bits 63:4 of the destination operand are cleared. MOV to CR8. The instruction stores bits 3:0 of its source operand into bits 7:4 of VTPR; the remainder of VTPR (bits 3:0 and bits 31:8) are cleared. Following this, the processor performs TPR virtualization (see Section 29.1.2).

29.4

VIRTUALIZING MEMORY-MAPPED APIC ACCESSES

When the local APIC is in xAPIC mode, software accesses the local APICs control registers using a memorymapped interface. Specifically, software uses linear addresses that translate to physical addresses on page frame indicated by the base address in the IA32_APIC_BASE MSR (see Section 10.4.4, Local APIC Status and Location). This section describes how these accesses can be virtualized. A virtual-machine monitor (VMM) can virtualize these memory-mapped APIC accesses by ensuring that any access to a linear address that would access the local APIC instead causes a VM exit. This could be done using paging or the extended page-table mechanism (EPT). Another way is by using the 1-setting of the virtualize APIC accesses VM-execution control. If the virtualize APIC accesses VM-execution control is 1, the logical processor treats specially memory accesses using linear addresses that translate to physical addresses in the 4-KByte APIC-access page.1 (The APIC-access page is identified by the APIC-access address, a field in the VMCS; see Section 24.6.8.) In general, an access to the APIC-access page causes an APIC-access VM exit. APIC-access VM exits provide a VMM with information about the access causing the VM exit. Section 29.4.1 discusses the priority of APIC-access VM exits. Certain VM-execution controls enable the processor to virtualize certain accesses to the APIC-access page without a VM exit. In general, this virtualization causes these accesses to be made to the virtual-APIC page instead of the APIC-access page.

NOTES
Unless stated otherwise, this section characterizes only linear accesses to the APIC-access page; an access to the APIC-access page is a linear access if (1) it results from a memory access using a linear address; and (2) the accesss physical address is the translation of that linear address. Section 29.4.6 discusses accesses to the APIC-access page that are not linear accesses. The distinction between the APIC-access page and the virtual-APIC page allows a VMM to share paging structures or EPT paging structures among the virtual processors of a virtual machine (the shared paging structures referencing the same APIC-access address, which appears in the VMCS of all the virtual processors) while giving each virtual processor its own virtual APIC (the VMCS of each virtual processor will have a unique virtual-APIC address). Section 29.4.2 discusses when and how the processor may virtualize read accesses from the APIC-access page. Section 29.4.3 does the same for write accesses. When virtualizing a write to the APIC-access page, the processor typically takes actions in addition to passing the write through to the virtual-APIC page.

1. Even when addresses are translated using EPT (see Section 28.2), the determination of whether an APIC-access VM exit occurs depends on an accesss physical address, not its guest-physical address. Even when CR0.PG = 0, ordinary memory accesses by software use linear addresses; the fact that CR0.PG = 0 means only that the identity translation is used to convert linear addresses to physical (or guest-physical) addresses.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

204

The discussion in those sections uses the concept of an operation within which these memory accesses may occur. For those discussions, an operation can be an iteration of a REP-prefixed string instruction, an execution of any other instruction, or delivery of an event through the IDT. The 1-setting of the virtualize APIC accesses VM-execution control may also affect accesses to the APIC-access page that do not result directly from linear addresses. This is discussed in Section 29.4.6.

29.4.1

Priority of APIC-Access VM Exits

The following items specify the priority of APIC-access VM exits relative to other events. The priority of an APIC-access VM exit due to a memory access is below that of any page fault or EPT violation that that access may incur. That is, an access does not cause an APIC-access VM exit if it would cause a page fault or an EPT violation. A memory access does not cause an APIC-access VM exit until after the accessed flags are set in the paging structures (including EPT paging structures, if enabled). A write access does not cause an APIC-access VM exit until after the dirty flags are set in the appropriate paging structure and EPT paging structure (if enabled). With respect to all other events, any APIC-access VM exit due to a memory access has the same priority as any page fault or EPT violation that the access could cause. (This item applies to other events that the access may generate as well as events that may be generated by other accesses by the same operation.)

These principles imply, among other things, that an APIC-access VM exit may occur during the execution of a repeated string instruction (including INS and OUTS). Suppose, for example, that the first n iterations (n may be 0) of such an instruction do not access the APIC-access page and that the next iteration does access that page. As a result, the first n iterations may complete and be followed by an APIC-access VM exit. The instruction pointer saved in the VMCS references the repeated string instruction and the values of the general-purpose registers reflect the completion of n iterations.

29.4.2

Virtualizing Reads from the APIC-Access Page

A read access from the APIC-access page causes an APIC-access VM exit if any of the following are true: The use TPR shadow VM-execution control is 0. The access is for an instruction fetch. The access is more than 32 bits in size. The access is part of an operation for which the processor has already virtualized a write to the APIC-access page. The access is not entirely contained within the low 4 bytes of a naturally aligned 16-byte region. That is, bits 3:2 of the accesss address are 0, and the same is true of the address of the highest byte accessed.

If none of the above are true, whether a read access is virtualized depends on the setting of the APIC-register virtualization VM-execution control: If APIC-register virtualization is 0, a read access is virtualized if its page offset is 080H (task priority); otherwise, the access causes an APIC-access VM exit. If APIC-register virtualization is 1, a read access is virtualized if it is entirely within one the following ranges of offsets: 020H023H (local APIC ID); 030H033H (local APIC version); 080H083H (task priority); 0B0H0B3H (end of interrupt);

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

205

0D0H0D3H (logical destination); 0E0H0E3H (destination format); 0F0H0F3H (spurious-interrupt vector); 100H103H, 110H113H, 120H123H, 130H133H, 140H143H, 150H153H, 160H163H, or 170H 173H (in-service); 180H183H, 190H193H, 1A0H1A3H, 1B0H1B3H, 1C0H1C3H, 1D0H1D3H, 1E0H1E3H, or 1F0H 1F3H (trigger mode); 200H203H, 210H213H, 220H223H, 230H233H, 240H243H, 250H253H, 260H263H, or 270H 273H (interrupt request); 280H283H (error status); 300H303H or 310H313H (interrupt command); 320H323H, 330H333H, 340H343H, 350H353H, 360H363H, or 370H373H (LVT entries); 380H383H (initial count); or 3E0H3E3H (divide configuration). In all other cases, the access causes an APIC-access VM exit. A read access from the APIC-access page that is virtualized returns data from the corresponding page offset on the virtual-APIC page.1

29.4.3

Virtualizing Writes to the APIC-Access Page

Whether a write access to the APIC-access page is virtualized depends on the settings of the VM-execution controls and the page offset of the access. Section 29.4.3.1 details when APIC-write virtualization occurs. Unlike reads, writes to the local APIC have side effects; because of this, virtualization of writes to the APIC-access page may require emulation specific to the accesss page offset (which identifies the APIC register being accessed). Section 29.4.3.2 describes this APIC-write emulation. For some page offsets, it is necessary for software to complete the virtualization after a write completes. In these cases, the processor causes an APIC-write VM exit to invoke VMM software. Section 29.4.3.3 discusses APICwrite VM exits.

29.4.3.1

Determining Whether a Write Access is Virtualized

A write access to the APIC-access page causes an APIC-access VM exit if any of the following are true: The use TPR shadow VM-execution control is 0. The access is more than 32 bits in size. The access is part of an operation for which the processor has already virtualized a write (with a different page offset or a different size) to the APIC-access page. The access is not entirely contained within the low 4 bytes of a naturally aligned 16-byte region. That is, bits 3:2 of the accesss address are 0, and the same is true of the address of the highest byte accessed.

If none of the above are true, whether a write access is virtualized depends on the settings of the APIC-register virtualization and virtual-interrupt delivery VM-execution controls:

1. The memory type used for accesses that read from the virtual-APIC page is reported in bits 53:50 of the IA32_VMX_BASIC MSR (see Appendix A.1).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

206

If the APIC-register virtualization and virtual-interrupt delivery VM-execution controls are both 0, a write access is virtualized if its page offset is 080H; otherwise, the access causes an APIC-access VM exit. If the APIC-register virtualization VM-execution control is 0 and the virtual-interrupt delivery VMexecution control is 1, a write access is virtualized if its page offset is 080H (task priority), 0B0H (end of interrupt), and 300H (interrupt command low); otherwise, the access causes an APIC-access VM exit. If APIC-register virtualization is 1, a write access is virtualized if it is entirely within one the following ranges of offsets: 020H023H (local APIC ID); 080H083H (task priority); 0B0H0B3H (end of interrupt); 0D0H0D3H (logical destination); 0E0H0E3H (destination format); 0F0H0F3H (spurious-interrupt vector); 280H283H (error status); 300H303H or 310H313H (interrupt command); 320H323H, 330H333H, 340H343H, 350H353H, 360H363H, or 370H373H (LVT entries); 380H383H (initial count); or 3E0H3E3H (divide configuration). In all other cases, the access causes an APIC-access VM exit.

The processor virtualizes a write access to the APIC-access page by writing data to the corresponding page offset on the virtual-APIC page.1 Following this, the processor performs certain actions after completion of the operation of which the access was a part.2 APIC-write emulation is described in Section 29.4.3.2.

29.4.3.2

APIC-Write Emulation

If the processor virtualizes a write access to the APIC-access page, it performs additional actions after completion of an operation of which the access was a part. These actions are called APIC-write emulation. The details of APIC-write emulation depend upon the page offset of the virtualized write access:3 080H (task priority). The processor clears bytes 3:1 of VTPR and then causes TPR virtualization (Section 29.1.2). 0B0H (end of interrupt). If the virtual-interrupt delivery VM-execution control is 1, the processor clears VEOI and then causes EOI virtualization (Section 29.1.4); otherwise, the processor causes an APIC-write VM exit (Section 29.4.3.3). 300H (interrupt command low). If the virtual-interrupt delivery VM-execution control is 1, the processor checks the value of VICR_LO to determine whether the following are all true: Reserved bits (31:20, 17:16, 13) and bit 12 (delivery status) are all 0. Bits 19:18 (destination shorthand) are 01B (self). 1. The memory type used for accesses that write to the virtual-APIC page is reported in bits 53:50 of the IA32_VMX_BASIC MSR (see Appendix A.1). 2. Recall that, for the purposes of this discussion, an operation is an iteration of a REP-prefixed string instruction, an execution of any other instruction, or delivery of an event through the IDT. 3. For any operation, there can be only one page offset for which a write access was virtualized. This is because a write access is not virtualized if the processor has already virtualized a write access for the same operation with a different page offset.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

207

Bit 15 (trigger mode) is 0 (edge). Bits 10:8 (delivery mode) are 000B (fixed). Bits 7:4 (the upper half of the vector) are not 0000B. If all of the items above are true, the processor performs self-IPI virtualization using the 8-bit vector in byte 0 of VICR_LO (Section 29.1.5). If the virtual-interrupt delivery VM-execution control is 0, or if any of the items above are false, the processor causes an APIC-write VM exit (Section 29.4.3.3). 310H313H (interrupt command high). The processor clears bytes 2:0 of VICR_HI. No other virtualization or VM exit occurs. Any other page offset. The processor causes an APIC-write VM exit (Section 29.4.3.3).

APIC-write emulation takes priority over system-management interrupts (SMIs), INIT signals, and lower priority events. APIC-write emulation is not blocked if RFLAGS.IF = 0 or by the MOV SS, POP SS, or STI instructions. If an operation causes a fault after a write access to the APIC-access page and before APIC-write emulation. In this case, APIC-write emulation occurs after the fault is delivered and before the fault handler can execute. However, if the operation causes a VM exit (perhaps due to a fault), the APIC-write emulation does not occur.

29.4.3.3

APIC-Write VM Exits

In certain cases, VMM software must be invoked to complete the virtualization of a write access to the APICaccess page. In this case, APIC-write emulation causes an APIC-write VM exit. (Section 29.4.3.2 details the cases that causes APIC-write VM exits.) APIC-write VM exits are invoked by APIC-write emulation, and APIC-write emulation occurs after an operation that performs a write access to the APIC-access page. Because of this, every APIC-write VM exit is trap-like: it occurs after completion of the operation containing the write access that caused the VM exit (for example, the value of CS:RIP saved in the guest-state area of the VMCS references the next instruction). The basic exit reason for an APIC-write VM exit is APIC write. The exit qualification is the page offset of the write access that led to the VM exit. As noted in Section 29.5, execution of WRMSR with ECX = 83FH (self-IPI MSR) can lead to an APIC-write VM exit if the virtual-interrupt delivery VM-execution control is 1. The exit qualification for such an APIC-write VM exit is 3F0H.

29.4.4

Instruction-Specific Considerations

Certain instructions that use linear address may cause page faults even though they do not use those addresses to access memory. The APIC-virtualization features may affect these instructions as well: CLFLUSH. With regard to faulting, the processor operates as if CLFLUSH reads from the linear address in its source operand. If that address translates to one on the APIC-access page, the instruction may cause an APIC-access VM exit. If it does not, it will flush the corresponding cache line on the virtual-APIC page instead of the APIC-access page. ENTER. With regard to faulting, the processor operates if ENTER writes to the byte referenced by the final value of the stack pointer (even though it does not if its size operand is non-zero). If that value translates to an address on the APIC-access page, the instruction may cause an APIC-access VM exit. If it does not, it will cause the APIC-write emulation appropriate to the addresss page offset. MASKMOVQ and MAKSMOVDQU. Even if the instructions mask is zero, the processor may operate with regard to faulting as if MASKMOVQ or MASKMOVDQU writes to memory (the behavior is implementationspecific). In such a situation, an APIC-access VM exit may occur.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

208

MONITOR. With regard to faulting, the processor operates as if MONITOR reads from the effective address in RAX. If the resulting linear address translates to one on the APIC-access page, the instruction may cause an APIC-access VM exit.1 If it does not, it will monitor the corresponding address on the virtual-APIC page instead of the APIC-access page. PREFETCH. An execution of the PREFETCH instruction that would result in an access to the APIC-access page does not cause an APIC-access VM exit. Such an access may prefetch data; if so, it is from the corresponding address on the virtual-APIC page.

Virtualization of accesses to the APIC-access page is principally intended for basic instructions such as AND, MOV, OR, TEST, XCHG, and XOR. Use of instructions that normally operate on floating-point, SSE, or AVX registers may cause APIC-access VM exit unconditionally regardless of the page offset they access on the APIC-access page.

29.4.5

Issues Pertaining to Page Size and TLB Management

The 1-setting of the virtualize APIC accesses VM-execution is guaranteed to apply only if translations to the APIC-access address use a 4-KByte page. The following items provide details: If EPT is not in use, any linear address that translates to an address on the APIC-access page should use a 4KByte page. Any access to a linear address that translates to the APIC-access page using a larger page may operate as if the virtualize APIC accesses VM-execution control were 0. If EPT is in use, any guest-physical address that translates to an address on the APIC-access page should use a 4-KByte page. Any access to a linear address that translates to a guest-physical address that in turn translates to the APIC-access page using a larger page may operate as if the virtualize APIC accesses VMexecution control were 0. (This is true also for guest-physical accesses to the APIC-access page; see Section 29.4.6.1.)

In addition, software should perform appropriate TLB invalidation when making changes that may affect APICvirtualization. The specifics depend on whether VPIDs or EPT is being used: VPIDs being used but EPT not being used. Suppose that there is a VPID that has been used before and that software has since made either of the following changes: (1) set the virtualize APIC accesses VMexecution control when it had previously been 0; or (2) changed the paging structures so that some linear address translates to the APIC-access address when it previously did not. In that case, software should execute INVVPID (see INVVPID Invalidate Translations Based on VPID in Section 30.3) before performing on the same logical processor and with the same VPID.2 EPT being used. Suppose that there is an EPTP value that has been used before and that software has since made either of the following changes: (1) set the virtualize APIC accesses VM-execution control when it had previously been 0; or (2) changed the EPT paging structures so that some guest-physical address translates to the APIC-access address when it previously did not. In that case, software should execute INVEPT (see INVEPT Invalidate Translations Derived from EPT in Section 30.3) before performing on the same logical processor and with the same EPTP value.3 Neither VPIDs nor EPT being used. No invalidation is required.

Failure to perform the appropriate TLB invalidation may result in the logical processor operating as if the virtualize APIC accesses VM-execution control were 0 in responses to accesses to the affected address. (No invalidation is necessary if neither VPIDs nor EPT is being used.) 1. This chapter uses the notation RAX, RIP, RSP, RFLAGS, etc. for processor registers because most processors that support VMX operation also support Intel 64 architecture. For IA-32 processors, this notation refers to the 32-bit forms of those registers (EAX, EIP, ESP, EFLAGS, etc.). In a few places, notation such as EAX is used to refer specifically to lower 32 bits of the indicated register. 2. INVVPID should use either (1) the all-contexts INVVPID type; (2) the single-context INVVPID type with the VPID in the INVVPID descriptor; or (3) the individual-address INVVPID type with the linear address and the VPID in the INVVPID descriptor. 3. INVEPT should use either (1) the global INVEPT type; or (2) the single-context INVEPT type with the EPTP value in the INVEPT descriptor.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

209

29.4.6

APIC Accesses Not Directly Resulting From Linear Addresses

Section 29.4 has described the treatment of accesses that use linear addresses that translate to addresses on the APIC-access page. This section considers memory accesses that do not result directly from linear addresses. An access is called a guest-physical access if (1) CR0.PG = 1;1 (2) the enable EPT VM-execution control is 1;2 (3) the accesss physical address is the result of an EPT translation; and (4) either (a) the access was not generated by a linear address; or (b) the accesss guest-physical address is not the translation of the accesss linear address. Section 29.4.6.1 discusses the treatment of guest-physical accesses to the APICaccess page. An access is called a physical access if (1) either (a) the enable EPT VM-execution control is 0; or (b) the accesss physical address is not the result of a translation through the EPT paging structures; and (2) either (a) the access is not generated by a linear address; or (b) the accesss physical address is not the translation of its linear address. Section 29.4.6.2 discusses the treatment of physical accesses to the APIC-access page.

29.4.6.1

Guest-Physical Accesses to the APIC-Access Page

Guest-physical accesses include the following when guest-physical addresses are being translated using EPT: Reads from the guest paging structures when translating a linear address (such an access uses a guestphysical address that is not the translation of that linear address). Loads of the page-directory-pointer-table entries by MOV to CR when the logical processor is using (or that causes the logical processor to use) PAE paging (see Section 4.4). Updates to the accessed and dirty flags in the guest paging structures when using a linear address (such an access uses a guest-physical address that is not the translation of that linear address).

Every guest-physical access to an address on the APIC-access page causes an APIC-access VM exit. Such accesses are never virtualized regardless of the page offset. The following items specify the priority relative to other events of APIC-access VM exits caused by guest-physical accesses to the APIC-access page. The priority of an APIC-access VM exit caused by a guest-physical access to memory is below that of any EPT violation that that access may incur. That is, a guest-physical access does not cause an APIC-access VM exit if it would cause an EPT violation. With respect to all other events, any APIC-access VM exit caused by a guest-physical access has the same priority as any EPT violation that the guest-physical access could cause.

29.4.6.2

Physical Accesses to the APIC-Access Page

Physical accesses include the following: If the enable EPT VM-execution control is 0: Reads from the paging structures when translating a linear address. Loads of the page-directory-pointer-table entries by MOV to CR when the logical processor is using (or that causes the logical processor to use) PAE paging (see Section 4.4). Updates to the accessed and dirty flags in the paging structures. If the enable EPT VM-execution control is 1, accesses to the EPT paging structures (including updates to the accessed and dirty flags for EPT).

1. If the capability MSR IA32_VMX_CR0_FIXED0 reports that CR0.PG must be 1 in VMX operation, CR0.PG must be 1 unless the unrestricted guest VM-execution control and bit 31 of the primary processor-based VM-execution controls are both 1. 2. Enable EPT is a secondary processor-based VM-execution control. If bit 31 of the primary processor-based VM-execution controls is 0, VMX non-root operation functions as if the enable EPT VM-execution control were 0. See Section 24.6.2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

210

Any of the following accesses made by the processor to support VMX non-root operation: Accesses to the VMCS region. Accesses to data structures referenced (directly or indirectly) by physical addresses in VM-execution control fields in the VMCS. These include the I/O bitmaps, the MSR bitmaps, and the virtual-APIC page.

Accesses that effect transitions into and out of SMM.1 These include the following: Accesses to SMRAM during SMI delivery and during execution of RSM. Accesses during SMM VM exits (including accesses to MSEG) and during VM entries that return from SMM.

A physical access to the APIC-access page may or may not cause an APIC-access VM exit. If it does not cause an APIC-access VM exit, it may access the APIC-access page or the virtual-APIC page. Physical write accesses to the APIC-access page may or may not cause APIC-write emulation or APIC-write VM exits. The priority of an APIC-access VM exit caused by physical access is not defined relative to other events that the access may cause. It is recommended that software not set the APIC-access address to any of the addresses used by physical memory accesses (identified above). For example, it should not set the APIC-access address to the physical address of any of the active paging structures if the enable EPT VM-execution control is 0.

29.5

VIRTUALIZING MSR-BASED APIC ACCESSES

When the local APIC is in x2APIC mode, software accesses the local APICs control registers using the MSR interface. Specifically, software uses the RDMSR and WRMSR instructions, setting ECX (identifying the MSR being accessed) to values in the range 800H8FFH (see Section 10.12, Extended XAPIC (x2APIC)). This section describes how these accesses can be virtualized. A virtual-machine monitor can virtualize these MSR-based APIC accesses by configuring the MSR bitmaps (see Section 24.6.9) to ensure that the accesses cause VM exits (see Section 25.1.3). Alternatively, there are methods for virtualizing some MSR-based APIC accesses without VM exits. Normally, an execution of RDMSR or WRMSR that does not fault or cause a VM exit accesses the MSR indicated in ECX. However, such an execution treats some values of ECX in the range 800H8FFH specially if the virtualize x2APIC mode VM-execution control is 1. The following items provide details: RDMSR. The instructions behavior depends on the setting of the APIC-register virtualization VM-execution control. If the APIC-register virtualization VM-execution control is 0, behavior depends upon the value of ECX. If ECX contains 808H (indicating the TPR MSR), the instruction reads the 8 bytes from offset 080H on the virtual-APIC page (VTPR and the 4 bytes above it) into EDX:EAX. This occurs even if the local APIC is not in x2APIC mode (no general-protection fault occurs because the local APIC is not x2APIC mode). If ECX contains any other value in the range 800H8FFH, the instruction operates normally. If the local APIC is in x2APIC mode and ECX indicates a readable APIC register, EDX and EAX are loaded with the value of that register. If the local APIC is not in x2APIC mode or ECX does not indicate a readable APIC register, a general-protection fault occurs.

If APIC-register virtualization is 1 and ECX contains a value in the range 800H8FFH, the instruction reads the 8 bytes from offset X on the virtual-APIC page into EDX:EAX, where X = (ECX & FFH) 4. This occurs even if the local APIC is not in x2APIC mode (no general-protection fault occurs because the local APIC is not in x2APIC mode). WRMSR. The instructions behavior depends on the value of ECX and the setting of the virtual-interrupt delivery VM-execution control.

1. Technically, these accesses do not occur in VMX non-root operation. They are included here for clarity.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

211

Special processing applies in the following cases: (1) ECX contains 808H (indicating the TPR MSR); (2) ECX contains 80BH (indicating the EOI MSR) and the virtual-interrupt delivery VM-execution control is 1; and (3) ECX contains 83FH (indicating the self-IPI MSR) and the virtual-interrupt delivery VM-execution control is 1. If special processing applies, no general-protection exception is produced due to the fact that the local APIC is in xAPIC mode. However, WRMSR does perform the normal reserved-bit checking: If ECX contains 808H or 83FH, a general-protection fault occurs if either EDX or EAX[31:8] is non-zero. If ECX contains 80BH, a general-protection fault occurs if either EDX or EAX is non-zero. If there is no fault, WRMSR stores EDX:EAX at offset X on the virtual-APIC page, where X = (ECX & FFH) 4. Following this, the processor performs an operation depending on the value of ECX: If ECX contains 808H, the processor performs TPR virtualization (see Section 29.1.2). If ECX contains 80BH, the processor performs EOI virtualization (see Section 29.1.4). If ECX contains 83FH, the processor It then checks the value of EAX[7:4] and proceeds as follows: If the value is non-zero, the logical processor performs self-IPI virtualization with the 8-bit vector in EAX[7:0] (see Section 29.1.5). If the value is zero, the logical processor causes an APIC-write VM exit as if there had been a write access to page offset 3F0H on the APIC-access page (see Section 29.4.3.3).

If special processing does not apply, the instruction operates normally. If the local APIC is in x2APIC mode and ECX indicates a writeable APIC register, the value in EDX:EAX is written to that register. If the local APIC is not in x2APIC mode or ECX does not indicate a writeable APIC register, a general-protection fault occurs.

29.6

POSTED-INTERRUPT PROCESSING

Posted-interrupt processing is a feature by which a processor processes the virtual interrupts by recording them as pending on the virtual-APIC page. Posted-interrupt processing is enabled by setting the process posted interrupts VM-execution control. The processing is performed in response to the arrival of an interrupt with the posted-interrupt notification vector. In response to such an interrupt, the processor processes virtual interrupts recorded in a data structure called a posted-interrupt descriptor. The posted-interrupt notification vector and the address of the postedinterrupt descriptor are fields in the VMCS; see Section 24.6.8. If the process posted interrupts VM-execution control is 1, a logical processor uses a 64-byte posted-interrupt descriptor located at the posted-interrupt descriptor address. The posted-interrupt descriptor has the following format:

Table 0-1. Format of Posted-Interrupt Descriptor


Bit Position(s) 255:0 256 511:257 Name Posted-interrupt requests Outstanding notification Reserved for software and other agents Description One bit for each interrupt vector. There is a posted-interrupt request for a vector if the corresponding bit is 1 If this bit is set, there is a notification outstanding for one or more posted interrupts in bits 255:0 These bits may be used by software and by other agents in the system (e.g., chipset). The processor does not modify these bits.

The notation PIR (posted-interrupt requests) refers to the 256 posted-interrupt bits in the posted-interrupt descriptor.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

212

Use of the posted-interrupt descriptor differs from that of other data structures that are referenced by pointers in a VMCS. There is a general requirement that software ensure that each such data structure is modified only when no logical processor with a current VMCS that references it is in VMX non-root operation. That requirement does not apply to the posted-interrupt descriptor. There is a requirement, however, that such modifications be done using locked read-modify-write instructions. If the external-interrupt exiting VM-execution control is 1, any unmasked external interrupt causes a VM exit (see Section 25.2). If the process posted interrupts VM-execution control is also 1, this behavior is changed and the processor handles an external interrupt as follows:1 1. The local APIC is acknowledged; this provides the processor core with an interrupt vector, called here the physical vector. 2. If the physical vector equals the posted-interrupt notification vector, the logical processor continues to the next step. Otherwise, a VM exit occurs as it would normally due to an external interrupt; the vector is saved in the VM-exit interruption-information field. 3. The processor clears the outstanding-notification bit in the posted-interrupt descriptor. This is done atomically so as to leave the remainder of the descriptor unmodified (e.g., with a locked AND operation). 4. The processor writes zero to the EOI register in the local APIC; this dismisses the interrupt with the postedinterrupt notification vector from the local APIC. 5. The logical processor performs a logical-OR of PIR into VIRR and clears PIR. No other agent can read or write a PIR bit (or group of bits) between the time it is read (to determine what to OR into VIRR) and when it is cleared. 6. The logical processor sets RVI to be the maximum of the old value of RVI and the highest index of all bits that were set in PIR; if no bit was set in PIR, RVI is left unmodified. 7. The logical processor evaluates pending virtual interrupts as described in Section 29.2.1. The logical processor performs the steps above in an uninterruptible manner. If step #7 leads to recognition of a virtual interrupt, the processor may deliver that interrupt immediately. Steps #1 to #7 above occur when the interrupt controller delivers an unmasked external interrupt to the CPU core. This delivery can occur when the logical processor is in the active, HLT, or MWAIT states. If the logical processor had been in the active or MWAIT state before the arrival of the interrupt, it is in the active state following completion of step #7; if it had been in the HLT state, it returns to the HLT state after step #7 (if a pending virtual interrupt was recognized, the logical processor may immediately wake from the HLT state). ...

25.Updates to Chapter 34, Volume 3C


Change bars show changes to Chapter 34 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------...

1. VM entry ensures that the process posted interrupts VM-execution control is 1 only if the external-interrupt exiting VM-execution control is also 1. SeeSection 26.2.1.1.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

213

Table 34-3 SMRAM State Save Map for Intel 64 Architecture


Offset (Added to SMBASE + 8000H) 7FF8H 7FF0H 7FE8H 7FE0H 7FD8H 7FD0H 7FC8H 7FC4H 7FC0H 7FBCH 7FB8H 7FB4H 7FB0H 7FACH 7FA8H 7FA4H 7F9CH 7F94H 7F8CH 7F84H 7F7CH 7F74H 7F6CH 7F64H 7F5CH 7F54H 7F4CH 7F44H 7F3CH 7F34H 7F2CH 7F24H 7F1CH 7F1BH-7F04H 7F02H Register CR0 CR3 RFLAGS IA32_EFER RIP DR6 DR7 TR SEL1 LDTR SEL GS SEL
1 1

Writable? No No Yes Yes Yes No No No No No No No No No No No No Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes Yes No Yes

FS SEL1 DS SEL SS SEL ES SEL


1 1

CS SEL1
1

IO_MISC IO_MEM_ADDR RDI RSI RBP RSP RBX RDX RCX RAX R8 R9 R10 R11 R12 R13 R14 R15 Reserved Auto HALT Restart Field (Word)

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

214

Table 34-3 SMRAM State Save Map for Intel 64 Architecture (Contd.)
Offset (Added to SMBASE + 8000H) 7F00H 7EFCH 7EF8H 7EF7H - 7EE4H 7EE0H 7ED8H 7ED7H - 7EA0H 7E9CH 7E98H 7E94H 7E90H 7E8CH 7E8BH - 7E44H 7E40H 7E3FH - 7DF0H 7DE8H 7DE7H - 7DDCH 7DD8H 7DD4H 7DD0H 7DCFH - 7C00H NOTE: 1. The two most significant bytes are reserved. ... Register I/O Instruction Restart Field (Word) SMM Revision Identifier Field (Doubleword) SMBASE Field (Doubleword) Reserved Setting of enable EPT VM-execution control Value of EPTP VM-execution control field Reserved LDT Base (lower 32 bits) Reserved IDT Base (lower 32 bits) Reserved GDT Base (lower 32 bits) Reserved CR4 Reserved IO_RIP Reserved IDT Base (Upper 32 bits) LDT Base (Upper 32 bits) GDT Base (Upper 32 bits) Reserved Writable? Yes No Yes No No No No No No No No No No No No Yes No No No No No

34.15.6.4 Saving MSRs


The VM-exit MSR-store area is not used by SMM VM exits that activate the dual-monitor treatment. No MSRs are saved into that area. ...

26.Updates to Chapter 35, Volume 3C


Change bars show changes to Chapter 35 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

215

Table 35-1 CPUID Signature Values of DisplayFamily_DisplayModel


DisplayFamily_DisplayModel Processor Families/Processor Number Series 06_3CH, 06_45H 06_3EH 06_3AH 06_2DH 06_2FH 06_2AH 06_2EH 06_25H, 06_2CH 06_1EH, 06_1FH 06_1AH 06_1DH 06_17H 06_0FH Next Generation Intel Core Processor Next Generation Intel Xeon Processor E5 Family based on Intel microarchitecture Ivy Bridge 3rd Generation Intel Core Processor and Intel Xeon Processor E3-1200v2 Product Family based on Intel microarchitecture Ivy Bridge Intel Xeon Processor E5 Family based on Intel microarchitecture Sandy Bridge Intel Xeon Processor E7 Family Intel Xeon Processor E3-1200 Family; 2nd Generation Intel Core i7, i5, i3 Processors 2xxx Series Intel Xeon processor 7500, 6500 series Intel Xeon processors 3600, 5600 series, Intel Core i7, i5 and i3 Processors Intel Core i7 and i5 Processors Intel Core i7 Processor, Intel Xeon Processor 3400, 3500, 5500 series Intel Xeon Processor MP 7400 series Intel Xeon Processor 3100, 3300, 5200, 5400 series, Intel Core 2 Quad processors 8000, 9000 series Intel Xeon Processor 3000, 3200, 5100, 5300, 7300 series, Intel Core 2 Quad processor 6000 series, Intel Core 2 Extreme 6000 series, Intel Core 2 Duo 4000, 5000, 6000, 7000 series processors, Intel Pentium dual-core processors Intel Core Duo, Intel Core Solo processors Intel Pentium M processor Intel Atom Processor Family Intel Xeon processor 7100, 5000 Series, Intel Xeon Processor MP, Intel Pentium 4, Pentium D processors Intel Xeon Processor, Intel Xeon Processor MP, Intel Pentium 4, Pentium D processors Intel Pentium M processor Intel Xeon Processor, Intel Xeon Processor MP, Intel Pentium 4 processors Intel Xeon Processor, Intel Xeon Processor MP, Intel Pentium 4 processors Intel Pentium III Xeon Processor, Intel Pentium III Processor Intel Pentium II Xeon Processor, Intel Pentium II Processor Intel Pentium Pro Processor Intel Pentium Processor, Intel Pentium Processor with MMX Technology

06_0EH 06_0DH 06_1CH, 06_26H, 06_27H 0F_06H 0F_03H, 0F_04H 06_09H 0F_02H 0F_0H, 0F_01H 06_7H, 06_08H, 06_0AH, 06_0BH 06_03H, 06_05H 06_01H 05_01H, 05_02H, 05_04H

...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

216

Table 35-2 IA-32 Architectural MSRs


Register Address Hex ... 3BH 59 IA32_TSC_ADJUST 63:0 Per Logical Processor TSC Adjust (R/Write to clear) THREAD_ADJUST: Local offset value of the IA32_TSC for a logical processor. Reset value is Zero. A write to IA32_TSC will modify the local offset in IA32_TSC_ADJUST and the content of IA32_TSC, but does not affect the internal invariant TSC hardware. ... C000_ 0080H IA32_EFER Extended Feature Enables If ( CPUID.80000001.EDX.[bit 20] or CPUID.80000001.EDX.[bit 29]) If CPUID.(EAX=07H, ECX=0H): EBX[1] = 1 Decimal Architectural MSR Name and bit fields (Former MSR Name) Introduced as Architectural MSR

MSR/Bit Description

SYSCALL Enable (R/W) Enables SYSCALL/SYSRET instructions in 64-bit mode.

7:1 8 9 10 11 63:12 ...

Reserved. IA-32e Mode Enable (R/W) Enables IA-32e mode operation. Reserved. IA-32e Mode Active (R) Indicates IA-32e mode is active when set. Execute Disable Bit Enable (R/W) Reserved.

...

35.2

MSRS IN THE INTEL CORE 2 PROCESSOR FAMILY

Table 35-3 lists model-specific registers (MSRs) for Intel Core 2 processor family and for Intel Xeon processors based on Intel Core microarchitecture, architectural MSR addresses are also included in Table 35-3. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_0FH, see Table 35-1. MSRs listed in Table 35-2 and Table 35-3 are also supported by processors based on the Enhanced Intel Core microarchitecture. Processors based on the Enhanced Intel Core microarchitecture have the CPUID signature DisplayFamily_DisplayModel of 06_17H.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

217

The column Shared/Unique applies to multi-core processors based on Intel Core microarchitecture. Unique means each processor core has a separate MSR, or a bit field in an MSR governs only a core independently. Shared means the MSR or the bit field in an MSR address governs the operation of both processor cores.

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture


Register Address Hex 0H 1H 6H 10H 17H 17H Dec 0 1 6 16 23 23 IA32_P5_MC_ADDR IA32_P5_MC_TYPE Unique Unique See Section 35.14, MSRs in Pentium Processors. See Section 35.14, MSRs in Pentium Processors. See Section 8.10.5, Monitor/Mwait Address Range Determination. and Table 35-2. See Section 17.13, Time-Stamp Counter, and see Table 35-2. Platform ID (R) See Table 35-2. Model Specific Platform ID (R) Reserved. Maximum Qualified Ratio (R) The maximum allowed bus ratio. 49:13 52:50 63:53 1BH 2AH 27 42 IA32_APIC_BASE MSR_EBL_CR_POWERON Unique Shared Reserved. See Table 35-2. Reserved. See Section 10.4.4, Local APIC Status and Location. and Table 352. Processor Hard Power-On Configuration (R/W) Enables and disables processor features; (R) indicates current processor configuration. 0 1 Reserved. Data Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. Response Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. MCERR# Drive Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 4 Address Parity Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. Register Name Shared/ Unique Bit Description

IA32_MONITOR_FILTER_SIZ Unique E IA32_TIME_STAMP_COUNT ER IA32_PLATFORM_ID MSR_PLATFORM_ID 7:0 12:8 Unique Shared Shared

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

218

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex Dec 5 6 7 Reserved. Reserved. BINIT# Driver Enable (R/W) 1 = Enabled; 0 = Disabled Note: Not all processor implements R/W. 8 9 10 11 12 13 14 15 17:16 18 19 21: 20 26:22 3AH 58 IA32_FEATURE_CONTROL 3 Unique Unique Output Tri-state Enabled (R/O) 1 = Enabled; 0 = Disabled Execute BIST (R/O) 1 = Enabled; 0 = Disabled MCERR# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Intel TXT Capable Chipset. (R/O) 1 = Present; 0 = Not Present BINIT# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Reserved. 1 MByte Power on Reset Vector (R/O) 1 = 1 MByte; 0 = 4 GBytes Reserved. APIC Cluster ID (R/O) N/2 Non-Integer Bus Ratio (R/O) 0 = Integer ratio; 1 = Non-integer ratio Reserved. Symmetric Arbitration ID (R/O) Integer Bus Frequency Ratio (R/O) Control Features in Intel 64Processor (R/W) See Table 35-2. SMRR Enable (R/WL) When this bit is set and the lock bit is set makes the SMRR_PHYS_BASE and SMRR_PHYS_MASK registers read visible and writeable while in SMM. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

219

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex 40H Dec 64 MSR_ LASTBRANCH_0_FROM_IP Unique Last Branch Record 0 From IP (R/W) One of four pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the source instruction for one of the last four branches, exceptions, or interrupts taken by the processor. See also: Last Branch Record Stack TOS at 1C9H Section 17.11, Last Branch, Interrupt, and Exception Recording (Pentium M Processors). 41H 42H 43H 60H 65 66 67 96 MSR_ LASTBRANCH_1_FROM_IP MSR_ LASTBRANCH_2_FROM_IP MSR_ LASTBRANCH_3_FROM_IP MSR_ LASTBRANCH_0_TO_IP Unique Unique Unique Unique Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 0 To IP (R/W) One of four pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the destination instruction for one of the last four branches, exceptions, or interrupts taken by the processor. Unique Unique Unique Unique Unique Unique Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. BIOS Update Trigger Register (W) See Table 35-2. 8BH A0H 139 160 IA32_BIOS_SIGN_ID MSR_SMRR_PHYSBASE BIOS Update Signature ID (RO) See Table 35-2. System Management Mode Base Address register (WO in SMM) Model-specific implementation of SMRR-like interface, read visible and write only in SMM. 11:0 31:12 63:32 Reserved. PhysBase. SMRR physical Base Address. Reserved. Register Name Shared/ Unique Bit Description

61H 62H 63H 79H

97 98 99 121

MSR_ LASTBRANCH_1_TO_IP MSR_ LASTBRANCH_2_TO_IP MSR_ LASTBRANCH_3_TO_IP IA32_BIOS_UPDT_TRIG

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

220

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex A1H Dec 161 MSR_SMRR_PHYSMASK Unique System Management Mode Physical Address Mask register (WO in SMM) Model-specific implementation of SMRR-like interface, read visible and write only in SMM.. 10:0 11 31:12 63:32 C1H C2H CDH 193 194 205 IA32_PMC0 IA32_PMC1 MSR_FSB_FREQ Unique Unique Shared Reserved. Valid. Physical address base and range mask are valid. PhysMask. SMRR physical address range mask. Reserved. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Intel Core microarchitecture: 2:0 101B: 100 MHz (FSB 400) 001B: 133 MHz (FSB 533) 011B: 167 MHz (FSB 667) 010B: 200 MHz (FSB 800) 000B: 267 MHz (FSB 1067) 100B: 333 MHz (FSB 1333) Register Name Shared/ Unique Bit Description

133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 266.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 000B. 333.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 100B. 63:3 CDH 205 MSR_FSB_FREQ Shared Reserved. Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Enhanced Intel Core microarchitecture: 2:0 101B: 100 MHz (FSB 400) 001B: 133 MHz (FSB 533) 011B: 167 MHz (FSB 667) 010B: 200 MHz (FSB 800) 000B: 267 MHz (FSB 1067) 100B: 333 MHz (FSB 1333) 110B: 400 MHz (FSB 1600)

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

221

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex Dec 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 266.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 110B. 333.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 111B. 63:3 E7H E8H FEH 11EH 231 232 254 281 IA32_MPERF IA32_APERF IA32_MTRRCAP 11 MSR_BBL_CR_CTL3 0 Unique Unique Unique Unique Shared L2 Hardware Enabled (RO) 1= 0= 7:1 8 If the L2 is hardware-enabled Indicates if the L2 is hardware-disabled Reserved. Maximum Performance Frequency Clock Count (RW) See Table 35-2. Actual Performance Frequency Clock Count (RW) See Table 35-2. See Table 35-2. SMRR Capability Using MSR 0A0H and 0A1H (R) Register Name Shared/ Unique Bit Description

Reserved. L2 Enabled (R/W) 1 = L2 cache has been initialized 0 = Disabled (default) Until this bit is set the processor will not respond to the WBINVD instruction or the assertion of the FLUSH# input.

22:9 23

Reserved. L2 Not Present (RO) 0= 1= L2 Present L2 Not Present

63:24 174H 175H 176H 179H 17AH 372 373 374 377 378 IA32_SYSENTER_CS IA32_SYSENTER_ESP IA32_SYSENTER_EIP IA32_MCG_CAP IA32_MCG_STATUS Unique Unique Unique Unique Unique

Reserved. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

222

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex Dec 0 RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 186H 187H 198H 198H 390 391 408 408 IA32_PERFEVTSEL0 IA32_PERFEVTSEL1 IA32_PERF_STATUS MSR_PERF_STATUS 15:0 30:16 31 39:32 44:40 45 46 Unique Unique Shared Shared Current Performance State Value. Reserved. XE Operation (R/O). If set, XE operation is enabled. Default is cleared. Reserved. Maximum Bus Ratio (R/O) Indicates maximum bus ratio configured for the processor. Reserved. Non-Integer Bus Ratio (R/O) Indicates non-integer bus ratio is enabled. Applies processors based on Enhanced Intel Core microarchitecture. 63:47 199H 19AH 409 410 IA32_PERF_CTL IA32_CLOCK_MODULATION Unique Unique Reserved. See Table 35-2. Clock Modulation (R/W) See Table 35-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. Reserved. See Table 35-2. See Table 35-2. See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

223

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex 19BH 19CH 19DH Dec 411 412 413 IA32_THERM_INTERRUPT IA32_THERM_STATUS MSR_THERM2_CTL 15:0 16 Unique Unique Unique Reserved. TM_SELECT (R/W) Mode of automatic thermal monitor: 0 = Thermal Monitor 1 (thermally-initiated on-die modulation of the stop-clock duty cycle) 1 = Thermal Monitor 2 (thermally-initiated frequency transitions) If bit 3 of the IA32_MISC_ENABLE register is cleared, TM_SELECT has no effect. Neither TM1 nor TM2 are enabled. 63:16 1A0 416 IA32_MISC_ENABLE 0 2:1 3 6:4 7 8 9 Shared Unique Reserved. Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. Fast-Strings Enable See Table 35-2. Reserved. Automatic Thermal Control Circuit Enable (R/W) See Table 35-2. Reserved. Performance Monitoring Available (R) See Table 35-2. Reserved. Hardware Prefetcher Disable (R/W) When set, disables the hardware prefetcher operation on streams of data. When clear (default), enables the prefetch queue. Disabling of the hardware prefetcher may impact processor performance. 10 Shared FERR# Multiplexing Enable (R/W) 1= FERR# asserted by the processor to indicate a pending break event within the processor 0 = Indicates compatible FERR# signaling behavior This bit must be set to 1 to support XAPIC interrupt model usage. Thermal Interrupt Control (R/W) See Table 35-2. Thermal Monitor Status (R/W) See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

224

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex Dec 11 12 13 Shared Shared Shared Branch Trace Storage Unavailable (RO) See Table 35-2. Precise Event Based Sampling Unavailable (RO) See Table 35-2. TM2 Enable (R/W) When this bit is set (1) and the thermal sensor indicates that the die temperature is at the pre-determined threshold, the Thermal Monitor 2 mechanism is engaged. TM2 will reduce the bus to core ratio and voltage according to the value last written to MSR_THERM2_CTL bits 15:0. When this bit is clear (0, default), the processor does not change the VID signals or the bus to core ratio when the processor enters a thermally managed state. The BIOS must enable this feature if the TM2 feature flag (CPUID.1:ECX[8]) is set; if the TM2 feature flag is not set, this feature is not supported and BIOS must not alter the contents of the TM2 bit location. The processor is operating out of specification if both this bit and the TM1 bit are set to 0. 15:14 16 18 19 Shared Shared Shared Reserved. Enhanced Intel SpeedStep Technology Enable (R/W) See Table 35-2. ENABLE MONITOR FSM (R/W) See Table 35-2. Adjacent Cache Line Prefetch Disable (R/W) When set to 1, the processor fetches the cache line that contains data currently required by the processor. When set to 0, the processor fetches cache lines that comprise a cache line pair (128 bytes). Single processor platforms should not set this bit. Server platforms should set or clear this bit based on platform performance observed in validation and testing. BIOS may contain a setup option that controls the setting of this bit. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

225

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex Dec 20 Shared Enhanced Intel SpeedStep Technology Select Lock (R/WO) When set, this bit causes the following bits to become read-only: Enhanced Intel SpeedStep Technology Select Lock (this bit), Enhanced Intel SpeedStep Technology Enable bit. The bit must be set before an Enhanced Intel SpeedStep Technology transition is requested. This bit is cleared on reset. 21 22 23 33:24 34 36:35 37 Unique Unique Shared Shared Reserved. Limit CPUID Maxval (R/W) See Table 35-2. xTPR Message Disable (R/W) See Table 35-2. Reserved. XD Bit Disable (R/W) See Table 35-2. Reserved. DCU Prefetcher Disable (R/W) When set to 1, The DCU L1 data cache prefetcher is disabled. The default value after reset is 0. BIOS may write 1 to disable this feature. The DCU prefetcher is an L1 data cache prefetcher. When the DCU prefetcher detects multiple loads from the same line done within a time limit, the DCU prefetcher assumes the next line will be required. The next line is prefetched in to the L1 data cache from memory or L2. 38 Shared IDA Disable (R/W) When set to 1 on processors that support IDA, the Intel Dynamic Acceleration feature (IDA) is disabled and the IDA_Enable feature flag will be clear (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processors support of IDA is enabled. Note: the power-on default value is used by BIOS to detect hardware support of IDA. If power-on default value is 1, IDA is available in the processor. If power-on default value is 0, IDA is not available. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

226

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex Dec 39 Unique IP Prefetcher Disable (R/W) When set to 1, The IP prefetcher is disabled. The default value after reset is 0. BIOS may write 1 to disable this feature. The IP prefetcher is an L1 data cache prefetcher. The IP prefetcher looks for sequential load history to determine whether to prefetch the next expected data into the L1 cache from memory or L2. 63:40 1C9H 457 MSR_LASTBRANCH_TOS Unique Reserved. Last Branch Record Stack TOS (R) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 40H). 1D9H 1DDH 473 477 IA32_DEBUGCTL MSR_LER_FROM_LIP Unique Unique Debug Control (R/W) See Table 35-2 Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Unique Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 200H 201H 202H 203H 204H 205H 206H 207H 208H 209H 20AH 20BH 20CH 20DH 20EH 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 IA32_MTRR_PHYSBASE0 IA32_MTRR_PHYSMASK0 IA32_MTRR_PHYSBASE1 IA32_MTRR_PHYSMASK1 IA32_MTRR_PHYSBASE2 IA32_MTRR_PHYSMASK2 IA32_MTRR_PHYSBASE3 IA32_MTRR_PHYSMASK3 IA32_MTRR_PHYSBASE4 IA32_MTRR_PHYSMASK4 IA32_MTRR_PHYSBASE5 IA32_MTRR_PHYSMASK5 IA32_MTRR_PHYSBASE6 IA32_MTRR_PHYSMASK6 IA32_MTRR_PHYSBASE7 Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

227

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex 20FH 250H 258H 259H 268H 269H 26AH 26BH 26CH 26DH 26EH 26FH 277H 2FFH Dec 527 592 600 601 616 617 618 619 620 621 622 623 631 767 IA32_MTRR_PHYSMASK7 IA32_MTRR_FIX64K_ 00000 IA32_MTRR_FIX16K_ 80000 IA32_MTRR_FIX16K_ A0000 IA32_MTRR_FIX4K_C0000 IA32_MTRR_FIX4K_C8000 IA32_MTRR_FIX4K_D0000 IA32_MTRR_FIX4K_D8000 IA32_MTRR_FIX4K_E0000 IA32_MTRR_FIX4K_E8000 IA32_MTRR_FIX4K_F0000 IA32_MTRR_FIX4K_F8000 IA32_PAT IA32_MTRR_DEF_TYPE Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Default Memory Types (R/W) See Table 35-2. 309H 309H 30AH 30AH 30BH 30BH 345H 345H 777 777 778 778 779 779 837 837 IA32_FIXED_CTR0 MSR_PERF_FIXED_CTR0 IA32_FIXED_CTR1 MSR_PERF_FIXED_CTR1 IA32_FIXED_CTR2 MSR_PERF_FIXED_CTR2 IA32_PERF_CAPABILITIES MSR_PERF_CAPABILITIES Unique Unique Unique Unique Unique Unique Unique Unique Fixed-Function Performance Counter Register 0 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 0 (R/W) Fixed-Function Performance Counter Register 1 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 1 (R/W) Fixed-Function Performance Counter Register 2 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 2 (R/W) See Table 35-2. See Section 17.4.1, IA32_DEBUGCTL MSR. RO. This applies to processors that do not support architectural perfmon version 2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

228

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex Dec 5:0 6 7 63:8 38DH 38DH 38EH 38EH 38FH 38FH 390H 390H 3F1H 909 909 910 910 911 911 912 912 1009 IA32_FIXED_CTR_CTRL MSR_PERF_FIXED_CTR_ CTRL IA32_PERF_GLOBAL_ STAUS IA32_PERF_GLOBAL_CTRL MSR_PERF_GLOBAL_CTRL IA32_PERF_GLOBAL_OVF_ CTRL MSR_PERF_GLOBAL_OVF_ CTRL MSR_PEBS_ENABLE 0 400H 401H 402H 1024 1025 1026 IA32_MC0_CTL IA32_MC0_STATUS IA32_MC0_ADDR Unique Unique Unique Unique Unique Unique LBR Format. See Table 35-2. PEBS Record Format. PEBSSaveArchRegs. See Table 35-2. Reserved. Fixed-Function-Counter Control Register (R/W) See Table 35-2. Fixed-Function-Counter Control Register (R/W) See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.4, Precise Event Based Sampling (PEBS). Enable PEBS on IA32_PMC0. (R/W) See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 405H 406H 1028 1029 1030 IA32_MC1_CTL IA32_MC1_STATUS IA32_MC1_ADDR Unique Unique Unique See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. Register Name Shared/ Unique Bit Description

MSR_PERF_GLOBAL_STAUS Unique Unique Unique Unique Unique Unique

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

229

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex 408H 409H 40AH Dec 1032 1033 1034 IA32_MC2_CTL IA32_MC2_STATUS IA32_MC2_ADDR Unique Unique Unique See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 40DH 40EH 1036 1037 1038 MSR_MC4_CTL MSR_MC4_STATUS MSR_MC4_ADDR Unique Unique Unique See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 410H 411H 412H 1040 1041 1042 MSR_MC3_CTL MSR_MC3_STATUS MSR_MC3_ADDR Unique See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 413H 414H 415H 416H 417H 419H 1043 1044 1045 1046 1047 1045 MSR_MC3_MISC MSR_MC5_CTL MSR_MC5_STATUS MSR_MC5_ADDR MSR_MC5_MISC MSR_MC6_STATUS Unique Unique Unique Unique Unique Unique Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. and Chapter 23. Reporting Register of Basic VMX Capabilities (R/O) See Table 35-2. See Appendix A.1, Basic VMX Information. Register Name Shared/ Unique Bit Description

480H

1152

IA32_VMX_BASIC

Unique

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

230

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex 481H Dec 1153 IA32_VMX_PINBASED_ CTLS Unique Capability Reporting Register of Pin-based VM-execution Controls (R/O) See Table 35-2. See Appendix A.3, VM-Execution Controls. 482H 1154 IA32_VMX_PROCBASED_ CTLS IA32_VMX_EXIT_CTLS Unique Capability Reporting Register of Primary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. 483H 1155 Unique Capability Reporting Register of VM-exit Controls (R/O) See Table 35-2. See Appendix A.4, VM-Exit Controls. 484H 1156 IA32_VMX_ENTRY_CTLS Unique Capability Reporting Register of VM-entry Controls (R/O) See Table 35-2. See Appendix A.5, VM-Entry Controls. 485H 1157 IA32_VMX_MISC Unique Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 35-2. See Appendix A.6, Miscellaneous Data. 486H 1158 IA32_VMX_CR0_FIXED0 Unique Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 487H 1159 IA32_VMX_CR0_FIXED1 Unique Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 488H 1160 IA32_VMX_CR4_FIXED0 Unique Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 489H 1161 IA32_VMX_CR4_FIXED1 Unique Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 48AH 1162 IA32_VMX_VMCS_ENUM Unique Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 35-2. See Appendix A.9, VMCS Enumeration. 48BH 1163 IA32_VMX_PROCBASED_ CTLS2 Unique Capability Reporting Register of Secondary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

231

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex 600H Dec 1536 IA32_DS_AREA Unique DS Save Area (R/W) See Table 35-2. See Section 18.11.4, Debug Store (DS) Mechanism. 107CC H 107CD H 107CE H 107CF H 107D0 H 107D1 H 107D2 H 107D3 H 107D8 H C000_ 0080H C000_ 0081H C000_ 0082H MSR_EMON_L3_CTR_CTL0 Unique GBUSQ Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_CTR_CTL1 Unique GBUSQ Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_CTR_CTL2 Unique GSNPQ Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_CTR_CTL3 Unique GSNPQ Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_CTR_CTL4 Unique FSB Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_CTR_CTL5 Unique FSB Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_CTR_CTL6 Unique FSB Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_CTR_CTL7 Unique FSB Event Control/Counter Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 MSR_EMON_L3_GL_CTL Unique L3/FSB Common Control Register (R/W) Apply to Intel Xeon processor 7400 series (processor signature 06_1D) only. See Section 17.2.2 IA32_EFER IA32_STAR IA32_LSTAR Unique Unique Unique Extended Feature Enables See Table 35-2. System Call Target Address (R/W) See Table 35-2. IA-32e Mode System Call Target Address (R/W) See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

232

Table 35-3 MSRs in Processors Based on Intel Core Microarchitecture (Contd.)


Register Address Hex C000_ 0084H C000_ 0100H C000_ 0101H C000_ 0102H ... Dec IA32_FMASK IA32_FS_BASE IA32_GS_BASE IA32_KERNEL_GSBASE Unique Unique Unique Unique System Call Flag Mask (R/W) See Table 35-2. Map of BASE Address of FS (R/W) See Table 35-2. Map of BASE Address of GS (R/W) See Table 35-2. Swap Target of BASE Address of GS (R/W) See Table 35-2. Register Name Shared/ Unique Bit Description

35.3

MSRS IN THE INTEL ATOM PROCESSOR FAMILY

Table 35-4 lists model-specific registers (MSRs) for Intel Atom processor family, architectural MSR addresses are also included in Table 35-4. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_1CH, see Table 35-1. The column Shared/Unique applies to logical processors sharing the same core in processors based on the Intel Atom microarchitecture. Unique means each logical processor has a separate MSR, or a bit field in an MSR governs only a logical processor. Shared means the MSR or the bit field in an MSR address governs the operation of both logical processors in the same core.

Table 35-4 MSRs in Intel Atom Processor Family


Register Address Hex 0H 1H 6H 10H 17H 17H Dec 0 1 6 16 23 23 IA32_P5_MC_ADDR IA32_P5_MC_TYPE IA32_MONITOR_FILTER_ SIZE IA32_TIME_STAMP_ COUNTER IA32_PLATFORM_ID MSR_PLATFORM_ID 7:0 Shared Shared Unique Shared Shared Shared See Section 35.14, MSRs in Pentium Processors. See Section 35.14, MSRs in Pentium Processors. See Section 8.10.5, Monitor/Mwait Address Range Determination. andTable 35-2 See Section 17.13, Time-Stamp Counter, and see Table 35-2. Platform ID (R) See Table 35-2. Model Specific Platform ID (R) Reserved. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

233

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex Dec 12:8 63:13 1BH 2AH 27 42 IA32_APIC_BASE MSR_EBL_CR_POWERON Unique Shared Maximum Qualified Ratio (R) The maximum allowed bus ratio. Reserved. See Section 10.4.4, Local APIC Status and Location, and Table 352. Processor Hard Power-On Configuration (R/W) Enables and disables processor features; (R) indicates current processor configuration. 0 1 Reserved. Data Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. Response Error Checking Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. AERR# Drive Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. 4 BERR# Enable for initiator bus requests (R/W) 1 = Enabled; 0 = Disabled Always 0. 5 6 7 Reserved. Reserved. BINIT# Driver Enable (R/W) 1 = Enabled; 0 = Disabled Always 0. 8 9 10 Reserved. Execute BIST (R/O) 1 = Enabled; 0 = Disabled AERR# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Always 0. 11 12 Reserved. BINIT# Observation Enabled (R/O) 1 = Enabled; 0 = Disabled Always 0. 13 Reserved. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

234

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex Dec 14 15 17:16 19: 18 21: 20 26:22 3AH 40H 58 64 IA32_FEATURE_CONTROL MSR_ LASTBRANCH_0_FROM_IP Unique Unique 1 MByte Power on Reset Vector (R/O) 1 = 1 MByte; 0 = 4 GBytes Reserved APIC Cluster ID (R/O) Always 00B. Reserved. Symmetric Arbitration ID (R/O) Always 00B. Integer Bus Frequency Ratio (R/O) Control Features in Intel 64Processor (R/W) See Table 35-2. Last Branch Record 0 From IP (R/W) One of eight pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the source instruction for one of the last eight branches, exceptions, or interrupts taken by the processor. See also: Last Branch Record Stack TOS at 1C9H Section 17.11, Last Branch, Interrupt, and Exception Recording (Pentium M Processors). 41H 42H 43H 44H 45H 46H 47H 60H 65 66 67 68 69 70 71 96 MSR_ LASTBRANCH_1_FROM_IP MSR_ LASTBRANCH_2_FROM_IP MSR_ LASTBRANCH_3_FROM_IP MSR_ LASTBRANCH_4_FROM_IP MSR_ LASTBRANCH_5_FROM_IP MSR_ LASTBRANCH_6_FROM_IP MSR_ LASTBRANCH_7_FROM_IP MSR_ LASTBRANCH_0_TO_IP Unique Unique Unique Unique Unique Unique Unique Unique Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 0 To IP (R/W) One of eight pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the destination instruction for one of the last eight branches, exceptions, or interrupts taken by the processor. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

235

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex 61H 62H 63H 64H 65H 66H 67H 79H 8BH C1H C2H CDH Dec 97 98 99 100 101 102 103 121 139 193 194 205 MSR_ LASTBRANCH_1_TO_IP MSR_ LASTBRANCH_2_TO_IP MSR_ LASTBRANCH_3_TO_IP MSR_ LASTBRANCH_4_TO_IP MSR_ LASTBRANCH_5_TO_IP MSR_ LASTBRANCH_6_TO_IP MSR_ LASTBRANCH_7_TO_IP IA32_BIOS_UPDT_TRIG IA32_BIOS_SIGN_ID IA32_PMC0 IA32_PMC1 MSR_FSB_FREQ Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Shared Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. BIOS Update Trigger Register (W) See Table 35-2. BIOS Update Signature ID (RO) See Table 35-2. Performance counter register See Table 35-2. Performance Counter Register See Table 35-2. Scaleable Bus Speed(RO) This field indicates the intended scaleable bus clock speed for processors based on Intel Atom microarchitecture: 2:0 101B: 100 MHz (FSB 400) 001B: 133 MHz (FSB 533) 011B: 167 MHz (FSB 667) 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 63:3 E7H E8H 231 232 IA32_MPERF IA32_APERF Unique Unique Reserved. Maximum Performance Frequency Clock Count (RW) See Table 35-2. Actual Performance Frequency Clock Count (RW) See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

236

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex FEH 11EH Dec 254 281 IA32_MTRRCAP MSR_BBL_CR_CTL3 0 Shared Shared L2 Hardware Enabled (RO) 1= 0= 7:1 8 If the L2 is hardware-enabled Indicates if the L2 is hardware-disabled Memory Type Range Register (R) See Table 35-2. Register Name Shared/ Unique Bit Description

Reserved. L2 Enabled. (R/W) 1 = L2 cache has been initialized 0 = Disabled (default) Until this bit is set the processor will not respond to the WBINVD instruction or the assertion of the FLUSH# input.

22:9 23

Reserved. L2 Not Present (RO) 0= 1= L2 Present L2 Not Present

63:24 174H 175H 176H 17AH 372 373 374 378 IA32_SYSENTER_CS IA32_SYSENTER_ESP IA32_SYSENTER_EIP IA32_MCG_STATUS 0 Unique Unique Unique Unique

Reserved. See Table 35-2. See Table 35-2. See Table 35-2. RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted

EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error.

MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception.

63:3 186H 187H 390 391 IA32_PERFEVTSEL0 IA32_PERFEVTSEL1 Unique Unique

Reserved. See Table 35-2. See Table 35-2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

237

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex 198H 198H Dec 408 408 IA32_PERF_STATUS MSR_PERF_STATUS 15:0 39:16 44:40 63:45 199H 19AH 409 410 IA32_PERF_CTL IA32_CLOCK_MODULATION Unique Unique Shared Shared Current Performance State Value. Reserved. Maximum Bus Ratio (R/O) Indicates maximum bus ratio configured for the processor. Reserved. See Table 35-2. Clock Modulation (R/W) See Table 35-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 19BH 19CH 19DH 411 412 413 IA32_THERM_INTERRUPT IA32_THERM_STATUS MSR_THERM2_CTL 15:0 16 Unique Unique Shared Reserved. TM_SELECT (R/W) Mode of automatic thermal monitor: 0 = Thermal Monitor 1 (thermally-initiated on-die modulation of the stop-clock duty cycle) 1 = Thermal Monitor 2 (thermally-initiated frequency transitions) If bit 3 of the IA32_MISC_ENABLE register is cleared, TM_SELECT has no effect. Neither TM1 nor TM2 are enabled. 63:17 1A0 416 IA32_MISC_ENABLE 0 2:1 3 6:4 7 8 Shared Unique Unique Reserved. Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. Fast-Strings Enable See Table 35-2. Reserved. Automatic Thermal Control Circuit Enable (R/W) See Table 35-2. Reserved. Performance Monitoring Available (R) See Table 35-2. Reserved. Thermal Interrupt Control (R/W) See Table 35-2. Thermal Monitor Status (R/W) See Table 35-2. See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

238

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex Dec 9 10 Shared Reserved. FERR# Multiplexing Enable (R/W) 1= FERR# asserted by the processor to indicate a pending break event within the processor 0 = Indicates compatible FERR# signaling behavior This bit must be set to 1 to support XAPIC interrupt model usage. 11 12 13 Shared Shared Shared Branch Trace Storage Unavailable (RO) See Table 35-2. Precise Event Based Sampling Unavailable (RO) See Table 35-2. TM2 Enable (R/W) When this bit is set (1) and the thermal sensor indicates that the die temperature is at the pre-determined threshold, the Thermal Monitor 2 mechanism is engaged. TM2 will reduce the bus to core ratio and voltage according to the value last written to MSR_THERM2_CTL bits 15:0. When this bit is clear (0, default), the processor does not change the VID signals or the bus to core ratio when the processor enters a thermally managed state. The BIOS must enable this feature if the TM2 feature flag (CPUID.1:ECX[8]) is set; if the TM2 feature flag is not set, this feature is not supported and BIOS must not alter the contents of the TM2 bit location. The processor is operating out of specification if both this bit and the TM1 bit are set to 0. 15:14 16 18 19 20 Shared Shared Shared Reserved. Enhanced Intel SpeedStep Technology Enable (R/W) See Table 35-2. ENABLE MONITOR FSM (R/W) See Table 35-2. Reserved. Enhanced Intel SpeedStep Technology Select Lock (R/WO) When set, this bit causes the following bits to become read-only: Enhanced Intel SpeedStep Technology Select Lock (this bit), Enhanced Intel SpeedStep Technology Enable bit. The bit must be set before an Enhanced Intel SpeedStep Technology transition is requested. This bit is cleared on reset. 21 22 Unique Reserved. Limit CPUID Maxval (R/W) See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

239

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex Dec 23 33:24 34 63:35 1C9H 457 MSR_LASTBRANCH_TOS Unique Unique Shared xTPR Message Disable (R/W) See Table 35-2. Reserved. XD Bit Disable (R/W) See Table 35-2. Reserved. Last Branch Record Stack TOS (R) Contains an index (bits 0-2) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 40H). 1D9H 1DDH 473 477 IA32_DEBUGCTL MSR_LER_FROM_LIP Unique Unique Debug Control (R/W) See Table 35-2. Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Unique Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 200H 201H 202H 203H 204H 205H 206H 207H 208H 209H 20AH 20BH 20CH 20DH 20EH 20FH 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 IA32_MTRR_PHYSBASE0 IA32_MTRR_PHYSMASK0 IA32_MTRR_PHYSBASE1 IA32_MTRR_PHYSMASK1 IA32_MTRR_PHYSBASE2 IA32_MTRR_PHYSMASK2 IA32_MTRR_PHYSBASE3 IA32_MTRR_PHYSMASK3 IA32_MTRR_PHYSBASE4 IA32_MTRR_PHYSMASK4 IA32_MTRR_PHYSBASE5 IA32_MTRR_PHYSMASK5 IA32_MTRR_PHYSBASE6 IA32_MTRR_PHYSMASK6 IA32_MTRR_PHYSBASE7 IA32_MTRR_PHYSMASK7 Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

240

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex 250H 258H 259H 268H 269H 26AH 26BH 26CH 26DH 26EH 26FH 277H 309H 30AH 30BH 345H 38DH 38EH 38FH 390H 3F1H Dec 592 600 601 616 617 618 619 620 621 622 623 631 777 778 779 837 909 910 911 912 1009 IA32_MTRR_FIX64K_ 00000 IA32_MTRR_FIX16K_ 80000 IA32_MTRR_FIX16K_ A0000 IA32_MTRR_FIX4K_C0000 IA32_MTRR_FIX4K_C8000 IA32_MTRR_FIX4K_D0000 IA32_MTRR_FIX4K_D8000 IA32_MTRR_FIX4K_E0000 IA32_MTRR_FIX4K_E8000 IA32_MTRR_FIX4K_F0000 IA32_MTRR_FIX4K_F8000 IA32_PAT IA32_FIXED_CTR0 IA32_FIXED_CTR1 IA32_FIXED_CTR2 IA32_PERF_CAPABILITIES IA32_FIXED_CTR_CTRL IA32_PERF_GLOBAL_ STAUS IA32_PERF_GLOBAL_CTRL IA32_PERF_GLOBAL_OVF_ CTRL MSR_PEBS_ENABLE 0 400H 401H 1024 1025 IA32_MC0_CTL IA32_MC0_STATUS Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Unique Unique Unique Unique Shared Unique Unique Unique Unique Unique See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Fixed-Function Performance Counter Register 0 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 1 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 2 (R/W) See Table 35-2. See Table 35-2. See Section 17.4.1, IA32_DEBUGCTL MSR. Fixed-Function-Counter Control Register (R/W) See Table 35-2. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.4, Precise Event Based Sampling (PEBS). Enable PEBS on IA32_PMC0. (R/W) See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

241

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex 402H Dec 1026 IA32_MC0_ADDR Shared See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 404H 405H 408H 409H 40AH 1028 1029 1032 1033 1034 IA32_MC1_CTL IA32_MC1_STATUS IA32_MC2_CTL IA32_MC2_STATUS IA32_MC2_ADDR Shared Shared Shared Shared Shared See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40CH 40DH 4OEH 1036 1037 1038 MSR_MC3_CTL MSR_MC3_STATUS MSR_MC3_ADDR Shared Shared Shared See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 410H 411H 412H 1040 1041 1042 MSR_MC4_CTL MSR_MC4_STATUS MSR_MC4_ADDR Shared Shared Shared See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 480H 1152 IA32_VMX_BASIC Unique Reporting Register of Basic VMX Capabilities (R/O) See Table 35-2. See Appendix A.1, Basic VMX Information. 481H 1153 IA32_VMX_PINBASED_ CTLS Unique Capability Reporting Register of Pin-based VM-execution Controls (R/O) See Table 35-2. See Appendix A.3, VM-Execution Controls. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

242

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex 482H Dec 1154 IA32_VMX_PROCBASED_ CTLS IA32_VMX_EXIT_CTLS Unique Capability Reporting Register of Primary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. 483H 1155 Unique Capability Reporting Register of VM-exit Controls (R/O) See Table 35-2. See Appendix A.4, VM-Exit Controls. 484H 1156 IA32_VMX_ENTRY_CTLS Unique Capability Reporting Register of VM-entry Controls (R/O) See Table 35-2. See Appendix A.5, VM-Entry Controls. 485H 1157 IA32_VMX_MISC Unique Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 35-2. See Appendix A.6, Miscellaneous Data. 486H 1158 IA32_VMX_CR0_FIXED0 Unique Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 487H 1159 IA32_VMX_CR0_FIXED1 Unique Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 488H 1160 IA32_VMX_CR4_FIXED0 Unique Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 489H 1161 IA32_VMX_CR4_FIXED1 Unique Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 48AH 1162 IA32_VMX_VMCS_ENUM Unique Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 35-2. See Appendix A.9, VMCS Enumeration. 48BH 1163 IA32_VMX_PROCBASED_ CTLS2 IA32_DS_AREA Unique Capability Reporting Register of Secondary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. 600H 1536 Unique DS Save Area (R/W) See Table 35-2. See Section 18.11.4, Debug Store (DS) Mechanism. C000_ 0080H C000_ 0081H IA32_EFER IA32_STAR Unique Unique Extended Feature Enables See Table 35-2. System Call Target Address (R/W) See Table 35-2. Register Name Shared/ Unique Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

243

Table 35-4 MSRs in Intel Atom Processor Family (Contd.)


Register Address Hex C000_ 0082H C000_ 0084H C000_ 0100H C000_ 0101H C000_ 0102H Dec IA32_LSTAR IA32_FMASK IA32_FS_BASE IA32_GS_BASE IA32_KERNEL_GSBASE Unique Unique Unique Unique Unique IA-32e Mode System Call Target Address (R/W) See Table 35-2. System Call Flag Mask (R/W) See Table 35-2. Map of BASE Address of FS (R/W) See Table 35-2. Map of BASE Address of GS (R/W) See Table 35-2. Swap Target of BASE Address of GS (R/W) See Table 35-2. Register Name Shared/ Unique Bit Description

Table 35-5 lists model-specific registers (MSRs) that are specific to Intel Atom processor with the CPUID signature with DisplayFamily_DisplayModel of 06_27H.

Table 35-5 MSRs Supported by Intel Atom Processors with CPUID Signature 06_27H
Register Address Hex 3F8H Dec 1016 MSR_PKG_C2_RESIDENCY Package Package C2 Residency Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States 63:0 Package Package C2 Residency Counter. (R/O) Time that this package is in processor-specific C2 states since last reset. Counts at 1 Mhz frequency. 3F9H 1017 MSR_PKG_C4_RESIDENCY Package Package C4 Residency Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States 63:0 Package Package C4 Residency Counter. (R/O) Time that this package is in processor-specific C4 states since last reset. Counts at 1 Mhz frequency. 3FAH 1018 MSR_PKG_C4_RESIDENCY Package Package C6 Residency Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI C-States 63:0 Package Package C6 Residency Counter. (R/O) Time that this package is in processor-specific C6 states since last reset. Counts at 1 Mhz frequency. ... Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

244

35.4

MSRS IN THE INTEL MICROARCHITECTURE CODE NAME NEHALEM

Table 35-6 lists model-specific registers (MSRs) that are common for Intel microarchitecture code name Nehalem. These include Intel Core i7 and i5 processor family. Architectural MSR addresses are also included in Table 35-6. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_1AH, 06_1EH, 06_1FH, 06_2EH, see Table 35-1. Additional MSRs specific to 06_1AH, 06_1EH, 06_1FH are listed in Table 35-7. Some MSRs listed in these tables are used by BIOS. More information about these MSR can be found at http:// biosbits.org. The column Scope represents the package/core/thread scope of individual bit field of an MSR. Thread means this bit field must be programmed on each logical processor independently. Core means the bit field must be programmed on each processor core independently, logical processors in the same core will be affected by change of this bit on the other logical processor in the same core. Package means the bit field must be programmed once for each physical package. Change of a bit filed with a package scope will affect all logical processors in that physical package.

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem
Register Address Hex 0H 1H 6H 10H 17H 17H Dec 0 1 6 16 23 23 IA32_P5_MC_ADDR IA32_P5_MC_TYPE IA32_MONITOR_FILTER_ SIZE IA32_TIME_ STAMP_COUNTER IA32_PLATFORM_ID MSR_PLATFORM_ID 49:0 52:50 63:53 1BH 34H 27 52 IA32_APIC_BASE MSR_SMI_COUNT 31:0 63:32 3AH 79H 58 121 IA32_FEATURE_CONTROL IA32_BIOS_ UPDT_TRIG Thread Core Thread Thread Thread Thread Thread Thread Package Package See Section 35.14, MSRs in Pentium Processors. See Section 35.14, MSRs in Pentium Processors. See Section 8.10.5, Monitor/Mwait Address Range Determination, and Table 35-2. See Section 17.13, Time-Stamp Counter, and see Table 35-2. Platform ID (R) See Table 35-2. Model Specific Platform ID (R) Reserved. See Table 35-2. Reserved. See Section 10.4.4, Local APIC Status and Location, and Table 352. SMI Counter (R/O) SMI Count (R/O) Running count of SMI events since last RESET. Reserved. Control Features in Intel 64Processor (R/W) See Table 35-2. BIOS Update Trigger Register (W) See Table 35-2. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

245

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 8BH C1H C2H C3H C4H CEH Dec 139 193 194 195 196 206 IA32_BIOS_ SIGN_ID IA32_PMC0 IA32_PMC1 IA32_PMC2 IA32_PMC3 MSR_PLATFORM_INFO 7:0 15:8 Package Thread Thread Thread Thread Thread Package BIOS Update Signature ID (RO) See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. see http://biosbits.org. Reserved. Maximum Non-Turbo Ratio (R/O) The is the ratio of the frequency that invariant TSC runs at. The invariant TSC frequency can be computed by multiplying this ratio by 133.33 MHz. 27:16 28 Package Reserved. Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limits for Turbo mode is enabled, and when set to 0, indicates Programmable Ratio Limits for Turbo mode is disabled. 29 Package Programmable TDC-TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDC/TDP Limits for Turbo mode are programmable, and when set to 0, indicates TDC and TDP Limits for Turbo mode are not programmable. 39:30 47:40 Package Reserved. Maximum Efficiency Ratio (R/O) The is the minimum ratio (maximum efficiency) that the processor can operates, in units of 133.33MHz. 63:48 E2H 226 MSR_PKG_CST_CONFIG_ CONTROL Core Reserved. C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. See http://biosbits.org. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

246

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex Dec 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power). for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 000b: C0 (no package C-sate support) 001b: C1 (Behavior is the same as 000b) 010b: C3 011b: C6 100b: C7 101b and 110b: Reserved 111: No package C-state limit. Note: This field cannot be used to limit package C-state to C3. 9:3 10 Reserved. I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions. 14:11 15 23:16 24 Reserved. CFG Lock (R/WO) When set, lock bits 15:0 of this register until next reset. Reserved. Interrupt filtering enable (R/W) When set, processor cores in a deep C-State will wake only when the event message is destined for that core. When 0, all processor cores in a deep C-State will wake for an event message. 25 C3 state auto demotion enable (R/W) When set, the processor will conditionally demote C6/C7 requests to C3 based on uncore auto-demote information. 26 C1 state auto demotion enable (R/W) When set, the processor will conditionally demote C3/C6/C7 requests to C1 based on uncore auto-demote information. 63:27 E4H 228 MSR_PMG_IO_CAPTURE_ BASE Core Reserved. Power Management IO Redirection in C-state (R/W) See http://biosbits.org. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

247

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex Dec 15:0 LVL_2 Base Address (R/W) Specifies the base address visible to software for IO redirection. If IO MWAIT Redirection is enabled, reads to this address will be consumed by the power management logic and decoded to MWAIT instructions. When IO port address redirection is enabled, this is the IO port address reported to the OS/software. 18:16 C-state Range (R/W) Specifies the encoding value of the maximum C-State code name to be included when IO read to MWAIT redirection is enabled by MSR_PMG_CST_CONFIG_CONTROL[bit10]: 000b - C3 is the max C-State to include 001b - C6 is the max C-State to include 010b - C7 is the max C-State to include 63:19 E7H E8H FEH 174H 175H 176H 179H 17AH 231 232 254 372 373 374 377 378 IA32_MPERF IA32_APERF IA32_MTRRCAP IA32_SYSENTER_CS IA32_SYSENTER_ESP IA32_SYSENTER_EIP IA32_MCG_CAP IA32_MCG_STATUS 0 Thread Thread Thread Thread Thread Thread Thread Thread RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 Reserved. Reserved. Maximum Performance Frequency Clock Count (RW) See Table 35-2. Actual Performance Frequency Clock Count (RW) See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

248

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 186H 187H 188H 189H 198H Dec 390 391 392 393 408 IA32_PERFEVTSEL0 IA32_PERFEVTSEL1 IA32_PERFEVTSEL2 IA32_PERFEVTSEL3 IA32_PERF_STATUS 15:0 63:16 199H 19AH 409 410 IA32_PERF_CTL IA32_CLOCK_MODULATION Thread Thread Thread Thread Thread Thread Core See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Current Performance State Value. Reserved. See Table 35-2. Clock Modulation (R/W) See Table 35-2. IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 0 3:1 4 63:5 19BH 19CH 1A0 411 412 416 IA32_THERM_INTERRUPT IA32_THERM_STATUS IA32_MISC_ENABLE 0 2:1 3 6:4 7 10:8 11 Thread Thread Thread Thread Core Core Reserved. On demand Clock Modulation Duty Cycle (R/W) On demand Clock Modulation Enable (R/W) Reserved. Thermal Interrupt Control (R/W) See Table 35-2. Thermal Monitor Status (R/W) See Table 35-2. Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. Fast-Strings Enable See Table 35-2. Reserved. Automatic Thermal Control Circuit Enable (R/W) See Table 35-2. Reserved. Performance Monitoring Available (R) See Table 35-2. Reserved. Branch Trace Storage Unavailable (RO) See Table 35-2. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

249

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex Dec 12 15:13 16 18 21:19 22 23 33:24 34 37:35 38 Package Thread Thread Thread Package Thread Thread Precise Event Based Sampling Unavailable (RO) See Table 35-2. Reserved. Enhanced Intel SpeedStep Technology Enable (R/W) See Table 35-2. ENABLE MONITOR FSM. (R/W) See Table 35-2. Reserved. Limit CPUID Maxval (R/W) See Table 35-2. xTPR Message Disable (R/W) See Table 35-2. Reserved. XD Bit Disable (R/W) See Table 35-2. Reserved. Turbo Mode Disable (R/W) When set to 1 on processors that support Intel Turbo Boost Technology, the turbo mode feature is disabled and the IDA_Enable feature flag will be clear (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processors support of turbo mode is enabled. Note: the power-on default value is used by BIOS to detect hardware support of turbo mode. If power-on default value is 1, turbo mode is available in the processor. If power-on default value is 0, turbo mode is not available. 63:39 1A2H 418 MSR_ TEMPERATURE_TARGET 15:0 23:16 Thread Reserved. Temperature Target (R) The minimum temperature at which PROCHOT# will be asserted. The value is degree C. 63:24 1A6H 1AAH 422 426 MSR_OFFCORE_RSP_0 MSR_MISC_PWR_MGMT Thread Reserved. Offcore Response Event Select Register (R/W) See http://biosbits.org. Reserved. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

250

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex Dec 0 Package EIST Hardware Coordination Disable (R/W) When 0, enables hardware coordination of EIST request from processor cores; When 1, disables hardware coordination of EIST requests. 1 Thread Energy/Performance Bias Enable (R/W) This bit makes the IA32_ENERGY_PERF_BIAS register (MSR 1B0h) visible to software with Ring 0 privileges. This bits status (1 or 0) is also reflected by CPUID.(EAX=06h):ECX[3]. 63:2 1ADH 428 MSR_TURBO_POWER_ CURRENT_LIMIT 14:0 15 Package Package Reserved. See http://biosbits.org. TDP Limit (R/W) TDP limit in 1/8 Watt granularity. TDP Limit Override Enable (R/W) A value = 0 indicates override is not active, and a value = 1 indicates active. 30:16 31 Package Package TDC Limit (R/W) TDC limit in 1/8 Amp granularity. TDC Limit Override Enable (R/W) A value = 0 indicates override is not active, and a value = 1 indicates active. 63:32 1ADH 429 MSR_TURBO_RATIO_LIMIT Package Reserved. Maximum Ratio Limit of Turbo Mode RO if MSR_PLATFORM_INFO.[28] = 0, RW if MSR_PLATFORM_INFO.[28] = 1 7:0 15:8 23:16 31:24 63:32 1C8H 456 MSR_LBR_SELECT Core Package Package Package Package Maximum Ratio Limit for 1C Maximum turbo ratio limit of 1 core active. Maximum Ratio Limit for 2C Maximum turbo ratio limit of 2 core active. Maximum Ratio Limit for 3C Maximum turbo ratio limit of 3 core active. Maximum Ratio Limit for 4C Maximum turbo ratio limit of 4 core active. Reserved. Last Branch Record Filtering Select Register (R/W) See Section 17.6.2, Filtering of Last Branch Records. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

251

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 1C9H Dec 457 MSR_LASTBRANCH_TOS Thread Last Branch Record Stack TOS (R) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 680H). 1D9H 1DDH 473 477 IA32_DEBUGCTL MSR_LER_FROM_LIP Thread Thread Debug Control (R/W) See Table 35-2. Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Thread Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1F2H 1F3H 1FCH 498 499 508 IA32_SMRR_PHYSBASE IA32_SMRR_PHYSMASK MSR_POWER_CTL 0 1 Package Core Core Core See Table 35-2. See Table 35-2. Power Control Register. See http://biosbits.org. Reserved. C1E Enable (R/W) When set to 1, will enable the CPU to switch to the Minimum Enhanced Intel SpeedStep Technology operating point when all execution cores enter MWAIT (C1). 63:2 200H 201H 202H 203H 204H 205H 206H 207H 208H 209H 20AH 512 513 514 515 516 517 518 519 520 521 522 IA32_MTRR_PHYSBASE0 IA32_MTRR_PHYSMASK0 IA32_MTRR_PHYSBASE1 IA32_MTRR_PHYSMASK1 IA32_MTRR_PHYSBASE2 IA32_MTRR_PHYSMASK2 IA32_MTRR_PHYSBASE3 IA32_MTRR_PHYSMASK3 IA32_MTRR_PHYSBASE4 IA32_MTRR_PHYSMASK4 IA32_MTRR_PHYSBASE5 Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Reserved. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

252

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 20BH 20CH 20DH 20EH 20FH 210H 211H 212H 213H 250H 258H 259H 268H 269H 26AH 26BH 26CH 26DH 26EH 26FH 277H 280H 281H 282H 283H 284H 285H 286H 287H 288H Dec 523 524 525 526 527 528 529 530 531 592 600 601 616 617 618 619 620 621 622 623 631 640 641 642 643 644 645 646 647 648 IA32_MTRR_PHYSMASK5 IA32_MTRR_PHYSBASE6 IA32_MTRR_PHYSMASK6 IA32_MTRR_PHYSBASE7 IA32_MTRR_PHYSMASK7 IA32_MTRR_PHYSBASE8 IA32_MTRR_PHYSMASK8 IA32_MTRR_PHYSBASE9 IA32_MTRR_PHYSMASK9 IA32_MTRR_FIX64K_ 00000 IA32_MTRR_FIX16K_ 80000 IA32_MTRR_FIX16K_ A0000 IA32_MTRR_FIX4K_C0000 IA32_MTRR_FIX4K_C8000 IA32_MTRR_FIX4K_D0000 IA32_MTRR_FIX4K_D8000 IA32_MTRR_FIX4K_E0000 IA32_MTRR_FIX4K_E8000 IA32_MTRR_FIX4K_F0000 IA32_MTRR_FIX4K_F8000 IA32_PAT IA32_MC0_CTL2 IA32_MC1_CTL2 IA32_MC2_CTL2 IA32_MC3_CTL2 IA32_MC4_CTL2 IA32_MC5_CTL2 IA32_MC6_CTL2 IA32_MC7_CTL2 IA32_MC8_CTL2 Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Package Package Core Core Core Core Package Package Package See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

253

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 2FFH 309H 30AH 30BH 345H Dec 767 777 778 779 837 IA32_MTRR_DEF_TYPE IA32_FIXED_CTR0 IA32_FIXED_CTR1 IA32_FIXED_CTR2 IA32_PERF_CAPABILITIES 5:0 6 7 11:8 12 63:13 38DH 38EH 38EH 909 910 910 IA32_FIXED_CTR_CTRL IA32_PERF_GLOBAL_ STAUS 61 38FH 390H 390H 911 912 912 IA32_PERF_GLOBAL_CTRL IA32_PERF_GLOBAL_OVF_ CTRL MSR_PERF_GLOBAL_OVF_ CTRL 61 3F1H 1009 MSR_PEBS_ENABLE 0 1 2 3 Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Default Memory Types (R/W) See Table 35-2. Fixed-Function Performance Counter Register 0 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 1 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 2 (R/W) See Table 35-2. See Table 35-2. See Section 17.4.1, IA32_DEBUGCTL MSR. LBR Format. See Table 35-2. PEBS Record Format. PEBSSaveArchRegs. See Table 35-2. PEBS_REC_FORMAT. See Table 35-2. SMM_FREEZE. See Table 35-2. Reserved. Fixed-Function-Counter Control Register (R/W) See Table 35-2. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. (RO) UNC_Ovf Uncore overflowed if 1. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. (R/W) CLR_UNC_Ovf Set 1 to clear UNC_Ovf. See Section 18.6.1.1, Precise Event Based Sampling (PEBS). Enable PEBS on IA32_PMC0. (R/W) Enable PEBS on IA32_PMC1. (R/W) Enable PEBS on IA32_PMC2. (R/W) Enable PEBS on IA32_PMC3. (R/W) Scope

Register Name

Bit Description

MSR_PERF_GLOBAL_STAUS Thread

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

254

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex Dec 31:4 32 33 34 35 63:36 3F6H 1014 MSR_PEBS_LD_LAT 15:0 63:36 3F8H 1016 MSR_PKG_C3_RESIDENCY Package Thread Reserved. Enable Load Latency on IA32_PMC0. (R/W) Enable Load Latency on IA32_PMC1. (R/W) Enable Load Latency on IA32_PMC2. (R/W) Enable Load Latency on IA32_PMC3. (R/W) Reserved. See Section 18.6.1.2, Load Latency Performance Monitoring Facility. Minimum threshold latency value of tagged load operation that will be counted. (R/W) Reserved. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Package C3 Residency Counter. (R/O) Value since last reset that this package is in processor-specific C3 states. Count at the same frequency as the TSC. 3F9H 1017 MSR_PKG_C6_RESIDENCY Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Package C6 Residency Counter. (R/O) Value since last reset that this package is in processor-specific C6 states. Count at the same frequency as the TSC. 3FAH 1018 MSR_PKG_C7_RESIDENCY Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Package C7 Residency Counter. (R/O) Value since last reset that this package is in processor-specific C7 states. Count at the same frequency as the TSC. 3FCH 1020 MSR_CORE_C3_RESIDENCY Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. CORE C3 Residency Counter. (R/O) Value since last reset that this core is in processor-specific C3 states. Count at the same frequency as the TSC. 3FDH 1021 MSR_CORE_C6_RESIDENCY Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Scope

Register Name

Bit Description

63:0

63:0

63:0

63:0

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

255

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex Dec 63:0 CORE C6 Residency Counter. (R/O) Value since last reset that this core is in processor-specific C6 states. Count at the same frequency as the TSC. 400H 401H 402H 1024 1025 1026 IA32_MC0_CTL IA32_MC0_STATUS IA32_MC0_ADDR Package Package Package See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 403H 404H 405H 406H 1027 1028 1029 1030 MSR_MC0_MISC IA32_MC1_CTL IA32_MC1_STATUS IA32_MC1_ADDR Package Package Package Package See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 407H 408H 409H 40AH 1031 1032 1033 1034 MSR_MC1_MISC IA32_MC2_CTL IA32_MC2_STATUS IA32_MC2_ADDR Package Core Core Core See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 40BH 40CH 40DH 40EH 1035 1036 1037 1038 MSR_MC2_MISC MSR_MC3_CTL MSR_MC3_STATUS MSR_MC3_ADDR Core Core Core Core See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The MSR_MC4_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

256

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 40FH 410H 411H 412H Dec 1039 1040 1041 1042 MSR_MC3_MISC MSR_MC4_CTL MSR_MC4_STATUS MSR_MC4_ADDR Core Core Core Core See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The MSR_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the MSR_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a general-protection exception. 413H 414H 415H 416H 417H 418H 419H 41AH 41BH 41CH 41DH 41EH 41FH 420H 421H 422H 423H 480H 1043 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 1056 1057 1058 1059 1152 MSR_MC4_MISC MSR_MC5_CTL MSR_MC5_STATUS MSR_MC5_ADDR MSR_MC5_MISC MSR_MC6_CTL MSR_MC6_STATUS MSR_MC6_ADDR MSR_MC6_MISC MSR_MC7_CTL MSR_MC7_STATUS MSR_MC7_ADDR MSR_MC7_MISC MSR_MC8_CTL MSR_MC8_STATUS MSR_MC8_ADDR MSR_MC8_MISC IA32_VMX_BASIC Core Core Core Core Core Package Package Package Package Package Package Package Package Package Package Package Package Thread See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. Reporting Register of Basic VMX Capabilities (R/O) See Table 35-2. See Appendix A.1, Basic VMX Information. 481H 1153 IA32_VMX_PINBASED_ CTLS Thread Capability Reporting Register of Pin-based VM-execution Controls (R/O) See Table 35-2. See Appendix A.3, VM-Execution Controls. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

257

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 482H Dec 1154 IA32_VMX_PROCBASED_ CTLS IA32_VMX_EXIT_CTLS Thread Capability Reporting Register of Primary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. 483H 1155 Thread Capability Reporting Register of VM-exit Controls (R/O) See Table 35-2. See Appendix A.4, VM-Exit Controls. 484H 1156 IA32_VMX_ENTRY_CTLS Thread Capability Reporting Register of VM-entry Controls (R/O) See Table 35-2. See Appendix A.5, VM-Entry Controls. 485H 1157 IA32_VMX_MISC Thread Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 35-2. See Appendix A.6, Miscellaneous Data. 486H 1158 IA32_VMX_CR0_FIXED0 Thread Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 487H 1159 IA32_VMX_CR0_FIXED1 Thread Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 488H 1160 IA32_VMX_CR4_FIXED0 Thread Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 489H 1161 IA32_VMX_CR4_FIXED1 Thread Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 48AH 1162 IA32_VMX_VMCS_ENUM Thread Capability Reporting Register of VMCS Field Enumeration (R/O). See Table 35-2. See Appendix A.9, VMCS Enumeration. 48BH 1163 IA32_VMX_PROCBASED_ CTLS2 IA32_DS_AREA Thread Capability Reporting Register of Secondary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. 600H 1536 Thread DS Save Area (R/W) See Table 35-2. See Section 18.11.4, Debug Store (DS) Mechanism. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

258

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 680H Dec 1664 MSR_ LASTBRANCH_0_FROM_IP Thread Last Branch Record 0 From IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the source instruction for one of the last sixteen branches, exceptions, or interrupts taken by the processor. See also: Last Branch Record Stack TOS at 1C9H Section 17.6.1, LBR Stack. 681H 682H 683H 684H 685H 686H 687H 688H 689H 68AH 68BH 68CH 68DH 68EH 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 1675 1676 1677 1678 MSR_ LASTBRANCH_1_FROM_IP MSR_ LASTBRANCH_2_FROM_IP MSR_ LASTBRANCH_3_FROM_IP MSR_ LASTBRANCH_4_FROM_IP MSR_ LASTBRANCH_5_FROM_IP MSR_ LASTBRANCH_6_FROM_IP MSR_ LASTBRANCH_7_FROM_IP MSR_ LASTBRANCH_8_FROM_IP MSR_ LASTBRANCH_9_FROM_IP Thread Thread Thread Thread Thread Thread Thread Thread Thread Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 8 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 9 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 10 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 11 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 12 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 13 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 14 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Scope

Register Name

Bit Description

MSR_ Thread LASTBRANCH_10_FROM_IP MSR_ Thread LASTBRANCH_11_FROM_IP MSR_ Thread LASTBRANCH_12_FROM_IP MSR_ Thread LASTBRANCH_13_FROM_IP MSR_ Thread LASTBRANCH_14_FROM_IP

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

259

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 68FH 6C0H Dec 1679 1728 MSR_ Thread LASTBRANCH_15_FROM_IP MSR_ LASTBRANCH_0_TO_IP Thread Last Branch Record 15 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 0 To IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the destination instruction for one of the last sixteen branches, exceptions, or interrupts taken by the processor. Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 8 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 9 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 10 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 11 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 12 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 13 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 14 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Scope

Register Name

Bit Description

6C1H 6C2H 6C3H 6C4H 6C5H 6C6H 6C7H 6C8H 6C9H 6CAH 6CBH 6CCH 6CDH 6CEH

1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742

MSR_ LASTBRANCH_1_TO_IP MSR_ LASTBRANCH_2_TO_IP MSR_ LASTBRANCH_3_TO_IP MSR_ LASTBRANCH_4_TO_IP MSR_ LASTBRANCH_5_TO_IP MSR_ LASTBRANCH_6_TO_IP MSR_ LASTBRANCH_7_TO_IP MSR_ LASTBRANCH_8_TO_IP MSR_ LASTBRANCH_9_TO_IP MSR_ LASTBRANCH_10_TO_IP MSR_ LASTBRANCH_11_TO_IP MSR_ LASTBRANCH_12_TO_IP MSR_ LASTBRANCH_13_TO_IP MSR_ LASTBRANCH_14_TO_IP

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

260

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 6CFH 802H 803H 808H 80AH 80BH 80DH 80FH 810H 811H 812H 813H 814H 815H 816H 817H 818H 819H 81AH 81BH 81CH 81DH 81EH 81FH 820H 821H 822H 823H 824H 825H 826H 827H Dec 1743 2050 2051 2056 2058 2059 2061 2063 2064 2065 2066 2067 2068 2069 2070 2071 2072 2073 2074 2075 2076 2077 2078 2079 2080 2081 2082 2083 2084 2085 2086 2087 MSR_ LASTBRANCH_15_TO_IP IA32_X2APIC_APICID IA32_X2APIC_VERSION IA32_X2APIC_TPR IA32_X2APIC_PPR IA32_X2APIC_EOI IA32_X2APIC_LDR IA32_X2APIC_SIVR IA32_X2APIC_ISR0 IA32_X2APIC_ISR1 IA32_X2APIC_ISR2 IA32_X2APIC_ISR3 IA32_X2APIC_ISR4 IA32_X2APIC_ISR5 IA32_X2APIC_ISR6 IA32_X2APIC_ISR7 IA32_X2APIC_TMR0 IA32_X2APIC_TMR1 IA32_X2APIC_TMR2 IA32_X2APIC_TMR3 IA32_X2APIC_TMR4 IA32_X2APIC_TMR5 IA32_X2APIC_TMR6 IA32_X2APIC_TMR7 IA32_X2APIC_IRR0 IA32_X2APIC_IRR1 IA32_X2APIC_IRR2 IA32_X2APIC_IRR3 IA32_X2APIC_IRR4 IA32_X2APIC_IRR5 IA32_X2APIC_IRR6 IA32_X2APIC_IRR7 Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Last Branch Record 15 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. x2APIC ID register (R/O) See x2APIC Specification. x2APIC Version register (R/O) x2APIC Task Priority register (R/W) x2APIC Processor Priority register (R/O) x2APIC EOI register (W/O) x2APIC Logical Destination register (R/O) x2APIC Spurious Interrupt Vector register (R/W) x2APIC In-Service register bits [31:0] (R/O) x2APIC In-Service register bits [63:32] (R/O) x2APIC In-Service register bits [95:64] (R/O) x2APIC In-Service register bits [127:96] (R/O) x2APIC In-Service register bits [159:128] (R/O) x2APIC In-Service register bits [191:160] (R/O) x2APIC In-Service register bits [223:192] (R/O) x2APIC In-Service register bits [255:224] (R/O) x2APIC Trigger Mode register bits [31:0] (R/O) x2APIC Trigger Mode register bits [63:32] (R/O) x2APIC Trigger Mode register bits [95:64] (R/O) x2APIC Trigger Mode register bits [127:96] (R/O) x2APIC Trigger Mode register bits [159:128] (R/O) x2APIC Trigger Mode register bits [191:160] (R/O) x2APIC Trigger Mode register bits [223:192] (R/O) x2APIC Trigger Mode register bits [255:224] (R/O) x2APIC Interrupt Request register bits [31:0] (R/O) x2APIC Interrupt Request register bits [63:32] (R/O) x2APIC Interrupt Request register bits [95:64] (R/O) x2APIC Interrupt Request register bits [127:96] (R/O) x2APIC Interrupt Request register bits [159:128] (R/O) x2APIC Interrupt Request register bits [191:160] (R/O) x2APIC Interrupt Request register bits [223:192] (R/O) x2APIC Interrupt Request register bits [255:224] (R/O) Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

261

Table 35-6 MSRs in Processors Based on Intel Microarchitecture Code Name Nehalem (Contd.)
Register Address Hex 828H 82FH 830H 832H 833H 834H 835H 836H 837H 838H 839H 83EH 83FH C000_ 0080H C000_ 0081H C000_ 0082H C000_ 0084H C000_ 0100H C000_ 0101H C000_ 0102H C000_ 0103H ... Dec 2088 2095 2096 2098 2099 2100 2101 2102 2103 2104 2105 2110 2111 IA32_X2APIC_ESR IA32_X2APIC_LVT_CMCI IA32_X2APIC_ICR IA32_X2APIC_LVT_TIMER IA32_X2APIC_LVT_THERM AL IA32_X2APIC_LVT_PMI IA32_X2APIC_LVT_LINT0 IA32_X2APIC_LVT_LINT1 IA32_X2APIC_LVT_ERROR IA32_X2APIC_INIT_COUNT IA32_X2APIC_CUR_COUNT IA32_X2APIC_DIV_CONF IA32_X2APIC_SELF_IPI IA32_EFER IA32_STAR IA32_LSTAR IA32_FMASK IA32_FS_BASE IA32_GS_BASE IA32_KERNEL_GSBASE IA32_TSC_AUX Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread x2APIC Error Status register (R/W) x2APIC LVT Corrected Machine Check Interrupt register (R/W) x2APIC Interrupt Command register (R/W) x2APIC LVT Timer Interrupt register (R/W) x2APIC LVT Thermal Sensor Interrupt register (R/W) x2APIC LVT Performance Monitor register (R/W) x2APIC LVT LINT0 register (R/W) x2APIC LVT LINT1 register (R/W) x2APIC LVT Error register (R/W) x2APIC Initial Count register (R/W) x2APIC Current Count register (R/O) x2APIC Divide Configuration register (R/W) x2APIC Self IPI register (W/O) Extended Feature Enables See Table 35-2. System Call Target Address (R/W) See Table 35-2. IA-32e Mode System Call Target Address (R/W) See Table 35-2. System Call Flag Mask (R/W) See Table 35-2. Map of BASE Address of FS (R/W) See Table 35-2. Map of BASE Address of GS (R/W) See Table 35-2. Swap Target of BASE Address of GS (R/W) See Table 35-2. AUXILIARY TSC Signature. (R/W) See Table 35-2 and Section 17.13.2, IA32_TSC_AUX Register and RDTSCP Support. Scope

Register Name

Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

262

35.7

MSRS IN INTEL PROCESSOR FAMILY (BASED ON INTEL MICROARCHITECTURE CODE NAME SANDY BRIDGE)

Table 35-11 lists model-specific registers (MSRs) that are common to Intel processor family based on Intel microarchitecture (Sandy Bridge). All architectural MSRs listed in Table 35-2 are supported. These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2AH, 06_2DH, see Table 35-1. Additional MSRs specific to 06_2AH are listed in Table 35-12.

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge
Register Address Hex 0H 1H 6H 10H 17H 1BH 34H Dec 0 1 6 16 23 27 52 IA32_P5_MC_ADDR IA32_P5_MC_TYPE IA32_MONITOR_FILTER_ SIZE IA32_TIME_STAMP_ COUNTER IA32_PLATFORM_ID IA32_APIC_BASE MSR_SMI_COUNT 31:0 63:32 3AH 79H 8BH C1H C2H C3H C4H 58 121 139 193 194 195 196 IA32_FEATURE_CONTROL IA32_BIOS_UPDT_TRIG IA32_BIOS_SIGN_ID IA32_PMC0 IA32_PMC1 IA32_PMC2 IA32_PMC3 Thread Core Thread Thread Thread Thread Thread Thread Thread Thread Thread Package Thread Thread See Section 35.14, MSRs in Pentium Processors. See Section 35.14, MSRs in Pentium Processors. See Section 8.10.5, Monitor/Mwait Address Range Determination, and Table 35-2. See Section 17.13, Time-Stamp Counter, and see Table 35-2. Platform ID (R) See Table 35-2. See Section 10.4.4, Local APIC Status and Location, and Table 352. SMI Counter (R/O) SMI Count (R/O) Count SMIs. Reserved. Control Features in Intel 64Processor (R/W) See Table 35-2. BIOS Update Trigger Register (W) See Table 35-2. BIOS Update Signature ID (RO) See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

263

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex C5H C6H C7H C8H CEH Dec 197 198 199 200 206 IA32_PMC4 IA32_PMC5 IA32_PMC6 IA32_PMC7 MSR_PLATFORM_INFO 7:0 15:8 Package Core Core Core Core Package Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. Performance Counter Register See Table 35-2. See http://biosbits.org. Reserved. Maximum Non-Turbo Ratio (R/O) The is the ratio of the frequency that invariant TSC runs at. Frequency = ratio * 100 MHz. 27:16 28 Package Reserved. Programmable Ratio Limit for Turbo Mode (R/O) When set to 1, indicates that Programmable Ratio Limits for Turbo mode is enabled, and when set to 0, indicates Programmable Ratio Limits for Turbo mode is disabled. 29 Package Programmable TDP Limit for Turbo Mode (R/O) When set to 1, indicates that TDP Limits for Turbo mode are programmable, and when set to 0, indicates TDP Limit for Turbo mode is not programmable. 39:30 47:40 Package Reserved. Maximum Efficiency Ratio (R/O) The is the minimum ratio (maximum efficiency) that the processor can operates, in units of 100MHz. 63:48 E2H 226 MSR_PKG_CST_CONFIG_ CONTROL Core Reserved. C-State Configuration Control (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. See http://biosbits.org. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

264

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex Dec 2:0 Package C-State Limit (R/W) Specifies the lowest processor-specific C-state code name (consuming the least power). for the package. The default is set as factory-configured package C-state limit. The following C-state code name encodings are supported: 000b: C0/C1 (no package C-sate support) 001b: C2 010b: C6 no retention 011b: C6 retention 100b: C7 101b: C7s 111: No package C-state limit. Note: This field cannot be used to limit package C-state to C3. 9:3 10 Reserved. I/O MWAIT Redirection Enable (R/W) When set, will map IO_read instructions sent to IO register specified by MSR_PMG_IO_CAPTURE_BASE to MWAIT instructions 14:11 15 24:16 25 Reserved. CFG Lock (R/WO) When set, lock bits 15:0 of this register until next reset. Reserved. C3 state auto demotion enable (R/W) When set, the processor will conditionally demote C6/C7 requests to C3 based on uncore auto-demote information. 26 C1 state auto demotion enable (R/W) When set, the processor will conditionally demote C3/C6/C7 requests to C1 based on uncore auto-demote information. 27 28 63:29 E4H 228 MSR_PMG_IO_CAPTURE_ BASE Core Enable C3 undemotion (R/W) When set, enables undemotion from demoted C3. Enable C1 undemotion (R/W) When set, enables undemotion from demoted C1. Reserved. Power Management IO Redirection in C-state (R/W) See http://biosbits.org. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

265

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex Dec 15:0 LVL_2 Base Address (R/W) Specifies the base address visible to software for IO redirection. If IO MWAIT Redirection is enabled, reads to this address will be consumed by the power management logic and decoded to MWAIT instructions. When IO port address redirection is enabled, this is the IO port address reported to the OS/software. 18:16 C-state Range (R/W) Specifies the encoding value of the maximum C-State code name to be included when IO read to MWAIT redirection is enabled by MSR_PMG_CST_CONFIG_CONTROL[bit10]: 000b - C3 is the max C-State to include 001b - C6 is the max C-State to include 010b - C7 is the max C-State to include 63:19 E7H E8H FEH 174H 175H 176H 179H 17AH 231 232 254 372 373 374 377 378 IA32_MPERF IA32_APERF IA32_MTRRCAP IA32_SYSENTER_CS IA32_SYSENTER_ESP IA32_SYSENTER_EIP IA32_MCG_CAP IA32_MCG_STATUS 0 Thread Thread Thread Thread Thread Thread Thread Thread RIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) can be used to restart the program. If cleared, the program cannot be reliably restarted. 1 EIPV When set, bit indicates that the instruction addressed by the instruction pointer pushed on the stack (when the machine check was generated) is directly associated with the error. 2 MCIP When set, bit indicates that a machine check has been generated. If a second machine check is detected while this bit is still set, the processor enters a shutdown state. Software should write this bit to 0 after processing a machine check exception. 63:3 Reserved. Reserved. Maximum Performance Frequency Clock Count (RW) See Table 35-2. Actual Performance Frequency Clock Count (RW) See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

266

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 186H 187H 188H 189H 18AH 18BH 18CH 18DH 198H Dec 390 391 392 393 394 395 396 397 408 IA32_ PERFEVTSEL0 IA32_ PERFEVTSEL1 IA32_ PERFEVTSEL2 IA32_ PERFEVTSEL3 IA32_ PERFEVTSEL4 IA32_ PERFEVTSEL5 IA32_ PERFEVTSEL6 IA32_ PERFEVTSEL7 IA32_PERF_STATUS 15:0 63:16 198H 408 MSR_PERF_STATUS 47:32 Package Core Voltage (R/O) P-state core voltage can be computed by MSR_PERF_STATUS[37:32] * (float) 1/(2^13). 199H 19AH 409 410 IA32_PERF_CTL IA32_CLOCK_ MODULATION Thread Thread See Table 35-2. Clock Modulation (R/W) See Table 35-2 IA32_CLOCK_MODULATION MSR was originally named IA32_THERM_CONTROL MSR. 3:0 4 63:5 19BH 19CH 411 412 IA32_THERM_INTERRUPT IA32_THERM_STATUS Core Core On demand Clock Modulation Duty Cycle (R/W) In 6.25% increment On demand Clock Modulation Enable (R/W) Reserved. Thermal Interrupt Control (R/W) See Table 35-2. Thermal Monitor Status (R/W) See Table 35-2. Thread Thread Thread Thread Core Core Core Core Package See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2; If CPUID.0AH:EAX[15:8] = 8 See Table 35-2; If CPUID.0AH:EAX[15:8] = 8 See Table 35-2; If CPUID.0AH:EAX[15:8] = 8 See Table 35-2; If CPUID.0AH:EAX[15:8] = 8 See Table 35-2. Current Performance State Value. Reserved. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

267

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 1A0 Dec 416 IA32_MISC_ENABLE 0 6:1 7 10:8 11 12 15:13 16 18 21:19 22 23 33:24 34 37:35 38 Package Thread Thread Thread Package Thread Thread Thread Thread Thread Enable Misc. Processor Features (R/W) Allows a variety of processor functions to be enabled and disabled. Fast-Strings Enable See Table 35-2 Reserved. Performance Monitoring Available (R) See Table 35-2. Reserved. Branch Trace Storage Unavailable (RO) See Table 35-2. Precise Event Based Sampling Unavailable (RO) See Table 35-2. Reserved. Enhanced Intel SpeedStep Technology Enable (R/W) See Table 35-2. ENABLE MONITOR FSM. (R/W) See Table 35-2. Reserved. Limit CPUID Maxval (R/W) See Table 35-2. xTPR Message Disable (R/W) See Table 35-2. Reserved. XD Bit Disable (R/W) See Table 35-2. Reserved. Turbo Mode Disable (R/W) When set to 1 on processors that support Intel Turbo Boost Technology, the turbo mode feature is disabled and the IDA_Enable feature flag will be clear (CPUID.06H: EAX[1]=0). When set to a 0 on processors that support IDA, CPUID.06H: EAX[1] reports the processors support of turbo mode is enabled. Note: the power-on default value is used by BIOS to detect hardware support of turbo mode. If power-on default value is 1, turbo mode is available in the processor. If power-on default value is 0, turbo mode is not available. 63:39 Reserved. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

268

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 1A2H Dec 418 MSR_ TEMPERATURE_TARGET 15:0 23:16 Unique Reserved. Temperature Target (R) The minimum temperature at which PROCHOT# will be asserted. The value is degree C. 63:24 1A6H 1A7H 1AAH 1ADH 1B0H 1B1H 1B2H 1C8H 1C9H 422 422 426 428 432 433 434 456 457 MSR_OFFCORE_RSP_0 MSR_OFFCORE_RSP_1 MSR_MISC_PWR_MGMT MSR_TURBO_PWR_ CURRENT_LIMIT IA32_ENERGY_PERF_BIAS IA32_PACKAGE_THERM_ STATUS IA32_PACKAGE_THERM_ INTERRUPT MSR_LBR_SELECT MSR_LASTBRANCH_TOS Package Package Package Thread Thread Thread Thread Reserved. Offcore Response Event Select Register (R/W) Offcore Response Event Select Register (R/W) See http://biosbits.org. See http://biosbits.org. See Table 35-2. See Table 35-2. See Table 35-2. Last Branch Record Filtering Select Register (R/W) See Section 17.6.2, Filtering of Last Branch Records. Last Branch Record Stack TOS (R) Contains an index (bits 0-3) that points to the MSR containing the most recent branch record. See MSR_LASTBRANCH_0_FROM_IP (at 680H). 1D9H 1DDH 473 477 IA32_DEBUGCTL MSR_LER_FROM_LIP Thread Thread Debug Control (R/W) See Table 35-2. Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1DEH 478 MSR_LER_TO_LIP Thread Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. 1F2H 1F3H 498 499 IA32_SMRR_PHYSBASE IA32_SMRR_PHYSMASK Core Core See Table 35-2. See Table 35-2. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

269

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 1FCH 200H 201H 202H 203H 204H 205H 206H 207H 208H 209H 20AH 20BH 20CH 20DH 20EH 20FH 210H 211H 212H 213H 250H 258H 259H 268H 269H 26AH 26BH 26CH 26DH 26EH Dec 508 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 592 600 601 616 617 618 619 620 621 622 MSR_POWER_CTL IA32_MTRR_PHYSBASE0 IA32_MTRR_PHYSMASK0 IA32_MTRR_PHYSBASE1 IA32_MTRR_PHYSMASK1 IA32_MTRR_PHYSBASE2 IA32_MTRR_PHYSMASK2 IA32_MTRR_PHYSBASE3 IA32_MTRR_PHYSMASK3 IA32_MTRR_PHYSBASE4 IA32_MTRR_PHYSMASK4 IA32_MTRR_PHYSBASE5 IA32_MTRR_PHYSMASK5 IA32_MTRR_PHYSBASE6 IA32_MTRR_PHYSMASK6 IA32_MTRR_PHYSBASE7 IA32_MTRR_PHYSMASK7 IA32_MTRR_PHYSBASE8 IA32_MTRR_PHYSMASK8 IA32_MTRR_PHYSBASE9 IA32_MTRR_PHYSMASK9 IA32_MTRR_FIX64K_ 00000 IA32_MTRR_FIX16K_ 80000 IA32_MTRR_FIX16K_ A0000 IA32_MTRR_FIX4K_C0000 IA32_MTRR_FIX4K_C8000 IA32_MTRR_FIX4K_D0000 IA32_MTRR_FIX4K_D8000 IA32_MTRR_FIX4K_E0000 IA32_MTRR_FIX4K_E8000 IA32_MTRR_FIX4K_F0000 Core Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread See http://biosbits.org. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

270

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 26FH 277H 280H 281H 282H 283H 284H 2FFH Dec 623 631 640 641 642 643 644 767 IA32_MTRR_FIX4K_F8000 IA32_PAT IA32_MC0_CTL2 IA32_MC1_CTL2 IA32_MC2_CTL2 IA32_MC3_CTL2 MSR_MC4_CTL2 IA32_MTRR_DEF_TYPE Thread Thread Core Core Core Core Package Thread See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. Always 0 (CMCI not supported). Default Memory Types (R/W) See Table 35-2. 309H 30AH 30BH 345H 777 778 779 837 IA32_FIXED_CTR0 IA32_FIXED_CTR1 IA32_FIXED_CTR2 IA32_PERF_CAPABILITIES 5:0 6 7 11:8 12 63:13 38DH 38EH 38FH 390H 3F1H 909 910 911 912 1009 IA32_FIXED_CTR_CTRL IA32_PERF_GLOBAL_ STAUS IA32_PERF_GLOBAL_CTRL IA32_PERF_GLOBAL_OVF_ CTRL MSR_PEBS_ENABLE 0 1 2 Thread Thread Thread Thread Thread Thread Thread Thread Thread Fixed-Function Performance Counter Register 0 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 1 (R/W) See Table 35-2. Fixed-Function Performance Counter Register 2 (R/W) See Table 35-2. See Table 35-2. See Section 17.4.1, IA32_DEBUGCTL MSR. LBR Format. See Table 35-2. PEBS Record Format. PEBSSaveArchRegs. See Table 35-2. PEBS_REC_FORMAT. See Table 35-2. SMM_FREEZE. See Table 35-2. Reserved. Fixed-Function-Counter Control Register (R/W) See Table 35-2. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Table 35-2. See Section 18.4.2, Global Counter Control Facilities. See Section 18.6.1.1, Precise Event Based Sampling (PEBS). Enable PEBS on IA32_PMC0. (R/W) Enable PEBS on IA32_PMC1. (R/W) Enable PEBS on IA32_PMC2. (R/W) Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

271

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex Dec 3 31:4 32 33 34 35 63:36 3F6H 1014 MSR_PEBS_LD_LAT 15:0 63:36 3F8H 1016 MSR_PKG_C3_RESIDENCY Package Thread Enable PEBS on IA32_PMC3. (R/W) Reserved. Enable Load Latency on IA32_PMC0. (R/W) Enable Load Latency on IA32_PMC1. (R/W) Enable Load Latency on IA32_PMC2. (R/W) Enable Load Latency on IA32_PMC3. (R/W) Reserved. see See Section 18.6.1.2, Load Latency Performance Monitoring Facility. Minimum threshold latency value of tagged load operation that will be counted. (R/W) Reserved. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Package C3 Residency Counter. (R/O) Value since last reset that this package is in processor-specific C3 states. Count at the same frequency as the TSC. 3F9H 1017 MSR_PKG_C6_RESIDENCY Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Package C6 Residency Counter. (R/O) Value since last reset that this package is in processor-specific C6 states. Count at the same frequency as the TSC. 3FAH 1018 MSR_PKG_C7_RESIDENCY Package Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Package C7 Residency Counter. (R/O) Value since last reset that this package is in processor-specific C7 states. Count at the same frequency as the TSC. 3FCH 1020 MSR_CORE_C3_RESIDENCY Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. CORE C3 Residency Counter. (R/O) Value since last reset that this core is in processor-specific C3 states. Count at the same frequency as the TSC. 3FDH 1021 MSR_CORE_C6_RESIDENCY Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Register Name Scope Bit Description

63:0

63:0

63:0

63:0

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

272

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex Dec 63:0 CORE C6 Residency Counter. (R/O) Value since last reset that this core is in processor-specific C6 states. Count at the same frequency as the TSC. 3FEH 1022 MSR_CORE_C7_RESIDENCY Core Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. CORE C7 Residency Counter. (R/O) Value since last reset that this core is in processor-specific C7 states. Count at the same frequency as the TSC. 400H 401H 402H 403H 404H 405H 406H 407H 408H 409H 40AH 40BH 40CH 40DH 40EH 40FH 410H 1024 1025 1026 1027 1028 1029 1030 1031 1032 1033 1034 1035 1036 1037 1038 1039 1040 IA32_MC0_CTL IA32_MC0_STATUS IA32_MC0_ADDR IA32_MC0_MISC IA32_MC1_CTL IA32_MC1_STATUS IA32_MC1_ADDR IA32_MC1_MISC IA32_MC2_CTL IA32_MC2_STATUS IA32_MC2_ADDR IA32_MC2_MISC IA32_MC3_CTL IA32_MC3_STATUS IA32_MC3_ADDR IA32_MC3_MISC MSR_MC4_CTL 0 1 2 63:2 411H 1041 IA32_MC4_STATUS Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core Core See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. PCU Hardware Error (R/W) When set, enables signaling of PCU hardware detected errors. PCU Controller Error (R/W) When set, enables signaling of PCU controller detected errors PCU Firmware Error (R/W) When set, enables signaling of PCU firmware detected errors Reserved. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. Register Name Scope Bit Description

63:0

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

273

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 480H Dec 1152 IA32_VMX_BASIC Thread Reporting Register of Basic VMX Capabilities (R/O) See Table 35-2. See Appendix A.1, Basic VMX Information. 481H 1153 IA32_VMX_PINBASED_ CTLS Thread Capability Reporting Register of Pin-based VM-execution Controls (R/O) See Table 35-2. See Appendix A.3, VM-Execution Controls. 482H 1154 IA32_VMX_PROCBASED_ CTLS IA32_VMX_EXIT_CTLS Thread Capability Reporting Register of Primary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. 483H 1155 Thread Capability Reporting Register of VM-exit Controls (R/O) See Table 35-2. See Appendix A.4, VM-Exit Controls. 484H 1156 IA32_VMX_ENTRY_CTLS Thread Capability Reporting Register of VM-entry Controls (R/O) See Table 35-2. See Appendix A.5, VM-Entry Controls. 485H 1157 IA32_VMX_MISC Thread Reporting Register of Miscellaneous VMX Capabilities (R/O) See Table 35-2. See Appendix A.6, Miscellaneous Data. 486H 1158 IA32_VMX_CR0_FIXED0 Thread Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 487H 1159 IA32_VMX_CR0_FIXED1 Thread Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.7, VMX-Fixed Bits in CR0. 488H 1160 IA32_VMX_CR4_FIXED0 Thread Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 489H 1161 IA32_VMX_CR4_FIXED1 Thread Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Table 35-2. See Appendix A.8, VMX-Fixed Bits in CR4. 48AH 1162 IA32_VMX_VMCS_ENUM Thread Capability Reporting Register of VMCS Field Enumeration (R/O) See Table 35-2. See Appendix A.9, VMCS Enumeration. 48BH 1163 IA32_VMX_PROCBASED_ CTLS2 Thread Capability Reporting Register of Secondary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

274

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 4C1H 4C2H 4C3H 4C4H 4C5H 4C6H 4C7H C8H 600H Dec 1217 1218 1219 1220 1221 1222 1223 200 1536 IA32_A_PMC0 IA32_A_PMC1 IA32_A_PMC2 IA32_A_PMC3 IA32_A_PMC4 IA32_A_PMC5 IA32_A_PMC6 IA32_A_PMC7 IA32_DS_AREA Thread Thread Thread Thread Core Core Core Core Thread See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. DS Save Area (R/W) See Table 35-2. See Section 18.11.4, Debug Store (DS) Mechanism. 606H 60AH 1542 1546 MSR_RAPL_POWER_UNIT MSR_PKGC3_IRTL Package Package Unit Multipliers used in RAPL Interfaces (R/O) See Section 14.7.1, RAPL Interfaces. Package C3 Interrupt Response Limit (R/W) Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. 9:0 Interrupt response time limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C3 state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. The following time unit encodings are supported: 000b: 1 ns 001b: 32 ns 010b: 1024 ns 011b: 32768 ns 100b: 1048576 ns 101b: 33554432 ns 14:13 15 Reserved. Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 Reserved. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

275

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 60BH Dec 1547 MSR_PKGC6_IRTL Package Package C6 Interrupt Response Limit (R/W) This MSR defines the budget allocated for the package to exit from C6 to a C0 state, where interrupt request can be delivered to the core and serviced. Additional core-exit latency amy be applicable depending on the actual C-state the core is in. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. 9:0 Interrupt response time limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C6 state. 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. The following time unit encodings are supported: 000b: 1 ns 001b: 32 ns 010b: 1024 ns 011b: 32768 ns 100b: 1048576 ns 101b: 33554432 ns 14:13 15 Reserved. Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 60CH 1548 MSR_PKGC7_IRTL Package Reserved. Package C7 Interrupt Response Limit (R/W) This MSR defines the budget allocated for the package to exit from C7 to a C0 state, where interrupt request can be delivered to the core and serviced. Additional core-exit latency amy be applicable depending on the actual C-state the core is in. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. 9:0 Interrupt response time limit (R/W) Specifies the limit that should be used to decide if the package should be put into a package C7 state. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

276

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex Dec 12:10 Time Unit (R/W) Specifies the encoding value of time unit of the interrupt response time limit. The following time unit encodings are supported: 000b: 1 ns 001b: 32 ns 010b: 1024 ns 011b: 32768 ns 100b: 1048576 ns 101b: 33554432 ns 14:13 15 Reserved. Valid (R/W) Indicates whether the values in bits 12:0 are valid and can be used by the processor for package C-sate management. 63:16 60DH 1549 MSR_PKG_C2_RESIDENCY Package Reserved. Note: C-state values are processor specific C-state code names, unrelated to MWAIT extension C-state parameters or ACPI CStates. Package C2 Residency Counter. (R/O) Value since last reset that this package is in processor-specific C2 states. Count at the same frequency as the TSC. 610H 611H 614H 638H 639H 63AH 63BH 1552 1553 1556 1592 1593 1594 1595 MSR_PKG_RAPL_POWER_ LIMIT MSR_PKG_ENERY_STATUS MSR_PKG_POWER_INFO MSR_PP0_POWER_LIMIT MSR_PP0_ENERY_STATUS MSR_PP0_POLICY MSR_PP0_PERF_STATUS Package Package Package Package Package Package Package PKG RAPL Power Limit Control (R/W) See Section 14.7.3, Package RAPL Domain. PKG Energy Status (R/O) See Section 14.7.3, Package RAPL Domain. PKG RAPL Parameters (R/W) See Section 14.7.3, Package RAPL Domain. PP0 RAPL Power Limit Control (R/W) See Section 14.7.4, PP0/PP1 RAPL Domains. PP0 Energy Status (R/O) See Section 14.7.4, PP0/PP1 RAPL Domains. PP0 Balance Policy (R/W) See Section 14.7.4, PP0/PP1 RAPL Domains. PP0 Performance Throttling Status (R/O) See Section 14.7.4, PP0/PP1 RAPL Domains. Register Name Scope Bit Description

63:0

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

277

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 680H Dec 1664 MSR_ LASTBRANCH_0_FROM_IP Thread Last Branch Record 0 From IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the source instruction for one of the last sixteen branches, exceptions, or interrupts taken by the processor. See also: Last Branch Record Stack TOS at 1C9H Section 17.6.1, LBR Stack. 681H 682H 683H 684H 685H 686H 687H 688H 689H 68AH 1665 1666 1667 1668 1669 1670 1671 1672 1673 1674 MSR_ LASTBRANCH_1_FROM_IP MSR_ LASTBRANCH_2_FROM_IP MSR_ LASTBRANCH_3_FROM_IP MSR_ LASTBRANCH_4_FROM_IP MSR_ LASTBRANCH_5_FROM_IP MSR_ LASTBRANCH_6_FROM_IP MSR_ LASTBRANCH_7_FROM_IP MSR_ LASTBRANCH_8_FROM_IP MSR_ LASTBRANCH_9_FROM_IP MSR_ LASTBRANCH_10_FROM_ IP MSR_ LASTBRANCH_11_FROM_ IP MSR_ LASTBRANCH_12_FROM_ IP MSR_ LASTBRANCH_13_FROM_ IP Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Last Branch Record 1 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 2 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 3 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 4 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 5 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 6 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 7 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 8 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 9 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Last Branch Record 10 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Thread Last Branch Record 11 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Thread Last Branch Record 12 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Thread Last Branch Record 13 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Register Name Scope Bit Description

68BH

1675

68CH

1676

68DH

1677

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

278

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 68EH Dec 1678 MSR_ LASTBRANCH_14_FROM_ IP MSR_ LASTBRANCH_15_FROM_ IP MSR_ LASTBRANCH_0_TO_IP Thread Last Branch Record 14 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Thread Last Branch Record 15 From IP (R/W) See description of MSR_LASTBRANCH_0_FROM_IP. Thread Last Branch Record 0 To IP (R/W) One of sixteen pairs of last branch record registers on the last branch record stack. This part of the stack contains pointers to the destination instruction for one of the last sixteen branches, exceptions, or interrupts taken by the processor. Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Last Branch Record 1 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 2 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 3 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 4 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 5 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 6 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 7 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 8 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 9 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 10 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 11 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 12 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 13 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Register Name Scope Bit Description

68FH

1679

6C0H

1728

6C1H 6C2H 6C3H 6C4H 6C5H 6C6H 6C7H 6C8H 6C9H 6CAH 6CBH 6CCH 6CDH

1729 1730 1731 1732 1733 1734 1735 1736 1737 1738 1739 1740 1741

MSR_ LASTBRANCH_1_TO_IP MSR_ LASTBRANCH_2_TO_IP MSR_ LASTBRANCH_3_TO_IP MSR_ LASTBRANCH_4_TO_IP MSR_ LASTBRANCH_5_TO_IP MSR_ LASTBRANCH_6_TO_IP MSR_ LASTBRANCH_7_TO_IP MSR_ LASTBRANCH_8_TO_IP MSR_ LASTBRANCH_9_TO_IP MSR_ LASTBRANCH_10_TO_IP MSR_ LASTBRANCH_11_TO_IP MSR_ LASTBRANCH_12_TO_IP MSR_ LASTBRANCH_13_TO_IP

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

279

Table 35-11 MSRs Supported by Intel Processors Based on Intel Microarchitecture Code Name Sandy Bridge (Contd.)
Register Address Hex 6CEH 6CFH 6E0H C000_ 0080H C000_ 0081H C000_ 0082H C000_ 0084H C000_ 0100H C000_ 0101H C000_ 0102H C000_ 0103H Dec 1742 1743 1760 MSR_ LASTBRANCH_14_TO_IP MSR_ LASTBRANCH_15_TO_IP IA32_TSC_DEADLINE IA32_EFER IA32_STAR IA32_LSTAR IA32_FMASK IA32_FS_BASE IA32_GS_BASE IA32_KERNEL_GSBASE IA32_TSC_AUX Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Thread Last Branch Record 14 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. Last Branch Record 15 To IP (R/W) See description of MSR_LASTBRANCH_0_TO_IP. See Table 35-2. Extended Feature Enables See Table 35-2. System Call Target Address (R/W) See Table 35-2. IA-32e Mode System Call Target Address (R/W) See Table 35-2. System Call Flag Mask (R/W) See Table 35-2. Map of BASE Address of FS (R/W) See Table 35-2. Map of BASE Address of GS (R/W) See Table 35-2. Swap Target of BASE Address of GS (R/W) See Table 35-2. AUXILIARY TSC Signature (R/W) See Table 35-2 and Section 17.13.2, IA32_TSC_AUX Register and RDTSCP Support. Register Name Scope Bit Description

...

35.7.2

MSRs In Intel Xeon Processor E5 Family (Based on Intel Microarchitecture Code Name Sandy Bridge)

Table 35-13 lists selected model-specific registers (MSRs) that are specific to the Intel Xeon Processor E5 Family (based on Intel microarchitecture code name Sandy Bridge). These processors have a CPUID signature with DisplayFamily_DisplayModel of 06_2DH, see Table 35-1.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

280

Table 35-13 Selected MSRs Supported by Intel Xeon Processors E5 Family (Based on Intel Microarchitecture Code Name Sandy Bridge)
Register Address Hex 17FH Dec 383 MSR_ERROR_CONTROL 0 1 Package MC Bank Error Configuration (R/W) Reserved MemError Log Enable (R/W) When set, enables IMC status bank to log additional info in bits 36:32. 63:2 285H 286H 287H 288H 289H 28AH 28BH 28CH 28DH 28EH 28FH 290H 291H 292H 293H 414H 415H 416H 417H 418H 419H 41AH 41BH 41CH 41DH 41EH 41FH 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 1044 1045 1046 1047 1048 1049 1050 1051 1052 1053 1054 1055 IA32_MC5_CTL2 IA32_MC6_CTL2 IA32_MC7_CTL2 IA32_MC8_CTL2 IA32_MC9_CTL2 IA32_MC10_CTL2 IA32_MC11_CTL2 IA32_MC12_CTL2 IA32_MC13_CTL2 IA32_MC14_CTL2 IA32_MC15_CTL2 IA32_MC16_CTL2 IA32_MC17_CTL2 IA32_MC18_CTL2 IA32_MC19_CTL2 MSR_MC5_CTL MSR_MC5_STATUS MSR_MC5_ADDR MSR_MC5_MISC MSR_MC6_CTL MSR_MC6_STATUS MSR_MC6_ADDR MSR_MC6_MISC MSR_MC7_CTL MSR_MC7_STATUS MSR_MC7_ADDR MSR_MC7_MISC Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Reserved. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Table 35-2. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

281

Table 35-13 Selected MSRs Supported by Intel Xeon Processors E5 Family (Based on Intel Microarchitecture Code Name Sandy Bridge) (Contd.)
Register Address Hex 420H 421H 422H 423H 424H 425H 426H 427H 428H 429H 42AH 42BH 42CH 42DH 42EH 42FH 430H 431H 432H 433H 434H 435H 436H 437H 438H 439H 43AH 43BH 43CH 43DH 43EH 43FH 440H Dec 1056 1057 1058 1059 1060 1061 1062 1063 1064 1065 1066 1067 1068 1069 1070 1071 1072 1073 1074 1075 1076 1077 1078 1079 1080 1081 1082 1083 1084 1085 1086 1087 1088 MSR_MC8_CTL MSR_MC8_STATUS MSR_MC8_ADDR MSR_MC8_MISC MSR_MC9_CTL MSR_MC9_STATUS MSR_MC9_ADDR MSR_MC9_MISC MSR_MC10_CTL MSR_MC10_STATUS MSR_MC10_ADDR MSR_MC10_MISC MSR_MC11_CTL MSR_MC11_STATUS MSR_MC11_ADDR MSR_MC11_MISC MSR_MC12_CTL MSR_MC12_STATUS MSR_MC12_ADDR MSR_MC12_MISC MSR_MC13_CTL MSR_MC13_STATUS MSR_MC13_ADDR MSR_MC13_MISC MSR_MC14_CTL MSR_MC14_STATUS MSR_MC14_ADDR MSR_MC14_MISC MSR_MC15_CTL MSR_MC15_STATUS MSR_MC15_ADDR MSR_MC15_MISC MSR_MC16_CTL Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. Register Name Scope Bit Description

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

282

Table 35-13 Selected MSRs Supported by Intel Xeon Processors E5 Family (Based on Intel Microarchitecture Code Name Sandy Bridge) (Contd.)
Register Address Hex 441H 442H 443H 444H 445H 446H 447H 448H 449H 44AH 44BH 44CH 44DH 44EH 44FH 613H 618H 619H 61BH 61CH Dec 1089 1090 1091 1092 1093 1094 1095 1096 1097 1098 1099 1100 1101 1102 1103 1555 1560 1561 1563 1564 MSR_MC16_STATUS MSR_MC16_ADDR MSR_MC16_MISC MSR_MC17_CTL MSR_MC17_STATUS MSR_MC17_ADDR MSR_MC17_MISC MSR_MC18_CTL MSR_MC18_STATUS MSR_MC18_ADDR MSR_MC18_MISC MSR_MC19_CTL MSR_MC19_STATUS MSR_MC19_ADDR MSR_MC19_MISC MSR_RAPL_PERF_STATUS MSR_DRAM_POWER_LIMIT MSR_DRAM_ENERY_ STATUS MSR_DRAM_PERF_STATUS MSR_DRAM_POWER_INFO Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package Package See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS, and Chapter 16. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. See Section 15.3.2.4, IA32_MCi_MISC MSRs. RAPL Perf Status (R/O) DRAM RAPL Power Limit Control (R/W) See Section 14.7.5, DRAM RAPL Domain. DRAM Energy Status (R/O) See Section 14.7.5, DRAM RAPL Domain. DRAM Performance Throttling Status (R/O) See Section 14.7.5, DRAM RAPL Domain. DRAM RAPL Parameters (R/W) See Section 14.7.5, DRAM RAPL Domain. ... Register Name Scope Bit Description

35.9

MSRS IN THE NEXT GENERATION INTEL CORE PROCESSORS (BASED ON INTEL MICROARCHITECTURE CODE NAME HASWELL)

The Next Generation Intel Core Processor Family (based on Intel microarchitecture code name Haswell) supports the MSR interfaces listed in Table 35-11, Table 35-12, Table 35-14, and Table 35-15.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

283

Table 35-15 Additional MSRs Supported by Next Generation Intel Core Processors (Based on Intel Microarchitecture Code Name Haswell)
Register Address Hex 3BH Dec 59 IA32_TSC_ADJUST THREAD Per-Logical-Processor TSC ADJUST (R/W) See Table 35-2. Register Name Scope Bit Description

35.10

MSRS IN THE PENTIUM 4 AND INTEL XEON PROCESSORS

Table 35-15 lists MSRs (architectural and model-specific) that are defined across processor generations based on Intel NetBurst microarchitecture. The processor can be identified by its CPUID signatures of DisplayFamily encoding of 0FH, see Table 35-1. MSRs with an IA32_ prefix are designated as architectural. This means that the functions of these MSRs and their addresses remain the same for succeeding families of IA-32 processors. MSRs with an MSR_ prefix are model specific with respect to address functionalities. The column Model Availability lists the model encoding value(s) within the Pentium 4 and Intel Xeon processor family at the specified register address. The model encoding value of a processor can be queried using CPUID. See CPUIDCPU Identification in Chapter 3 of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 2A.

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors


Register Address Hex 0H 1H 6H 10H Dec 0 1 6 16 IA32_P5_MC_ADDR IA32_P5_MC_TYPE IA32_MONITOR_FILTER_LINE_ SIZE IA32_TIME_STAMP_COUNTER Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Shared Shared Unique Bit Description See Section 35.14, MSRs in Pentium Processors. See Section 35.14, MSRs in Pentium Processors. See Section 8.10.5, Monitor/Mwait Address Range Determination. Time Stamp Counter See Table 35-2. On earlier processors, only the lower 32 bits are writable. On any write to the lower 32 bits, the upper 32 bits are cleared. For processor family 0FH, models 3 and 4: all 64 bits are writable. 17H 23 IA32_PLATFORM_ID 0, 1, 2, 3, 4, 6 Shared Platform ID (R) See Table 35-2. The operating system can use this MSR to determine slot information for the processor and the proper microcode update to load.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

284

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 1BH Dec 27 IA32_APIC_BASE Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Unique Bit Description APIC Location and Status (R/W) See Table 35-2. See Section 10.4.4, Local APIC Status and Location. Shared Processor Hard Power-On Configuration (R/W) Enables and disables processor features; (R) indicates current processor configuration. 0 Output Tri-state Enabled (R) Indicates whether tri-state output is enabled (1) or disabled (0) as set by the strapping of SMI#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 1 Execute BIST (R) Indicates whether the execution of the BIST is enabled (1) or disabled (0) as set by the strapping of INIT#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 2 In Order Queue Depth (R) Indicates whether the in order queue depth for the system bus is 1 (1) or up to 12 (0) as set by the strapping of A7#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 3 MCERR# Observation Disabled (R) Indicates whether MCERR# observation is enabled (0) or disabled (1) as determined by the strapping of A9#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 4 BINIT# Observation Enabled (R) Indicates whether BINIT# observation is enabled (0) or disabled (1) as determined by the strapping of A10#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 6:5 APIC Cluster ID (R) Contains the logical APIC cluster ID value as set by the strapping of A12# and A11#. The logical cluster ID value is written into the field on the deassertion of RESET#; the field is set to 1 when the address bus signal is asserted.

2AH

42

MSR_EBC_HARD_POWERON

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

285

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec 7 Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Bus Park Disable (R) Indicates whether bus park is enabled (0) or disabled (1) as set by the strapping of A15#. The value in this bit is written on the deassertion of RESET#; the bit is set to 1 when the address bus signal is asserted. 11:8 13:12 Reserved. Agent ID (R) Contains the logical agent ID value as set by the strapping of BR[3:0]. The logical ID value is written into the field on the deassertion of RESET#; the field is set to 1 when the address bus signal is asserted. 63:14 2BH 43 MSR_EBC_SOFT_POWERON 0 0, 1, 2, 3, 4, 6 Shared Reserved. Processor Soft Power-On Configuration (R/W) Enables and disables processor features. RCNT/SCNT On Request Encoding Enable (R/W) Controls the driving of RCNT/SCNT on the request encoding. Set to enable (1); clear to disabled (0, default). 1 Data Error Checking Disable (R/W) Set to disable system data bus parity checking; clear to enable parity checking. 2 3 4 Response Error Checking Disable (R/W) Set to disable (default); clear to enable. Address/Request Error Checking Disable (R/W) Set to disable (default); clear to enable. Initiator MCERR# Disable (R/W) Set to disable MCERR# driving for initiator bus requests (default); clear to enable. 5 Internal MCERR# Disable (R/W) Set to disable MCERR# driving for initiator internal errors (default); clear to enable. 6 BINIT# Driver Disable (R/W) Set to disable BINIT# driver (default); clear to enable driver. 63:7 Reserved.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

286

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 2CH Dec 44 MSR_EBC_FREQUENCY_ID Register Name Fields and Flags Model Availability 2,3, 4, 6 Shared/ Unique1 Shared Bit Description Processor Frequency Configuration The bit field layout of this MSR varies according to the MODEL value in the CPUID version information. The following bit field layout applies to Pentium 4 and Xeon Processors with MODEL encoding equal or greater than 2. (R) The field Indicates the current processor frequency configuration. 15:0 18:16 Reserved. Scalable Bus Speed (R/W) Indicates the intended scalable bus speed: Encoding Scalable Bus Speed 000B 100 MHz (Model 2) 000B 266 MHz (Model 3 or 4) 001B 133 MHz 010B 200 MHz 011B 166 MHz 100B 333 MHz (Model 6) 133.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 001B. 166.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 011B. 266.67 MHz should be utilized if performing calculation with System Bus Speed when encoding is 000B and model encoding = 3 or 4. 333.33 MHz should be utilized if performing calculation with System Bus Speed when encoding is 100B and model encoding = 6. All other values are reserved. 23:19 31:24 Reserved. Core Clock Frequency to System Bus Frequency Ratio (R) The processor core clock frequency to system bus frequency ratio observed at the de-assertion of the reset pin. 63:25 Reserved.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

287

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 2CH Dec 44 MSR_EBC_FREQUENCY_ID Register Name Fields and Flags Model Availability 0, 1 Shared/ Unique1 Shared Bit Description Processor Frequency Configuration (R) The bit field layout of this MSR varies according to the MODEL value of the CPUID version information. This bit field layout applies to Pentium 4 and Xeon Processors with MODEL encoding less than 2. Indicates current processor frequency configuration. 20:0 23:21 Reserved. Scalable Bus Speed (R/W) Indicates the intended scalable bus speed: Encoding Scalable Bus Speed 000B 100 MHz All others values reserved. 63:24 3AH 58 IA32_FEATURE_CONTROL 3, 4, 6 Unique Reserved. Control Features in IA-32 Processor (R/W) See Table 35-2 (If CPUID.01H:ECX.[bit 5]) 79H 8BH 9BH FEH 121 139 155 254 IA32_BIOS_UPDT_TRIG IA32_BIOS_SIGN_ID IA32_SMM_MONITOR_CTL IA32_MTRRCAP 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Unique Unique Unique BIOS Update Trigger Register (W) See Table 35-2. BIOS Update Signature ID (R/W) See Table 35-2. SMM Monitor Configuration (R/W) See Table 35-2. MTRR Information See Section 11.11.1, MTRR Feature Identification.. Unique CS register target for CPL 0 code (R/W) See Table 35-2. See Section 5.8.7, Performing Fast Calls to System Procedures with the SYSENTER and SYSEXIT Instructions. 175H 373 IA32_SYSENTER_ESP 0, 1, 2, 3, 4, 6 Unique Stack pointer for CPL 0 stack (R/W) See Table 35-2. See Section 5.8.7, Performing Fast Calls to System Procedures with the SYSENTER and SYSEXIT Instructions.

174H

372

IA32_SYSENTER_CS

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

288

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 176H Dec 374 IA32_SYSENTER_EIP Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Unique Bit Description CPL 0 code entry point (R/W) See Table 35-2. See Section 5.8.7, Performing Fast Calls to System Procedures with the SYSENTER and SYSEXIT Instructions. Unique Machine Check Capabilities (R) See Table 35-2. See Section 15.3.1.1, IA32_MCG_CAP MSR. Unique Machine Check Status. (R) See Table 35-2. See Section 15.3.1.2, IA32_MCG_STATUS MSR. Machine Check Feature Enable (R/W) See Table 35-2. See Section 15.3.1.3, IA32_MCG_CTL MSR. 180H 384 MSR_MCG_RAX 0, 1, 2, 3, 4, 6 Unique Machine Check EAX/RAX Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check EBX/RBX Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check ECX/RCX Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check EDX/RDX Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check ESI/RSI Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs.

179H

377

IA32_MCG_CAP

0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6

17AH

378

IA32_MCG_STATUS

17BH

379

IA32_MCG_CTL

63:0

181H

385

MSR_MCG_RBX

63:0

182H

386

MSR_MCG_RCX

63:0

183H

387

MSR_MCG_RDX

63:0

184H

388

MSR_MCG_RSI

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

289

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec 63:0 Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check EDI/RDI Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check EBP/RBP Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check ESP/RSP Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check EFLAGS/RFLAG Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check EIP/RIP Save State See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Contains register state at time of machine check error. When in non-64-bit modes at the time of the error, bits 63-32 do not contain valid data. 0, 1, 2, 3, 4, 6 Unique Machine Check Miscellaneous See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs.

185H

389

MSR_MCG_RDI

63:0

186H

390

MSR_MCG_RBP

63:0

187H

391

MSR_MCG_RSP

63:0

188H

392

MSR_MCG_RFLAGS

63:0

189H

393

MSR_MCG_RIP

63:0

18AH

394

MSR_MCG_MISC

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

290

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec 0 Register Name Fields and Flags Model Availability Shared/ Unique1 DS When set, the bit indicates that a page assist or page fault occurred during DS normal operation. The processors response is to shut down. The bit is used as an aid for debugging DS handling code. It is the responsibility of the user (BIOS or operating system) to clear this bit for normal operation. 63:1 18BH 18FH 190H 395 400 MSR_MCG_RESERVED1 MSR_MCG_RESERVED5 MSR_MCG_R8 0, 1, 2, 3, 4, 6 Unique Reserved. Reserved. Machine Check R8 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R9D/R9 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R10 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R11 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. Bit Description

63-0

191H

401

MSR_MCG_R9

63-0

192H

402

MSR_MCG_R10

63-0

193H

403

MSR_MCG_R11

63-0

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

291

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 194H Dec 404 MSR_MCG_R12 Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Unique Bit Description Machine Check R12 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R13 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R14 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 0, 1, 2, 3, 4, 6 Unique Machine Check R15 See Section 15.3.2.6, IA32_MCG Extended Machine Check State MSRs. Registers R8-15 (and the associated state-save MSRs) exist only in Intel 64 processors. These registers contain valid information only when the processor is operating in 64-bit mode at the time of the error. 3, 4, 6 3, 4, 6 0, 1, 2, 3, 4, 6 Unique Unique Unique See Table 35-2. See Section 14.1, Enhanced Intel Speedstep Technology. See Table 35-2. See Section 14.1, Enhanced Intel Speedstep Technology. Thermal Monitor Control (R/W) See Table 35-2. See Section 14.5.3, Software Controlled Clock Modulation.

63-0

195H

405

MSR_MCG_R13

63-0

196H

406

MSR_MCG_R14

63-0

197H

407

MSR_MCG_R15

63-0

198H 199H 19AH

408 409 410

IA32_PERF_STATUS IA32_PERF_CTL IA32_CLOCK_MODULATION

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

292

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 19BH Dec 411 IA32_THERM_INTERRUPT Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Unique Bit Description Thermal Interrupt Control (R/W) See Section 14.5.2, Thermal Monitor, and see Table 35-2. Shared Thermal Monitor Status (R/W) See Section 14.5.2, Thermal Monitor, and see Table 35-2. Thermal Monitor 2 Control. 3, Shared For Family F, Model 3 processors: When read, specifies the value of the target TM2 transition last written. When set, it sets the next target value for TM2 transition. For Family F, Model 4 and Model 6 processors: When read, specifies the value of the target TM2 transition last written. Writes may cause #GP exceptions. Enable Miscellaneous Processor Features (R/W) Fast-Strings Enable. See Table 35-2. Reserved. x87 FPU Fopcode Compatibility Mode Enable Thermal Monitor 1 Enable See Section 14.5.2, Thermal Monitor, and see Table 35-2. 4 Split-Lock Disable When set, the bit causes an #AC exception to be issued instead of a split-lock cycle. Operating systems that set this bit must align system structures to avoid split-lock scenarios. When the bit is clear (default), normal split-locks are issued to the bus. This debug feature is specific to the Pentium 4 processor. 5 6 Reserved. Third-Level Cache Disable (R/W) When set, the third-level cache is disabled; when clear (default) the third-level cache is enabled. This flag is reserved for processors that do not have a third-level cache.

19CH

412

IA32_THERM_STATUS

19DH

413

MSR_THERM2_CTL

4, 6

Shared

1A0H

416

IA32_MISC_ENABLE 0 1 2 3

0, 1, 2, 3, 4, 6

Shared

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

293

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description Note that the bit controls only the third-level cache; and only if overall caching is enabled through the CD flag of control register CR0, the page-level cache controls, and/or the MTRRs. See Section 11.5.4, Disabling and Enabling the L3 Cache. 7 8 Performance Monitoring Available (R) See Table 35-2. Suppress Lock Enable When set, assertion of LOCK on the bus is suppressed during a Split Lock access. When clear (default), LOCK is not suppressed. 9 Prefetch Queue Disable When set, disables the prefetch queue. When clear (default), enables the prefetch queue. 10 FERR# Interrupt Reporting Enable (R/W) When set, interrupt reporting through the FERR# pin is enabled; when clear, this interrupt reporting function is disabled. When this flag is set and the processor is in the stop-clock state (STPCLK# is asserted), asserting the FERR# pin signals to the processor that an interrupt (such as, INIT#, BINIT#, INTR, NMI, SMI#, or RESET#) is pending and that the processor should return to normal operation to handle the interrupt. This flag does not affect the normal operation of the FERR# pin (to indicate an unmasked floatingpoint error) when the STPCLK# pin is not asserted. 11 Branch Trace Storage Unavailable (BTS_UNAVILABLE) (R) See Table 35-2. When set, the processor does not support branch trace storage (BTS); when clear, BTS is supported. 12 PEBS_UNAVILABLE: Precise Event Based Sampling Unavailable (R) See Table 35-2. When set, the processor does not support precise event-based sampling (PEBS); when clear, PEBS is supported.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

294

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec 13 3 Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description TM2 Enable (R/W) When this bit is set (1) and the thermal sensor indicates that the die temperature is at the predetermined threshold, the Thermal Monitor 2 mechanism is engaged. TM2 will reduce the bus to core ratio and voltage according to the value last written to MSR_THERM2_CTL bits 15:0. When this bit is clear (0, default), the processor does not change the VID signals or the bus to core ratio when the processor enters a thermal managed state. If the TM2 feature flag (ECX[8]) is not set to 1 after executing CPUID with EAX = 1, then this feature is not supported and BIOS must not alter the contents of this bit location. The processor is operating out of spec if both this bit and the TM1 bit are set to disabled states. 17:14 18 19 3, 4, 6 Reserved. ENABLE MONITOR FSM (R/W) See Table 35-2. Adjacent Cache Line Prefetch Disable (R/W) When set to 1, the processor fetches the cache line of the 128-byte sector containing currently required data. When set to 0, the processor fetches both cache lines in the sector. Single processor platforms should not set this bit. Server platforms should set or clear this bit based on platform performance observed in validation and testing. BIOS may contain a setup option that controls the setting of this bit. 21:20 22 3, 4, 6 Reserved. Limit CPUID MAXVAL (R/W) See Table 35-2. Setting this can cause unexpected behavior to software that depends on the availability of CPUID leaves greater than 3. 23 Shared xTPR Message Disable (R/W) See Table 35-2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

295

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec 24 Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description L1 Data Cache Context Mode (R/W) When set, the L1 data cache is placed in shared mode; when clear (default), the cache is placed in adaptive mode. This bit is only enabled for IA-32 processors that support Intel Hyper-Threading Technology. See Section 11.5.6, L1 Data Cache Context Mode. When L1 is running in adaptive mode and CR3s are identical, data in L1 is shared across logical processors. Otherwise, L1 is not shared and cache use is competitive. If the Context ID feature flag (ECX[10]) is set to 0 after executing CPUID with EAX = 1, the ability to switch modes is not supported. BIOS must not alter the contents of IA32_MISC_ENABLE[24]. 33:25 34 63:35 1A1H 417 MSR_PLATFORM_BRV 17:0 18 3, 4, 6 Shared Unique Reserved. XD Bit Disable (R/W) See Table 35-2. Reserved. Platform Feature Requirements (R) Reserved. PLATFORM Requirements When set to 1, indicates the processor has specific platform requirements. The details of the platform requirements are listed in the respective data sheets of the processor. 63:19 1D7H 471 MSR_LER_FROM_LIP 0, 1, 2, 3, 4, 6 Unique Reserved. Last Exception Record From Linear IP (R) Contains a pointer to the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. See Section 17.9.3, Last Exception Records. 31:0 63:32 1D7H 471 63:0 Unique From Linear IP Linear address of the last branch instruction. Reserved. From Linear IP Linear address of the last branch instruction (If IA32e mode is active).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

296

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 1D8H Dec 472 MSR_LER_TO_LIP Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Unique Bit Description Last Exception Record To Linear IP (R) This area contains a pointer to the target of the last branch instruction that the processor executed prior to the last exception that was generated or the last interrupt that was handled. See Section 17.9.3, Last Exception Records. 31:0 From Linear IP Linear address of the target of the last branch instruction. 63:32 1D8H 472 63:0 Unique Reserved. From Linear IP Linear address of the target of the last branch instruction (If IA-32e mode is active). 1D9H 473 MSR_DEBUGCTLA 0, 1, 2, 3, 4, 6 Unique Debug Control (R/W) Controls how several debug features are used. Bit definitions are discussed in the referenced section. See Section 17.9.1, MSR_DEBUGCTLA MSR. 1DAH 474 MSR_LASTBRANCH _TOS 0, 1, 2, 3, 4, 6 Unique Last Branch Record Stack TOS (R) Contains an index (0-3 or 0-15) that points to the top of the last branch record stack (that is, that points the index of the MSR containing the most recent branch record). See Section 17.9.2, LBR Stack for Processors Based on Intel NetBurst Microarchitecture; and addresses 1DBH-1DEH and 680H-68FH. 1DBH 475 MSR_LASTBRANCH_0 0, 1, 2 Unique Last Branch Record 0 (R/W) One of four last branch record registers on the last branch record stack. It contains pointers to the source and destination instruction for one of the last four branches, exceptions, or interrupts that the processor took. MSR_LASTBRANCH_0 through MSR_LASTBRANCH_3 at 1DBH-1DEH are available only on family 0FH, models 0H-02H. They have been replaced by the MSRs at 680H68FH and 6C0H-6CFH. See Section 17.9, Last Branch, Interrupt, and Exception Recording (Processors based on Intel NetBurst Microarchitecture).

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

297

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 1DDH Dec 477 MSR_LASTBRANCH_2 Register Name Fields and Flags Model Availability 0, 1, 2 Shared/ Unique1 Unique Bit Description Last Branch Record 2 See description of the MSR_LASTBRANCH_0 MSR at 1DBH. 1DEH 478 MSR_LASTBRANCH_3 0, 1, 2 Unique Last Branch Record 3 See description of the MSR_LASTBRANCH_0 MSR at 1DBH. 200H 201H 202H 203H 204H 205H 206H 207H 208H 209H 20AH 20BH 20CH 20DH 20EH 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 IA32_MTRR_PHYSBASE0 IA32_MTRR_PHYSMASK0 IA32_MTRR_PHYSBASE1 IA32_MTRR_PHYSMASK1 IA32_MTRR_PHYSBASE2 IA32_MTRR_PHYSMASK2 IA32_MTRR_PHYSBASE3 IA32_MTRR_PHYSMASK3 IA32_MTRR_PHYSBASE4 IA32_MTRR_PHYSMASK4 IA32_MTRR_PHYSBASE5 IA32_MTRR_PHYSMASK5 IA32_MTRR_PHYSBASE6 IA32_MTRR_PHYSMASK6 IA32_MTRR_PHYSBASE7 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Variable Range Base MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

298

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 20FH 250H 258H 259H 268H 269H 26AH 26BH 26CH 26DH 26EH 26FH 277H 2FFH Dec 527 592 600 601 616 617 618 619 620 621 622 623 631 767 IA32_MTRR_PHYSMASK7 IA32_MTRR_FIX64K_00000 IA32_MTRR_FIX16K_80000 IA32_MTRR_FIX16K_A0000 IA32_MTRR_FIX4K_C0000 IA32_MTRR_FIX4K_C8000 IA32_MTRR_FIX4K_D0000 IA32_MTRR_FIX4K_D8000 IA32_MTRR_FIX4K_E0000 IA32_MTRR_FIX4K_E8000 IA32_MTRR_FIX4K_F0000 IA32_MTRR_FIX4K_F8000 IA32_PAT IA32_MTRR_DEF_TYPE Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Unique Shared Bit Description Variable Range Mask MTRR See Section 11.11.2.3, Variable Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Fixed Range MTRR See Section 11.11.2.2, Fixed Range MTRRs. Page Attribute Table See Section 11.11.2.2, Fixed Range MTRRs. Default Memory Types (R/W) See Table 35-2. See Section 11.11.2.1, IA32_MTRR_DEF_TYPE MSR. 300H 301H 302H 768 769 770 MSR_BPU_COUNTER0 MSR_BPU_COUNTER1 MSR_BPU_COUNTER2 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared Shared See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

299

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 303H 304H 305H 306H 307H 308H 309H 30AH 30BH 3OCH 3ODH 3OEH 3OFH 310H 311H 360H 361H 362H 363H 364H Dec 771 772 773 774 775 776 777 778 779 780 781 782 783 784 785 864 865 866 867 868 MSR_BPU_COUNTER3 MSR_MS_COUNTER0 MSR_MS_COUNTER1 MSR_MS_COUNTER2 MSR_MS_COUNTER3 MSR_FLAME_COUNTER0 MSR_FLAME_COUNTER1 MSR_FLAME_COUNTER2 MSR_FLAME_COUNTER3 MSR_IQ_COUNTER0 MSR_IQ_COUNTER1 MSR_IQ_COUNTER2 MSR_IQ_COUNTER3 MSR_IQ_COUNTER4 MSR_IQ_COUNTER5 MSR_BPU_CCCR0 MSR_BPU_CCCR1 MSR_BPU_CCCR2 MSR_BPU_CCCR3 MSR_MS_CCCR0 Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Bit Description See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.2, Performance Counters. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

300

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 365H 366H 367H 368H 369H 36AH 36BH 36CH 36DH 36EH 36FH 370H 371H 3A0H 3A1H 3A2H 3A3H 3A4H 3A5H 3A6H Dec 869 870 871 872 873 874 875 876 877 878 879 880 881 928 929 930 931 932 933 934 MSR_MS_CCCR1 MSR_MS_CCCR2 MSR_MS_CCCR3 MSR_FLAME_CCCR0 MSR_FLAME_CCCR1 MSR_FLAME_CCCR2 MSR_FLAME_CCCR3 MSR_IQ_CCCR0 MSR_IQ_CCCR1 MSR_IQ_CCCR2 MSR_IQ_CCCR3 MSR_IQ_CCCR4 MSR_IQ_CCCR5 MSR_BSU_ESCR0 MSR_BSU_ESCR1 MSR_FSB_ESCR0 MSR_FSB_ESCR1 MSR_FIRM_ESCR0 MSR_FIRM_ESCR1 MSR_FLAME_ESCR0 Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Bit Description See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.3, CCCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

301

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 3A7H 3A8H 3A9H 3AAH 3ABH 3ACH 3ADH 3AEH 3AFH 3B0H 3B1H 3B2H 3B3H 3B4H 3B5H 3B6H 3B7H 3B8H 3B9H Dec 935 936 937 938 939 940 941 942 943 944 945 946 947 948 949 950 951 952 953 MSR_FLAME_ESCR1 MSR_DAC_ESCR0 MSR_DAC_ESCR1 MSR_MOB_ESCR0 MSR_MOB_ESCR1 MSR_PMH_ESCR0 MSR_PMH_ESCR1 MSR_SAAT_ESCR0 MSR_SAAT_ESCR1 MSR_U2L_ESCR0 MSR_U2L_ESCR1 MSR_BPU_ESCR0 MSR_BPU_ESCR1 MSR_IS_ESCR0 MSR_IS_ESCR1 MSR_ITLB_ESCR0 MSR_ITLB_ESCR1 MSR_CRU_ESCR0 MSR_CRU_ESCR1 Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Bit Description See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

302

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 3BAH Dec 954 MSR_IQ_ESCR0 Register Name Fields and Flags Model Availability 0, 1, 2 Shared/ Unique1 Shared Bit Description See Section 18.11.1, ESCR MSRs. This MSR is not available on later processors. It is only available on processor family 0FH, models 01H-02H. 3BBH 955 MSR_IQ_ESCR1 0, 1, 2 Shared See Section 18.11.1, ESCR MSRs. This MSR is not available on later processors. It is only available on processor family 0FH, models 01H-02H. 3BCH 3BDH 3BEH 3C0H 3C1H 3C2H 3C3H 3C4H 3C5H 3C8H 3C9H 3CAH 3CBH 3CCH 3CDH 3E0H 956 957 958 960 961 962 963 964 965 968 969 970 971 972 973 992 MSR_RAT_ESCR0 MSR_RAT_ESCR1 MSR_SSU_ESCR0 MSR_MS_ESCR0 MSR_MS_ESCR1 MSR_TBPU_ESCR0 MSR_TBPU_ESCR1 MSR_TC_ESCR0 MSR_TC_ESCR1 MSR_IX_ESCR0 MSR_IX_ESCR0 MSR_ALF_ESCR0 MSR_ALF_ESCR1 MSR_CRU_ESCR2 MSR_CRU_ESCR3 MSR_CRU_ESCR4 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared Shared See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

303

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 3E1H 3FOH 3F1H Dec 993 1008 1009 MSR_CRU_ESCR5 MSR_TC_PRECISE_EVENT MSR_PEBS_ENABLE Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Shared Shared Bit Description See Section 18.11.1, ESCR MSRs. See Section 18.11.1, ESCR MSRs. Precise Event-Based Sampling (PEBS) (R/W) Controls the enabling of precise event sampling and replay tagging. See Table 19-24. Reserved. UOP Tag Enables replay tagging when set. 25 ENABLE_PEBS_MY_THR (R/W) Enables PEBS for the target logical processor when set; disables PEBS when clear (default). See Section 18.12.3, IA32_PEBS_ENABLE MSR, for an explanation of the target logical processor. This bit is called ENABLE_PEBS in IA-32 processors that do not support Intel HyperThreading Technology. 26 ENABLE_PEBS_OTH_THR (R/W) Enables PEBS for the target logical processor when set; disables PEBS when clear (default). See Section 18.12.3, IA32_PEBS_ENABLE MSR, for an explanation of the target logical processor. This bit is reserved for IA-32 processors that do not support Intel Hyper-Threading Technology. 63:27 3F2H 400H 401H 402H 1010 1024 1025 1026 MSR_PEBS_MATRIX_VERT IA32_MC0_CTL IA32_MC0_STATUS IA32_MC0_ADDR 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared Shared Shared Reserved. See Table 19-24. See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC0_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception.

12:0 23:13 24

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

304

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 403H Dec 1027 IA32_MC0_MISC Register Name Fields and Flags Model Availability 0, 1, 2, 3, 4, 6 Shared/ Unique1 Shared Bit Description See Section 15.3.2.4, IA32_MCi_MISC MSRs. The IA32_MC0_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC0_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 404H 405H 406H 1028 1029 1030 IA32_MC1_CTL IA32_MC1_STATUS IA32_MC1_ADDR 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared Shared See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC1_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 407H 1031 IA32_MC1_MISC Shared See Section 15.3.2.4, IA32_MCi_MISC MSRs. The IA32_MC1_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC1_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 408H 409H 40AH 1032 1033 1034 IA32_MC2_CTL IA32_MC2_STATUS IA32_MC2_ADDR 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

305

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 40BH Dec 1035 IA32_MC2_MISC Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description See Section 15.3.2.4, IA32_MCi_MISC MSRs. The IA32_MC2_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC2_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 40CH 40DH 40EH 1036 1037 1038 IA32_MC3_CTL IA32_MC3_STATUS IA32_MC3_ADDR 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared Shared See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC3_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 40FH 1039 IA32_MC3_MISC 0, 1, 2, 3, 4, 6 Shared See Section 15.3.2.4, IA32_MCi_MISC MSRs. The IA32_MC3_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC3_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 410H 411H 412H 1040 1041 1042 IA32_MC4_CTL IA32_MC4_STATUS IA32_MC4_ADDR 0, 1, 2, 3, 4, 6 0, 1, 2, 3, 4, 6 Shared Shared See Section 15.3.2.1, IA32_MCi_CTL MSRs. See Section 15.3.2.2, IA32_MCi_STATUS MSRS. See Section 15.3.2.3, IA32_MCi_ADDR MSRs. The IA32_MC2_ADDR register is either not implemented or contains no address if the ADDRV flag in the IA32_MC4_STATUS register is clear. When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 413H 1043 IA32_MC4_MISC See Section 15.3.2.4, IA32_MCi_MISC MSRs. The IA32_MC2_MISC MSR is either not implemented or does not contain additional information if the MISCV flag in the IA32_MC4_STATUS register is clear.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

306

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description When not implemented in the processor, all reads and writes to this MSR will cause a generalprotection exception. 480H 1152 IA32_VMX_BASIC 3, 4, 6 Unique Reporting Register of Basic VMX Capabilities (R/O) See Table 35-2. See Appendix A.1, Basic VMX Information. 481H 1153 IA32_VMX_PINBASED_CTLS 3, 4, 6 Unique Capability Reporting Register of Pin-based VM-execution Controls (R/O) See Table 35-2. See Appendix A.3, VM-Execution Controls. 482H 1154 IA32_VMX_PROCBASED_CTLS 3, 4, 6 Unique Capability Reporting Register of Primary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls, and see Table 35-2. 483H 1155 IA32_VMX_EXIT_CTLS 3, 4, 6 Unique Capability Reporting Register of VM-exit Controls (R/O) See Appendix A.4, VM-Exit Controls, and see Table 35-2. 484H 1156 IA32_VMX_ENTRY_CTLS 3, 4, 6 Unique Capability Reporting Register of VM-entry Controls (R/O) See Appendix A.5, VM-Entry Controls, and see Table 35-2. 485H 1157 IA32_VMX_MISC 3, 4, 6 Unique Reporting Register of Miscellaneous VMX Capabilities (R/O) See Appendix A.6, Miscellaneous Data, and see Table 35-2. 486H 1158 IA32_VMX_CR0_FIXED0 3, 4, 6 Unique Capability Reporting Register of CR0 Bits Fixed to 0 (R/O) See Appendix A.7, VMX-Fixed Bits in CR0, and see Table 35-2. 487H 1159 IA32_VMX_CR0_FIXED1 3, 4, 6 Unique Capability Reporting Register of CR0 Bits Fixed to 1 (R/O) See Appendix A.7, VMX-Fixed Bits in CR0, and see Table 35-2. 488H 1160 IA32_VMX_CR4_FIXED0 3, 4, 6 Unique Capability Reporting Register of CR4 Bits Fixed to 0 (R/O) See Appendix A.8, VMX-Fixed Bits in CR4, and see Table 35-2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

307

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 489H Dec 1161 IA32_VMX_CR4_FIXED1 Register Name Fields and Flags Model Availability 3, 4, 6 Shared/ Unique1 Unique Bit Description Capability Reporting Register of CR4 Bits Fixed to 1 (R/O) See Appendix A.8, VMX-Fixed Bits in CR4, and see Table 35-2. 48AH 1162 IA32_VMX_VMCS_ENUM 3, 4, 6 Unique Capability Reporting Register of VMCS Field Enumeration (R/O) See Appendix A.9, VMCS Enumeration, and see Table 35-2. 48BH 1163 IA32_VMX_PROCBASED_CTLS2 3, 4, 6 Unique Capability Reporting Register of Secondary Processor-based VM-execution Controls (R/O) See Appendix A.3, VM-Execution Controls, and see Table 35-2. 600H 1536 IA32_DS_AREA 0, 1, 2, 3, 4, 6 Unique DS Save Area (R/W) See Table 35-2. See Section 18.11.4, Debug Store (DS) Mechanism. 680H 1664 MSR_LASTBRANCH_0_FROM_IP 3, 4, 6 Unique Last Branch Record 0 (R/W) One of 16 pairs of last branch record registers on the last branch record stack (680H-68FH). This part of the stack contains pointers to the source instruction for one of the last 16 branches, exceptions, or interrupts taken by the processor. The MSRs at 680H-68FH, 6C0H-6CfH are not available in processor releases before family 0FH, model 03H. These MSRs replace MSRs previously located at 1DBH-1DEH.which performed the same function for early releases. See Section 17.9, Last Branch, Interrupt, and Exception Recording (Processors based on Intel NetBurst Microarchitecture). 681H 682H 683H 684H 685H 1665 1666 1667 1668 1669 MSR_LASTBRANCH_1_FROM_IP MSR_LASTBRANCH_2_FROM_IP MSR_LASTBRANCH_3_FROM_IP MSR_LASTBRANCH_4_FROM_IP MSR_LASTBRANCH_5_FROM_IP 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 Unique Unique Unique Unique Unique Last Branch Record 1 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 2 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 3 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 4 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 5 See description of MSR_LASTBRANCH_0 at 680H.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

308

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 686H 687H 688H 689H 68AH 68BH 68CH 68DH 68EH 68FH 6C0H Dec 1670 1671 1672 1673 1674 1675 1676 1677 1678 1679 1728 MSR_LASTBRANCH_6_FROM_IP MSR_LASTBRANCH_7_FROM_IP MSR_LASTBRANCH_8_FROM_IP MSR_LASTBRANCH_9_FROM_IP Register Name Fields and Flags Model Availability 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 Shared/ Unique1 Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Bit Description Last Branch Record 6 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 7 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 8 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 9 See description of MSR_LASTBRANCH_0 at 680H. MSR_LASTBRANCH_10_FROM_IP 3, 4, 6 MSR_LASTBRANCH_11_FROM_IP 3, 4, 6 MSR_LASTBRANCH_12_FROM_IP 3, 4, 6 MSR_LASTBRANCH_13_FROM_IP 3, 4, 6 MSR_LASTBRANCH_14_FROM_IP 3, 4, 6 MSR_LASTBRANCH_15_FROM_IP 3, 4, 6 MSR_LASTBRANCH_0_TO_IP 3, 4, 6 Last Branch Record 10 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 11 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 12 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 13 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 14 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 15 See description of MSR_LASTBRANCH_0 at 680H. Last Branch Record 0 (R/W) One of 16 pairs of last branch record registers on the last branch record stack (6C0H-6CFH). This part of the stack contains pointers to the destination instruction for one of the last 16 branches, exceptions, or interrupts that the processor took. See Section 17.9, Last Branch, Interrupt, and Exception Recording (Processors based on Intel NetBurst Microarchitecture). 6C1H 6C2H 6C3H 6C4H 1729 1730 1731 1732 MSR_LASTBRANCH_1_TO_IP MSR_LASTBRANCH_2_TO_IP MSR_LASTBRANCH_3_TO_IP MSR_LASTBRANCH_4_TO_IP 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 Unique Unique Unique Unique Last Branch Record 1 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 2 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 3 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 4 See description of MSR_LASTBRANCH_0 at 6C0H.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

309

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex 6C5H 6C6H 6C7H 6C8H 6C9H 6CAH 6CBH 6CCH 6CDH 6CEH 6CFH C000_ 0080H C000_ 0081H C000_ 0082H C000_ 0084H C000_ 0100H C000_ 0101H C000_ 0102H Dec 1733 1734 1735 1736 1737 1738 1739 1740 1741 1742 1743 MSR_LASTBRANCH_5_TO_IP MSR_LASTBRANCH_6_TO_IP MSR_LASTBRANCH_7_TO_IP MSR_LASTBRANCH_8_TO_IP MSR_LASTBRANCH_9_TO_IP MSR_LASTBRANCH_10_TO_IP MSR_LASTBRANCH_11_TO_IP MSR_LASTBRANCH_12_TO_IP MSR_LASTBRANCH_13_TO_IP MSR_LASTBRANCH_14_TO_IP MSR_LASTBRANCH_15_TO_IP IA32_EFER IA32_STAR IA32_LSTAR IA32_FMASK IA32_FS_BASE IA32_GS_BASE IA32_KERNEL_GSBASE Register Name Fields and Flags Model Availability 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 3, 4, 6 Shared/ Unique1 Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Unique Bit Description Last Branch Record 5 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 6 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 7 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 8 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 9 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 10 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 11 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 12 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 13 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 14 See description of MSR_LASTBRANCH_0 at 6C0H. Last Branch Record 15 See description of MSR_LASTBRANCH_0 at 6C0H. Extended Feature Enables See Table 35-2. System Call Target Address (R/W) See Table 35-2. IA-32e Mode System Call Target Address (R/W) See Table 35-2. System Call Flag Mask (R/W) See Table 35-2. Map of BASE Address of FS (R/W) See Table 35-2. Map of BASE Address of GS (R/W) See Table 35-2. Swap Target of BASE Address of GS (R/W) See Table 35-2.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

310

Table 35-15 MSRs in the Pentium 4 and Intel Xeon Processors (Contd.)
Register Address Hex Dec Register Name Fields and Flags Model Availability Shared/ Unique1 Bit Description

NOTES 1. For HT-enabled processors, there may be more than one logical processors per physical unit. If an MSR is Shared, this means that one MSR is shared between logical processors. If an MSR is unique, this means that each logical processor has its own MSR. ...

27.Updates to Appendix B, Volume 3C


Change bars show changes to Appendix B of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------...

B.1.1

16-Bit Control Fields

A value of 0 in bits 11:10 of an encoding indicates a control field. These fields are distinguished by their index value in bits 9:1. Table B-1 enumerates the 16-bit control fields.

Table B-1 Encoding for 16-Bit Control Fields (0000_00xx_xxxx_xxx0B)


Field Name Virtual-processor identifier
NOTES:

Index (VPID)1 vector2 000000000B 000000001B

Encoding 00000000H 00000002H

Posted-interrupt notification

1. This field exists only on processors that support the 1-setting of the enable VPID VM-execution control. 2. This field exists only on processors that support the 1-setting of the process posted interrupts VM-execution control.

B.1.2

16-Bit Guest-State Fields

A value of 2 in bits 11:10 of an encoding indicates a field in the guest-state area. These fields are distinguished by their index value in bits 9:1. Table B-2 enumerates 16-bit guest-state fields.

Table B-2 Encodings for 16-Bit Guest-State Fields (0000_10xx_xxxx_xxx0B)


Field Name Guest ES selector Guest CS selector Guest SS selector Guest DS selector Guest FS selector Guest GS selector Guest LDTR selector Index 000000000B 000000001B 000000010B 000000011B 000000100B 000000101B 000000110B Encoding 00000800H 00000802H 00000804H 00000806H 00000808H 0000080AH 0000080CH

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

311

Table B-2 Encodings for 16-Bit Guest-State Fields (0000_10xx_xxxx_xxx0B) (Contd.)


Field Name Guest TR selector Guest interrupt status
NOTES:
1

Index 000000111B 000001000B

Encoding 0000080EH 00000810H

1. This field exists only on processors that support the 1-setting of the virtual-interrupt delivery VM-execution control. ...

B.2.1

64-Bit Control Fields


Table B-4 Encodings for 64-Bit Control Fields (0010_00xx_xxxx_xxxAb)

A value of 0 in bits 11:10 of an encoding indicates a control field. These fields are distinguished by their index value in bits 9:1. Table B-4 enumerates the 64-bit control fields. Field Name Address of I/O bitmap A (full) Address of I/O bitmap A (high) Address of I/O bitmap B (full) Address of I/O bitmap B (high) Address of MSR bitmaps (full)1 Address of MSR bitmaps (high)1 VM-exit MSR-store address (full) VM-exit MSR-store address (high) VM-exit MSR-load address (full) VM-exit MSR-load address (high) VM-entry MSR-load address (full) VM-entry MSR-load address (high) Executive-VMCS pointer (full) Executive-VMCS pointer (high) TSC offset (full) TSC offset (high) Virtual-APIC address (full)2 Virtual-APIC address (high)2 APIC-access address (full)3 APIC-access address (high)3 Posted-interrupt descriptor address VM-function controls (full)5 VM-function controls (high)
5

Index 000000000B 000000001B 000000010B 000000011B 000000100B 000000101B 000000110B 000001000B 000001001B 000001010B (full)4 000001011B 000001100B

Encoding 00002000H 00002001H 00002002H 00002003H 00002004H 00002005H 00002006H 00002007H 00002008H 00002009H 0000200AH 0000200BH 0000200CH 0000200DH 00002010H 00002011H 00002012H 00002013H 00002014H 00002015H 00002016H 00002017H 00002018H 00002019H

Posted-interrupt descriptor address (high)4

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

312

Table B-4 Encodings for 64-Bit Control Fields (0010_00xx_xxxx_xxxAb) (Contd.)


Field Name EPT pointer (EPTP; full)
6 6

Index 000001101B full)7


7 7

Encoding 0000201AH 0000201BH 0000201CH 0000201DH 0000201EH 0000201FH 00002020H 00002021H 00002022H 00002023H 00002024H 00002025H

EPT pointer (EPTP; high)

EOI-exit bitmap 0 (EOI_EXIT0;

EOI-exit bitmap 0 (EOI_EXIT0; high) EOI-exit bitmap 1 (EOI_EXIT1; full) EOI-exit bitmap 1 (EOI_EXIT1; EOI-exit bitmap 2 (EOI_EXIT2; EOI-exit bitmap 3 (EOI_EXIT3; EPTP-list address EPTP-list address
NOTES:

000001110B 000001111B 000010000B 000010001B 000010010B

high)7
7

EOI-exit bitmap 2 (EOI_EXIT2; full)

high)7 full)7
7

EOI-exit bitmap 3 (EOI_EXIT3; high) (full)8 (high)8

1. This field exists only on processors that support the 1-setting of the use MSR bitmaps VM-execution control. 2. This field exists only on processors that support either the 1-setting of the use TPR shadow VM-execution control. 3. This field exists only on processors that support the 1-setting of the virtualize APIC accesses VM-execution control. 4. This field exists only on processors that support the 1-setting of the process posted interrupts VM-execution control. 5. This field exists only on processors that support the 1-setting of the enable VM functions VM-execution control. 6. This field exists only on processors that support the 1-setting of the enable EPT VM-execution control. 7. This field exists only on processors that support the 1-setting of the virtual-interrupt delivery VM-execution control. 8. This field exists only on processors that support the 1-setting of the EPTP switching VM-function control.

B.2.2

64-Bit Read-Only Data Field

A value of 1 in bits 11:10 of an encoding indicates a read-only data field. These fields are distinguished by their index value in bits 9:1. There is only one such 64-bit field as given in Table B-5.(As with other 64-bit fields, this one has two encodings.)

Table B-5 Encodings for 64-Bit Read-Only Data Field (0010_01xx_xxxx_xxxAb)


Field Name Guest-physical address (full)1 Guest-physical address (high)1
NOTES:

Index 000000000B

Encoding 00002400H 00002401H

1. This field exists only on processors that support the 1-setting of the "enable EPT VM-execution control.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

313

B.2.3

64-Bit Guest-State Fields

A value of 2 in bits 11:10 of an encoding indicates a field in the guest-state area. These fields are distinguished by their index value in bits 9:1. Table B-6 enumerates the 64-bit guest-state fields.

Table B-6 Encodings for 64-Bit Guest-State Fields (0010_10xx_xxxx_xxxAb)


Field Name VMCS link pointer (full) VMCS link pointer (high) Guest IA32_DEBUGCTL (full) Guest IA32_DEBUGCTL (high) Guest IA32_PAT (full)1
1

Index 000000000B 000000001B 000000010B 000000011B (full)3 (high)3 000000100B 000000101B 000000110B 000000111B 000001000B

Encoding 00002800H 00002801H 00002802H 00002803H 00002804H 00002805H 00002806H 00002807H 00002808H 00002809H 0000280AH 0000280BH 0000280CH 0000280DH 0000280EH 0000280FH 00002810H 00002811H

Guest IA32_PAT (high) Guest IA32_EFER Guest IA32_EFER

(full)2 (high)2

Guest IA32_PERF_GLOBAL_CTRL Guest IA32_PERF_GLOBAL_CTRL Guest PDPTE0 Guest PDPTE1 Guest PDPTE1 Guest PDPTE2 Guest PDPTE2 Guest PDPTE3
NOTES:

(full)4
4

Guest PDPTE0 (high)

(full)4 (high)4 (full)4 (high)4 (full)4


4

Guest PDPTE3 (high)

1. This field exists only on processors that support either the 1-setting of the "load IA32_PAT" VM-entry control or that of the "save IA32_PAT" VM-exit control. 2. This field exists only on processors that support either the 1-setting of the "load IA32_EFER" VM-entry control or that of the "save IA32_EFER" VM-exit control. 3. This field exists only on processors that support the 1-setting of the "load IA32_PERF_GLOBAL_CTRL" VM-entry control. 4. This field exists only on processors that support the 1-setting of the "enable EPT" VM-execution control.

B.2.4

64-Bit Host-State Fields

A value of 3 in bits 11:10 of an encoding indicates a field in the host-state area. These fields are distinguished by their index value in bits 9:1. Table B-7 enumerates the 64-bit control fields.

Table B-7 Encodings for 64-Bit Host-State Fields (0010_11xx_xxxx_xxxAb)


Field Name Host IA32_PAT (full) Host IA32_PAT
1

Index (high)1 000000000B

Encoding 00002C00H 00002C01H

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

314

Table B-7 Encodings for 64-Bit Host-State Fields (0010_11xx_xxxx_xxxAb) (Contd.)


Field Name Host IA32_EFER (full)
2 2

Index 000000001B (full)3


3

Encoding 00002C02H 00002C03H 00002C04H 00002C05H

Host IA32_EFER (high)

Host IA32_PERF_GLOBAL_CTRL
NOTES:

Host IA32_PERF_GLOBAL_CTRL (high)

000000010B

1. This field exists only on processors that support the 1-setting of the "load IA32_PAT" VM-exit control. 2. This field exists only on processors that support the 1-setting of the "load IA32_EFER" VM-exit control. 3. This field exists only on processors that support the 1-setting of the "load IA32_PERF_GLOBAL_CTRL" VM-exit control. ...

28.Updates to Appendix C, Volume 3C


Change bars show changes to Appendix C of the Intel 64 and IA-32 Architectures Software Developers Manual, Volume 3C: System Programming Guide, Part 3. -----------------------------------------------------------------------------------------...

Table C-1 Basic Exit Reasons


Basic Exit Reason 0 Description Exception or non-maskable interrupt (NMI). Either: 1: Guest software caused an exception and the bit in the exception bitmap associated with exceptions vector was 1. 2: An NMI was delivered to the logical processor and the NMI exiting VM-execution control was 1. This case includes executions of BOUND that cause #BR, executions of INT3 (they cause #BP), executions of INTO that cause #OF, and executions of UD2 (they cause #UD). 1 2 3 4 5 6 7 8 9 10 11 External interrupt. An external interrupt arrived and the external-interrupt exiting VM-execution control was 1. Triple fault. The logical processor encountered an exception while attempting to call the double-fault handler and that exception did not itself cause a VM exit due to the exception bitmap. INIT signal. An INIT signal arrived Start-up IPI (SIPI). A SIPI arrived while the logical processor was in the wait-for-SIPI state. I/O system-management interrupt (SMI). An SMI arrived immediately after retirement of an I/O instruction and caused an SMM VM exit (see Section 34.15.2). Other SMI. An SMI arrived and caused an SMM VM exit (see Section 34.15.2) but not immediately after retirement of an I/O instruction. Interrupt window. At the beginning of an instruction, RFLAGS.IF was 1; events were not blocked by STI or by MOV SS; and the interrupt-window exiting VM-execution control was 1. NMI window. At the beginning of an instruction, there was no virtual-NMI blocking; events were not blocked by MOV SS; and the NMI-window exiting VM-execution control was 1. Task switch. Guest software attempted a task switch. CPUID. Guest software attempted to execute CPUID. GETSEC. Guest software attempted to execute GETSEC.

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

315

Table C-1 Basic Exit Reasons (Contd.)


Basic Exit Reason 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 Description HLT. Guest software attempted to execute HLT and the HLT exiting VM-execution control was 1. INVD. Guest software attempted to execute INVD. INVLPG. Guest software attempted to execute INVLPG and the INVLPG exiting VM-execution control was 1. RDPMC. Guest software attempted to execute RDPMC and the RDPMC exiting VM-execution control was 1. RDTSC. Guest software attempted to execute RDTSC and the RDTSC exiting VM-execution control was 1. RSM. Guest software attempted to execute RSM in SMM. VMCALL. VMCALL was executed either by guest software (causing an ordinary VM exit) or by the executive monitor (causing an SMM VM exit; see Section 34.15.2). VMCLEAR. Guest software attempted to execute VMCLEAR. VMLAUNCH. Guest software attempted to execute VMLAUNCH. VMPTRLD. Guest software attempted to execute VMPTRLD. VMPTRST. Guest software attempted to execute VMPTRST. VMREAD. Guest software attempted to execute VMREAD. VMRESUME. Guest software attempted to execute VMRESUME. VMWRITE. Guest software attempted to execute VMWRITE. VMXOFF. Guest software attempted to execute VMXOFF. VMXON. Guest software attempted to execute VMXON. Control-register accesses. Guest software attempted to access CR0, CR3, CR4, or CR8 using CLTS, LMSW, or MOV CR and the VM-execution control fields indicate that a VM exit should occur (see Section 25.1 for details). This basic exit reason is not used for trap-like VM exits following executions of the MOV to CR8 instruction when the use TPR shadow VM-execution control is 1. MOV DR. Guest software attempted a MOV to or from a debug register and the MOV-DR exiting VM-execution control was 1. I/O instruction. Guest software attempted to execute an I/O instruction and either: 1: The use I/O bitmaps VM-execution control was 0 and the unconditional I/O exiting VM-execution control was 1. 2: The use I/O bitmaps VM-execution control was 1 and a bit in the I/O bitmap associated with one of the ports accessed by the I/O instruction was 1. 31 RDMSR. Guest software attempted to execute RDMSR and either: 1: The use MSR bitmaps VM-execution control was 0. 2: The value of RCX is neither in the range 00000000H 00001FFFH nor in the range C0000000H C0001FFFH. 3: The value of RCX was in the range 00000000H 00001FFFH and the nth bit in read bitmap for low MSRs is 1, where n was the value of RCX. 4: The value of RCX is in the range C0000000H C0001FFFH and the nth bit in read bitmap for high MSRs is 1, where n is the value of RCX & 00001FFFH. 32 WRMSR. Guest software attempted to execute WRMSR and either: 1: The use MSR bitmaps VM-execution control was 0. 2: The value of RCX is neither in the range 00000000H 00001FFFH nor in the range C0000000H C0001FFFH. 3: The value of RCX was in the range 00000000H 00001FFFH and the nth bit in write bitmap for low MSRs is 1, where n was the value of RCX. 4: The value of RCX is in the range C0000000H C0001FFFH and the nth bit in write bitmap for high MSRs is 1, where n is the value of RCX & 00001FFFH. 33 VM-entry failure due to invalid guest state. A VM entry failed one of the checks identified in Section 26.3.1.

29 30

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

316

Table C-1 Basic Exit Reasons (Contd.)


Basic Exit Reason 34 36 37 39 40 Description VM-entry failure due to MSR loading. A VM entry failed in an attempt to load MSRs. See Section 26.4. MWAIT. Guest software attempted to execute MWAIT and the MWAIT exiting VM-execution control was 1. Monitor trap flag. A VM entry occurred due to the 1-setting of the monitor trap flag VM-execution control and injection of an MTF VM exit as part of VM entry. See Section 25.5.2. MONITOR. Guest software attempted to execute MONITOR and the MONITOR exiting VM-execution control was 1. PAUSE. Either guest software attempted to execute PAUSE and the PAUSE exiting VM-execution control was 1 or the PAUSE-loop exiting VM-execution control was 1 and guest software executed a PAUSE loop with execution time exceeding PLE_Window (see Section 25.1.3). VM-entry failure due to machine-check event. A machine-check event occurred during VM entry (see Section 26.8). TPR below threshold. The logical processor determined that the value of bits 7:4 of the byte at offset 080H on the virtual-APIC page was below that of the TPR threshold VM-execution control field while the use TPR shadow VMexecution control was 1 either as part of TPR virtualization (Section 29.1.2) or VM entry (Section 26.6.7). APIC access. Guest software attempted to access memory at a physical address on the APIC-access page and the virtualize APIC accesses VM-execution control was 1 (see Section 29.4). Virtualized EOI. EOI virtualization was performed for a virtual interrupt whose vector indexed a bit set in the EOI-exit bitmap. Access to GDTR or IDTR. Guest software attempted to execute LGDT, LIDT, SGDT, or SIDT and the descriptor-table exiting VM-execution control was 1. Access to LDTR or TR. Guest software attempted to execute LLDT, LTR, SLDT, or STR and the descriptor-table exiting VM-execution control was 1. EPT violation. An attempt to access memory with a guest-physical address was disallowed by the configuration of the EPT paging structures. EPT misconfiguration. An attempt to access memory with a guest-physical address encountered a misconfigured EPT paging-structure entry. INVEPT. Guest software attempted to execute INVEPT. RDTSCP. Guest software attempted to execute RDTSCP and the enable RDTSCP and RDTSC exiting VM-execution controls were both 1. VMX-preemption timer expired. The preemption timer counted down to zero. INVVPID. Guest software attempted to execute INVVPID. WBINVD. Guest software attempted to execute WBINVD and the WBINVD exiting VM-execution control was 1. XSETBV. Guest software attempted to execute XSETBV. APIC write. Guest software completed a write to the virtual-APIC page that must be virtualized by VMM software (see Section 29.4.3.3). RDRAND. Guest software attempted to execute RDRAND and the RDRAND exiting VM-execution control was 1. INVPCID. Guest software attempted to execute INVPCID and the enable INVPCID and INVLPG exiting VM-execution controls were both 1. VMFUNC. Guest software invoked a VM function with the VMFUNC instruction and the VM function either was not enabled or generated a function-specific condition causing a VM exit.

41 43

44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59

...

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

317

Intel 64 and IA-32 Architectures Software Developers Manual Documentation Changes

318

You might also like