Intel® X86 Encoder Decoder
Fast Encoder for Specific Instructions

The basic idea for the ENC2 fast encoder is that there is one encode function per variant of every instruction. The instructions are encoded in 3 encoding spaces (legacy, VEX and EVEX). We need to have different function names for every variation as well. To come up with unique names, ENC2 uses a few function naming conventions. For legacy encoded instructions, we often have 3 variations in 64b mode (2 in other modes) to handle 16-bit, 32-bit and 64-bit operands. Those 3 sizes are usually differentiated with "_o16", "_o32" and "_o64" in the ENC2 function names. Having unique names is complicated as there are often multiple encodings for the same operation in the instrution set. To disambiguate alias encodings the some function names include substring "_vrN" where N is a integer. Simlarly, VEX and EVEX encodings for related instructions often need to be distinguished when their instruction name and operands are the same. To accomplish that all ENC2 EVEX encoding functions names contain the substring "_e". The checked interface functions end with "_chk". More...

Data Structures

struct  xed_enc2_req_payload_t
 This structure is filled in by the various XED ENC2 functions. More...
 
union  xed_enc2_req_t
 A wrapper for xed_enc2_req_payload_t . More...
 

Functions

XED_DLL_EXPORT void xed_emit_seg_prefix (xed_enc2_req_t *r, xed_reg_enum_t reg)
 Emit a legacy segment prefix byte in to the specified request's output buffer. More...
 
static XED_INLINE xed_uint32_t xed_enc2_encoded_length (xed_enc2_req_t *r)
 Returns the number of bytes that were used for the encoding. More...
 
XED_DLL_EXPORT void xed_enc2_error (const char *fmt,...)
 The error handler routine. More...
 
static XED_INLINE void xed_enc2_req_t_init (xed_enc2_req_t *r, xed_uint8_t *output_buffer)
 Zero out a xed_enc2_req_t structure and set the output pointer. More...
 
XED_DLL_EXPORT void xed_enc2_set_check_args (xed_bool_t on)
 turn off (or on) argument checking if using the checked encoder interface. More...
 
XED_DLL_EXPORT void xed_enc2_set_error_handler (xed_user_abort_handler_t *fn)
 Set a function taking a variable-number-of-arguments (stdarg) to handle the errors and die. More...
 

Detailed Description

The basic idea for the ENC2 fast encoder is that there is one encode function per variant of every instruction. The instructions are encoded in 3 encoding spaces (legacy, VEX and EVEX). We need to have different function names for every variation as well. To come up with unique names, ENC2 uses a few function naming conventions. For legacy encoded instructions, we often have 3 variations in 64b mode (2 in other modes) to handle 16-bit, 32-bit and 64-bit operands. Those 3 sizes are usually differentiated with "_o16", "_o32" and "_o64" in the ENC2 function names. Having unique names is complicated as there are often multiple encodings for the same operation in the instrution set. To disambiguate alias encodings the some function names include substring "_vrN" where N is a integer. Simlarly, VEX and EVEX encodings for related instructions often need to be distinguished when their instruction name and operands are the same. To accomplish that all ENC2 EVEX encoding functions names contain the substring "_e". The checked interface functions end with "_chk".

For instructions that take conventional x86 memory operands, there are 6 functions generated depending on the addressing mode required. The 6 functions are denoted: b, bd8, bd32, bis, bids8, and bisd32 where:

The idea behind having different functions for the different addressing modes is to make the encode functions simpler and more straight-line code. Memory instructions also indicate their effective addressing width with one of "_a16", "_a32" or "_a64" substrings.

The libraries for the ENC2 encoder are built when when includes the "--enc2" switch during the build process. There is one set of libraries and headers generated for each supported configuration. Currently Intel® XED ENC2 supports 64b mode with 64b addrssing (m64,a64) and 32b mode with 32b addressing (m32,a32). The build process creates an enc2-m64-a64 directory and an enc2-m32-a32 directory, each with two libraries for the checked and unchecked interfaces. There are 2 headers as well, one for each version of each library in the hdr/xed subdirectory of their respective enc2-* directory. On linux, for a static build, you'd see:

enc2-m64-a64/
libxed-chk-enc2-m64-a64.a
libxed-enc2-m64-a64.a
hdr/
xed/
xed-chk-enc2-m64-a64.h
xed-enc2-m64-a64.h

Given the large size of the generated ENC2 headers, doxygen documentation is not created for those header files. Please view the headers directly in your editor.

Even with the unchecked interface, some register checking is done the addressing registers. In the x86 encoding system, some choices of base register require that an 8-bit or 32-bit displacement is also used. In those cases, the ENC2 encoder is capable of supplying a zero-valued displacement.

Users can install their own error handler by calling xed_enc2_set_error_handler() passing a function pointer that takes stdarg variable arguments. See examples/xed-enc2-2.c for an example.

When using the checked interface, one can disable the checking at runtime by calling xed_enc2_set_check_args() with an integer value 0. With a nonzero argument, the argument checking can be re-enabled.

To minimize copying, ENC2 users are required to supply a pointer to an output buffer where the encoding bytes will be placed. That buffer is required to be 15 bytes in length. Valid x86 encodings are shorter than 15 bytes and only reach that length if redudant legacy prefixes are employed. XED ENC2 does not generate redundant legacy prefixes.

Here is an example of creating an LEA instruction using the checked interface and several fixed registers:

xed_uint32_t create_lea_64b(xed_uint8_t* output_buffer)
{
xed_reg_enum_t dest, base, index;
xed_uint_t scale;
xed_int32_t disp32;
xed_enc2_req_t request;
xed_enc2_req_t_init(&request, output_buffer);
dest = XED_REG_R11;
base = XED_REG_R12;
index = XED_REG_R13;
scale = 1;
disp32 = 0x11223344;
xed_enc_lea_rm_q_bisd32_a64_chk(&request,
dest,
base, index, scale, disp32);
return xed_enc2_encoded_length(&request);
}

The call to xed_enc2_req_t_init() zeros out the request structure and sets up the pointer to the output buffer. It is very important to zero the request structure before using it as much of the ENC2 code is optimized to not set zero-valued bits to zero. The call to xed_enc2_encoded_length() returns the number of bytes placed in the output buffer. Getting the length of the encoding is useful for setting the correct buffer pointer for subsequent encoder requests.

See examples/xed-enc2-1.c and examples/xed-enc2-2.c for examples.

Function Documentation

◆ xed_emit_seg_prefix()

XED_DLL_EXPORT void xed_emit_seg_prefix ( xed_enc2_req_t r,
xed_reg_enum_t  reg 
)

Emit a legacy segment prefix byte in to the specified request's output buffer.

◆ xed_enc2_encoded_length()

static XED_INLINE xed_uint32_t xed_enc2_encoded_length ( xed_enc2_req_t r)
static

Returns the number of bytes that were used for the encoding.

◆ xed_enc2_error()

XED_DLL_EXPORT void xed_enc2_error ( const char *  fmt,
  ... 
)

The error handler routine.

This function is called by encoder functions upon detecting argument errors. It fist attempts to call the user-registered handler (configured by xed_enc2_set_error_handler() ), or if no user handler is set, then this function calls printf() and then abort(). If the user handler returns, abort() is still called.

◆ xed_enc2_req_t_init()

static XED_INLINE void xed_enc2_req_t_init ( xed_enc2_req_t r,
xed_uint8_t *  output_buffer 
)
static

Zero out a xed_enc2_req_t structure and set the output pointer.

Required before calling and any ENC2 encoding function.

◆ xed_enc2_set_check_args()

XED_DLL_EXPORT void xed_enc2_set_check_args ( xed_bool_t  on)

turn off (or on) argument checking if using the checked encoder interface.

values 1, 0

◆ xed_enc2_set_error_handler()

XED_DLL_EXPORT void xed_enc2_set_error_handler ( xed_user_abort_handler_t fn)

Set a function taking a variable-number-of-arguments (stdarg) to handle the errors and die.

The argument are like printf with a format string followed by a varaible number of arguments.

xed_enc2_req_t
A wrapper for xed_enc2_req_payload_t .
Definition: xed-encode-direct.h:74
xed_enc2_encoded_length
static XED_INLINE xed_uint32_t xed_enc2_encoded_length(xed_enc2_req_t *r)
Returns the number of bytes that were used for the encoding.
Definition: xed-encode-direct.h:88
xed_uint_t
unsigned int xed_uint_t
Definition: xed-types.h:52
XED_REG_R13
@ XED_REG_R13
Definition: xed-reg-enum.h:560
xed_enc2_req_t_init
static XED_INLINE void xed_enc2_req_t_init(xed_enc2_req_t *r, xed_uint8_t *output_buffer)
Zero out a xed_enc2_req_t structure and set the output pointer.
Definition: xed-encode-direct.h:81
XED_REG_R12
@ XED_REG_R12
Definition: xed-reg-enum.h:559
xed_reg_enum_t
xed_reg_enum_t
Definition: xed-reg-enum.h:448
XED_REG_R11
@ XED_REG_R11
Definition: xed-reg-enum.h:558