SLQ Server: Row and Index Storage

The content of a memory-optimized table is stored differently than the content of traditional tables. As you already know from Chapter 15, traditional tables are disk-based tables, meaning that their data and corresponding indices are stored on physical pages.

On the other hand, memory-optimized tables are designed according to byte-addressable memory.

NOTE Byte-addressable means that, given an “address,” this address refers to a single block of 8 bits. The most important consequence is that a byte becomes a working unit. In other words, a byte of memory will be read (or written) with a single read (write) call.

The following subsections give just a high-level overview of row storage of memory- optimized tables. For a detailed description, please refer to Microsoft Docs and white papers concerning these tables.

1. Row Storage

Rows of a memory-optimized table are connected using the tables’ indices, and the index provides the structure for such tables. That is a reason why a memory-optimized table must have at least an index created on it.

Also, the structure of a row of a memory-optimized table is significantly different than the structure of a row of a disk-based table. Each row of a memory-optimized table consists of a header and the corresponding content. The header contains several fields, of which we will explore the two most important ones: Begin Timestamp (Begin-TS) and End Timestamp (End-TS). The value of Begin-TS is the timestamp of the transaction that inserted the row, and the End-TS value is the timestamp for the transaction that deleted the row. That way, there is always information for each row regarding when it was inserted and when it was deleted. If a row is not (yet) deleted, the special value called “infinity” is used for the End-TS value, so the system can differentiate between current and deleted rows. (All deleted rows have the End-TS value, which is different from infinity.)

2. Index Storage

As you already know, each memory-optimized table must have at least one index. There are two types of indices supported for memory-optimized tables:

  • Hash index
  • Nonclustered index

The following two subsections describe these index types.

2.1. Hash Index

A hash index consists of a collection of buckets organized in an array. Besides buckets, each hash index has a hash function, which determines the storage place for each index entry.

In other words, a hash function is applied to each index key, and the result of the function determines the bucket where the particular row will be stored. All key values that hash to the same value are linked together in a chain, and can be accessed with the same pointer in the hash index. Also, when a row is inserted in an indexed table, the hash function will be applied to the index key value in that row and the value of the function again determines the corresponding bucket.

To explain hash indices, let’s take another look at the code from Example 21.3, where such an index is created:

USE sample2;

CREATE TABLE employee_mem

( emp_no int not null, emp_lname CHAR(20) NOT NULL , emp_fname CHAR(20) NOT NULL, dept_no CHAR(4) NOT NULL,

PRIMARY KEY NONCLUSTERED HASH (emp_no)

WITH (BUCKET_COUNT = 1000))

WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

First, you can create an index for a memory-optimized table only by using the CREATE TABLE statement. In other words, Transact-SQL does not support the CREATE INDEX statement for memory-optimized tables. The index created in Example 21.3 is a primary hash index, and it is created at table level for the emp_no column. (For a discussion about table-level and column-level constraints, see Examples 5.7 and 5.8 in Chapter 5.)

When a hash index is created, you must specify a number of buckets using the BUCKET_ COUNT option. It is recommended to choose a number of buckets equal to or greater than the expected number of unique values of the index key column. That way you increase the probability that each bucket will have only rows with a single value in its chain. The number you supply is rounded up to the next power of two, so the value used in Example 21.3 (1000) will be rounded up to 1024.

NOTE A hash table uses a hash function to compute an index into an array of buckets or slots, from which the desired value can be found. This index is called hash index.

2.2. Memory-Optimized Nonclustered Index

The general structure of a memory-optimized nonclustered index is similar to the traditional B-tree data structure, except that index pages are not a fixed sized, and once they are built they are unchangeable. Like a regular B-tree page, each index page contains a set of ordered key values, and for each value there is a corresponding pointer. Example 21.4 shows the creation of a nonclustered index.

NOTE Hash indices and memory-optimized nonclustered indices exist only in memory. The index structure is created when the memory-optimized table is created in memory, both during CREATE TABLE and during database startup. All index operations are not logged in the transaction log.

Example 21.4

CREATE TABLE employee_mem2

( emp_no int not null, emp_lname CHAR(20) NOT NULL ,

emp_fname CHAR(20) NOT NULL, dept_no CHAR(4) NOT NULL,

PRIMARY KEY NONCLUSTERED (emp_no) )

WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

Again, a nonclustered index can be created only using the CREATE TABLE statement. The creation of a nonclustered index differs from the creation of a hash index in two ways. First, the NONCLUSTERED option is used instead of NONCLUSTERED HASH. Second, the number of buffers is not specified, because buffers exist only in relation to hash indices.

NOTE If most of your queries are range queries, the use of a nonclustered index is recommended.

Example 21.5 creates a table with two indices: one hash index and one nonclustered index.

Example 21.5

USE sample2;

CREATE TABLE employee_mem4

(emp_no int not null PRIMARY KEY NONCLUSTERED HASH WITH (BUCKET_COUNT=1000),

emp_lname CHAR(20) NOT NULL ,

emp_fname CHAR(20) NOT NULL, dept_no CHAR(4) NOT NULL,

INDEX i_lname NONCLUSTERED (emp_lname))

WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA);

The index on the emp_no column is a hash index, while the i_lname index is a nonclustered index on the emp_lname column. (The former is defined at the column level, while the latter is defined at the table level.)

There are two significant differences between a memory-optimized nonclustered index and the B+ tree structure of the Database Engine. First, a page pointer of the memory-optimized nonclustered index contains a logical page identifier (PID), while a pointer of the B+ tree is always a physical page number. A PID indicates a position in a corresponding mapping table, and it connects each index entry with a physical memory address. Index pages of a memory-optimized nonclustered index are never updated; instead, they are replaced with a new page and the mapping table is updated so that the same PID indicates a new physical memory address.

The second significant difference between memory-optimized nonclustered indices and B-trees is that at the leaf level, modifications caused by DML operations on memory-optimized tables with a nonclustered index are stored in a set of delta values, which together build a delta file. (Delta value means that only the modified part of a row is stored, while the unchanged part exists in the original row.) In other words, each update to a logical page, which can be an insert or delete of a key value on that page, produces a logical page containing a delta record indicating the change that was made. An update is represented by two new delta records, one for the delete and one for the insert of the new value.

Source: Petkovic Dusan (2020), Microsoft SQL Server 2019: A Beginner’s Guide, Seventh Edition-McGraw-Hill Education.

Leave a Reply

Your email address will not be published. Required fields are marked *