SQL Server Graph Databases: Creating Node Tables and Edge Tables

As you will see in the following examples, the creation of node tables and edge tables is straightforward. In other words, if you know the syntax of the CREATE TABLE statement for regular relational tables, it will be very easy for you to create node tables and edge tables. All examples that follow use the model of a graph shown in Figure 31-1.

The graph in Figure 31-1 contains three nodes: Employee, Company, and City. The Employee entity has four attributes (properties): ID, Name, Age, and Sex. The Company entity has four attributes: ID, Name, Sector, and City, while City has three attributes: ID, Cityname, and Statename.

The graph contains three edges: WorksIn, LocatedIn, and LivesIn. WorksIn has a property called Starts that specifies the year in which the particular employee started to work for the specified company. Similarly, the LivesIn edge has the Since property that specifies the date on which an employee moved to a particular city. (The third edge does not have any additional properties.) All three relationships are nonrecursive, meaning that they all connect two different nodes.

1. Creating Node Tables

A node table can be created in any user-defined database. The creation of such a table is very similar to creating a regular relational table, with one extension. Example 31.1 shows the creation of three node tables, corresponding to the entities shown in Figure 31-1.

Example 31.1

CREATE DATABASE graph_db;

GO;

USE graph_db;

CREATE TABLE dbo.Company (

ID INT NOT NULL PRIMARY KEY,

name VARCHAR (100) NULL,

sector VARCHAR(25) NULL,

city VARCHAR (100) NULL) AS NODE;

CREATE TABLE dbo.Employee (

ID INT NOT NULL PRIMARY KEY,

name VARCHAR (100) NULL,

age INT NULL,

sex char (10) NULL) AS NODE;

CREATE TABLE dbo.City (

ID INT NOT NULL PRIMARY KEY,

name VARCHAR(100) NULL,

stateName VARCHAR(100) NULL) AS NODE;

The most important extension concerning node tables is the AS NODE clause, written at the end of the CREATE TABLE statement. This clause defines the corresponding table as a node table. When you specify this clause, the system adds two new columns of the BIT data type to the sys.tables catalog view: is_node and is_edge. For a node table, the value of is_node is set to 1, and the value of is_edge is set to 0. (A detailed description of metadata concerning graph databases is provided in the section “Editing Information Concerning Graph Databases” at the end of this chapter.)

Whenever you create a node table, along with the user-defined columns, an implicit $node id column is created, which uniquely identifies each instance of the corresponding node table. The values in the $node_id column are automatically generated and are a combination of the object_id value of that node table and an internally generated value of the BIGINT data type. (When you display the values of the $node_id column, the corresponding computed values are displayed as JSON strings.) Also, $node_id is a pseudo column that maps to an internal name with a hex string appended to it. In other words, when you select $node_id from the table, the column name appears as $node_id_\hex_string.

After creation of node tables, you have to load data into them. As Example 31.2 shows, inserting rows into node tables works the same way as for any other regular table.

Example 31.2

USE graph_db;