Creating and Populating a SNOMED CT Database

Creating the Database

The first step is to create the database schema. The code below is an excerpt from the SNOMED CT example database MySQL import script which carries out the following steps:

  • Creates a database called "snomedct"

  • Sets the default character set of the database as "utf8mb4"

  • Sets snomedct as the default database for the following script.

  • Sets timeouts appropriate to the import process.

  • Clears sql_mode settings some of which, if present, may cause import issues.

Create SNOMED CT Database

DELIMITER ;
SELECT Now() `--`,"Create Database and Initialize" '--';
-- CREATE DATABASE
DROP DATABASE IF EXISTS `snomedct`;
CREATE DATABASE `snomedct` /*!40100 DEFAULT CHARACTER SET utf8mb4 */;
USE `snomedct`;
-- INITIALIZE SETTINGS
SET GLOBAL net_write_timeout = 60;
SET GLOBAL net_read_timeout=120;
SET GLOBAL sql_mode ='';
SET SESSION sql_mode ='';

Creating Tables for Components

This section contains SQL statements that create database tables to accommodate the data in each of the main component files. Each table creation is accompanied by a summary of the relevant release file specification.

Notes

  1. The title of the file specification summary table is a link to the full file specification.

  2. The table names used on this page are prefixed with full_ as these are the tables into which the full SNOMED CT release will be imported. The loader script also create identically structured tables with the prefix snap_ and the latest snapshot view is loaded into those tables.

  3. The SQL code on this page creates the primary keys for each table (id, effective time) but omits creation of any other indexes. The loader script creates additional indexes after importing data into the table. This enables faster importing of data from the text files as the additional indexes do not need to be updated while importing.

  4. The effectiveTime is set as a DATETIME data type. This supports a specific time in hours, minutes or seconds. In practice, effectiveTime values are formally restricted to YYYYMMDD but we are aware of at least one SNOMED CT extension that includes time units in the effectiveTime field or its release files. The effectiveTime is set by default to a 2000-01-31, a date which predates any SNOMED CT effectiveTime value. In practice, the effectiveTime will always be set by the imported data so the default has no material effect.

  5. Tables are also created for the full_textDefinition table and its snapshot version. As these tables have the same structure as the description tables, the data from the textDefinition release files could be imported into those tables instead. The text definitions would still be distinguishable from the descriptions as they have a different typeId.

Creating a Concept Table

Create Concept Table

DROP TABLE IF EXISTS `full_concept`;
 
CREATE TABLE `full_concept` (
    `id` BIGINT NOT NULL DEFAULT  0,
    `effectiveTime` DATETIME NOT NULL DEFAULT  '2000-01-31 00:00:00',
    `active` TINYINT NOT NULL DEFAULT  0,
    `moduleId` BIGINT NOT NULL DEFAULT  0,
    `definitionStatusId` BIGINT NOT NULL DEFAULT  0,
    PRIMARY KEY (`id`,`effectiveTime`))
    ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

Table: Concept File Specification Summary

Creating a Description Table

Create Description Table

DROP TABLE IF EXISTS `full_description`;
 
CREATE TABLE `full_description` (
    `id` BIGINT NOT NULL DEFAULT  0,
    `effectiveTime` DATETIME NOT NULL DEFAULT  '2000-01-31 00:00:00',
    `active` TINYINT NOT NULL DEFAULT  0,
    `moduleId` BIGINT NOT NULL DEFAULT  0,
    `conceptId` BIGINT NOT NULL DEFAULT  0,
    `languageCode` VARCHAR (3) NOT NULL DEFAULT '',
    `typeId` BIGINT NOT NULL DEFAULT  0,
    `term` TEXT NOT NULL,
    `caseSignificanceId` BIGINT NOT NULL DEFAULT  0,
    PRIMARY KEY (`id`,`effectiveTime`))
    ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

Table: Description File Specification Summary

Creating a Relationship Table

Create Relationship Table

DROP TABLE IF EXISTS `full_relationship`;
 
CREATE TABLE `full_relationship` (
    `id` BIGINT NOT NULL DEFAULT  0,
    `effectiveTime` DATETIME NOT NULL DEFAULT  '2000-01-31 00:00:00',
    `active` TINYINT NOT NULL DEFAULT  0,
    `moduleId` BIGINT NOT NULL DEFAULT  0,
    `sourceId` BIGINT NOT NULL DEFAULT  0,
    `destinationId` BIGINT NOT NULL DEFAULT  0,
    `relationshipGroup` INT NOT NULL DEFAULT 0,
    `typeId` BIGINT NOT NULL DEFAULT  0,
    `characteristicTypeId` BIGINT NOT NULL DEFAULT  0,
    `modifierId` BIGINT NOT NULL DEFAULT  0,
    PRIMARY KEY (`id`,`effectiveTime`))
    ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

Table: Relationship File Specification Summary

Creating Tables for Reference Sets

This section contains SQL statements that create database tables to accommodate the data in some of the reference set release files. Each table creation is accompanied by a summary of the relevant release file specification. The selection of reference set types shown on this page is incomplete but includes at least one example reference set that includes an additional column of each of the permitted types (componentId, string and integer).

Notes

  1. The title of the file specification summary table is a link to the full file specification.

  2. The table names used on this page are prefixed with full_ as these are the tables into which the full SNOMED CT release will be imported. The loader script also create identically structured tables with the prefix snap_ and the latest snapshot view is loaded into those tables.

  3. The SQL code on this page creates the primary keys for each table (id, effective time) but omits creation of any other indexes. The loader script creates additional indexes after importing data into the table. This enables faster importing of data from the text files as the additional indexes do not need to be updated while importing.

  4. The SQL code used on this page does not include any additional optimizations for generating alternative snapshot views. Optimizations discussed in this guide can be added to the tables if required. However, this loader script creates and populates tables for both the full release and the current snapshot views. Therefore, additional optimizations would only deliver performance benefits when querying retrospective snapshot views. Even in this case the performance benefits for most types of query are often limited when compared to the use of unoptimized dynamic views.

  5. The effectiveTime is set as a DATETIME data type. This supports a specific time in hours, minutes or seconds. In practice, effectiveTime values are formally restricted to YYYYMMDD but we are aware of at least one SNOMED CT extension that includes time units in the effectiveTime field or its release files. The effectiveTime is set by default to a 2000-01-31, a date which predates any SNOMED CT effectiveTime value. In previous versions of the script defaults were set to 0000-00-00 but some SQL settings treat these as invalid dates. In practice, the effectiveTime will always be set by the imported data so the default has no material effect.

Creating a Simple Refset Table

Create Simple Refset Table

DROP TABLE IF EXISTS `full_refset_Simple`;
 
CREATE TABLE `full_refset_Simple` (
  `id` char(36) NOT NULL DEFAULT '',
  `effectiveTime` DATETIME NOT NULL
        DEFAULT  '2000-01-31 00:00:00',
  `active` TINYINT NOT NULL DEFAULT  0,
  `moduleId` BIGINT NOT NULL DEFAULT  0,
  `refsetId` BIGINT NOT NULL DEFAULT  0,
  `referencedComponentId` BIGINT NOT NULL DEFAULT  0,
  PRIMARY KEY (`id`,`effectiveTime`))
  ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

Table: Simple Refset File Specification Summary

Creating a Language Refset Table

Create Language Refset Table

DROP TABLE IF EXISTS `full_refset_Language`;
 
CREATE TABLE `full_refset_Language` (
  `id` char(36) NOT NULL DEFAULT '',
  `effectiveTime` DATETIME NOT NULL
        DEFAULT  '2000-01-31 00:00:00',
  `active` TINYINT NOT NULL DEFAULT  0,
  `moduleId` BIGINT NOT NULL DEFAULT  0,
  `refsetId` BIGINT NOT NULL DEFAULT  0,
  `referencedComponentId` BIGINT NOT NULL DEFAULT  0,
  `acceptabilityId` BIGINT NOT NULL DEFAULT  0,
  PRIMARY KEY (`id`,`effectiveTime`))
  ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

Table: Language Refset File Specification Summary

Creating an Extended Map Refset Table

Create Extended Map Refset Table

DROP TABLE IF EXISTS `full_refset_ExtendedMap`;
 
CREATE TABLE `full_refset_ExtendedMap` (
  `id` char(36) NOT NULL DEFAULT '',
  `effectiveTime` DATETIME NOT NULL
        DEFAULT  '2000-01-31 00:00:00',
  `active` TINYINT NOT NULL DEFAULT  0,
  `moduleId` BIGINT NOT NULL DEFAULT  0,
  `refsetId` BIGINT NOT NULL DEFAULT  0,
  `referencedComponentId` BIGINT NOT NULL DEFAULT  0,
  `mapGroup` INT NOT NULL DEFAULT 0,
  `mapPriority` INT NOT NULL DEFAULT 0,
  `mapRule` TEXT NOT NULL,
  `mapAdvice` TEXT NOT NULL,
  `mapTarget` VARCHAR (200) NOT NULL DEFAULT '',
  `correlationId` BIGINT NOT NULL DEFAULT  0,
  `mapCategoryId` BIGINT NOT NULL DEFAULT  0,
  PRIMARY KEY (`id`,`effectiveTime`))
  ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

Table: Extended Map Refset File Specification Summary

Creating a Module Dependency Refset Table

Create Module Dependency Refset Table

DROP TABLE IF EXISTS `full_refset_ModuleDependency`;
 
CREATE TABLE `full_refset_ModuleDependency` (
  `id` char(36) NOT NULL DEFAULT '',
  `effectiveTime` DATETIME NOT NULL
        DEFAULT  '2000-01-31 00:00:00',
  `active` TINYINT NOT NULL DEFAULT  0,
  `moduleId` BIGINT NOT NULL DEFAULT  0,
  `refsetId` BIGINT NOT NULL DEFAULT  0,
  `referencedComponentId` BIGINT NOT NULL DEFAULT  0,
  `sourceEffectiveTime` DATETIME NOT NULL
        DEFAULT  '2000-01-31 00:00:00',
  `targetEffectiveTime` DATETIME NOT NULL
        DEFAULT  '2000-01-31 00:00:00',
  PRIMARY KEY (`id`,`effectiveTime`))
  ENGINE=MyISAM DEFAULT CHARSET=utf8mb4;

Table: Module Dependency Refset File Specification Summary

Importing Release Files

This section contains examples of the SQL statements used to imports data from a component release files in the appropriate database table. Before importing the tables must be created (see Creating Tables for Components and Creating Tables for Reference Sets).

Sample SQL Code for Importing from Full Release

Import Concepts from Full Release

Import Concept File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Full/Terminology/sct2_Concept_Full_INT_[RELEASE-DATE].txt'
INTO TABLE `full_concept`
LINES TERMINATED BY '\r\n'
 IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`definitionStatusId`);

Import Descriptions from Full Release

Import Description File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Full/Terminology/sct2_Description_Full-en_INT_[RELEASE-DATE].txt'
INTO TABLE `full_description`
LINES TERMINATED BY '\r\n'
 IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`conceptId`,`languageCode`,`typeId`,`term`,`caseSignificanceId`);

Import Relationships from Full Release

Import Relationship File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Full/Terminology/sct2_Relationship_Full_INT_[RELEASE-DATE].txt'
INTO TABLE `full_relationship`
LINES TERMINATED BY '\r\n'
 IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`sourceId`,`destinationId`,`relationshipGroup`,`typeId`,`characteristicTypeId`,`modifierId`);

Import Simple Refsets from Full Release

Import a Simple Refset File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Full/Refset/Content/der2_Refset_SimpleFull_INT_[RELEASE-DATE].txt'
INTO TABLE `full_refset_simple`
LINES TERMINATED BY '\r\n'
 IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`refSetId`,`referencedComponentId` );

Import Language Refsets from Full Release

Import a Language Refset File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Full/Refset/Language/der2_cRefset_LanguageFull-en_INT_$RELDATE.txt'
INTO TABLE `full_refset_Language`
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`refsetId`,`referencedComponentId`,`acceptabilityId`);

Import Extended Map Refsets from Full Release

Import an Extended Map Refset File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Full/Refset/Map/der2_iisssccRefset_ExtendedMapFull_INT_$RELDATE.txt'
INTO TABLE `full_refset_ExtendedMap`
LINES TERMINATED BY '\r\n'
IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`refsetId`,`referencedComponentId`,`mapGroup`,`mapPriority`,`mapRule`,`mapAdvice`,`mapTarget`,`correlationId`,`mapCategoryId`);

Sample SQL Code for Importing from a Snapshot Release

Import Concepts from Snapshot Release

Import Concept File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Snapshot/Terminology/sct2_Concept_Snapshot_INT_[RELEASE-DATE].txt'
INTO TABLE `snap_concept`
LINES TERMINATED BY '\r\n'
 IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`definitionStatusId`);

Import Simple Refsets from Snapshot Release

Import a Simple Refset File
LOAD DATA LOCAL INFILE '[RELEASE-FILE-PATH]/[RELEASE-PACKAGE-VERSION-NAME]/Snapshot/Refset/Content/der2_Refset_SimpleSnapshot_INT_[RELEASE-DATE].txt'
INTO TABLE `snap_refset_simple`
LINES TERMINATED BY '\r\n'
 IGNORE 1 LINES
(`id`,`effectiveTime`,`active`,`moduleId`,`refSetId`,`referencedComponentId`);

Last updated