Loom REST API V1.docx - Teradata



Loom REST APIAPI Version ‘V1’This documents the ‘v1’ version of the Loom API. This version is applicable for Loom 2.2 and beyond.API OverviewOperations in Loom are all managed through a set of HTTP-based APIs. While all operations can be performed through the Loom application, users may also access the APIs directly.The Loom API is designed to be versioned. The initial version is ‘v1’, accessible through the Loom server URL from the root: Organization of APIThe Loom API is organized into two parts: the Resource API centers around resources managed in the Loom Registry: sources, datasets, etc; and the Activity API focuses on activities that users can perform with Loom: executing transforms, accessing data, etc. Resources and Resource APIThe Resource API is focused on entities, which are exposed as resources according to their types. The routes shown in the following tables are relative to the Loom root, e.g. the ‘datasets’ resources are accessible from of data, not directly controlled by Loom./sourcesdatasetsSets of data whose lifecycles are controlled by Loom./datasetsprocessesProcessing performed on datasets./processesjobsTracking of asynchronous processes executed through Loom./jobsusersUser accounts./usersglossariesGlossaries of business terms./glossariesrelationshipsDynamic relationships between entities./relationshipsThe ‘generic’ form of the Resource API is accessed through the following endpoints. Note that these parts of the API are not as well-developed as the type-specific instance methods shown above. ResourceDescriptionRouteentitiesGeneric entities (irrespective of type)/entitiestypesType information./typesActivity APIThe Activity API is focused on activities that users can perform against the Loom system.ActivityDescriptionRouteconnectionUser login, logout, ping/connectsearchRelated to search - fulltext search, filters for search, etc/searchdataData access, reading files and getting data from datasets/dataexecutionExecuting transformations against datasets/executeenvironmentEnvironment interactions - browse the file system, etc/environsystemLoom system information/systemAPI Standard ResponseEvery method returns the following standard response:{results: { ...the actual result(s) ... }, related: { ...any entities that are referred to by id from within 'results'... }, count: ...the size of 'results' ... , errors: [ ...any errors that occurred in processing the request... ]}Note that the results will be returned as an array for ‘many’ requests, and as a scalar for ‘individual’ requests (such as when <id> is in URL). When the return value is the unique identifier of an entity, the results will contain a map with a single key, 'entity/id'.In the documentation of each method, generally only the ‘results’ part of the response structure are described.Related SectionThe ‘related’ section is a map of entity ID’s to properties. The entity ID’s will match property values returned in the ‘results’ section. In that way, results containing properties whose values are entity ID’s can be resolved into a more human-processable form. For example, the ‘entity/createdBy’ value is the unique entity identifier of a user in the system; for display purposes though, usually the user’s actual name is preferred. This can be obtained from the related section using the createdBy value from the results section to look up the key, from which the user’s name can be obtained.Example:{ “results”: { "entity/name": "SomeEntity", "entity/description": "The entity description.", "entity/tags": "tag1, tag2, tag3" "entity/folder": "test1/test2", "entity/createdBy": "52e28419-2c48-436d-8e7c-643cf331e071", "entity/modifiedBy": "52e28419-1725-47bd-9884-6149e7b9b446", } "related": { "52e28419-2c48-436d-8e7c-643cf331e071": { "user/username" : "smusial" } "52e28419-1725-47bd-9884-6149e7b9b446": { “user/username” : “bgibson” } }}Resources and Resource APIThis API exposes the Loom registry entities as resources, using standard REST conventions. The Resource API is organized by type (e.g., Source, Dataset, etc), with two generic parts for entity instances and entity types. For each section, the following information is provided:Attributes - orange indicates a domain entity, magenta a struct (subordinate to an entity)Requests - summary of the methods available for the resourceRequest Details - calling details for each methodNote that every Entity in Loom automatically has all the core Entity attributes (entity/id, entity/name, etc). See the Model Overview below for details on the core Entity attributes.SourcesSources represent sources of data whose lifecycles are not controlled by Loom. Sources are containers of data units that conform to some structural form (tables are currently supported by Loom). Sources are similar in structure to Datasets, but are semantically different, as Datasets are managed by Loom.Source Entity AttributesThe Source entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptiondata/structuralFormstringThe structural form of the data contained within the data container. Currently, the only supported form is “table”. data/structurearray of embedded structDefault structure for the data units contained in this source, defined by a schema (or possibly, multiple schemas), of type ‘data/Schema’. Applies to all data units in the source, unless specifically overridden by a data unit.source/expandablebooleanWhether the source is an expandable collection or not.source/metadataAccessiblebooleanWhether the source's metadata can be accessed by Loom.source/dataAccessiblebooleanWhether the source's data can be accessed by Loom.source/entityStatestringIndicator of the state the entity is in. One of ‘potential’, ‘active’, or ‘deleted.persist/storagereferenceReference (pointer via entity ID) to a persist/Storage.data/dataUnitarray of referencesReferences (pointers via entity ID) to one or more data/DataUnits.DataUnit AttributesA Source is a data container, represented by the type DataContainer. All data containers own a set of DataUnits. DataUnits represent the actual data (although in most cases, they are merely proxies for the data, and do not physically contain it). DataUnits are first-class entities, with unique identifiers, so they may be referenced from outside the context of the containing data container.Name [Type]Descriptiondata/structuralFormstringThe structural form of the data contained within the data unit. Currently, the only supported form is “table”. data/structurearray of embedded structStructure for the data, defined by a schema (or possibly, multiple schemas), of type ‘data/Schema’. Overrides the default structure for the containing source, to set the structure on this data unit.persist/storagereferenceReference (pointer via entity ID) to a persist/StorageUnit.Schema AttributesData units contain one or more schemas. A Schema is a structure; it is fully-owned by a DataUnit and does not (currently) have a unique identifierName [Type]Descriptiondata/structuralFormstringThe structural form that the schema represents. Currently, the only supported form is “table”. data/isDefaultbooleanIf true (or nil if one schema), the schema is the default one for the data unitThe type of the schema depends on the structural form of the data unit. For table data units, with structural form of ‘table’, the schema type is TableSchema.Storage AttributesA Source is physically persisted to some system (HDFS, database, etc). The Storage entity represents the persistence information. For example, a source may hold its information in a directory of files, or in a Hive database. Storage is a container; it owns a set of StorageUnits. For example, if the Storage is a directory, the storage units are individual files within the directory.Name [Type]Descriptionpersist/storageTypestringThe type of storage. E.g., ‘file/text’, ‘file/binary’, ‘rdb/hive’, ‘rdb/generic’. persist/locationstringLocation of storage; used to connect to or otherwise access the storage.persist/applicationstringApplication that can process this type of storage.persist/storageUnitarray of referencesReferences (pointers via entity IDs) to the storage units for this storage.persist/formatembedded structDefault format for the storage; applies to all storage units unless overridden by a storage unit.There are additional properties that apply for extensions to the base Storage. For example, a FileSet.StorageUnit AttributesStorageUnits are proxies for the individual units that hold the data that is exposed from a Source as a DataUnit. For example, a Source represent the actual data (although in most cases, they are merely proxies for the data, and do not physically contain it). DataUnits are first-class entities, with unique identifiers, so they may be referenced from outside the context of the containing data container.Name [Type]Descriptionpersist/locationstringAbsolute location of storage unit, if applicable.persist/relativeLocationstringRelative location of storage unit in storage.persist/containsDatabooleanTrue if storage unit contains data (will get exposed from source as a DataUnit).persist/formatstringStorage format for storage unit; overrides storage-level format.persist/formatTypestringType of storage format.There are additional properties that apply for extensions to the base StorageUnit. For example, a FileSetFile.Format AttributesFormats define how the bits in persistent storage are to be read and parsed. There is a default format nested under a Storage, which applies to all Storage Units in that Storage unless overridden. Each StorageUnit can explicitly define a Format, which takes precedence over the default stored in its Storage container. Name [Type]Descriptionpersist/formatTypestringType of storage format. E.g., ‘textdelim’ or ‘text/pattern’ for storage types of ‘file/text’; ‘binary/avro’ for storage types of ‘file/binary’. There is no format for storage types of ‘rdb/*’.The interesting properties are associated with specific subclasses of Format, e.g., in DelimitedFormat and PatternFormat. Source Summary AttributesThe SourceSummary structure captures a ‘view’ of a Source entity, pulling in information from related entities (such as scan measurements). Instances of these structs are returned from the ‘summary’ API methods. Name [Type]Descriptionsummary/entityIDstringThe unique identifier of the entity that the summary is for.summary/entityNamestringThe name of the entity that the summary is for.summary/entityDescriptionstringThe description of the entity that the summary is for.summary/entityCreatedAtinstantThe creation timestamp of the entity that the summary is for.summary/entityCreatedBystringThe username of the person who created the entity.summary/entityModifiedAtinstantThe timestamp when the entity was last modified.summary/entityModifiedBystringThe username of the person who last modified the entity.data/structuralFormstringThe structural form of the data contained within the data container. E.g., “table”.persist/storageTypestringStorage form: file/text, file/binary, rdb/hive, rdb/generic, etc.persist/locationstringThe location of the source; duplicate of Storage location, for convenience.data/expandablebooleanWhether the source is an expandable collection or not,data/autoUpdatebooleanWhether the source will be auto-updated as new files, etc are created.source/metadataAccessiblebooleanWhether the source's metadata can be accessed by Loom.source/dataAccessiblebooleanWhether the source's data can be accessed by Loom.source/entityStatestringIndicator of the lifecycle state of the source entity. One of ‘active’ or ‘potential’. summary.source/nDatasetlongNumber of datasets derived from the source.summary.source/nDataUnitlongNumber of data units exposed by the source.summary.data/dataUnitarray of embedded structsRelationship to summary items for each data unit (DataUnitSummary structs).For a particular source instance, a SourceSummary structure contains one DataUnitSummary for each data unit in the source. Name [Type]Descriptionsummary/entityIDstringThe unique identifier of the entity that the summary is for.summary/entityNamestringThe name of the entity that the summary is for.summary/entityDescriptionstringThe description of the entity that the summary is for.summary/entityCreatedAtinstantThe creation timestamp of the entity that the summary is for.summary/entityCreatedBystringThe username of the person who created the entity.summary/entityModifiedAtinstantThe timestamp when the entity was last modified.summary/entityModifiedBystringThe username of the person who last modified the entity.data/structuralFormstringThe structural form of the data contained within the data container. E.g., “table”.summary.data/nRowlongNumber of rows in the (2-dimensional) data itemsummary.data/nCollongNumber of columns or fields in the data itemsummary.data/sizeByteslongSize of the data item, in bytes; null if unknownsummary.source/nameInSourcestringThe name of the data entity in its native source (e.g. file path for files)RequestsRequestDescriptionGET sourcesGet all sources matching the provided filters.POST sourcesCreate a new source, given entity metadata and storage information.GET sources/defaultGet a default source instance, for use when creating a new source.POST sources/defaultCreate a new source given a location, using all default settings. GET sources/summaryGet summaries of all sources matching the provided filters.GET sources/<id>Get the source with the specified entity ID.PATCH sources/<id>Modify the source with the specified ID with updated attributes and storage informationDELETE sources/<id>Delete the source with the specified entity ID.GET sources/<id>/summaryGet the summary of the specified source.POST /sources/<id>/data_unitsAdd data unit to an existing source.The ‘default’ methods are convenience functions, to help in defining a source. The ‘GET’ version provides default settings, which can be edited by users, and then saved to Loom using ‘POST /sources’. The ‘post’ version simply creates a source using all defaults, with no user interaction. Note on setting Data Unit schemas:There are several possible ways in which Loom will determine the structure (i.e. schema) with which to access data in a data unit contained in a given source.The schema may be read directly from the physical storage for that source, e.g. a file header or the Hive metastore.The schema may be explicitly set by the user or by the API client.The schema may be omitted, and inherited from the default schema set on the source container.The API methods which result in creation or modification of data unit entities (POST sources, PATCH sources/<id>, POST sources/<id>/data_units) all use the following rules to determine the schema that ultimately gets used for a given data unit entity, listed in priority order:If the 'data/structure' property is set on a data unit entity passed through the API and is a non-empty array, then this property is stored unchanged on the resulting data unit entity in the registry.No validation is done of such a manually-supplied schema; if it is inconsistent with the underlying data (e.g. missing fields, unknown fields) then this may result in missing data or errors when accessing the underlying data through Loom.If the 'data/structure' property is explicitly set to null on the data unit entity passed through the API, or is an empty array, then no structure is set on the resulting data unit entity stored in the registry; the structure is inherited from the parent source.If no default data/structure is set on the source, then an error code is returned and the operation will fail.No validation is done of the source default schema with respect to the data unit's data; if it is inconsistent with the underlying data (e.g. missing fields, unknown fields) then this may result in missing data or errors when accessing the underlying data through Loom.If no data unit entity is supplied for a table to be created, or a data unit entity is passed through the API without a 'data/structure' property, and the 'source/metadataAccessible' property is true for the source, then Loom will infer the table structure from the underlying storage. This is the recommended mode of operation when working with Loom's supported storage formats.If Loom can directly access metadata about the structure of the data unit through underlying storage format (e.g. an embedded schema in a file header, or a schema managed by another metastore) then this structure is translated into a schema entity and stored as part of the tata unit entity in Loom, and used whenever the data is accessed through Loom.If the structure for the underlying data cannot be directly determined by examining the underlying storage, as might be the case for unstructured or semi-structured text formats such as CSV or log files, then a structure is generated for the data unit based on examining a sample of the underlying data, using auto-generated field names.If a default structure is present on the parent source entity, and matches the generated structure in number and type of fields, then no structure is set on the resulting data unit entity in the registry; the source default schema is used whenever accessing this data unit.If the parent Source entity has no default structure, or it has one that does not match the generated structure in number or type of fields, then the generated structure is stored in the data/structure property of the resulting DataUnit entity in the registry, and will override the Source default structure.If no DataUnit entity is supplied for a table to be created, or a DataUnit entity is passed through the API without a 'data/structure' property, and the 'source/metadataAccessible' property is false for the Source, then Loom will not include a structure on the DataUnit entity that is stored in the registry.Request DetailsPaths are all relative to the root of the API, including version number.Get Registered SourcesGet all sources matching the provided filters.Path:sourcesMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_source_storagetype: filter on the source’s storage type; values ‘all’ (default) or one of the valid values of persist/storageType filter_source_entitystate: filter on the source’s entity state; values ‘all’ (default) or one of the valid values of source/entityState filter_ndataunit: filter on the number of data units in the source filter_use_frequency: filter on how often the source has been used to create datasets Returns:array of Source entitiesSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Register (Create) a SourceCreate a new source, given entity metadata and storage information. The values returned from ‘GET /sources/default’ can be edited and passed into this method to create a new source.Path:sourcesMethod:POSTParameters:entity: a set of core entity properties (e.g., entity/name, entity/description, etc); optional, if name can be derived from the storage location or storage namedata_units: an array of DataUnit’s, which define metadata and schemas of data units (tables) corresponding to storage units specified via the storage_units parameter. The correspondence between data units and storage units is through their entity names. Optional - if not specified, all storage units with their source/containsData property set to true will be exposed as tables named the same as the storage units, and the schemas will be derived from the underlying physical persistence if the source/metadataAccessible property is true and they are not explicitly set in the corresponding data unit entity.storage: a Storage instance, which defines the container persistence; requiredstorage_units: an array of StorageUnit’s, which define the unit-level persistence; required - there must be at least 1 storage unit. All storage units with their containsData property set to true will be exposed as tables in the source.Returns:id: ID of created sourceNotes:The storage units drive the source container contents. Any storage unit with its source/containsData property set to true will be exposed as a table (data unit) in the source. If the data_units section is not specified, or if there is not a data unit in that section corresponding to a storage unit, then the name of the data unit will be derived from that of the storage unit, and the schema will be inferred from the physical persistence (e.g., a file on disk) if source/metadataAccessible is true. If there is a data unit associated with a storage unit, then that data unit can define the descriptive properties (such as description, tags), and the schema to be used. The data unit is tied to its corresponding storage unit through their names, which must exactly match.Get a Source Instance with Default SettingsGet a default source instance, for use when creating a new source. The values returned from this method can be edited, and used with ‘POST /sources’ to create a new source in Loom.Path:sources/defaultMethod:GETParameters:location: the location of the source (semantics based on storageType); requiredstorage_type: the type of storage; optional, if server can derive itformat_type: the type of format to use; optional (not needed for all storage types)Returns:entity: source object (no entity/id)storage: Storage, without embedded StorageUnits storage_units: StorageUnits, separate from StorageRegister (Create) a Source using Default SettingsCreate a new source given a location, using all default settings. This is a convenience to combine the calls to ‘GET /sources/default’ and ‘POST /sources’ when no modification of the default source is required.Path:sources/defaultMethod:POSTParameters:location: the location of the source (semantics based on storage_type); requiredstorage_type: the type of storage; optional, if server can derive itformat_type: the type of format used to read and parse the source; optional (not needed for all storage types); defaults to ‘text/delim’ if storage type==’file/text’, else nullentity: a bundle of core properties (e.g., entity/name, entity/description, etc) to be set on the created Source entity; optional (name can be derived from location)Returns:id: ID of created sourceGet Source SummariesGet summaries of all sources matching the provided filters.Path:sources/summaryMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_source_storagetype: filter on the source’s storage type; values ‘all’ (default) or one of the valid values of persist/storageType filter_source_entitystate: filter on the source’s entity state; values ‘all’ (default) or one of the valid values of source/entityState filter_ndataunit: filter on the number of data units in the source filter_use_frequency: filter on how often the source has been used to create datasets Returns:array of SourceSummary structsSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Get a Source InstanceGet the source with the specified entity ID.Path:sources/<id>Method:GETParameters:none, except for entity/id that is built into the URL Returns:entity: source entity; persist/storage matches Storage’s entity/idstorage: Storage; persist/storage references StorageUnits’ entity/id’s storage_units: StorageUnits, separate from StorageReplace a Source InstanceReplace the source with the specified ID with updated attributes and storage information.Path:sources/<id>Method:PATCHParameters:entity: a set of core entity properties (e.g., entity/name, entity/description, etc); optional, if name can be derived from the storage location or storage name; if entity/id is included (e.g., if using results from GET /sources/<id>), it must match the ID in the URL, or it is considered an errordata_units: an array of DataUnit’s, which define metadata and schemas of data units (tables) corresponding to storage units specified via the storage_units parameter. The correspondence between data units and storage units is through their entity names. Optional - if not specified, all storage units with their source/containsData property set to true will be exposed as tables named the same as the storage units, and the schemas will be derived from the underlying physical persistence if the source/metadataAccessible property is true and they are not explicitly set in the corresponding data unit entity.storage: the Storage to replace the existing Storage; optional, if not changing Storagestorage_units: the StorageUnits to replace the existing StorageUnits, en masse; optional, if not changing StorageUnitsReturns:id: ID of sourceNotes:The properties specified under the entity parameter are merged in with the existing properties.Delete a Source InstanceDelete the source with the specified ID.Path:sources/<id>Method:DELETEParameters:none, except for entity/id that is built into the URL Returns:nothingGet a Source Instance’s SummaryGet the summary of the specified source.Path:sources/<id>/summaryMethod:GETParameters:none, except for entity/id that is built into the URL Returns:SourceSummary structCreate a DataUnit in an Existing SourceCreates a new data unit (table) in an existing source. This is useful to represent a table that was added to the source through an external system (as opposed to through Loom). Path:sources/<id>/data_unitsMethod:POSTParameters:location: is a relative path to the item in the source (if absolute path provided, will attempt to reconcile with source location, and extract relative path if compatible)format: is a Format object, for interpreting the persisted data pointed to by location. Optional; will use source’s Storage format (the ‘default’ format) if not specified; if specified, must be of the same type as the source’s Storage format.entity: is a set of properties to assign to the created table. Optional, will derive table name from location if not specified.schema: an optional schema to associate with the data unit as the ‘default’ schema for that data unit. This is only allowed if the source that the data unit is being added to has the source/metadataAccessible property set to false. If this parameter is set, then the format parameter is optional, and if specified, will not be used to read metadata from the persistent store.Returns:id: The ID of the data unitNotes:1. If the source’s source/metadataAccessible property is set to true, then the location and format will be be used to read the metadata and infer a schema for the associated data unit. If the source’s source/dataAccessible property is set to true, then the new table’s associated storage unit will have its persist/containsData property set to true; otherwise that property will be set to false.2. A schema can only be specified if the source’s source/metadataAccessible property is false. Otherwise, as noted above, the schema will be inferred from the location and format information.3. If a schema is specified, the format property is for informational purposes only; it is not used to infer the table metadata. See Also:POST /sources: to create a new source___________________________________________DatasetsDatasets represent sets of data whose lifecycles are controlled by Loom. Datasets originate from Sources, and from other Datasets through transformations. Datasets are containers of data units that conform to some structural form (such as tables). Datasets are similar in this way to Sources, but are semantically different, as Datasets are managed by Loom whereas Sources are controlled by some external entity (and so may be changed without direct involvement of Loom).Dataset Entity AttributesThe Dataset entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptiondata/structuralFormstringThe structural form of the data contained within the data container. E.g., “table”. dataset/entityStatestringIndicates the ‘state’ the entity is currently in. One of ‘pending’, ‘active’, or ‘deleted’. A dataset is ‘pending’ if it is in the process of being created from a long-running transformation.dataset/sourcedFromreferenceThe Source identifier, if the dataset was created directly from a source Null if the dataset was created through a transformation.persist/storagereferenceReference (pointer via entity ID) to a persist/Storage.data/dataUnitarray of referencesReferences (pointers via entity ID) to data/DataUnits.DataUnit AttributesA Dataset is a data container, represented by the type DataContainer. All data containers own a set of DataUnits. DataUnits represent the actual data (although in most cases, they are merely proxies for the data, and do not physically contain it). DataUnits are first-class entities, with unique identifiers, so they may be referenced from outside the context of the containing data container.Name [Type]Descriptiondata/structuralFormstringThe structural form of the data contained within the data unit. Currently, the only supported form is “table”. data/structurearray of embedded structStructure for the data, defined by a schema (or possibly, multiple schemas), of type ‘data/Schema’.persist/storagereferenceReference (pointer via entity ID) to a persist/StorageUnit.Schema AttributesData units contain one or more schemas. A Schema is a structure; it is fully-owned by a DataUnit and does not (currently) have a unique identifierName [Type]Descriptiondata/structuralFormstringThe structural form that the schema represents. Currently, the only supported form is “table”. data/isDefaultbooleanIf true (or nil if one schema), the schema is the default one for the data unitThe type of the schema depends on the structural form of the data unit. For table data units, with structural form of ‘table’, the schema type is TableSchema.Storage AttributesA Dataset is physically persisted to some system. The Storage entity represents the persistence information. For the most part, storage information for datasets is hidden from Loom users (unlike for Sources). See the persistence information under Sources.Dataset Summary AttributesThe DatasetSummary structure captures a ‘view’ of a Dataset entity, pulling in information from related entities (such as scan measurements). Instances of these structs are returned from the ‘summary’ API methods. Name [Type]Descriptionsummary/entityIDstringThe unique identifier of the entity that the summary is for.summary/entityNamestringThe name of the entity that the summary is for.summary/entityDescriptionstringThe description of the entity that the summary is for.summary/entityCreatedAtinstantThe creation timestamp of the entity that the summary is for.summary/entityCreatedBystringThe username of the person who created the entity.summary/entityModifiedAtinstantThe timestamp when the entity was last modified.summary/entityModifiedBystringThe username of the person who last modified the entity.data/structuralFormstringThe structural form of the data contained within the data container. E.g., “table”.data/expandablebooleanWhether the dataset is an expandable collection or not,data/autoUpdatebooleanWhether the dataset will be auto-updated as its associated source (sourcedFrom) is updated.dataset/entityStatestringIndicator of the lifecycle state of the source entity. One of ‘active’ or ‘potential’. persist/storageTypestringStorage form: file/text, rdb/hive, etc.summary.dataset/nDataUnitlongNumber of ‘data units’ (e.g., tables) in the dataset.summary.dataset/sourcedFromstringName of Source, if dataset directly derived from source.summary.dataset/nUseslong Number of times Dataset has been transformed.summary.dataset/lastProcessedBystringWho last used the dataset in a transformation (username).summary.dataset/lastProcessedAtinstantWhen the dataset was last used in a transformation.summary.data/dataUnitarray of embedded structsRelationship to summary items for each data unit (DataUnitSummary structs).For a particular dataset instance, a DatasetSummary structure contains one DataUnitSummary for each data unit in the dataset. Name [Type]Descriptionsummary/entityIDstringThe unique identifier of the entity that the summary is for.summary/entityNamestringThe name of the entity that the summary is for.summary/entityDescriptionstringThe description of the entity that the summary is for.summary/entityCreatedAtinstantThe creation timestamp of the entity that the summary is for.summary/entityCreatedBystringThe username of the person who created the entity.summary/entityModifiedAtinstantThe timestamp when the entity was last modified.summary/entityModifiedBystringThe username of the person who last modified the entity.data/structuralFormstringThe structural form of the data contained within the data container. E.g., “table”.summary.data/nRowlongNumber of rows in the (2-dimensional) data itemsummary.data/nCollongNumber of columns or fields in the data itemsummary.data/sizeByteslongSize of the data item, in bytes; null if unknownsummary.data/transformedFrom stringName of transformation that emitted the table (if from a Transformation, not ‘from source’).Note on setting Data Unit schemas:Unlike the Source API, where data units can be saved in the registry without a schema and inherit the source default schema at runtime, the Dataset API explicitly attaches a schema to each data unit in datasets saved in the registry. The dataset default schema, if present, will be copied onto each data unit passed to the API that does not have a schema provided by the client. The copying of the default schema onto the data unit prevents subsequent modifications to the default schema from causing problems with processing jobs that were written against a previous version of the schema. If a request is made to store a data unit without a schema, and there is no dataset default schema, then an error will be raised. This applies to all API methods that result in creation or modification of a data unit (POST and PUT).RequestsRequestDescriptionGET datasetsGet all datasets matching the provided filters.POST datasetsCreate a new dataset, given a local instance.GET datasets/defaultGet a default dataset tied to an existing source, for use when creating a new dataset.POST datasets/defaultCreate a new dataset tied to an existing source, using all default settings.GET datasets/summaryGet summaries of all datasets matching the provided filters.GET datasets/<id>Get the dataset with the specified entity ID.PUT datasets/<id>Replace the dataset with the specified ID with a local instance.DELETE datasets/<id>Delete the dataset with the specified entity ID.GET datasets/<id>/summaryGet the summary of the specified dataset.POST datasets/<id>/data_unitsAdd a data unit to an existing dataset.Request DetailsPaths are all relative to the root of the API, including version number.Get Registered DatasetsGet all datasets matching the provided filters.Path:datasetsMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_dataset_obtained_from: filter on where the dataset was directly obtained from; values of ‘all’ (default), ‘source’, or ‘dataset’. filter_ndataunit: filter on the number of data units (tables) in datasets; values ‘all’ (default) or one of several discretized facetsfilter_use_frequency: filter on how often the dataset has been used by processes (in transforms, etc)Returns:array of Dataset entitiesSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Register (Create) a DatasetCreate a new dataset from a local dataset object. The values returned from ‘GET /datasets/default’ can be edited and passed into this method to create a new dataset.Path:datasetsMethod:POSTParameters:local Dataset objectReturns:id: ID of created datasetGet a Dataset Instance with Default SettingsGet a default dataset instance, for use when creating a new dataset. The values returned from this method can be edited, and used with ‘POST /datasets’ to create a new dataset in Loom.Path:datasets/defaultMethod:GETParameters:source_id: the ID of a source registered in Loom; requiredReturns:dataset object (no entity/id)Register (Create) a Dataset using Default SettingsCreate a new dataset tied to an existing source, using all default settings. This is a convenience to combine the calls to ‘GET /datasets/default’ and ‘POST /datasets’ when no modification of the default dataset is required.Path:datasets/defaultMethod:POSTParameters:source_id: the ID of a source registered in Loom; requiredentity: a bundle of core properties (e.g., entity/name, entity/description, etc) to be set on the created Dataset entity; optional (will be obtained from source fields if not present)Returns:id: ID of created datasetGet Dataset SummariesGet summaries of all datasets matching the provided filters.Path:datasets/summaryMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_dataset_obtained_from: filter on where the dataset was directly obtained from; values of ‘all’ (default), ‘source’, or ‘dataset’. filter_ndataunit: filter on the number of data units (tables) in datasets; values ‘all’ (default) or one of several discretized facetsfilter_use_frequency: filter on how often the dataset has been used by processes (in transforms, etc)Returns:array of DatasetSummary structsSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Get a Dataset InstanceGet the dataset with the specified entity ID.Path:datasets/<id>Method:GETParameters:none, except for entity/id that is built into the URL Returns:entity: dataset, with embedded DataUnitsReplace a Dataset InstanceReplace the dataset with the specified ID with the new dataset object.Path:datasets/<id>Method:PUTParameters:the dataset to replace the existing datasetReturns:id: ID of updated datasetDelete a Dataset InstanceDelete the dataset with the specified ID.Path:datasets/<id>Method:DELETEParameters:none, except for entity/id that is built into the URL Returns:nothingGet a Dataset Instance’s SummaryGet the summary of the specified dataset.Path:datasets/<id>/summaryMethod:GETParameters:none, except for entity/id that is built into the URL Returns:DatasetSummary structCreate a DataUnit in an Existing DatasetCreates a new data unit (table) in an existing dataset. There are two overloaded options available, depending on which of the first parameters is passed in. The first creates a data unit in the dataset corresponding to a new data unit in the associated (‘sourcedFrom’) source, if the dataset was created directly from a source. The second is for when a table was added to the dataset through an external process (as opposed to through a Loom process). The latter option should be used with great care as incorrect use can result in datasets which are not processable by Loom.Path:datasets/<id>/data_unitsMethod:POSTThese are the parameters for the first overloaded option:Parameters:source_data_unit_name is the name of a table in the source (pointed to by the dataset’s ‘sourcedFrom’ property) to add to the dataset; optional; if unspecified, then all ‘new’ tables in the source. If this is specified and the dataset does not have a sourcedFrom field set, then an error will be raised.schema: is a Schema struct, to define the structure of the new data unit (table) columns. Required.entity: is a set of properties to assign to the created table. Optional, will derive table name from location if not specified.Returns:id: The ID of the data unitThese are the parameters for the second overloaded option:Parameters:location: is a relative (or absolute) path to the persisted item, either on disk (hdfs) or in Hive. If a relative path is passed in, it will be used in conjunction with location in the dataset’s Storage to determine the location of the persisted item on disk or in Hive. If an absolute path is passed in, it will be checked for consistency with the location in the dataset’s Storage, and then used as the location of the persisted item. Required.format: is a Format object, for interpreting the persisted data pointed to by location. Optional; will use dataset’s Storage format (the ‘default’ format) if not specified; if specified, must be of the same type as the dataset’s Storage format.schema: is a Schema struct, to define the structure of the new data unit (table) columns. Required.entity: is a set of properties to assign to the created table. Optional, will derive table name from location if not specified.Returns:id: The ID of the data unit___________________________________________ProcessesProcesses represent processing performed on Datasets (and, to a lesser degree, on Sources).Process Entity AttributesThe Process entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionprocess/processTypestringType of process, corresponding to the ProcessDefintiion. One of 'sql-query', 'hiveql’, ‘dataset-import', or user-defined type.process/processClassstringGeneral class of this process. One of ‘transform’, ‘import-export’, ‘descriptive’, or user-defined class.process/processScopestringThe scope of the process in terms of its data inputs and outputs. One of ‘container’ or ‘dataunit’.process/isExecutablebooleanTrue if processes conforming to this definition are executableprocess/entityStatestringIndicator of the state the entity is in. One of ‘active’ or ‘deleted.process/argumentreferenceConfiguration arguments defining the process configuration process/contextreferenceOptional default data context for input; can be overridden for ProcessUseProcessUse AttributesA ProcessUse is a snapshot of a Process at the point in time that process is used. A process that is executable (isExecutable = TRUE) is ‘used’ by virtue of its being executed, through POST /execute/transform. A process that is not executable (isExecutable = FALSE) can still be used, establishing a link between two or more data containers or data units; that is done via the POST /processes/<process_id>/uses method. The ProcessUse entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionprocess/processreferenceReference (pointer via UUID) to the Process that this is a use of.process/processClassstringGeneral class of this process. One of ‘transform’, ‘import-export’, ‘descriptive’, or user-defined class.process/isExecutedbooleanTrue if processes conforming to this definition was executed. (Corresponds to isExecutable field of corresponding Process).process/jobIdentiferstringStringified identifier of Job that was spawned, for informational purposes. Not a formal UUID-reference, to avoid bi-directional dependency.process/argumentreferenceConfiguration arguments defining the process configuration .process/contextreferenceThe data contexts for input and output.Argument/ConfigArgument AttributesProcesses and ProcessUses contain one or more arguments. An Argument is a structure; it is fully-owned by the Process or ProcessUse it is attached to, and does not have a unique identifier. A ConfigArgument is an Argument with a value.Name [Type]Descriptionentity/namestringThe name of the argument.process.param/indexlongThe index of the argument (1, 2, etc).process.arg/valuestringThe value of the argument, corresponding to the parameter's valueTypeContext AttributesProcesses and ProcessUses contain one or more contexts. A Context is a structure; it is fully-owned by the Process or ProcessUse it is attached to, and does not have a unique identifier.Name [Type]Descriptionentity/namestringThe name of the context.process.context/inoutstringThe structural form that the schema represents. Currently, the only supported form is “table”. process.context/containerreferenceReference (pointer via UUID) of the data container that the context is for.process.context/dataUnitNamestringName of the data unit in the data container. May be null (missing) for container-level contexts used with Processes with processScope = ‘container’.Contexts associated with Processes (or their ProcessUses) which are container-level (processScope = ‘container’) will not have a dataUnitName set. Process Summary AttributesThe ProcessSummary structure captures a ‘view’ of a Process entity, pulling in information from related entities (such as scan measurements). Instances of these structs are returned from the ‘summary’ API methods. Name [Type]Descriptionsummary/entityIDstringThe unique identifier of the entity that the summary is for.summary/entityNamestringThe name of the entity that the summary is for.summary/entityCreatedAtinstantThe creation timestamp of the entity that the summary is for.summary/entityCreatedBystringThe username of the person who created the entity.summary/entityModifiedAtinstantThe timestamp when the entity was last modified.summary/entityModifiedBystringThe username of the person who last modified the entity.process/processTypestringType of process, corresponding to the ProcessDefinition. One of 'sql-query', 'hiveql’, ‘'dataset-import', or user-defined type.process/processClassstringGeneral class of this process. One of ‘transform’, ‘import-export’, ‘descriptive’, or user-defined class.process/processScopestringThe scope of the process in terms of its data inputs and outputs. One of ‘container’ or ‘dataunit’.process/isExecutablebooleanTrue if processes conforming to this definition are executable.summary.process/nUses long Number of times transform has been used/executed.summary.process/lastUsereference (uuid)The entity ID of the last process use.summary.process/lastUsedBystringWho last used the process (username).summary.process/lastUsedAtinstantWhen the process was last used (e.g., executed).RequestsRequestDescriptionGET processesGet all processes matching the provided filters.POST processesCreate a new process, given a local instance.GET process/defaultGet a default process, for use when creating a new process.GET processes/summaryGet summaries of all processes matching the provided filters.GET processes/<id>Get the process with the specified entity ID.PUT processes/<id>Replace the process with the specified ID with local instance.DELETE processes/<id>Delete the process with the specified entity ID.GET processes/<id>/summaryGet the summary of the specified process.GET processes/<id>usesGet all the ‘uses’ of the specified processPOST processes/<id>/usesCreate a new ‘use’ of the specified process.GET processes/<id>/uses/<pu_id>Get a specific process use for a specific process.Request DetailsPaths are all relative to the root of the API, including version number.Get Registered ProcessesGet all processes matching the provided filters.Path:processesMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_process_class: filter on the process class; values ‘all’ (default) or one of the valid values of process/processClass filter_executable: filter on whether executable or not; values ‘all’ (default), true, false filter_use_frequency: filter on how often the process has been usedReturns:array of Process entitiesSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Register (Create) a ProcessCreate a new process, given entity metadata and storage information. The values returned from ‘GET /processes/default’ can be edited and then passed into this method to create a new process.Path:processesMethod:POSTParameters:local instance of processReturns:id: ID of created processGet a Process Instance with Default SettingsGet a default processinstance, for use when creating a new process. The values returned from this method can be edited, and used with ‘POST /processes’ to create a new process in Loom.Path:processes/defaultMethod:GETParameters:container_id: the ID of a dataset or source registered in Loomdata_unit_name: the name of a data unit in the containerReturns:process object (no entity/id)Get Process SummariesGet summaries of all processes matching the provided filters.Path:processes/summaryMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_process_class: filter on the process class; values ‘all’ (default) or one of the valid values of process/processClass filter_executable: filter on whether executable or not; values ‘all’ (default), true, false filter_use_frequency: filter on how often the process has been usedReturns:array of ProcessSummary structsSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Get a Process InstanceGet the process with the specified entity ID.Path:processes/<id>Method:GETParameters:none, except for entity/id that is built into the URL Returns:entity: process entityReplace a Process InstanceReplace the process with the specified ID with the new process object.Path:processes/<id>Method:PUTParameters:the process to replace the existing oneReturns:id: ID of process (same as on input)Delete a Process InstanceDelete the process with the specified ID.Path:processes/<id>Method:DELETEParameters:none, except for entity/id that is built into the URL Returns:nothingGet a Process Instance’s SummaryGet the summary of the specified process.Path:processes/<id>/summaryMethod:GETParameters:none, except for entity/id that is built into the URL Returns:ProcessSummary structCreate a Process UseCreates a unique ‘use’ of the specified process. A ProcessUse is a snapshot of a Process at the time it was used; this ensures the viability of lineage calculations, as ProcessUses are immutable (whereas Processes are mutable).Path:processes/<id>/usesMethod:POSTParameters:contexts: The input and output contexts for this particular use. Returns:id: The ID of the ProcessUseGet a Process’s UsesGet the ‘uses’ of the specified process. This returns a set of ProcessUse instances. A ProcessUse is a snapshot of a Process at the time it was used; this ensures the viability of lineage calculations, as ProcessUses are immutable (whereas Processes are mutable).Path:processes/<id>/usesMethod:GETParameters:none, except for entity/id that is built into the URL Returns:array of ProcessUse’sGet a Process UseGet a specific process use for a specific process.Path:processes/<id>/uses/<id>Method:GETParameters:none, except for entity/id that is built into the URL Returns:ProcessUse___________________________________________JobsJobs represent asynchronous activity performed in system, and tracked by Loom. Jobs are linked to executable Processes.Job Entity AttributesName [Type]Descriptionjob/processUseuuidPointer to the ProcessUse the job is tracking execution of.job/processuuidPointer to the Process that the ProcessUse was created from.job/statusstringCurrent status of the job execution. One of created, started, in-progress, completed, failed, cancelled, timed-outjob/executedAtinstantWhen the job was initiated. Should be equal to or greater than the value of entity/createdAt.job/errorMessagestringError message, if an error occurs (status == failed).job/jobTrackerIDstringHadoop job tracker ID, if available.job/jobLogstringLog associated with the job, if available.job/progressembedded structJob progress information. Contains a single JobProgress if available.JobProgress AttributesA JobProgress is a structure that captures progress information related to a job. It can be embedded in a Job instance, if the Job is completed, or obtained independently, if a Job is still in progress.Name [Type]Descriptionjob.progress/statestringInternal state of the processing engine. Corresponds roughly to the Job’s status field.job.progress/startedAtinstantWhen processing began by the processing engine. Should be equal to or greater than the Job’s entity/createdAt.job.progress/finishedAtinstantWhen processing was completed by the processing engine. Should be equal to or greater than the value of startedAt.job.progress/durationlongDuration of the job so far, in milliseconds. duration = (finishedAt - startedAt).job.progress/cpuTimelongTotal CPU time used so far for the job, in milliseconds.job.progress/stepMetricsembedded structJob progress metrics related to job steps. JobStepMetrics struct.job.progress/dataMetricsembedded structJob progress metrics related to data read and written. JobDataMetrics struct.JobStepMetrics AttributesA JobStepMetrics is a structure that captures low-level details about a Job’s execution and progress. It is always embedded in a JobProgress. Name [Type]Descriptionjob.progress.metrics/stepCountlongNumber of steps.job.progress.metrics/stepsFailedlongNumber of steps that failed.job.progress.metrics/stepsPendinglongNumber of steps that have not run yet.job.progress.metrics/stepsRunninglongNumber of steps currently running.job.progress.metrics/stepsSkippedlongNumber of steps skipped.job.progress.metrics/stepsStartedlongNumber of steps that have been started.job.progress.metrics/stepsStoppedlongNumber of steps that have been stopped.job.progress.metrics/stepsSubmittedlongNumber of steps submitted.job.progress.metrics/stepsSuccessfullongNumber of steps completed successfully.JobDataMetrics AttributesA JobDataMetrics is a structure that captures low-level details about a Job’s execution and progress. It is always embedded in a JobProgress. Name [Type]Descriptionjob.progress.metrics/tuplesReadlongThe number of tuples read so far.job.progress.metrics/tuplesWrittenlongThe number of tuples written so far.Job Summary AttributesThe JobSummary structure captures a ‘view’ of a Job entity, pulling in information from related entities (such as the Process that was executed to spawn the job). Instances of these structs are returned from the ‘summary’ API methods. Name [Type]Descriptionsummary/entityIDstringThe unique identifier of the entity that the summary is for.summary/entityNamestringThe name of the entity that the summary is for.summary/entityCreatedAtinstantThe creation timestamp of the entity that the summary is for.summary/entityCreatedBystringThe username of the person who created the entity.summary/entityModifiedAtinstantThe timestamp when the entity was last modified.summary/entityModifiedBystringThe username of the person who last modified the entity.job/statusstringCurrent status of the job execution.job.progress/startedAtinstantWhen the execution engine started processing the job.job.progress/finishedAtinstantWhen the execution engine finished processing the job.job.progress/durationlongThe duration of the job so far; milliseconds.(Equals finishedAt - startedAt when job completed).process/processTypestringType of process, corresponding to the ProcessDefinition. One of 'sql-query', 'hiveql’, ‘'dataset-import', or user-defined type.process/processClassstringGeneral class of this process. One of ‘transform’, ‘import-export’, ‘descriptive’, or user-defined class.job/jobTrackerIDstringHadoop Job Tracker ID, if applicablejob/errorMessagestringError message, if an error occurs (status = FAILED)job/processUsereference (uuid)Pointer to the ProcessUse the job is tracking progress of.summary.job/processIDreference (uuid)Pointer to the Process that the ProcessUse is a snapshot of.summary.job/processNamestringName of the Process (from Process entity/name).RequestsRequestDescriptionGET jobsGet all jobs matching the provided filters.GET jobs/summaryGet summaries of all jobs matching the provided filters.GET jobs/<id>Get the job with the specified entity ID.PUT jobs/<id>Replace the job with the specified ID with the new job object.GET jobs/<id>/summaryGet the summary of the specified job.Request DetailsPaths are all relative to the root of the API, including version number.Get Registered JobsGet all jobs matching the provided filters.Path:jobsMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_job_status: filter on the job status; values ‘all’ (default) or one of the valid values of job/status filter_job_duration: filter on job durationReturns:array of Job entitiesSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Get Job SummariesGet summaries of all jobs matching the provided filters.Path:jobs/summaryMethod:GETParameters:filter_recency: filter on when the entity was last modified; values ‘all’ or ‘recent’ (default)filter_job_status: filter on the job status; values ‘all’ (default) or one of the valid values of job/status filter_job_duration: filter on job durationReturns:array of JobSummary structsSee Also:GET /search/filters/values: to get the allowed values for a specific filterGET /types/attributes/values: to get the values of enumerated attributes which are used in filters.Get a Job InstanceGet the job with the specified entity ID.Path:jobs/<id>Method:GETParameters:none, except for entity/id that is built into the URL Returns:entity: job entityReplace a Job InstanceReplace the job with the specified ID with the new job object. This is used to update a job’s core metadata (name, description, folder, tags); it cannot be used to modify job progress or metrics information.Path:jobs/<id>Method:PUTParameters:the job to replace the existing oneReturns:id: ID of job (same as on input)Get a Job Instance’s SummaryGet the summary of the specified job.Path:jobs/<id>/summaryMethod:GETParameters:none, except for entity/id that is built into the URL Returns:JobSummary struct___________________________________________UsersUsers represent users (and in the future, groups) who use Loom.See also the Connection methods for logging into Loom.User Entity AttributesThe User entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionuser/usernamestringUser’s unique system user name.user/passwordstringUser’s password, in hashed form.user/emailstringUser’s email address.RequestsRequestDescriptionPOST usersCreate a new user, given a local instance. Returns entity ID.PATCH users/<id>Modify user’s password and/or email address.Request DetailsNew UserCreates a user account, and logs in to that account.Path:usersMethod:POSTParameters:body: User object to be savedReturns:id: the entity ID for the created user entity.Update UserModifies the specified user account (e.g., password, email, etc).Path:users/<id>Method:PATCHParameters:password: current password, if changingbody: attributes from User entity, excluding usernameReturns:id: the entity ID for the modified user entity.___________________________________________Glossaries Glossary Entity AttributesA glossary is a container that holds terms. In addition, terms in a glossary can be organized by ‘subject areas’. So, a glossary is also a container of subject areas. The Glossary entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionglossary/businessUnitstringThe business unit that owns or manages the glossary.glossary/stewardstringThe name of the steward who maintains and manages the glossary. glossary/namespacenamespace/NamespaceThe set of namespaces the glossary imports. Used for RDF interoperability.glossary/importarray of namespace/NamespaceThe set of namespaces the glossary imports. Used for RDF interoperability.glossary/subjectAreaarray of referencesThe subject areas in the glossary. Each subject area is owned by the Glossary. Each term is associated with exactly one of these.glossary/termarray of referencesThe terms in a glossary. Each term is ownded by the Glossary, but is considered to be 'contained' within a single subject area.SubjectArea AttributesA glossary is a container that holds terms. In addition, terms in a glossary can be organized by ‘subject areas’. So, a glossary is also a container of subject areas. Subject areas are owned by their containing glossary; if the glossary is deleted, the subject areas within it are deleted also. Note that terms are not ‘contained’ by subject areas; they reference them through a property.The SubjectArea entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionglossary/stewardstringThe name of the steward who maintains and manages the subject area in the glossary. glossary/namespacenamespace/NamespaceThe set of namespaces the subject area imports. Used for RDF interoperability.glossary/importarray of namespace/NamespaceThe set of namespaces the subject area imports. Used for RDF interoperability.Term AttributesA glossary is a container that holds terms. Terms are owned by their containing glossary; if the glossary is deleted, the terms within it are deleted also. Terms can reference a subject area in the glossary. However, terms are not ‘contained’ by subject areas; they reference them through a property.The Term entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionglossary/acronymstringAn acronym by which the term may be known. May be the same as entity name or label.glossary/alternateNamearray of stringAlternate names by which the term may be known.glossary/longDescriptionstringA verbose description of the term. (entity/description is treated as a short description).glossary/subjectAreaRefstringReference to a SubjectArea within the glossary, by name.glossary/termStatusRefstringReference to a TermStatus within the glossary, by name.glossary/termarray of referencesThe terms in a glossary. Each term is ownded by the Glossary, but is considered to be 'contained' within a single subject area.There are sub-types of terms -- SimpleTerms and (in the future) CompositeTerms.TermStatus AttributesA glossary term can have a status, indicating some level of ‘governance’ of the term. Typically, term governance will be managed by a glossary steward.The TermStatus struct has the following attributes, in addition to the core entity attributes such as entity/name, entity/label, and entity/description. Name [Type]Descriptionglossary/statusIndexlongIndex of the status, for ordering.Namespace AttributesGlossaries and subject areas can be organized by namespaces. This is for the purposes of RDF interoperability.The Namespace struct has the following attributes, in addition to the core entity attributes such as entity/name, entity/label, and entity/description. Name [Type]Descriptionnamespace/namespaceUristringThe URI of the namespace for RDF interoperability.namespace/namespacePrefixbooleanThe prefix of the namespace for RDF interoperability.RequestsRequestDescriptionGET glossariesGet all glossaries matching the provided filters.POST glossariesCreate a new glossary, given glossary metadata.GET glossaries/<id>Get the glossary with the specified entity ID.PATCH glossaries/<id>Update the glossary with the specified entity ID.DELETE glossaries/<id>Delete the glossary with the specified entity ID.GET glossaries/<gid>/areasGet all glossary subject areas matching the filters.POST glossaries/<gid>/areasCreate a new subject area in a glossary.GET glossaries/<gid>/areas/<id>Get the subject area with the specified entity ID.PATCH glossaries/<gid>/areas/<id>Update the subject area with the specified entity ID.DELETE glossaries/<gid>/areas/<id>Delete the subject area with the specified entity ID.GET glossaries/<gid>/termsGet all glossary terms matching the filters.POST glossaries/<gid>/termsCreate a new term in a glossary.GET glossaries/<gid>/terms/<id>Get the term with the specified entity ID.PATCH glossaries/<gid>/terms/<id>Update the term with the specified entity ID.DELETE glossaries/<gid>/terms/<id>Delete the term with the specified entity ID.Request DetailsNote this API is ‘flat’ with respect to the containment hierarchy of a glossary. I.e., since glossary terms have unique entity IDs, they can be referenced directly, without referencing the containing subject area. Similarly, subject areas also have unique IDs, and can be referenced directly. (The containing glossary is still included in API methods, to be consistent with other parts of the Loom API, and to allow for retrieval by name in addition to by unique ID).Also note that while Terms are ‘contained’ within SubjectAreas, in terms of the formal data model, they are serialized as siblings, with the ‘whole-part’ relationship defined indirectly, using the subjectAreaRef property of a Term to point to their associated SubjectArea using the name (or namespace prefix) of the Subject Area.Register (Create) a GlossaryCreate a new glossary with a set of descriptive properties. Path:glossaries Method:POSTParameters:entity: properties to assign to the new glossary; must include glossary/namespace if the namespace parameter is not specified, and may contain glossary/import if the import parameter is not specified.namespace: the namespace to assign to the glossary entity; optional, if there is a glossary/namespace property defined within the entity that is passed in.imports: the namespaces of other glossaries (or subject areas) to import, so that terms within the glossary can reference terms in other glossaries; optional, if a glossary/import property is defined within the entity that is passed in.Returns:ID of the new glossary entity.Errors:1. If the glossary name (entity/name) matches that of another glossary in Loom.2. If the glossary’s display name (entity/label) matches that of another glossary in Loom. (Would be confusing in user interfaces).3. If the glossary’s namespace prefix or URI match those of another glossary in Loom. (In reality, the prefix does not have to be unique; this is a convenience, or sloppiness, initially.)Notes:1. The entity parameter is required, and must have at least the entity/name specified. If the namespace parameter is not specified, the entity parameter must include the glossary/namespace property. The entity/type need not be specified.2. The glossary name (entity/name), display name (entity/label), namespace prefix, and namespace URI must all be unique within the context of the Loom registry.See Also:GET glossaries: to get all defined glossariesGet Registered GlossariesGet all glossaries registered with Loom. All contained subject areas are also returned. However, contained terms are not returned. (The focus of this method is to retrieve the ‘organizational scheme’ consisting of glossaries and subject areas).Path:glossariesMethod:GETParameters:(none)Returns:array of Glossary entities, with contained SubjectArea entities. Notes:1. When no glossaries are defined, will return empty array.See Also:POST glossaries: to create a new glossaryGET glossaries/<id>: to get a specific glossaryGet a Glossary Get the glossary with the specified entity ID. The returned instance will include all subject areas in the glossary. It will include all terms in the glossary (linked to their respective subject areas) only if requested.Path:glossaries/<id>Method:GETParameters:include_terms: If true, the glossary terms will be included in the response; default is false.Returns:a Glossary entity, with contained SubjectArea entities; contained terms are also included if the include_terms parameter is true.Notes:1. The ‘related’ section should contain the following sections:contentSummary, with counts of the total number of terms (‘nTerm’) and the total number of subject areas (‘nSubjectArea’).governanceSummary, an array of structures closely aligned with the actual TermStatus instances (with their name for identification, label for display, description for tooltip, and index for ordering), along with a "count" field with the number of terms in the glossary with that status. See the attached JSON example. The array elements should be ordered in the same order as the glossary/statusIndex values.See Also:POST glossaries: to create a new glossaryGET glossaries: to get all glossariesGET glossaries/statuses: to get all term statusesUpdate a Glossary Update the properties of the glossary with the specified ID, with the new or updated attributes specified. This does not change the contents of the glossary: the subject areas and terms. Path:glossaries/<id>Method:PATCHParameters:entity: a set of glossary properties (e.g., entity/name, entity/description, glossary/import, etc), which will replace the existing glossary properties; may include glossary/namespace if the namespace parameter is not specified, and may contain glossary/import if the import parameter is not specified.namespace: the namespace to assign to the glossary entity; optional, if there is a glossary/namespace property defined within the entity that is passed in, or if the namespace is not to be changed.imports: the namespaces of other glossaries (or subject areas) to import, so that terms within the glossary can reference terms in other glossaries; optional, if a glossary/import property is defined within the entity that is passed in, or if the imports are not to be changed.Returns:ID of the updated glossary entity.Errors:1. If the glossary’s display name (entity/label) matches that of another glossary in Loom. (Would be confusing in user interfaces).2. If the glossary’s namespace URI matches that of another glossary in Loom. Notes:1. Changing the entity name or the namespace prefix is not allowed.See Also:POST glossaries to create a new glossary instanceGET glossaries/<id> to get a specific glossary instanceGET glossaries: to get all defined glossariesDelete a Glossary Delete the glossary with the specified ID. This will delete all the subject areas and all the terms in the glossary.Path:glossaries/<id>Method:DELETEParameters:none, except for entity/id that is built into the URL Returns:nothing___________________________________________Add a Subject Area to a GlossaryCreate a new subject area within a glossary. The subject area will be owned by the glossary; if the glossary is deleted, the subject area will be deleted also. Path:glossaries/<glossary_id>/areasMethod:POSTParameters:entity: properties to assign to the new subject area; must include glossary/namespace if the namespace parameter is not specified, and may contain glossary/import if the import parameter is not specified.namespace: the namespace to assign to the subject area entity; optional, if there is a glossary/namespace property defined within the entity that is passed in.imports: the namespaces of other glossaries (or subject areas) to import, so that terms associated with the subject area can reference terms in other glossaries; optional, if a glossary/import property is defined within the entity that is passed in.Returns:ID of the new SubjectArea entity.Errors:1. If the subject area name (entity/name) matches that of another subject area in the glossary.2. If the subject area’s display name (entity/label) matches that of another subject area in the glossary. (Would be confusing in user interfaces).3. If the subject area’s namespace prefix or URI match those of another subject area in the the glossary. Notes:1. The entity parameter is required, and must have at least the entity/name specified. If the namespace parameter is not specified, the entity parameter must include the glossary/namespace property. The entity/type need not be specified.2. Subject areas are owned by their containing glossary; if the glossary is deleted, the subject area will be deleted also. 3. The subject area name (entity/name), display name (entity/label), namespace prefix, and namespace URI must all be unique within the context of the glossary.See Also:GET glossaries<glossary_id>/areas: to get all subject areas in the glossaryGet all Subject Areas in a GlossaryGet all subject areas within a specified glossary. The terms associated with each subject area are not returned by this method.Path:glossaries/<glossary_id>/areasMethod:GETParameters:(none)Returns:an array of SubjectArea entities. Notes:1. When no subject areas are defined, will return empty array.See Also:POST glossaries/<glossary_id>/areas: to add a new subject area to a glossaryGET glossaries/<glossary_id>/areas/<id>: to get a specific subject areaGet a Subject Area and its TermsGet a specific subject area from the glossary. The returned instance will include the subject area and all its terms. Path:glossaries/<glossary_id>/areas/<id>Method:GETParameters:(none)Returns:a SubjectArea entity under “subjectArea”, and an array of Terms under “terms”. Errors:1. If the ID does not match the entity/id of a SubjectArea in the glossary.Notes:1. When a subject area that doesn’t have any terms (such as one newly created) is retrieved, there should be no terms returned in the results.See Also:POST glossaries/<glossary_id>/areas: to add a new subject area to a glossaryGET glossaries/<glossary_id>/areas: to get a all the subject areas in the glossaryUpdate a Subject Area in a GlossaryUpdate a glossary subject area with the specified ID, with the new or updated attributes specified. This does not change the terms associated with the subject area. This method cannot be used to rename the subject area, or to change its namespace. (Renaming will not be supported in Loom 2.2, although changing the display name will be permitted).Path:glossaries/<glossary_id>/areas/<id>Method:PATCHParameters:entity: a set of subject area properties (e.g., entity/name, entity/description, glossary/import, etc), which will replace the existing subject area properties; may include glossary/namespace if the namespace parameter is not specified, and may contain glossary/import if the import parameter is not specified.namespace: the namespace to assign to the subject area entity; optional, if there is a glossary/namespace property defined within the entity that is passed in, or if the namespace is not to be changed.imports: the namespaces of other glossaries (or subject areas) to import, so that terms within the subject area can reference terms in other glossaries; optional, if a glossary/import property is defined within the entity that is passed in, or if the imports are not to be changed.Returns:ID of the updated subject area entity.Errors:1. If the subject area’s display name (entity/label) matches that of another subject area in the glossary. (Would be confusing in user interfaces).2. If the subject area’s namespace URI matches that of another subject area in the the glossary. Notes:The name and namespace prefix cannot be modified by this method. That is because Terms reference their associated subject areas using these properties.Delete a Subject Area from a GlossaryDelete the subject area with the specified ID from the glossary. This will also delete all the terms associated with the subject area from the glossary.Path:glossaries/<glossary_id>/areas/<id>Method:DELETEParameters:term_disposition: Indicates how to deal with terms that are associated with the subject area being deleted. Options are ‘delete’, which will delete the terms; or ‘leave’, which will leave the terms, but set their subject area to null.Returns:nothingErrors:1. Calling this for an entity that does not exist will yield an error.___________________________________________Add a Term to a GlossaryCreate a new term in a glossary. Typically, a term will be associated with a subject area in the glossary. The term will be owned by the glossary; if the glossary is deleted, the term will be deleted also. The term is tightly tied to, but not owned by, the associated subject area if one is defined; if the associated subject area is deleted, it is optional to also delete all the associated terms. Path:glossaries/<glossary_id>/termsMethod:POSTParameters:entity: Properties to assign to the new term. E.g., entity/name (required), entity/label, entity/description, etc. May contain a glossary/subjectAreaRef property, if the area_ref parameter is not specified, whose value must match the name of a subject area in the glossary. May contain a glossary/termStatusRef property, if the status_name parameter is not specified, whose value must match the name of a termStatus in Loom. May contain glossary/property array, if the composite_properties parameter is not specified, which denotes the term is a composite, and specifies the properties to embed in the composite (may be null if no properties specified for a composite upon create).composite_properties: An array of business properties for a composite term. Optional; if the glossary/property property is specified or if the term is a simple term. It is acceptable that this parameter (or the glossary/property property) is specified without a value, indicating the term is composite but that it does not currently have any properties. area_ref: The name of the subject area the term will be in. Optional, if the glossary/subjectAreaRef property is specified as part of the entity parameter.status_ref: The name of the term status that will be assigned to the term. Optional, if the glossary/termStatusRef property is specified as part of the entity parameter.Returns:ID of the new Term entity.Errors:1. If the combination of the term name (entity/name) and subject area matches that of another term in the glossary (i.e., same name in same or ‘default’ subject area).2. If the term display name (entity/label) matches that of another term in the glossary. (That would be confusing for UI displays). 3. If the glossary/termStatusRef has a value that does not match the name (entity/name) of a valid TermStatus in the registry.4. If the area_ref or the glossary/subjectAreaRef property has a value that does not match the name of a SubjectArea in the glossary.5. If an attempt is made to create a new term without a valid technical name (entity/name).6. If an attempt is made to add a term with the combination of technical name and subject area as another term in the glossary.Notes:1. If not explicitly specified using entity/type, the type is inferred from the presence or absence of the ‘glossary/property’ property or ‘composite_properties’ parameter (either of which may have value ‘null’ (or missing for parameter), which indicates the term is a composite with no properties defined (currently)).2. Terms are owned by their containing glossary; if the glossary is deleted, the terms will be deleted also. 3. Terms are not owned by their associated subject areas; if the subject area is deleted, it is optional whether or not to delete the associated terms.4. The entity parameter is required, and must have at least the entity/name specified. The entity/type need not be specified.5. If the display name (entity/label) is not specified upon input, then the entity name should be copied into the entity/label field for use as the display name.6. Term names within a given SubjectArea must be unique. Term names in different SubjectAreas can be the same, as their qualified names (namespace:name) will be unique.7. Term display names (entity/label) must be unique within the entire glossary. This is more conservative than the restriction on technical names (which must be unique within a subject area), for usability reasons (seeing different terms named the same thing in a glossary might be confusing).8. It is permissible to add a term to the glossary that has the same technical name as another term, as long as its subject area is different.See Also:GET glossaries<glossary_id>/terms: to get all terms in the glossaryGet all Terms in a Glossary Matching some ConstraintsGet all terms within a specified glossary that match some input constraints. Path:glossaries/<glossary_id>/termsMethod:GETParameters:area_ref: The subject area to restrict the terms to. If not specified, terms matching all subject areas are returned. If the special name ‘default’ is specified, then only terms associated with the glossary directly (no subject area references) are returned.status_ref: The status to restrict the terms to. If not specified, then terms with any status are returned.Returns:an array of Term entities matching the specified input constraints. Notes:1. When no terms are defined, or when no defined terms match the input constraints, this method will return an empty array.See Also:POST glossaries/<glossary_id>/terms: to add a new term to a glossaryGET glossaries/<glossary_id>/terms/<id>: to get a specific termGet a Term from the GlossaryGet a specific term from the glossary. There are multiple possible variations of this method, depending on how the get a term using its Path:glossaries/<glossary_id>/terms/<id>Method:GETParameters:(none)Returns:a Term entity Errors:1. If the ID does not match the entity/id of a term in the glossary.See Also:POST glossaries/<glossary_id>/terms: to add a new term to a glossaryGET glossaries/<glossary_id>/terms: to get a set of termsUpdate a Term in a GlossaryUpdate a specific term in the glossary, with the new or updated attributes specified. This method can be used to change the subject area that a term is associated with. However, it cannot be used to rename the term. (Renaming will not be supported in Loom 2.2, although changing the display name will be permitted).Path:glossaries/<glossary_id>/terms/<id>Method:PATCHParameters:entity: A set of term properties (e.g., entity/name, entity/label, entity/description, etc), which will replace the current set of propertiescomposite_properties: An array of properties for a composite term, which will replace the current set of composite properties. Optional; if the glossary/property property ios specified or if the term is a simple term. It is acceptable that this parameter (or the glossary/property property) is specified without a value, indicating the term is composite but that it does not currently have any properties. area_ref: The name of a different subject area the term will be in. Optional, if the glossary/subjectAreaRef property is specified as part of the entity parameter, or if the term will remain tied to the original subject area.status_ref: The name of the term status that will be assigned to the term. Optional, if the glossary/termStatusRef property is specified as part of the entity parameter, or if the term status will remain unchanged.Returns:ID of the updated term entity.Errors:1. Attempting to change the name of the term will yield an error.2. Attempting to update a term with a area_ref or glossary/subjectAreaRef that does not match the name of a subject area in the glossary will yield an error.Notes:1. The term’s name cannot be modified by this method. That is because CompositeTerms reference their properties via qnames, formed with the SubjectArea prefix and Term name. So changing term names would break those property bindings.2. The subject area that a term is associated with can be changed with this method by specifying a new (valid) value for the glossary/subjectAreaRef property of the entity that is passed in.3. Updating a term without the area_ref or glossary/subjectAreaRef specified will result in the term having the same subject area association it had before the call.Delete a Term from a GlossaryDelete the glossary term with the specified ID from the glossary. Path:glossaries/<glossary_id>/terms/<id>Method:DELETEParameters:none, except for entity/id that is built into the URL Returns:nothingError:1. Calling this for an entity that does not exist should yield an error.___________________________________________RelationshipsRelationship Entity AttributesA relationship is an entity that can relate any two entities. Relationships have a reference to a RelationshipType, which defines the constraints for relationships. The Relationship entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionrelate/relationshipTypereference to RelationshipTypeThe relationship type, defining constraints on the relationship.relate/role1Role structThe first role (of two) defining one of the ends of the relationship.relate/role2Role structThe second role (of two) defining one of the ends of the relationship.RelationshipType AttributesRelationshipTypes define the constraints for relationships.The RelationshipType entity has the following attributes, in addition to the core entity attributes. Name [Type]Descriptionrelate/acronymstringAn acronym by which the type might be known.relate/alternateNamearray of stringA set of alternate names by which the relationship type can be known.relate/role1RoleType structThe first role type (of two) defining constraints on one of the ends of relationship instances.relate/role2RoleType structThe second role type (of two) defining constraints on one of the ends of relationship instances.Role AttributesRoles define the ends of a relationship. A role has an implied reference to a RoleType; i.e., the Role at ‘end 1’ is associated with the RoleType at ‘end 1’ of the associated relationship type; similarly for the role at ‘end 2’.The Role struct has the following attributes, in addition to the core entity attributes such as entity/name, entity/label, and entity/description. Name [Type]Descriptionrelate/entityreferenceThe entity that the role is bound to.RoleType AttributesRole types define the constraints on the ends of relationships that reference the relationship ttype to which the role type is part of.The RoleType struct has the following attributes, in addition to the core entity attributes such as entity/name, entity/label, and entity/description. Name [Type]Descriptionrelate/isNavigablebooleanWhether the end of the relationship denoted by the role is navigable or not.relate/constrainedTypestringConstraint on type that can fill a role in a relationship instance; null for 'any'.RequestsRequestDescriptionGET relationshipsGet all relationships matching the provided filters.POST relationshipsCreate a new relationship.GET relationships/<id>Get the relationship with the specified entity ID.PATCH relationships/<id>Update the relationship with the specified entity ID.DELETE relationships/<id>Delete the relationship with the specified entity ID.GET relationships/typesGet all relationship types matching the provided filters.Request DetailsCreate a Relationship between Two EntitiesCreate a new relationship between two entities, optionally overriding the default relationship name and role namesPath:relationshipsMethod:POSTParameters:rel_type_id: the entity/id of a relationship type (relate/RelationshipType) registered with the system; optional - if not specified, the generic 'relatedEntity' relationship will be usedrelated1_id: the entity/id of the first entity to relate (required)related2_id: the entity/id of the second entity to relate (required)rel_entity: a bucket of properties (entity/name, entity/label, entity/description, etc) to put on the relationship entity itself; optional - if not specified, the relationship properties will be set from the associated relationship typerole1: a bucket of properties (entity/name, entity/label, entity/description, etc) describing the role of the first entity in the relationship; optional - if not specified, the role properties will be set from the associated role typerole2: a bucket of properties (entity/name, entity/label, entity/description, etc) describing the role of the second entity in the relationship; optional - if not specified, the role properties will be set from the associated role typeReturns:ID of the new Relationship entity.Errors:1. If exactly two valid entity IDs are not passed in.2. If the relationship type is specified, but is invalid.3. If the specified entities at the ends of the relationship do not correspond to the type constraints defined bty their respective role types (relate/constrainedType).Notes:1. If the relationship type is not specified, the generic 'RelatedEntity' relationship will be used.2. If the relationship entity properties are not specified, the corresponding properties from the relationship type will be used.3. If the role for an end is not specified, the properties from the associated RoleType (from the RelationshipType) will be used.4. The type of an entity at a relationship end (related1_id, related2_id), must be compatible (accounting for inheritance) with the type specified by the relate/constrainedType field of the corresponding role type. 5. If a label is specified for any of the Relationship or Roles, but no name, then the label will be ‘sanitized’ and used to generate the name.See Also:GET relationships: to get all relationships, given some constraintsGet all Relationships Matching some ConstraintsGet all relationships, optionally constrained by relationship type or name, or additionally by entity ID (at either end) or related entity type (at either end if no entity ID specified, else at the opposite end)Path:relationshipsMethod:GETParameters:rel_type_id: the entity/id of a relationship type (relate/RelationshipType) to constrain the relationships returned to; optional .rel_name: a name to restrict the relationships returned to; optional. Can be used independent of or in conjunction with rel_type_id.related_id: the entity/id of an entity at one end of the relationship (can be at either end); optional.related_type: the type of entities at one end of the relationship (can be either end, if no related entity ID is specified, otherwise at the opposite end); optional. If not specified, there is no constraint on the type of the related entities. Can use the special values ‘all-technical’, ‘all-glossary’, and ‘all’ (which is the same as ‘core/Entity’)Returns:an array of Relationship entities.Errors:1. If the specified relationship type ID is invalid.2. If the specified related ID is invalid. Notes:1. The relationship name may be more restrictive than the relationship type (specified via the rel_type_id). That is because a relationship can be named independently of the relationship type’s name. Or, in the case where the relationships were given the same name as the relationship types, then specifying this is just another way to identify the relationship type (effectively).2. There is no restriction on what a relationship can be named. The same name can be used for multiple instances of the same type, or for instances of different relationship types.3. The related type constraint should take inheritance into account. E.g., specifying data/DataContainer should return all instances that have that type, or source/Source, or dataset/Dataset.See Also:POST relationships: to create a new relationshipGET relationships/<id>: to get a relationshipGet a RelationshipGet a relationship given its unique identifier.Path:relationships/<id>Method:GETParameters:none (except for the ID in the URL)Returns:a Relationship entityErrors:1. If the specified relationship ID is invalid.See Also:POST relationships: to create a new relationshipUpdate a RelationshipUpdate a relationship. This performs a partial update, not a full replace. Path:relationships/<id>Method:PATCHParameters:rel_type_id: the entity/id of a relationship type (relate/RelationshipType) registered with the system; optional - if not specified, the existing relationship type is left unchangedrelated1_id: the entity/id of the first entity to relate; optional - if not specified, the existing entity on ‘role 1’ is left unchangedrelated2_id: the entity/id of the second entity to relate; optional - if not specified, the existing entity on ‘role 1’ is left unchangedrel_entity: a bucket of properties (entity/name, entity/label, entity/description, etc) to put on the relationship entity itself; optional - if specified, the existing relationship properties will be fully replaced with the new values; if not specified, the relationship properties will be left unchangedrole1: a bucket of properties (entity/name, entity/label, entity/description, etc) describing the role of the first entity in the relationship; optional - if specified, the existing role properties will be fully replaced with the new values; if not specified, the role properties will be left unchangedrole2: a bucket of properties (entity/name, entity/label, entity/description, etc) describing the role of the second entity in the relationship; optional - if specified, the existing role properties will be fully replaced with the new values; if not specified, the role properties will be left unchangedReturns:ID of the updated Relationship entity.Errors:1. If an invalid entity ID is passed in.2. If the relationship type is specified, but is invalid.Notes:1. For any of the parameters that is specified, their contents will fully replace the corresponding values of the current relationship.2. If a new relationship type is specified, but one or more of the rel_entity and role1 and role2 are not specified, then the existing values of those will be replaced with the ‘default’ values, obtained from the new relationship type. So, the relationship name, label, description will be replaced with the corresponding values from the new relationship type. Similarly for the role name, etc being replaced with the corresponding values from the role types (for each role). 3. If a entity is specified for one or both ends of the relationship (related1_id and/or related2_id), the type of the entity must be compatible (accounting for inheritance) with the type specified by the relate/constrainedType field of the corresponding role type. (I.e., related1 must be compatible with roleType1, and relalted2 must be compatible with roleType2). This applies whether a new relationship type is specified or not.Error:1. If the specified entities at the ends of the relationship do not correspond to the type constraints defined by their respective role types (relate/constrainedType).See Also:POST relationships: to create a new relationshipDelete a RelationshipDelete a relationship given its unique identifier.Path:relationships/<id>Method:DELETEParameters:none (except for the ID in the URL)Returns:nothingErrors:1. If the specified relationship ID is invalid.See Also:POST relationships: to create a new relationship___________________________________________Get Relationship TypesThere are a fixed set of ‘global’ RelationshipType instances in the Loom registry. This method retrieves all the RelationshipType instances that are registered, optionally constrained by those types that allow for a specific type of entity to be related.Path:relationships/typesMethod:GETParameters:related_type: the type of entities allowed at one end of the relationship (can be either end); optional - if not specified, all relationship types are returned.Returns:array of RelationshipType entities. Notes:1. This method will return all the RelationshipType instances registered if no parameters are specified.EntitiesThis is a ‘generic’ API for accessing instance-related information about the Loom registry model. RequestsThere is currently only one ‘polymorphic’ (type-agnostic) method available. In the future, there will be more methods for interacting with entity instances without declaring the type. RequestDescriptionDELETE entities/<id>Delete the entity with the specified entity ID.DELETE entitiesDelete all entities in the specified folder; optionally recursive.Delete an entityDelete the entity with the specified ID. This performs type-specific delete processing where appropriate (i.e., delegates to DELETE /sources, etc).Path:entities/<id>Method:DELETEParameters:none, except for entity/id that is built into the URL Returns:nothingDelete all entities in a folderDeletes all entities ‘in’ the specified folder. Optionally recurses down the virtual folder hierarchy. Path:entitiesMethod:DELETEParameters:folder: The starting folder; use a single slash (‘/’) or an empty string (“”) for the ‘root’ folder.recurse: Indicates whether to delete recursively or not (default false).Returns:nothing___________________________________________TypesThis is a ‘generic’ API for accessing type-related information about the Loom registry model.RequestsRequestDescriptionGET types/extensionsGet extension attributes for specified entity type.GET types/attributes/valuesGet allowed values for the specified attribute.Request DetailsType ExtensionsReturns all the extension attributes for a data type. Results are an array of attribute definitions.Path:types/extensionsMethod:GETParameters:type: The entity type to retrieve extension attributes for.Returns:An array of attribute definitions associated with the entity type; e.g.,[{"meta.attribute.ref/type":"dataset/Dataset","meta.attribute/doc": "Related To","meta.attribute/cardinality":"many","meta.attribute/valueType":"uuid","meta.attribute/name":"dataset.extension/relatedTo"},{"meta.attribute/doc":"Source","meta.attribute/fulltext":"true","meta.attribute/cardinality":"one","meta.attribute/valueType":"string","meta.attribute/name":"dataset.extension/source"}]Attribute Allowed ValuesReturns the allowed values for a specific attribute. This assumes the attribute holds enumerated values. Path:types/attributes/valuesMethod:GETParameters:entity: The groups of attributes of interest, based on an entity type. This parameter is ignored if attribute is set.Must be a single value or array of: source/Source, process/Process or data.table/Columnattribute: Defines the attributes of interest. Must be a single value or array of: persist/storageType, data/structuralForm, process/processType, data.table/dataTypeReturns:Map of attribute-name to [value,label] pairs; E.g.,{"data/structuralForm":[["table","Table"]],"persist/storageType":[["file/text","Text Files"],["rdb/generic","Relational Database"]]}___________________________________________Activity APIThe Activity API focuses on general user activities, such as executing transforms, managing sessions, etc. These are independent of metadata-manipulation operations performed through the Resource API.Connection OperationsThese operations are related to establishing, checking, and terminating connections to Loom.Note that in Loom 1.0, connection establishes a cookie so that entity creation and modification events are tied to a specific user (via the entity/createdBy and entity/updatedBy fields). Loom does not have formal secure authentication in 1.0.User Entity AttributesThe connection operations deal with users, and as such use User entity information. See the Users section above for attribute values.RequestsRequestDescriptionPOST connect/loginLog user in to Loom.POST connect/logoutLog user out of Loom.GET connect/pingChecks if the connection is still valid. If it is, returns information for the user associated with the session.Request DetailsLoginProvide user credentials to get a session cookie for that user. Both the username and password parameters are strings. The password is sent in the clear, and will rely on the transport layer for security.Path:connect/loginMethod:POSTParameters:username, passwordReturns:entity/id: The entity ID of the user who logged in.Headers:Set-Cookie with session-id set.LogoutReleases the user’s connection. This resets the user’s session cookie.Path:connect/logoutMethod:POSTParameters:none - retrieves current user from the session-id in cookies.Returns:emptyPingChecks if the caller is currently connected to Loom, and if so, returns the entity containing all of the user data for the currently logged in user. The current user is determined from the session information (see /connect/login).Path:connect/pingMethod:GETParameters:session cookie in headersReturns:User entity. ___________________________________________Search OperationsRelated to search - getting filters, defining filters, etc.RequestsRequestDescriptionGET search/filters/valuesGet the values for specific named filters, used for the multi-value ‘GET’ methods in the Resource API.POST search/textSearch through text fields for matching entities.POST search/text/glossariesSearch through all glossaries or a single glossary for term matching the search value. (New in 2.2)GET search/foldersGet all entities under the specified folder.GET search/lineage/dataGet lineage for a specific dataset, source, or data unit.GET search/lineage/processGet lineage starting from a specific process or process use.GET search/related/entitiesGet entities of some type that are traversable in a specified direction from a single instance.GET search/related/processesGet all the processes and process uses that a specific data unit has been used in.GET search/related/dataflowGet all the ‘dataflow segments’ that a specified entity is part of. GET search/totalsGet the total number of entities of each of the main types exposed by Loom:source/Source, dataset/Dataset, process/Process, job/JobRequest DetailsPaths are all relative to the root of the API, including version number.Get Filter Values for a FilterGet the values for specific named filters, used for the multi-value ‘GET’ methods in the Resource API.Path:search/filters/valuesMethod:GETParameters:context: The name of a filter grouping (e.g., ‘dataset’, ‘source’, etc).filter: The name of a filter (e.g., ‘filter_recency’)Returns:Map of filter key to filter value. E.g., {<filter_name> : [ ["<value_1>", "<label_1>"], ... ] ... }The filter key is passed into Loom API methods, whereas the value is used for display purposes. See Also:GET /types/attributes/values: To get the values for a specific attribute. Filters often are comprised of the values of an enumerated attribute, plus the value ‘all’ (and sometimes ‘none’).Perform Full-text Search over all EntitiesSearch through text fields for matching entities. The following core entity fields can be included in these searches: entity/name, entity/description, entity/label, entity/tags. In addition, some domain model attributes (those with the meta-attribute of meta.attribute/fulltext=true) can be included in these searches. Path:search/textMethod:POSTParameters:types: entity types to return if their properties, or the properties of a related type, is matched. One of ‘all’, ‘technical’, ‘glossary’, or a list of specific types.properties: properties to search over (all of the specified properties may not be on all the instances being matched against; just ignore the ones that are missing for a given type). By default, all of the 'core' properties for an entity are included: entity/name, entity/description, entity/label, entity/tagsmatch_terms: list of search words or phrases, to be matched individually and AND’d together match_type: whether exact or wildcard (default wildcard); one of "exact" or "wildcard", with the latter being default; wildcard search can also be baked into search terms with wildcard (‘*’) characterrelated_types: entity types of related entities to search over; if not specified, relationships will not be followed. One of ‘all’, ‘technical’, ‘glossary’, or a list of specific types._offset: Provides an offset into the results._limit: Specifies how many results should be returned, starting at the offset position. Returns:An array of SearchResults, each containing the following information:returnEntity: the entity that was returned as a result of the match; must be compatible with the ‘types’ parameterentity/identity/nameentity/labelentity/typeentity/modifiedAtentity/modifiedBycontainerEntityentity/identity/nameentity/labelentity/typeNotes:If multiple properties are specified, then a match will occur if the match term is found in any of those properties, on the target types (types parameter) or on any related types (if related_types parameter is specified). Get Entities in FolderLists all entities of a given type that appears within a folder. This recursively includes all sub-folders under the specified one. If a folder is not specified, then the root folder is presumed. Note that the type parameter is unnamed in the request, and appears at the end of the path.Path:search/foldersMethod:GETParameters:type: The type of entities to get; ‘all’ for allfolder: The folder to look in. ‘/’ for the root folder.recurse: If true (default), recurse through the folder hierarchyReturns:entities: An array of entities of the requested type, in the folder or one of it’s sub-folders.folders: A sorted array of the folders that the objects appear in.Get Lineage starting from a Data Container or UnitGet lineage starting from a specific dataset, source, or data unit within a dataset or source.Path:search/lineage/data (previous method search/lineage is deprecated)Method:GETParameters:container_id: The ID of the dataset or source; if specified and no dataunit information is provided, than container-level lineage will be computed, otherwise data unit-level lineage will be computeddata_unit_name: The name of the data unit within the container (container_id must be specified); if specified, then data unit-level lineage will be computeddata_unit_id: The ID of the data unit (container_id is optional, but must be consistent if specified); if specified, then data unit-level lineage will be computeddirection - specifies direction. Optional. If excluded then both directions are given.format: The format the lineage will be returned in. One of ‘graph’ (default) or ‘nested’. The ‘graph’ format returns a node-link structure with all node and link details in the ‘related’ section of the response. The ‘nested’ format returns a deeply nested structure.direction: The direction to compute lineage for; one of ‘up' (for upstream), 'down' (for downstream), or 'both' (default).steps_max: the maximum number of steps taken in the graph traversal; default is ‘unlimited’Returns:‘graph’ option:nodes: an array of the nodes in the graph, representing entities that are connected via the links; the ‘node’ key contains the entity IDnode node is the UUID of the data entity (dataset, source, or data unit)node label is the name of the nodenode type is one of ‘data’ or ‘process’node group is the UUID of the container for data nodes, or of the process for process (use) nodesnode state is the entity statelinks: an array of links, with the ‘source’ and ‘target’ keys containing the IDs of the entities at the start and end of each link.link source is the UUID of the node the directed link comes fromlink target is the UUID of the node the directed link goes tolink label is a string of the form ‘source_type->target_type’link type is ‘lineage’ Returns:‘nested’ option:up: the lineage upstream (‘backward’ direction) of the starting entitydown: the lineage downstream (‘forward’ direction) of the starting entityNotes:The combinations possible on input are:container_id only - container-level lineagedata_unit_id only - data unit-level lineagecontainer_id and data_unit_name - server looks up data_unit_id, computes data unit-level lineagecontainer_id and data_unit_id - same as previous, but data unit must be in the container or it is an errorNotes:Example return structure for ‘graph’ option (with details keyed off UUIDs in the ‘related’ part of the response):{"nodes": [ {"node": <uuid>, "label": <string>, "type": <string>, "group": <uuid>} , ... ], "links": [ {"source": <uuid>, "target": <uuid>, "label": <string>, "type": <string>, ...]}Get Lineage starting from a ProcessGet lineage starting from a specific process or process use. This method always returns lineage in the ‘graph’ format. (The ‘data’ lineage has an option to do ‘nested’ also, but that does not make as much sense when starting from a process.)Path:search/lineage/processMethod:GETParameters:process_id: The ID of the process or process use (required) granularity: The granularity at which lineage will be computed. One of ‘container’ or ‘data_unit’ (optional; default is container if not specified). steps_max: the maximum number of steps taken in the graph traversal; default is ‘unlimited’Returns:nodes: an array of the nodes in the graph, representing entities that are connected via the links; the ‘node’ key contains the entity IDnode node is the UUID of the data entity (dataset, source, or data unit)node label is the name of the nodenode type is one of ‘data’ or ‘process’node group is the UUID of the container for data nodes, or of the process for process (use) nodesnode state is the entity statelinks: an array of links, with the ‘source’ and ‘target’ keys containing the IDs of the entities at the start and end of each link.link source is the UUID of the node the directed link comes fromlink target is the UUID of the node the directed link goes tolink label is a string of the form ‘source_type->target_type’link type is ‘lineage’ Notes:Example return structure for ‘graph’ option (with details keyed off UUIDs in the ‘related’ part of the response):{"nodes": [ {"node": <uuid>, "label": <string>, "type": <string>, "group": <uuid>} , ... ], "links": [ {"source": <uuid>, "target": <uuid>, "label": <string>, "type": <string>, ...]}Get Related EntitiesGet entities of some type that are traversable in a specified direction from a single instance.Path:search/related/entitiesMethod:GETParameters:entity_id: the ID of the starting entity; currently, this must be of type source/Sourcerelated_type - the type of the target entities; currently, this must be of type dataset/Datasetsteps_max: is the maximum number of steps taken in the graph traversal; default 1Returns:related_entities: array of all entities of the specified type that are related to the starting entity steps_taken: the number of steps taken in the graph traversal; will be less than or equal to steps_maxGet ProcessesGet Processes and Process Uses that have a Dataset or DataUnit in their input or output contexts. NOTE: This method will be merged into /search/related/entities in an upcoming release.Path:search/related/processesMethod:GETParameters:dataset_id: The ID of the dataset; if specified and no dataunit information is provided, then all processes involving the dataset will be returned. data_unit_name: The name of the data unit within the dataset (dataset_id must be specified); if specified, then only processes using the specific data unit will be returned.data_unit_id: The ID of the data unit (dataset_id is optional, but must be consistent if specified); if specified, then only processes using the specific data unit will be returned.direction: The direction; either ‘in’ (for input to process) or ‘out’ (for output from process) or 'both' (default)class: The class of processes to return; one of either ‘process’, ‘process_use’, or ‘all’ (with ‘all’ being the default if not specified). Returns:in: array of Processes and ProcessUses that the dataset / data unit are input toout: array of Processes and ProcessUses that the dataset / data unit are output fromNotes:The combinations possible are:dataset_id and data_unit_name - server looks up dataunit_id, returns statsdata_unit_id only - ok, will return statistics for thatdataset_id and data_unit_id - same as previous, but data unit must be in the dataset or it is an errordataset_id only - errordata_unit_name only - errorGet Related DataflowsGet all ‘dataflow segments’ that involve a specified entity. This is a useful way to find not only which entities are related (within 2 graph links) to a given entity, but also how they are related.A dataflow segment is of the form: /-> process /data-input* -> process-use -> data-output* \ \-> jobwhere:data-input is a set of ‘Source [DataUnit]’ or ‘Dataset [DataUnit]’ pairs, input to and output from a ProcessUseDataUnit does not have to be present for all process uses (e.g., source-> dataset)There can in general be multiple input pairs and multiple output pairsprocess-use is a ProcessUse instancedata-input is a Source/DataUnit or Dataset/DataUnit pair, input to a ProcessUseprocess is the Process entity that the ProcessUse is a snapshot ofjob contains Job information for the executing or executed ProcessUse (empty if not executable)The main segment contains the main data flow information: data input to process use and data output. The process and job information are secondary info, and can be optionally omitted.Path: search/related/dataflowMethod: GETParameters:entity_id - the ID of the reference entity; must be of type Source, Dataset, Process, ProcessUse, or Job.include_secondary - whether to include ‘secondary’ entities (process and job entities), or not; default is true, to include the secondary entities.Returns:an array containing structures with the the following fields: use-role - the role that the starting entity plays in the segment (one of ‘data-input’, ‘data-output’, ‘process-use’, ‘process’, or ‘job’)use-date - the modifiedAt date from the process-use (copied up for convenience)data-input - the data input entity (and contained entity if there is one)process-use - the process usedata-output - the data output entity (and contained entity if there is one)process - the process that the process use is a snapshot ofjob - the job tracking execution info (empty if not an executable process)Each path field (data-input, etc) contains the following core properties:entity/identity/nameentity/typeentity/modifiedAtentity/modifiedByIn addition, the ‘data’ fields (data-input and data-output) will have an additional sub-field, called ‘contained-entity’, if the ProcessUse context had a dataUnitName property. In that case, this which will have the same 5 core properties for the data unit.All other ‘auxiliary’ information should be in the ‘related’ section:for all types: description, modifiedByUser, entityStatefor sources: storageType (technical and human readable), location, # tables (calc)for datasets: # tables (calc)for processes: processType, processClass, processScope, # uses (calc)for process uses: job ID, job name, job status, job duration (raw and display)for jobs: process use ID, job name, job status, job duration (raw and display)Notes:If include_secondary is false, the ‘process’ and ‘job’ fields will not be included in the response.The ‘use-role’ and ‘use-date’ are used to extract out particular segments. For example, a data entity can be an input to or an output from a process use, or both. The role allows the client to ‘filter’ based on these use contexts. (Similar to the ‘in’ and ‘out’ grouping output from search/related/processes). The use-date is a copy of the process-use modifiedAt, and is placed at the top-level to assign that same timestamp to the overall segment, and provide an easy way to sort segments by timestamp.Get Entity TotalsGet the total number of entities of each of the main types exposed by Loom: source/Source, dataset/Dataset, process/Process, job/Job.Path:search/totalsMethod:GETParameters:noneReturns:map of type to count___________________________________________Data Access OperationsThese operations focus on basic data access. These do not deal with any kind of filtering, processing, or transformations; those kinds of operations are handled through the Processing API.RequestsRequestDescriptionGET data/file/read_linesGet the first rows from a text file.GET data/file/read_parsedGet the first parsed lines from a text file.GET data/dataset/headGet the first rows from an individual data unit within a dataset.GET data/dataset/statsGet the statistics for an individual data unit within dataset.Request DetailsRead Lines from Text FileGet the first rows from text file in HDFS.Path:data/file/read_linesMethod:GETParameters:location: The absolute path to the file in the file system.nrow: The number of rows to return; default 10.Returns:An array of strings, each of which is a line of text from the file.Read Lines from Text FileGet the first rows from text file in HDFS.Path:data/file/read_parsedMethod:POSTParameters:location: The absolute path to the file in the file system.file_format: The Format struct, to interpret the bits on disknrow: The number of rows to return; default 10.Returns:rows: An array of rows from the file. Each row is an array of strings that are the columns parsed from the lines in the file.columns: An array of strings that gives the column names used in the file. The column names are parsed from the file's header if possible, else the columns are assigned auto-generated names.Get the First Rows from a Data UnitGet the first rows from an individual data unit within a dataset.Path:data/dataset/headMethod:GETParameters:dataset_id: The ID of the dataset; optional if data_unit_id is specifieddata_unit_name: the name of the data unit within the dataset (dataset_id must be specified); optional if data_unit_id is specifieddata_unit_id: The ID of the data unit (dataset_id is optional, but must be consistent if specified); optional, if data_unit_name is specifiednrow: The number of rows to return; default 10.Returns:records: the recordscolumn_names: the names of the columnsNotes:The combinations possible are:dataset_id and data_unit_name - server looks up data_unit_id, returns statsdata_unit_id only - ok, will return statistics for thatdataset_id and data_unit_id - same as previous, but data unit must be in the dataset or it is an errordataset_id only - errordata_unit_name only - errorGet the Statistics for a Data UnitGet the statistics for an individual data unit within dataset.Path:data/dataset/statsMethod:GETParameters:container_id: The ID of the dataset; optional if data_unit_id is specifieddata_unit_name: the name of the data unit within the dataset (container_id must be specified); optional if data_unit_id is specifieddata_unit_id: The ID of the data unit (container_id is optional, but must be consistent if specified); optional, if data_unit_name is specifiedReturns:A ‘scan metadata’ structure, consisting of (only primary fields shown):scan.table/numRecords: the number of recordsscan.table/columnMetadata: the columns:scan.table/column - the column that the metadata is for, with sub-fields ofentity/typeentity/namedata.table/dataTypescan.table/columnType - one of 'string', 'number', or 'object'scan.table/nullValues - number of null valuesscan.table/emptValues - number of empty string values, for string columnsscan.table/minValue - the minimum value, for numeric columnsscan.table/maxValue - the maximum value, for numeric columnsscan.table/meanValue - the mean, for numeric columnsscan.table/stdDev - the standard deviation, for numeric columnsNotes:The possible parameter combinations are:container_id and data_unit_name - server looks up data_unit_id, returns statsdata_unit_id only - ok, will return statistics for thatcontainer_id and data_unit_id - same as previous, but data unit must be in the dataset or it is an errorcontainer_id only - errordata_unit_name only - errorCalculate the Statistics for a Data UnitCalculate the statistics for an individual data unit within dataset.Path:data/dataset/statsMethod:POSTParameters:container_id: The ID of the dataset; optional if data_unit_id is specifieddata_unit_name: the name of the data unit within the dataset (container_id must be specified); optional if data_unit_id is specifieddata_unit_id: The ID of the data unit (container_id is optional, but must be consistent if specified); optional, if data_unit_name is specifiedReturns:ID of the data unit the statistics were computed for.Notes:The possible parameter combinations are:container_id and data_unit_name - server looks up data_unit_iddata_unit_id only - okcontainer_id and data_unit_id - same as previous, but data unit must be in the dataset or it is an errorcontainer_id only - errordata_unit_name only - error___________________________________________Execution OperationsThese operations deal with processing data -- executing transformations and tracking job progress.RequestsRequestDescriptionPOST execute/transformExecute the specified transformation.GET execute/statusGet the status of an executed job.Request DetailsExecute a TransformExecute the specified transformation.Path:execute/transformMethod:POSTParameters:process_id: The ID of the process to execute.contexts: The input and output data contexts; not required if process has a default context with both dataset and data unit name defined.Returns:id: The ID of a Job to track the progress of the executionNotes:If a context is not provided, the process being executed must have a default context defined. If that is the case, the input context will be the default input context, and the output context will be automatically generated as follows: the dataset will be the same as the input dataset, and the output data unit (table) name will be auto-generated.Get Execution StatusGet the status of an executed job. The job may be in-progress, or may have completed (or failed). Path:execute/statusMethod:GETParameters:job_id: The ID of the job (appended to URL)Returns:The job status and progress, as a list with 2 fields: job/status: the job's execution statusjob/progress: the details about the job's execution, a JobProgress___________________________________________Environment OperationsThese operations deal with the external environment which Loom interacts with.Environment Struct AttributesFileInfo AttributesA FileInfo is a structure that describes a file or directory in HDFS. Name [Type]DescriptionpathstringThe full path of the file or directory.isDirbooleanIf true, is a directory; otherwise is a file. lengthintegerThe length of the file in bytes.blockSizeintegerThe number of blocks in a file block.modificationTimeinstantWhen the file or directory was last modified.ownerstringUsername in HDFS of user who owns the file or directorygroupstringName of group in HDFS that owns the file or directory.permissionstringThe file permissions, in string format (e.g., “644”).replicationintegerHow many times the file is replicated across a cluster.RequestsRequestDescriptionGET environ/fs/home_dirGet the home directory of the file system.GET environ/fs/list_infoGet a listing of files in a directory. This is not recursive.GET environ/fs/file_infoGet information about a specific file.POST environ/fs/filesUpload one or more files to a specified directory in HDFS.GET environ/hive/dblistGet a list of databases in a Hive instance.Request DetailsFile System Home DirectoryGets the name of the file system home directory. The file system can either be the native file system on the Loom server, or a Hadoop File System (HDFS) managed by the server. See also Apache WebHDFS documentation.Path:environ/fs/home_dirMethod:GETParameters:noneReturns:The path of the home directoryFile System List FilesGets all file system details for everything in the provided path. If location is a directory, then lists the details for everything in that directory. If location is a file, then lists the details for just that file. The file system can either be the native file system on the Loom server, or a Hadoop File System (HDFS) managed by the server. See also Apache WebHDFS documentation.Path:environ/fs/list_infoMethod:GETParameters:location: The path to get details for. May be a file or a directory.Returns:An array containing FileInfo objects (see above). E.g., [{"path":"","group":"wheel","blockSize":33554432,"modificationTime":"2013-07-10T23:44:47Z","length":102,"owner":"root","isDir":true,"replication":1,"permission":"rwxr-xr-x"},{"path":"","group":"wheel","blockSize":33554432,"modificationTime":"2013-07-23T02:30:47Z","length":0,"owner":"pag","isDir":false,"replication":1,"permission":"rw-r--r--"}]File System File InformationGet the file system details for a specified path. If path is a directory, then returns just the description of that directory. See also Apache WebHDFS documentation.Path:environ/fs/file_infoMethod:GETParameters:location: The path to get details for. May be a file or a directory.Returns:A FileInfo object (see above). E.g. {"path":"","group":"wheel","blockSize":33554432,"modificationTime":"2013-07-23T19:44:29Z","length":476,"owner":"root","isDir":true,"replication":1,"permission":"rwxrwxrwt"}File UploadUpload one or more files to a specified directory in HDFS.Path:environ/fs/filesMethod:POSTParameters:file: The fully-qualified path to the file to upload.target_directory: The destination directory in HDFS, where the file will be uploaded to.Returns:No return value. Hive Database ListingGet the databases in a Hive instance.Path:environ/hive/dblistMethod:GETParameters:location: If present, list the tables in the database named by the pathall: If true, include databases that are Loom datasets; optional, defaults to falseReturns:A map containing tables, and if location specified, names of databases{tables: [ { dbname: name: owner:tableType:/* "MANAGED" | "EXTERNAL" */parameters:viewOriginalText:viewExpandedText: }, ... ] paths: [ "dbname_1", "dbname_2", ... ]/* only if location="" or empty */}___________________________________________System OperationsSystem operations deal with the Loom system itself.System Struct AttributesSystemVersion AttributesA SystemVersion is a structure that describes the Loom version. Name [Type]DescriptionversionDateinstantThe date and time when the Loom version was releasedversionNumberstringThe Loom release identifier (e.g., 1.0.5)buildNumberstringThe Loom build identifier; mostly for internal Revelytix useversionAndBuildstringThe Loom release identifier with the build identifier appendedproductNamestringThe name of the product (always ‘Loom’)productEditionstringThe name of the product edition (e.g., ‘Standard’)SystemConfig AttributesA SystemConfig is a structure that describes the Loom configuration. These are the properties defined in the ‘loom.properties’ file under the ‘config’ directory in your Loom installation.Name [Type]Descriptionpersist.modestringThe mechanism that Loom uses for persisting datasets. If ‘loom’, then Loom manages persistence in directories of HDFS files. If ‘hive’, then Loom uses Hive to persist datasets as databases in Hive. Default: loomdataset.persist.dirstringThe location in HDFS where Loom-managed datasets are created, when running Loom persist mode.Defaults to a directory named 'loom-datasets' in the HDFS working directory.activeScan.hdfs.enabledbooleanIf true, active scanning of potential sources in HDFS is enabled.Default: falseactiveScan.hdfs.baseDirstringComma-separated list of directories under which to scan for potential sources in HDFS. Directories may be specified as an absolute hdfs:// URL or a relative path that will be resolved against the Loom working directory. The scan is recursive, so all sub-directories of each configured directory will be scanned.Defaults to Loom working directory.activeScan.hdfs.scanIntervalMinutesintegerThe interval, in minutes, at which Loom will scan HDFS for potential sources.Default: 60activeScan.hdfs.parseLinesintegerThe number of records to parse from a file in HDFS to determine whether it's a potential source.Default: 50activeScan.hdfs.scoreThresholdfloatThe threshold above which the confidence level must be for a file in HDFS to be considered a potential source. The confidence level is a computed value between 0 and 1.Default: 0.25activeScan.hdfs.maxBufferSizelongThe maximum amount of data to read into memory from an HDFS file to determine whether it's a potential source. Default: 8388608security.enabledbooleanEnables or disables Loom security.If security is enabled, user impersonation is performed.Default: falsesecurity.authenticationbooleanConfigures how authentication is done: does the user exist and have permission?Username and password will always be requested in order to get a session.-- jaas - (default) Use JAAS. JAAS is configured in security-unix.conf. Must have a valid session to access API.-- loom - Use Loom username/password for a valid session. API rejects requests without a valid session.-- disabled - Use Loom username/password to get a session. Valid session not required to access API.ssl.enabledbooleanEnables SSL (https) support. Disabled by default.ssl.portlongConfigures the SSL port. Defaults to 8443.RequestsRequestDescriptionGET system/versionGet system information, such as Loom version.GET system/configGet system configuration information.POST system/log_messageWrites a message to the system log.GET system/backupGet the contents of the registry; for release migration.POST system/restoreFill the registry with the backed up contents from another registry.Request DetailsGet VersionReturns version information for the instance of Loom.Path:system/versionMethod:GETParameters:noneReturns:SystemVersion struct (see above)Get ConfigurationWrites a message to the Loom system log. This is useful for ‘tagging’ activities before or after they are performed. Will emit the username from the current session. Path:system/configMethod:GETParameters:noneReturns:SystemConfig struct (see above); e.g.{"activeScan.hdfs.enabled":true,"activeScan.hdfs.baseDir":["/data/dataset"],"activeScan.hdfs.scanIntervalMinutes":60,"activeScan.hdfs.parseLines":50,"activeScan.hdfs.scoreThreshold":0.25,"activeScan.hdfs.maxBufferSize":8388608,"persist.mode":”hdfs”,"dataset.persist.dir":"data/loom-datasets","security.enabled": false,"security.authentication": "disabled","ssl.enabled": false,"ssl.port": 8443,"jobService.threadPool.size": 10}Log a MessageWrites a message to the Loom system log. This is useful for ‘tagging’ activities before or after they are performed. Will emit the username from the current session. Path:system/log_messageMethod:POSTParameters:level: The log level. One of ‘info’, ‘debug’, ‘trace’, ‘fatal’, ‘warn’.message: The message to writeReturns:none; writes a message to the txnlog, e.g. 2013-08-08 10:41:03,701- INFO - [fabric.txnlog] - [nREPL-worker-36] - Logged by <gary>: THIS IS A MESSAGEBackup RegistryGet the contents of the registry; for release migration. This can be restored to a new registry using /system/restore.Path:system/backupMethod:GETParameters:noneReturns:array of entity objectsRestore a BackupFill the registry with the backed up contents from another registry. The backup contents are obtained from /system/backup.Path:system/restoreMethod:POSTParameters:array of entity objects wrapped in ‘results’ (output from backup)Returns:array of entity/id’s of restored objects___________________________________________Registry Model NotesRegistry Model OverviewThe following figure shows a high-level view of the Loom registry model. The model is comprised of a set of ‘domain’ models, which define a set of connected Entity Types. The models themselves have dependencies based on cross-model relationships between entity types in the domains. Some domain models are extensible, and have specific sub-models to expose specific functionality. The primary 3 entity types are Source, Dataset, and Process. Source and Dataset models represent unmanaged and managed sets of data, respectively; they both are derived from the base DataContainer, which contains DataUnits (e.g., tables) with schema information. Data containers and data units have underlying persistent storage, which is proxied by Storage (container) and storage units. The Process entity represents processing performed on data entities, in a generic form. Lineage is derived from the inter-connected instances of data entities (containers or data units) and processes. In order to compute valid lineage, a ‘snapshot’ of a process must be taken when it is used (executed, or just used in a relationship). This provides the immutability of processes from a lineage perspective. From a data perspective, immutability is provided by making dataset units (i.e., tables) non-modifiable once they have been used.Core Entity AttributesAll entities have the following core attributes. In addition to these, entities of each type have their own type-specific attributes.Name [Type]Descriptionentity/id stringUnique identifier for the entity. In the form of a UUID. Entity IDs are used for references between entities.entity/namestringName of the entity.entity/descriptionstringDescription of the entity.entity/folderstringRegistry folder in which the entity is organized.entity/tagsarray of string?Tags applied to the entity by users.entity/createdAtinstant (long)Timestamp when the entity was created.entity/createdBystringUser ID of the person who created the entity.entity/modifiedAtinstant (long)Timestamp when the entity was last modified.entity/modifiedBystringUser ID of the person who last modified the entity.Usage ScenariosBasic Entity/Metadata OperationsCreate a SourceA data source is a set of data whose lifecycle is not managed and controlled by Loom. These may be registered with Loom as a Source. This yields the following benefits:visibility of the source to a wider set of usersdefinition and management of metadata describing the sourcedetermination of how to read and parse the dataunderstanding of the data characteristics, through descriptive statistics and other measurementsfirst step in creating a dataset for more thorough data preparation and analysisability to define relationships between sets of dataHere is how you create a Source entity in Loom, to represent a data source in its ‘native’ form. This version allows for user interaction; there is a more streamlined version if the user knows all information up-front. The example is for a text file-based source.StepDescriptionNotes/Methods1Identify locationInput to API2Identify storage type, and format typeInputs to API3Read source metadata, return structuresGET /sources/default4Identify which parts of the source to includeSet ‘containsData’5Review raw data in filesGET /data/file/read-lines6Review parsed data based on current formatGET /data/file/read-parsed7Modify format characteristics to best parse dataChange Format properties8Register the source with LoomPOST /sourcesCreate a DatasetA dataset is a set of data that is created by Loom, and whose lifecycle is controlled and managed by Loom. These are represented by Datasets in Loom. Datasets may be initially created in Loom from Sources; thereafter, any processing that is performed on a dataset in Loom will result in the automatic creation and registration of another dataset. Datasets are used for data preparation -- and subsequently data analysis -- using Loom. The benefits of defining a dataset are:visibility of the dataset to a wider set of usersdefinition and management of metadata describing the sourceLoom controls how the data are persisted, and can optimize this for efficient processing.understanding of the data characteristics, through descriptive statistics and other measurementsused for data preparation and cleansing in Loomrelationships between datasets automatically captured when processing occursHere is how you create a Dataset entity in Loom from an existing Source, to represent a managed set of data. This version involves user interaction; there is a more streamlined sequence of steps if no user interaction is required.StepDescriptionNotes//Methods1Get default dataset from registered sourceGET /datasets/default <src>2Change which data units to include3Change schema informationEdit data unit Schemas4Register the dataset with LoomPOST /datasetsCreate a ProcessA process represents some processing of data. Data resides in Sources and Datasets, so processing acts upon those (specifically, upon the data in tables in them). A process is represented by a Process entity in Loom. A process is similar to a function or method in a programming language -- it has a name, some input parameters, and some output parameters. In the case of Loom, there is also the notion of data contexts, which arguments may bind to during execution.There are a variety of types of processes (including any type a user wants to define). The most common when using Loom for data processing is a SQL Query. Here is how you create a Process entity in Loom from an existing Source, to represent a managed set of data. This version involves user interaction; there is a more streamlined sequence of steps if no user interaction is required.StepDescriptionObjects//Methods1Define the input arguments (name-value pairs)Arguments2Define the input and output data contextsContexts3Define the local ProcessProcess4Register the Process with LoomPOST /processesOverlay Metadata on Corporate ResourcesAnother set of use cases involves using Loom primarily as a metadata registry, to attach metadata, manage resource, define relationships, and determine lineage between data resources that are generated and used outside of Loom.Here is the basic sequence of steps. The full sequencing of each activity is consolidated down to one primary method for each; the additional details are covered above.StepDescriptionObjects//Methods1Register Sources for each set of dataPOST /sources2Register Processes POST /processes3‘Use’ Processes to link Sources togetherPOST /processes/<id>/uses4Compute lineage relative to a SourceGET /search/lineageAPI ExamplesSourceThe following is an example of the composite structure that is returned from and passed into the various /sources methods.JSON Composite Structure{ “entity”: { "entity/type": [ "data/DataContainer", "source/Source" ], "entity/name": "ModifiedName", "entity/folder": "test/test2/test3", "entity/description": "Created by RLoom", "entity/tags": "RLoom", "data/structuralForm": "table", "source/dataAccessible": true, "source/metadataAccessible": true, "source/entityState": "active", "source/expandable": true "data/dataUnit": [], }, “storage”: { "persist/format": { "persist.file.text/headerRow": true, "entity/type": [ "persist/StorageFormat", "persist.file/DelimitedFormat" ], "persist/formatType": "text/delim", "persist.file.delim/delimiter": ",", "persist.file.delim/quoteChar": "\"", "persist.file.text/skipRows": 0 }, "entity/type": [ "persist/Storage", "persist.file/FileSet" ], "persist/storageType": "file/text", "persist/location": "/data/datasets/earthquakes", "persist/application": "", "persist.file/isSingleFile": true, "persist/storageUnit": [] }, “storage_units”: [ { "entity/type": [ "persist/StorageUnit", "persist.file/FileSetFile" ], "persist/location": "", "persist/relativeLocation": "earthquakes.ddl", "persist/containsData": false, "persist.file/fileExtension": "ddl", "persist.file/isLogicalFile": false, "persist.file/isBinary": false }, { "entity/name": "eqs7day", "entity/type": [ "persist/StorageUnit", "persist.file/FileSetFile" ], "persist/location": "", "persist/relativeLocation": "eqs7day.csv", "persist/containsData": true, "persist.file/fileExtension": "csv", "persist.file/isLogicalFile": false, "persist.file/isBinary": false }, { "entity/type": [ "persist/StorageUnit", "persist.file/FileSetFile" ], "persist/location": "", "persist/relativeLocation": "README.txt", "persist/containsData": false, "persist.file/fileExtension": "txt", "persist.file/isLogicalFile": false, "persist.file/isBinary": false } ]}ProcessThe following is an example of the structure that is returned from and passed into the various /processes methods.JSON for SQL Transform with no Default Input ContextThe following is an example of the JSON that can be POSTed to /processes to define an executable ad-hoc SQL process that can be executed through Loom. This particular example does not have a default input context defined.{ "entity/description": "Created by RLoom", "process/argument": [ { "entity/type": [ "process/Argument", "process/ConfigArgument" ], "entity/name": "transformText" } ], "entity/type": [ "process/Process" ], "process/processClass": "transform", "process/processType": "sql-query", "process/processScope": "dataunit", "entity/name": "SQLProcess_NoContexts", "entity/folder": "test", "process/isExecutable": true, "entity/tags": "RLoom"}JSON for Linking 3 SourcesThe following is an example of the JSON that can be POSTed to /processes to define a ‘descriptive’ (non-executable) process to link 2 sources as inputs with 1 as output. { "entity/type": [ "process/Process" ], "entity/name": "ProcessLinkingSources", "entity/description": "2 input sources, 1 output", "entity/folder": "test1/test2", "entity/tags": "tag1, tag2", "process/processClass": "descriptive", "process/processType": "lineage-process", "process/processScope": "container", "process/isExecutable": false, "process/argument": [ { "entity/type": [ "process/Argument", "process/ConfigArgument" ], "entity/name": "relationship", "process.arg/value": "link" }, { "entity/type": [ "process/Argument", "process/ConfigArgument" ], "entity/name": "source1", "process.arg/value": "Source1" }, { "entity/type": [ "process/Argument", "process/ConfigArgument" ], "entity/name": "source2", "process.arg/value": "Source2" }, { "entity/type": [ "process/Argument", "process/ConfigArgument" ], "entity/name": "source3", "process.arg/value": "Source3" } ], "process/context": [ { "entity/type": [ "process/Context" ], "entity/name": "Source1", "process.context/inout": "in", "process.context/container": "51ee9a72-af33-4798-97dd-15b71ea86d49" }, { "entity/type": [ "process/Context" ], "entity/name": "Source2", "process.context/inout": "in", "process.context/container": "51ee9a74-fdc2-47c9-a9c3-203f165f353d" }, { "entity/type": [ "process/Context" ], "entity/name": "Source3", "process.context/inout": "out", "process.context/container": "51ee9a77-e4d6-4135-9160-2ea07d565927" } ]}Executing/Using ProcessesProcesses are used, or realized, either through execution (for executable processes) or by explicitly defining a ‘use’ (for non-executable processes). In both cases, a ProcessUse is created. A ProcessUse is a snapshot of the Process at the instant the process was used. When using a process, the data contexts must be specified. (Default input contexts from the Process may be used, or new input contexts may be specified; output contexts must always be specified.) A context represents a single set of data involved in the processing. If the process is scoped at the container level, then the contexts will contain only data container references (i.e., references to Sources and Datasets, one per context). If the process is data unit scoped, then the contexts will have not only a container reference, but also the name of a data unit (i.e., table) within the container. In addition to a name, a container, and an optional data unit name, each context must specified whether it is used for input or output.When executing or using a process through the API, an array of Contexts is provided. JSON for Array of Contexts, Container-Level ProcessThis is an example of a JSON for the ‘contexts’ parameter for POST /execute/transform and POST /processes/<process_id>/uses.{ "contexts": [ { "entity/type": [ "process/Context" ], "entity/name": "Source1", "process.context/inout": "in", "process.context/container": "51ee9a72-af33-4798-97dd-15b71ea86d49" }, { "entity/type": [ "process/Context" ], "entity/name": "Source2", "process.context/inout": "in", "process.context/container": "51ee9a74-fdc2-47c9-a9c3-203f165f353d" }, { "entity/type": [ "process/Context" ], "entity/name": "Source0", "process.context/inout": "out", "process.context/container": "51ee9a77-e4d6-4135-9160-2ea07d565927" } ]}JSON for Array of Contexts, DataUnit-Level ProcessThis is an example of a JSON for the ‘contexts’ parameter for POST /execute/transform and POST /processes/<process_id>/uses. This would be similar to contexts used in single-table SQL query executions (1 table in, 1 table out).{ "contexts": [ { "entity/type": [ "process/Context" ], "entity/name": "input", "process.context/inout": "in", "process.context/container": "520d1f28-35e2-47e8-aa80-6509d674bda1", "process.context/dataUnitName": "eqs7day" }, { "entity/type": [ "process/Context" ], "entity/name": "output", "process.context/inout": "out", "process.context/container": "520d1f28-35e2-47e8-aa80-6509d674bda1", "process.context/dataUnitName": "Result01" } ]}API NotesNotes on HTTP callsEach API is accessed by an HTTP URL whose path starts with the root path of the API. The individual API operations are addressed both by extensions to this root path, and by the HTTP methods used to access to the URL.MethodsSome HTTP clients (e.g. browsers) may not support all of the methods used by an API. In this case, a POST operation may be used in conjunction with a URL parameter named _method. The value of the _method parameter will be used to override the POST method.As an example, if it is not possible to send a PATCH request to the URL: a POST may be sent instead to the URL: and PortWhen referencing the Loom Server via its URL, the host name must be resolvable through DNS to the server that Loom is running on, according to the machines listed in the local hosts file of that machine (or may be an explicit IP address). The default port is 8080, although that can be changed when the Loom Server is started. Firewalls should be configured to allow the host machine to accept connections to this port number. Input ParametersMost API operations require several parameters. For GET requests, these parameters are provided as part of the URL. For other HTTP requests (POST, PUT, PATCH), the parameters are provided in the body, as a JSON map of parameter names to values. Occasionally an unnamed parameter may appear in the path, as is often seen in REST operations. This is the case, for example, when performing operations on a specific entity instance, in which case, the entity ID is part of the URL. _method ParameterDue to URL length limitations, it is sometimes not possible to put the required input parameters in the URL for a GET request. In those cases, the special _method parameter should be placed in the operation URL, and a POST operation should be used. The back-end service will interpret the request as an HTTP HGET, but will look for the parameters in the request body, rather than in the URL.ResponsesAll responses are JSON map structures in the HTTP body. This is consistent even when a response holds a single value or an array.Reference propertiesMany entities contain references by ID to other entities. For instance, a job currently contains a reference to a query, an input dataset, and an output dataset. Invariably, a client (e.g. the UI) requires the name and folder path of the referred-to entity in order to display the reference to the user. In addition, the ID is used to construct a hyperlink to the referred-to entity. The API provides a way to return this information, in the ‘related’ part of the standard response structure. References and compositesCertain kinds of references are not to named entities but are instead references to internal entities. For instance a DataUnit contains a reference to a Schema entity, which contains references to Column entities, and so on. In most cases, those kinds of references to internal objects are expanded into nested JSON objects when passed through the API.SerializationEntities are serialized as JSON.In general, a sub-graph is serialized as a nested JSON structure, where references (UUID ‘pointer’ attributes) and composite structs are treated similarly, as nested values under a parent. ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download