Bigdatawg.nist.gov



DRAFT User GuideThe following document is a draft of a user guide to the nine NIST Big Data Interoperability Framework (NBDIF) volumes. The objective of this document is to provide a very high level overview of the contents of the nine volumes, relation between the volumes, location (within the volumes) of some topics of interest to the Big Data community, and related NIST resources. The proposed focus of each section is highlighted in blue. The blue text serves to guide development of the document and will be removed later. Example text (i.e., text similar to what will be included in the document) is not highlighted.Volume OverviewOVERALL GOAL OF THE NBDIFThis section will describe (at a very high level) the purpose for creating the compendium of NBDIF documents (e.g., Why were the documents created? What do the documents contain?)Suggested Reading PathwaysThis section will suggest a reading pathway (i.e., a suggested order in which to read the volumes) for a few user groups. The user groups are very high-level, broad groups and are not intended to address every scenario of user. One goal of the User Guide is to assist the reader in finding information of interest within the volumes of the NBDIF. Readers from different disciplines may prefer to take different navigation paths on their journey through the volumes. The suggested reading pathways below are only intended as a guide and do not suggest that these are the only order to read the NBDIF volumes.Analyst (e.g., Security and Privacy Analysts, Vertical Application End-Users)Suggested reading pathway narrativeBusiness User (e.g., Project Managers, Senior Executives) Suggested reading pathway narrative: Vol 3 Vol n Vol n2. Engineer (e.g., Chief Data Officers, Database Managers, Developers, IT Staff, System ArchitectsSuggested reading pathway narrativeScientist (e.g., Data Scientists, Vertical Application End-Users)Suggested reading pathway narrativeStandards DeveloperSuggested reading pathway narrativeVolume AbstractsThis section describes at a very high level (i.e., 2-4 sentences) the focus of each volume. NBDIF Index and ThesaurusThis section will list major topics and subtopics covered within the volumes and point users to the location of significant discussion of the subtopic within the volumes. IndexTopic SubtopicVol 1Vol 2Vol 3Vol 4Vol 5Vol 6Vol 7Vol 8Vol 9ANALYTICS??????applicationpg 44, 50, 289, rerunas a central concept in the reference architecturepg 21, 22as a general requirement characteristicpg 16as a key component of use case documentationpg 30, 31, 32as it relates to the gapspg 35, 40, 43, 49, 50as part of a workflowpg 268, 274benefit ofpg 327current approachespg 9, 14, 15, 17, 327definitionpg 269facets ofpg 272future needspg 8, 23, 25, 26, 50, 327in relation to security and privacypg 19, 20pertaining to data cleaning and qualitypg 47, 49relationship with transformationpg 51types ofpg 329UIMApg 64CLEANINGduring migration,pg 42as it relates to qualitypg 47time sinkpg 39, 41pervasive project issuepg 25position in a ML workflowpg 28usecase 8, M0165 (recheck)usecase 8, M0165usecase 2-1 (recheck)usecase 2-1usecase 41 (recheck)usecase 41usecase 51 (recheck)usecase 51relationship to schema on readsec 2relationship to preparationsec 5.3differing levels of importancesec 5[DATA] PREPzeroas part of the data life cyclepg 15in the context of 'translation'pg 17time sinkpg 48in context of NBDRA Activitypg 77in the context of FAIRpg 0 and sec 6.3.5in the context of governancesec 6.3.4time sinksec 6.2.1.6related to implementationsec 6.3.5importance to checklist designsec 6.3.6.2level of priority, checklist ID 2d??level of priority, checklist ID 2d??fundamental part of the analytics lifecyclesection 2, 5.3in relation to tranformation, cleaning, integration, and schema on readsec 2changes to the order of steps in processingsec 5.3.1GRAPHSknowledge graphs and RDFgraph DB, pg 43pg 37, 39data structurepg 41usecase 34 (recheck)usecase 34, 6 hitsusecase 21 (recheck)usecase 21, 3 hitsusecase 32 (recheck)usecase 32, concept graphusecase 51 (recheck)usecase 51, pg 191support for graph data and processingsec 3.2a class12n dimension for mapping the 51 usecasesappendix Eas a vis12n technique, 1.19??as a vis12n technique, 1.19??usecase 2-3 (recheck)usecase 2-3graph DBsec 4.2.2visualization challengessec 3.3.5LANGUAGErelated to English and terminologypg 14gaps affecting interoperabilitypg 35, 52related to querypg 39, 40, 41when accepted by a servicepg 28in geo spatial mappingpg 41related acronymspg 53, 54, 55, 56related standardsappendix BNLPusecase 15, 22, 31, 2-3XBRLusecase 5KMLusecase 13XMLusecase 29stemmingusecase 34ENVRI [is a] reference modelusecase 42??semistructured data, XML, JSONsec 3.2.5, sec 3.31issue with the term nosqlsec 4.2.2new programming languages [sql layers?]sec 4.4.6related acronyms [3]appendix AMASHUPmetadata issues in mashuppg 38in commercial usecasespg 90, 91, 118, 210, 240benefitpg 8relationship with Varietypg 268relationship to linked datasec 5.4.6[neg] effect on security and privacysec 5.5.2METADATAsynonymous with descriptive datapg 14its role in ETLpg 15, 16qualitypg 26, 31, 47, 48, 49as a gap affecting interoperabilitypg 35, 37, 38, pg 8, 11in data analysispg 39, 40in persistent identifiers [PIDs]usecase 6, 25, 32 (error on pg 254)pg 42related acronymspg 53, 55, 56related standardsappendix Brelated to FAIRpg 40in open datapg 19data checklistspg 44, 47, 48, 49, 50, 51, 52, 53, 56, 57, 60, 62, 63metadata issues in mashuppg 38related to cleaningpg 39in relation to provenancesec 8.6usecase 7 (recheck)usecase 7usecase 32, 33, and 34 (recheck)usecase 32, 33, and 34usecase 45 (recheck)usecase 45usecase 2-1 and appendix F (recheck)usecase 2-1 and appendix Fas a key aspect of data science, in templatesec 3.4types of, and importance to data sciencesec 5.4.6, 20 hitsdefinitionsec 2 and sec 3.3.3data fusion and integrationsec 3.2.3in the context of the four main characteristics of big datasec 3.3in relation to vertical data scientistssec 5.2QUALITYgapspg 42, 44data checklistspg 43, 44, 48, 49, 51, 55, 56, 58related to interfacespg 20as an aspect of projects and implementationpg 14, 19minimum viable qualitypg 25levelsvol 9, p 37FAIRpg 40governancepg 41SEARCHapplicability to gaps 2 and 4pg 37, 38, 39, 40, 42prevalence in use casespg 30, 31, 32as a data consumer requirementsec 3.2pg 17issues in data virtualizationpg 51related standardsappendix Busecase 15 (recheck)usecase 15usecase 21 (recheck)usecase 21usecase 2-1 (recheck)usecase 2-1 appendix Fin relation to analytics, template q section 1.35??in relation to analytics, template q section 1.35??usecase 8 (recheck)usecase 8, 8 hitsusecase 2 (recheck)usecase 2, 5 hitsusecase 18 (recheck)usecase 18usecase 34 (recheck)usecase 34ThesaurusThis could be a grouping of synonyms for topics. This could be presented in a graphic format. Each reader brings their own set of terminology and understanding. If they are from a particular audience, then a sub topic of interest to them may be obscured due to our use of terminology which is unfamiliar to them.Highlights of Select TablesThe following list are some significant tables within the volumes. Use Case to requirements mapping in Volume 3Use Case to NBDRA mapping (Volume 4, Appendix E)NIST Big Data Security and Privacy Safety Levels (Volume 4, Appendix A)Additional NIST ResourcesLinks to other NIST documents and websites that cover these topics more in depth. List will be organized by Volume. Resources outside of NIST will not be included because we don’t have the resources to create a complete list. Volume 1Volume 2Volume 3Volume 4Cloud Security NIST Special Publication 500-322: Evaluation of Cloud Computing Services Based on NIST SP 800-145, February 2018, Publication 500-291, Version 2: NIST Cloud Computing Standards Roadmap, July2013. NIST Special Publication 500-299: DRAFT NIST Cloud Computing 5 Security Reference Architecture, May 2013. Risk Management FrameworkVolume 5Volume 6Volume 7Volume 8Microservices: Microservices, Containers for Vol 8. Draft SP 800-180, Volume 9Individual Volume Table of Contents These are the table of contents copied from each volume. They were copied from the drafts on 9/10/19.Volume 1, DefinitionsVolume 2, Big Data Taxonomies TOC \o "2-3" \h \z \t "Heading 1,1,BD Appendices,1,BD Appendices2,2,BD Appendices3,3,BD HeaderNoNumber,1" Executive Summary PAGEREF _Toc17199663 \h vii1Introduction PAGEREF _Toc17199664 \h 11.1Background PAGEREF _Toc17199665 \h 11.2Scope and Objectives of the Definitions and Taxonomies Subgroup PAGEREF _Toc17199666 \h 31.3Report Production PAGEREF _Toc17199667 \h 31.4Report Structure PAGEREF _Toc17199668 \h 32Reference Architecture Taxonomy PAGEREF _Toc17199669 \h 62.1Actors and Roles PAGEREF _Toc17199670 \h 62.2System Orchestrator PAGEREF _Toc17199671 \h 82.3Data Provider PAGEREF _Toc17199672 \h 102.4Big Data Application Provider PAGEREF _Toc17199673 \h 132.5Big Data Framework Provider PAGEREF _Toc17199674 \h 162.6Data Consumer PAGEREF _Toc17199675 \h 182.7Management Fabric PAGEREF _Toc17199676 \h 192.8Security and Privacy Fabric PAGEREF _Toc17199677 \h 193Data Characteristic Hierarchy PAGEREF _Toc17199678 \h 203.1Data Elements PAGEREF _Toc17199679 \h 203.2Records PAGEREF _Toc17199680 \h 213.3Inter-Dependent Records PAGEREF _Toc17199681 \h 223.4Datasets PAGEREF _Toc17199682 \h 233.5Multiple Datasets PAGEREF _Toc17199683 \h 234Summary PAGEREF _Toc17199684 \h 25Appendix A: Acronyms PAGEREF _Toc17199685 \h 26Appendix B: Bibliography PAGEREF _Toc17199686 \h 27Volume 3, Use Cases and General RequirementsVolume 4, Security and PrivacyVolume 5, Architectures White Paper SurveyVolume 6, Reference ArchitectureVolume 7, Standards RoadmapVolume 8, Reference Architecture InterfacesVolume 9, Adoption and Modernization ................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download