Artificial Intelligence and Machine Learning

Artificial Intelligence and Machine Learning

Vijay Gadepally Jeremy Kepner, Lauren Milechin, Siddharth Samsi

DISTRIBUTION STATEMENT A. Approved for public release. Distribution is unlimited. This material is based upon work

supported by the Under Secretary of Defense for Research and Engineering under Air Force Contract No. FA8702-15-D-0001. Any opinions, findings, conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the Under Secretary of Defense for Research and Engineering.

? 2020 Massachusetts Institute of Technology. Delivered to the U.S. Government with Unlimited Rights, as defined in DFARS

Part 252.227-7013 or 7014 (Feb 2014). Notwithstanding any copyright notice, U.S. Government rights in this work are defined by DFARS 252.227-7013 or DFARS 252.227-7014 as detailed above. Use of this work other than as specifically authorized by the U.S. Government may violate any copyrights that exist in this work.

Slide contributions from: Siddharth Samsi, Albert Reuther, Jeremy Kepner, David Martinez, Lauren Milechin

Outline

? Artificial Intelligence Overview ? Machine Learning Deep Dives

? Supervised Learning ? Unsupervised Learning ? Reinforcement Learning ? Conclusions/Summary

AI and ML - 2 VNG 010720

1

What is Artificial Intelligence?

Narrow AI: The theory and development of computer systems that perform tasks that augment for human intelligence such as perceiving, classifying, learning, abstracting, reasoning, and/or acting

General AI: Full autonomy

AI and ML - 3 VNG 010720

Definition adapted from Oxford dictionary and inputs from Prof. Patrick Winston (MIT)

Big Data

AI. Why Now?

Compute Power

Machine Learning Algorithms

Source: DARPA/ Public domain

Source: DARPA/ Public domain

Convergence of High Performance Computing, Big Data and Algorithms that enable widespread AI development

AI and ML - 4 VNG 010720

2

AI Canonical Architecture

Sensors Sources

? IEEE. Figure 1 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-9, doi: 10.1109/HPEC.2019.8916327. All rights reserved. This content is excluded from our Creative Commons license. For more information, see

Structured Data

Algorithms

e.g.:

Human- Machine Teaming (CoA)

Users (Missions)

Data Conditioning

Information

Unstructured Data

? Knowledge-Based

? Unsupervised and Supervised Learning

? Transfer learning

? Reinforcement Learning

? etc

Knowledge

Human Human-Machine Complement Machine

Insight

Spectrum

CPUs

GPUs

Modern Computing

TPU

Neuromorphic

Explainable AI

Metrics and Bias Assessment

Robust AI

Verification & Validation

AI and ML - 5 VNG 010720

GPU = Graphics Processing Unit CoA = Courses of Action TPU = Tensor Processing Unit

Custom . . .

Security (e.g., counter AI)

Quantum

Policy, Ethics, Safety and Training

Select History of Artificial Intelligence

1950

1960

1970

1980

1990

AI Winters 1974?1980 and 1987?1993

2000

2010

2017

1950 - Computing

Machinery and Intelligence "Turing

Test" published by MIND vol. LIX

1956 - Dartmouth Summer Research Project on AI

J. McCarthy, M. Minsky, N. Rochester, O. Selfridge, C. Shannon, others

1955 - Western Joint Computer Conference Session on Learning Machines

W.A. Clark

G. Dineen

O. Selfridge

Generalization of Pattern Programming Pattern Recognition

Recognition in a Self-

Pattern

and Modern

Organizing System Recognition

Computers

(MIT LL Staff)

(MIT LL Staff) (MIT LL Staff)

1957 - Frank

Rosenblatt Neural

Networks Perceiving and Recognizing

A u to m atio n

1957 - Memory Test Computer, first computer to simulate the operation of neural networks

1959 - Arthur Samuel "Some studies in machine learning using the Game of checkers" IBM Journal of R&D

1958 - National Physical Laboratory in the UK Symposium on the Mechanization of Thought Processes

1960 - Recognizing handwritten characters, Robert Larson of SRI AI Center

1961 - James Slagle, Solving Freshman Calculus (Minsky Student) MIT

J. Forgie

and J. Allen

1979 - An

Assessment of AI from

a Lincoln Laboratory Perspective

Internal MIT LL

publication

1982 - Expert Systems Pioneer DENDRAL project at Stanford

1984 - Hidden Markov models

1988 - Statistical Machine Translation

1989 -

Convolutional Neural Networks

1986?present

The return of neural networks

1994 - Human-level

spontaneous speech recognition

1997 - IBM Deep Blue

defeats reigning chess champion (Garry Kasparov)

2001?present The availability of very large data sets

2007 - DARPA Grand Challenge ("Urban Challenge")

2012 - Team from U. of Toronto (Geoff Hinton's lab) wins the ImageNet Large Scale Visual Recognition Challenge with deep-learning software

2015 - DeepMind achieves human expert level of play on Atari games (using only raw pixels and scores)

2005 - Google's Arabic and Chinese to English translation

2011 - IBM Watson defeats former Jeopardy! champions (Brad Rutter and Ken Jennings)

2014 - Google's GoogleNet Object classification at near human performance

2016 - DeepMind AlphaGo defeats top human Go player (Lee Sedol)

2016 - DARPA Cyber Grand Challenge

AI and ML - 8 VNG 010720

Adapted from: The Quest for Artificial Intelligence, Nils J. Nilsson, 2010 and MIT Lincoln Laboratory Library and Archives

3

Artificial Intelligence Evolution

* Waves adapted from John Launchbury, Director I2O, DARPA

AI and ML - 9 VNG 010720

Four Waves of AI

REASONING*

Handcrafted Knowledge

LEARNING

Statistical Learning

Perceiving Learning

Abstracting Reasoning

Perceiving Learning

Abstracting Reasoning

CONTEXT

Contextual Adaptation

Perceiving Learning

Abstracting Reasoning

ABSTRACTION

System Evolution

Perceiving Learning

Abstracting Reasoning

Lots of data enabled

non-expert systems

Adding context to AI systems

Ability of system to abstract

Spectrum of Commercial Organizations in the Machine Intelligence Field

AI and ML - 10 VNG 010720

Source: Shivon Zilis, 2016,

? Shivon Zilis and James Cham, designed by Heidi Skinner. All rights reserved. This content is excluded from our Creative Commons license. For more information, see

4

Data is Critical To Breakthroughs in AI

Year Breakthroughs in AI

Datasets (First Available)

Algorithms (First Proposed)

1994 Human-level read-speech recognition

Spoken Wall Street Journal articles and other texts (1991)

Hidden Markov Model (1984)

1997 IBM Deep Blue defeated Garry Kasparov 700,000 Grandmaster chess games, aka Negascout planning algorithm

"The Extended Book" (1991)

(1983)

2005 Google's Arabic- and Chinese-to-English 1.8 trillion tokens from Google Web and

translation

News pages (collected in 2005)

Statistical machine translation algorithm (1988)

2011

IBM Watson became the world Jeopardy! 8.6 million documents from Wikipedia,

champion

Wiktionary, Wikiquote, and Project

Gutenberg (updated in 2010)

Mixture-of-Experts algorithm (1991)

2014 Google's GoogleNet object classification ImageNet corpus of 1.5 million labeled Convolutional neural network

at near-human performance

images and 1,000 object categories (2010) algorithm (1989)

2015 Google's Deepmind achieved human parity in playing 29 Atari games by learning general control from video

Average No. of Years to Breakthrough:

Arcade Learning Environment dataset of Q-learning algorithm (1992) over 50 Atari games (2013)

3 years

18 years

AI and ML - 11 VNG 010720

Source: Train AI 2017,

AI Canonical Architecture

Sensors Sources

Structured Data

Data Conditioning

Unstructured Data

Information

Algorithms e.g.:

? Knowledge-Based ? Unsupervised

and Supervised Learning ? Transfer learning

? Reinforcement Learning

? etc

Knowledge

Human- Machine Teaming (CoA)

Human Human-Machine Complement Machine

Spectrum

Users (Missions) Insight

CPUs

GPUs

Modern Computing

TPU

Neuromorphic

Explainable AI

Metrics and Bias Assessment

Robust AI Verification & Validation

AI and ML - 12 VNG 010720

GPU = Graphics Processing Unit CoA = Courses of Action TPU = Tensor Processing Unit

Custom . . .

Security (e.g., counter AI)

Quantum

Policy, Ethics, Safety and Training

5

Unstructured and Structured Data

DDaattaa CCoonnddiittiioonniinngg

Algorithms

HumanMachine Teaming

Modern Computing

Robust AI

Structured Data Types

Speech

Sensors

Network Logs

Metadata

Unstructured Data Types

Social Media

Human Behavior

Reports

Side Channel

Data Conditioning/Storage Technologies

- Data to Information -

Technologies

Capabilities Provided

Infrastructure/Databases

? Indexing/Organization/Structure ? Domain Specific Languages ? High Performance Data Access ? Declarative Interfaces

Data Curation

? Unsupervised machine learning ? Dimensionality Reduction ? Clustering/Pattern Recognition ? Outlier Detection

Data Labeling

? Initial data exploration ? Highlight missing or incomplete data ? Reorient sensors/recapture data ? Look for errors/biases in collection

AI and ML - 13 VNG 010720

Often takes up 80+% of overall AI/ML development work

Machine Learning Algorithms Taxonomy

DDaattaa CCoonnddiittiioonniinngg

Algorithms

HumanMachine Teaming

Modern Computing

Robust AI

Algorithms*

* "The Five Tribes of Machine Learning", Pedro Domingos

AI and ML - 14 VNG 010720

DNN = Deep Neural Networks SVM = Support Vector Machines Exp. Sys. = Expert Systems

Symbolists (e.g., exp. sys.)

Bayesians (e.g., naive Bayes)

Analogizers (e.g., SVM)

Connectionists (e.g., DNN)

Evolutionaries (e.g., genetic programming)

Artificial Intelligence Machine Learning Neural Nets Deep Neural Nets

Image Adapted From "Deep Learning" by Ian Goodfellow, Yoshua Bengio and Aaron Courville

6

Modern AI Computing Engines

DDaattaa CCoonnddiittiioonniinngg

Algorithms

HumanMachine Teaming

Modern Computing

Robust AI

CPU GPU TPU

What It Provides to AI

? Most popular computing platform

? General purpose compute

? Used by most for training algorithms (good for NN backpropagation)

? Speeds up inference time (domain specific architecture)

Computing Class

Neuromorphic ? Active research area

Custom

? ? ?

Quantum

? Ability to speed up specific computations of interest (e.g. graphs)

? Benefits unproven until now ? Recent results on HHL

(linear system of equations)

AI and ML - 15 VNG 010720

GPU = Graphics Processing Unit TPU = Tensor Processing Unit

HHL = Harrow-Hassidim-Lloyd quantum algorithm

Selected Results

Alexnet comparison: Forward-Backward Pass

SpGEMM Performance using Graph Processor (G102)

1E1+01144

d1E1+01133 n Seco1E1+01122 er esP1E1+01111 g Ed1E1+01100

Traversed

11E0+99 11E0+88

Traversed Edges / Second

ASIC Graph Processor (Projected) FPGA Graph Processor (Measured) Cray XK7 Titan (Measured) Cray XT4 Franklin (Measured)

16k Nodes, 64 Racks

256 Nodes Rack

64 Nodes Chassis

8 Nodes Mini-Chassis

4k Nodes 16 Racks

1024 Nodes 4 Racks

Embedded Applications

Data Center Applications

11E0+77 11E0+11

11E0+22

11E0+33

11E0+44

11E0+55 Watts

1E1+066

11E0+77

11E0+88

11E0+99

Neural Network Processing Performance

? IEEE. Figure 2 in Reuther, A., et al, "Survey and Benchmarking of Machine Learning Accelerators," 2019 IEEE High Performance Extreme Computing Conference (HPEC), Waltham, MA, USA, 2019, pp. 1-9, doi: 10.1109/HPEC.2019.8916327. All rights reserved. This content is excluded from our Creative Commons license. For more information, see

Peak GOps/Second

1T0ruTeeNroarOthps/W 1 TeraOpMso/WvidiusX 100 GigaOpsT/PWUEdge

MIT Eyeriss

WaveSystem DGX-2

TPU1

Arria

DGX-1

WaveDPU Turing

DGX-Station

TPU3 V100

Goya

TPU2

GraphCoreNode

Nervana P100

GraphCoreC2

Xavier

K80

TrueNorthSyPshi7290F

2xSkyLakeSP

Phi7210F

JetsonTX2

JetsonTX1

Legend

Computation Precision

Int8 Int16 Float16 Float16 -> Float32 Float32 Float64

Form Factor

Chip Card System

Computation Type

Inference Training

AI and ML - 16 VNG 010720

Peak Power (W)

Reuther, Albert, et al. "Survey and Benchmarking of Machine Learning Accelerators." arXiv preprint arXiv:1908.11348 (2019).

7

DDaattaa CCoonnddiittiioonniinngg

Algorithms

HumanMachine Teaming

Modern Computing

Robust AI

AI and ML - 17 VNG 010720

Robust AI: Preserving Trust

Confidence Level in the Machine Making

the Decision

Confidence Level vs. Consequence of Actions

High

Best Matched to Machines

Low Low

Machines Augmenting

Humans

Consequence of Actions

Best Matched to

Humans

High

Robust AI Feature Explainable AI

Metrics

Validation & Verification

Security

Policy, Ethics, Safety, and Training

AI and ML - 18 VNG 010720

Importance of Robust AI

Issue

User unfamiliarity or mistrust leads to lack of adoption

Unknown relationship between arbitrary input and machine output

Algorithms need to meet mission specifications

System vulnerable to adversarial action (both cyber and physical)

Unwanted actions when controlling heavy or dangerous machinery

Example

Solutions

Seamless integration, model expansion, transparent uncertainty

Explainability, dimensionality reduction, feature importance inference

Robust training, "portfolio" methods, regularization

Model failure detection, red teaming

Risk sensitivity, robust inference, high decision thresholds

8

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download