Machine Learning Guide

PUBLIC

SAP Data Intelligence

2024-04-20

? 2024 SAP SE or an SAP affiliate company. All rights reserved.

Machine Learning Guide

THE BEST RUN

Content

1

Machine Learning in SAP Data Intelligence. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3

2

ML Scenario Manager. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.1

Setting Up Your Machine Learning Scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

Registering a Dataset. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Adding a Jupyter Notebook. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Adding a Pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8

2.2

Scenario Versions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Creating a New Version of Your Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.3

Exporting and Importing Scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2.4

Metrics Explorer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Browsing Run Collections and Runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

Comparing Runs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.5

Troubleshooting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

Workspace Contains Files Larger Than 2 MB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3

JupyterLab Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.1

Running Kernels and Terminals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3.2

Setting Up a Virtual Environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.3

Accessing DI_DATA_LAKE. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

3.4

Using the Data Browser Extension. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.5

Copying Resources to MLSM. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4

Python SDK. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1

Setting the Context. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29

4.2

Retrieving ML Scenario Metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .29

4.3

Working with Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.4

Versioning an ML Scenario. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.5

Executing or Deploying Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.6

Working with Training Pipelines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31

4.7

Accessing Artifacts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.8

Tracking Metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Runs and Run Collections. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

Using the SDK to Track Metrics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

5

2

Deprecation of Machine Learning Components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

PUBLIC

Machine Learning Guide

Content

1

Machine Learning in SAP Data

Intelligence

This guide provides an overview of key concepts related to machine learning and demonstrates how SAP Data

Intelligence can be used to perform data science tasks.

SAP Data Intelligence empowers scientists to leverage the power of machine learning and gain valuable

insights from their data. With the ML Scenario Manager, users can model data, while the Metrics Explorer

provides tools for visualizing and analyzing data.

If you prefer to access the machine learning functionality programmatically, you can do so using the software

development kit (SDK), which is available from a Jupyter notebook or from the Pipeline operator.

Data Modeling

?

ML Scenario Manager [page 5]

Machine Learning Guide

Machine Learning in SAP Data Intelligence

PUBLIC

3

Data Visualization

?

Metrics Explorer [page 16]

Software Development Kit

?

4

Python SDK [page 28]

PUBLIC

Machine Learning Guide

Machine Learning in SAP Data Intelligence

2

ML Scenario Manager

ML Scenario Manager helps you to organize your data science artifacts and to manage your tasks in a central

location. As a multi-faceted data science application, it is built around the key concept of the machine learning

(ML) scenario, which may contain datasets, pipelines, and Jupyter notebooks.

You access ML Scenario Manager from the SAP Data Intelligence launchpad. You can then use it to create

your machine learning scenarios and add artifacts. You can also manage your model performance metrics and

deployment history. If necessary, you can version an ML scenario as part of your end-to-end workflow and

create a new branch from a previous version.

A typical process within ML Scenario Manager involves the following:

?

?

?

?

?

Managing your datasets and model artifacts

Creating Jupyter notebooks for your experiments (see JupyterLab Environment [page 22])

Creating and managing data pipelines

Viewing executions and performance metrics

Tracking your model deployments

See ML Scenario Manager in Action

If you'd like to see ML Scenario Manager in action, check out the video in the Web version of this document.

Related Information

Setting Up Your Machine Learning Scenario [page 6]

Scenario Versions [page 12]

Exporting and Importing Scenarios [page 13]

Metrics Explorer [page 16]

Machine Learning Guide

ML Scenario Manager

PUBLIC

5

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download