Deploying Data Factory Pipelines to Microsoft Azure

[Pages:34]Deploying Data Factory Pipelines to Microsoft Azure

Author: Alberto Alonso Marcos

The ACH Payment System: An Overview

Copyright? 2019 Sogeti. All rights reserved.

In this article we will describe in detail the steps to follow to complete the automatic deployment of Azure Data Factory pipelines in the Development (dev), Staging (stg) and Production (prd) environments. In software development, the use of integration (CI) and continuous deployment (CD) is done to release better code in a fast way. This possibility also exists for data engineers working with Azure Data Factory. That is, we will have the possibility of moving pipelines between the different environments. Furthermore, working in this way, several people in the team may be working at the same time on the same data pipeline. In this case, we are going to work on an example of automating the deployment from dev to prd. All thanks to ADF integration with Azure DevOps Automation. Let's see how. The infrastructure necessary to complete this process is

Azure Resource Group Azure SQL Database Azure Key Vault Azure Data Factory Azure DevOps And as we have previously commented, we will have to "repeat" everything in the three environments with which we will work: Development, Staging and Production. In the first one, it is where we will develop all the part of the pipelines in a collaborative way, between the different components of the team. The second would be for Quality Assurance or Testing, its objective is to test the pipelines verifying that they really do what they should do. Finally, in the Production environment, it would come to represent the real world.

Copyright? 2019 Sogeti. All rights reserved.

We create the first resource group, in this case for dev. We call it sogetiaa-rgdev. From here, we begin to create all the components within this resource group. We start by creating the database sogetiaa-db-dev and its server sogetiaa-serverdev

Finally, and to complete this first part, we created an Azure Key Vault. We call it sogetiaa-key-dev.

In the second part, the first thing we must provision is the Azure Data Factory. The name that we will give it will be sogetiaa-factory-dev. And in this case, we will leave the GIT option unchecked. We will do it later.

Copyright? 2019 Sogeti. All rights reserved.

The next thing would be, copy the connection string from the Azure SQL Database of the dev environment and create a secret in the Azure Key Vault, with the name of storage-access-key. This is a good practice in order not to publish sensitive information. We copy the connection string from our database

Copyright? 2019 Sogeti. All rights reserved.

And we create the secret by adding our connection string.

In the next step, I add an access policy. In this case, with the permission of secrets to obtain. Being our sogetiaa-factory-dev, the selected security principal.

Now, in the database I create a schema with name stg and the customer table in both the dbo and the newly created schema. In the latter I include three records and check that everything is correct.

Copyright? 2019 Sogeti. All rights reserved.

Perfect, now we just have to start with the creation of the Azure Data Factory pipeline, which allows us to pass the data from stg.customer to dbo.customer. To do this, the first thing is to create a pipeline and an input dataset The implementation will be carried out by putting into play the secret that we previously kept in Azure Key Vault. We test the connection of the Linked Service created between ADF and our Azure Key Vault account, and correct. We can continue

Now we only have to include the name of the secret that we want to use from our Azure Key Vault account and that's it. We test the connection

Copyright? 2019 Sogeti. All rights reserved.

Now we would only have to indicate the name of the source table and to the next step

Copyright? 2019 Sogeti. All rights reserved.

We check that everything is correct

We repeat the process for the dbo.customer table. We check

And in the pipeline we include the Copy Data activity. We configure the activity indicating that the origin is my table stg.customer

Copyright? 2019 Sogeti. All rights reserved.

................
................

In order to avoid copyright disputes, this page is only a partial summary.

Google Online Preview   Download