You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: powerapps-docs/maker/data-platform/export-to-data-lake-data-adf.md
+46-87Lines changed: 46 additions & 87 deletions
Original file line number
Diff line number
Diff line change
@@ -2,7 +2,7 @@
2
2
title: "Ingest Microsoft Dataverse data with Azure Data Factory | MicrosoftDocs"
3
3
description: Learn how to use Azure Data Factory to create dataflows, transform, and run analysis on Dataverse data
4
4
ms.custom: ""
5
-
ms.date: 07/29/2020
5
+
ms.date: 03/22/2021
6
6
ms.reviewer: "matp"
7
7
author: sabinn-msft
8
8
ms.service: powerapps
@@ -30,9 +30,9 @@ After exporting data from Dataverse to Azure Data Lake Storage Gen2 with the Exp
30
30
31
31
This article shows you how to perform the following tasks:
32
32
33
-
1.Generate a manifest.json from the existing model.json in the Data Lake Storage Gen2 storage account that holds the exported data.
33
+
1.Set the Data Lake Storage Gen2 storage account with the Dataverse data as a *source* in a Data Factory dataflow.
34
34
35
-
2.Set the Data Lake Storage Gen2 storage account with the Dataverse data as a *source*in a Data Factory dataflow.
35
+
2.Transform the Dataverse data in Data Factory with a dataflow.
36
36
37
37
3. Set the Data Lake Storage Gen2 storage account with the Dataverse data as a *sink* in a Data Factory dataflow.
38
38
@@ -41,78 +41,27 @@ This article shows you how to perform the following tasks:
41
41
## Prerequisites
42
42
43
43
This section describes the prerequisites necessary to ingest exported Dataverse data with Data Factory.
44
-
44
+
45
45
### Azure roles
46
46
47
47
The user account that's used to sign in to Azure must be a member of the
48
48
*contributor* or *owner* role, or an *administrator* of the Azure subscription.
49
49
To view the permissions that you have in the subscription, go to the [Azure portal](https://portal.azure.com/), select your username in the upper-right corner, select **...**, and then select **My permissions**. If you have access to multiple subscriptions, select the appropriate one. To create and manage child resources for Data Factory in the Azure portal—including datasets, linked services, pipelines, triggers, and integration runtimes—you must belong to the *Data Factory Contributor* role at the resource group level or above.
50
50
51
51
### Export to data lake
52
-
53
-
This article assumes that you've already exported Dataverse data by using the [Export to Data Lake service](export-to-data-lake.md).
52
+
This guide assumes that you've already exported Dataverse data by using the [Export to Data Lake service](export-to-data-lake.md).
54
53
55
54
In this example, account table data is exported to the data lake.
56
55
57
-
## Generate the manifest.json from the model.json
58
-
59
-
1. Go to [this GitHub repository](https://github.com/t-sazaki/ConvertModelJsonToManifestOriginal) and download it to your computer.
60
-
61
-
2. Go to ConvertModelJsonToManifest-master/ConvertModelJsonToManifest-master/ConvertModelJsonToManifest.sln.
62
-
63
-
3. Right-click to select the file, and then open it in Visual Studio. If you don't have Visual Studio, you can follow this article to install it: [Install Visual Studio](/visualstudio/install/install-visual-studio?view=vs-2019&preserve-view=true).
64
-
65
-
4. Go to **Project** > **Manage NuGet Packages**, and ensure that the
1. Replace **your-folder-name** with the folder containing the model.json file. Go to your storage account **Overview** > **Storage Explorer** > **Containers**, and then select the correct folder name.
100
-

101
-
102
-
1. Replace the access key with the access key for this storage account. Go to your storage account, and on the left panel under **Settings**, select **Access Keys**. Select **Copy** to copy the access key and replace it in the code.
103
-
104
-
9. Optionally, you can change the name of the manifest file as indicated in the code comments.
105
-
106
-
10. Run the code, and refresh your storage container to find the new manifest, table, resolved table, and config files.
107
-
108
-
> [!NOTE]
109
-
> If there are changes made to the metadata of the table, you must delete the generated files from the Data Lake and regenerate an updated manifest file by running the code again. It is recommended that you maintain the same name of the manifest file, so there is no need to update any Azure Data Factory dataflows or pipelines.
58
+
This guide assumes that you've already created a data factory under the same subscription and resource group as the storage account containing the exported Dataverse data.
110
59
111
60
## Set the Data Lake Storage Gen2 storage account as a source
112
61
113
-
1. Open [Azure Data Factory](https://ms-adf.azure.com/home?factory=%2Fsubscriptions%2Fd410b7d3-02af-45c8-895e-dc27c5b35342%2FresourceGroups%2Fsama%2Fproviders%2FMicrosoft.DataFactory%2Ffactories%2Fadfathena), and then select **Create data flow**.
62
+
1. Open [Azure Data Factory](https://ms-adf.azure.com/en-us/datafactories) and select the data factory that is on the same subscription and resource group as the storage account containing your exported Dataverse data. Then select **Create data flow** from the home page.
114
63
115
-
2. Turn on **Data flow debug** mode. This might take up to 10 minutes, but you
64
+
2. Turn on **Data flow debug** mode and select your preferred time to live. This might take up to 10 minutes, but you
@@ -121,7 +70,7 @@ In this example, account table data is exported to the data lake.
121
70
122
71

123
72
124
-
4. Under **Source settings**, do the following<!--Suggested. It's "configure the following options" here and "select the following options" in the next procedure, but these are a combination of entering and selecting.-->:
73
+
4. Under **Source settings**, do the following:
125
74
126
75
-**Output stream name**: Enter the name you want.
127
76
-**Source type**: Select **Common Data Model**.
@@ -130,50 +79,60 @@ In this example, account table data is exported to the data lake.
130
79
131
80
5. Under **Source options**, do the following:
132
81
133
-
-**Metadata format**: Select **Manifest**.
134
-
-**Root ___location**: In the first box (**Container**), enter the container name. In the second box (**Folder path**), enter **/**.
135
-
-**Manifest file**: Leave the first box (**table path**) blank, and in the second box (**Manifest name (default)**), enter the first part of the manifest file name, such as *test.manifest.cdm.json***/***test*).
82
+
-**Metadata format**: Select **Model.json**.
83
+
-**Root ___location**: Enter the container name in the first box (**Container**) or **Browse** for the container name and select **OK**.
84
+
-**Entity**: Enter the table name or **Browse** for the table.
6. Check the **Projection** tab to ensure that your schema has been imported sucessfully. If you do not see any columns, select **Schema options** and check the **Infer drifted column types** option. Configure the formatting options to match your data set then select **Apply**.
136
89
137
-

90
+
7. You can view your data in the **Data preview** tab to ensure the Source creation was complete and accurate.
138
91
139
-
-**Schema linked service**: Select the same storage container as the source settings.
140
-
-**Container**: Enter the container name.
141
-
-**Corpus folder**: Leave blank.
142
-
-**table**: Enter text in the format **/*table*Res.cdm.json/*table***, replacing *table* with the table name you want, such as account.
92
+
## Transform your Dataverse data
93
+
After setting the exported Dataverse data in the Data Lake Storage Gen2 storage account as a source in the Data Factory dataflow, there are many possibilities for transforming your data. More information: [Azure Data Factory](/azure/data-factory/introduction)
143
94
144
-

95
+
Follow these instructions to create a rank for the each row by the *revenue* of the account.
145
96
146
-
## Set the Data Lake Storage Gen2 storage account
97
+
1. Select **+** in the lower-right corner of the previous transformation, and then search for and select **Rank**.
147
98
148
-
After setting the exported Dataverse data in the Data Lake Storage Gen2 storage account as a source in the Data Factory dataflow, there are many possibilities for transforming your data. More information: [Azure Data Factory](/azure/data-factory/introduction)
99
+
2. On the **Rank settings** tab, do the following:
100
+
-**Output stream name**: Enter the name you want, such as *Rank1*.
101
+
-**Incoming Stream**: Select the source name you want. In this case, the source name from the previous step.
102
+
-**Options**: Leave the options unchecked.
103
+
-**Rank column**: Enter the name of the rank column generated.
104
+
-**Sort conditions**: Select the *revenue* column and sorty by *Descending* order.
149
105
150
-
Ultimately, you must set a sink for your dataflow. Follow these instructions to set the Data Lake Storage Gen2 storage account with the data exported by the Export to Data Lake service as your sink.
106
+

151
107
152
-
1. Select **+** in the lower-right corner, and then search for and select **Sink**.
108
+
3. You can view your data in the **data preview** tab where you will find the new *revenueRank* column at the right-most position.
109
+
110
+
## Set the Data Lake Storage Gen2 storage account as a sink
111
+
Ultimately, you must set a sink for your dataflow. Follow these instructions to place your transformed data as a Delimited Text file in the Data Lake.
112
+
113
+
1. Select **+** in the lower-right corner of the previous transformation, and then search for and select **Sink**.
153
114
154
115
2. On the **Sink** tab, do the following:
155
116
156
117
-**Output stream name**: Enter the name you want, such as *Sink1*.
157
-
-**Incoming stream**: Select the source name you want.
158
-
-**Sink type**: Select **Common Data Model**.
118
+
-**Incoming stream**: Select the source name you want. In this case, the source name from the previous step.
119
+
-**Sink type**: Select **DelimitedText**.
159
120
-**Linked service**: Select your Data Lake Storage Gen2 storage container that has the data you exported by using the Export to Data Lake service.
160
121
161
122

162
123
163
124
3. On the **Settings** tab, do the following:
164
125
165
-
-**Schema linked service**: Select the final destination storage container.
166
-
-**Container**: Enter the container name.
167
-
-**Corpus folder**: Enter **/**
168
-
-**table**: Enter text in the format **/*table*Res.cdm.json/*table***, replacing *table* with the table name you want, such as account.
169
-
170
-

126
+
-**Folder path**: Enter the container name in the first box (**File system**) or **Browse** for the container name and select **OK**.
127
+
-**File name option**: Select **output to single file**.
128
+
-**Output to single file**: Enter a file name, such as *ADFOutput*
129
+
- Leave all other default settings.
171
130
172
-
-**Root Location**: In the first box (**Container**), enter the container name. In the second box (**Folder path**), enter **/**.
173
-
-**Manifest file**: Leave the first box (**table path**) blank, and in the second box (**Manifest name (default)**), enter the first part of the manifest file name, such as *test.manifest.cdm.json / test*.
174
-
-**Format type**: Select your file format preference.
131
+

132
+
133
+
3. On the **Optimize** tab, set the **Partition option** to **Single partition**.
175
134
176
-

135
+
4. You can view your data in the **data preview** tab.
177
136
178
137
## Run your dataflow
179
138
@@ -197,4 +156,4 @@ Ultimately, you must set a sink for your dataflow. Follow these instructions to
197
156
[Analyze Dataverse data in Azure Data Lake Storage Gen2 with Power BI](export-to-data-lake-data-powerbi.md)
0 commit comments