You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: powerapps-docs/maker/data-platform/azure-synapse-link-delta-lake.md
+13-13Lines changed: 13 additions & 13 deletions
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ ms.author: jasonhuang
6
6
ms.reviewer: matp
7
7
ms.service: powerapps
8
8
ms.topic: how-to
9
-
ms.date: 06/26/2023
9
+
ms.date: 09/27/2023
10
10
ms.custom: template-how-to
11
11
---
12
12
# Export Dataverse data in Delta Lake format
@@ -27,10 +27,10 @@ provides the following information and shows you how to perform the following ta
27
27
> - It's important to take these additional costs into consideration when deciding to use this feature as they are not optional and must be paid in order to continue using this feature.
28
28
>
29
29
> [!NOTE]
30
-
> Synapse Link status in the Maker Portal UI now accurately reflects the Delta Lake conversion state:
31
-
> 1.`Count`will show number of records in delta lake table.
32
-
> 1.`Last synchronized on` Datetime will represent the last successful conversion timestamp
33
-
> 1.`Sycn status`will be shown as **active** once the delta lake conversion completes
30
+
> The Azure Synapse Link status in Power Apps (make.powerapps.com) reflects the delta lake conversion state:
31
+
> -`Count`shows the number of records in the delta lake table.
32
+
> -`Last synchronized on` Datetime represents the last successful conversion timestamp.
33
+
> -`Sync status`is shown as **active** once the delta lake conversion completes.
34
34
35
35
## What is Delta Lake?
36
36
@@ -44,15 +44,15 @@ Apache Parquet is the baseline format for Delta Lake, enabling you to leverage t
44
44
-**Reliability**: Delta Lake provides ACID transactions, ensuring data consistency and reliability even in the face of failures or concurrent access.
45
45
-**Performance**: Delta Lake leverages the columnar storage format of Parquet, providing better compression and encoding techniques, which can lead to improved query performance compared to query CSV files.
46
46
-**Cost-effective**: The Delta Lake file format is a highly compressed data storage technology that offers significant potential storage savings for businesses. This format is specifically designed to optimize data processing and potentially reduce the total amount of data processed or running time required for on-demand computing.
47
-
-**Data protection compliance**: Delta Lake with Synapse Link provides tools and features including soft-delete and hard-delete to comply various data privacy regulations, including General Data Protection Regulation (GDPR).
47
+
-**Data protection compliance**: Delta Lake with Azure Synapse Link provides tools and features including soft-delete and hard-delete to comply various data privacy regulations, including General Data Protection Regulation (GDPR).
48
48
49
-
## How Delta Lake works with Synapse Link for Dataverse?
49
+
## How Delta Lake works with Azure Synapse Link for Dataverse?
50
50
51
-
When setting up an Azure Synapse Link for Dataverse, you can enable the **export to Delta Lake** feature and connect with a Synapse workspace and Spark pool. Synapse Link exports the selected Dataverse tables in CSV format at designated time intervals, processing them through a Delta Lake conversion Spark job. Upon the completion of this conversion process, CSV data is cleaned up for storage saving. Additionally, a series of maintenance jobs are scheduled to run on a daily basis, automatically performing compaction and vacuuming processes to merge and clean up data files to further optimize storage and improve query performance.
51
+
When setting up an Azure Synapse Link for Dataverse, you can enable the **export to Delta Lake** feature and connect with a Synapse workspace and Spark pool. Azure Synapse Link exports the selected Dataverse tables in CSV format at designated time intervals, processing them through a Delta Lake conversion Spark job. Upon the completion of this conversion process, CSV data is cleaned up for storage saving. Additionally, a series of maintenance jobs are scheduled to run on a daily basis, automatically performing compaction and vacuuming processes to merge and clean up data files to further optimize storage and improve query performance.
52
52
53
53
## Prerequisites
54
54
55
-
- Dataverse: You must have the Dataverse **system administrator** security role. Additionally, tables you want to export via Synapse Link must have the **Track changes** property enabled. More information: [Advanced options](create-edit-entities-portal.md#advanced-options)
55
+
- Dataverse: You must have the Dataverse **system administrator** security role. Additionally, tables you want to export via Azure Synapse Link must have the **Track changes** property enabled. More information: [Advanced options](create-edit-entities-portal.md#advanced-options)
56
56
- Azure Data Lake Storage Gen2: You must have an Azure Data Lake Storage Gen2 account and **Owner** and **Storage Blob Data Contributor** role access. Your storage account must enable **Hierarchical namespace** and **public network access** for both initial setup and delta sync. **Allow storage account key access** is required only for the initial setup.
57
57
- Synapse workspace: You must have a Synapse workspace and **Owner** role in access control(IAM) and the **Synapse Administrator** role access within the Synapse Studio. The Synapse workspace must be in the same region as your Azure Data Lake Storage Gen2 account. The storage account must be added as a linked service within the Synapse Studio. To create a Synapse workspace, go to [Creating a Synapse workspace](/azure/synapse-analytics/get-started-create-workspace).
58
58
- A Spark Pool in the connected Azure Synapse workspace with **Apache Spark Version 3.1** using this [recommended Spark Pool configuration](#recommended-spark-pool-configuration). For information about how to create a Spark Pool, go to [Create new Apache Spark pool](/azure/synapse-analytics/quickstart-create-apache-spark-pool-portal#create-new-apache-spark-pool).
@@ -73,9 +73,9 @@ This configuration can be considered a bootstrap step for average use cases.
73
73
74
74
1. Sign into [Power Apps](https://make.powerapps.com/?utm_source=padocs&utm_medium=linkinadoc&utm_campaign=referralsfromdoc) and select the environment you want.
75
75
1. On the left navigation pane, select **Azure Synapse Link**. [!INCLUDE [left-navigation-pane](../../includes/left-navigation-pane.md)]
76
-
1. On the command bar select **+ New link**
76
+
1. On the command bar, select **+ New link**
77
77
1. Select **Connect to your Azure Synapse Analytics workspace**, and then select the **Subscription**, **Resource group**, and **Workspace name**.
78
-
1. Select **Use Spark pool for processing**, and then select the pre-created**Spark pool** and **Storage account**.
78
+
1. Select **Use Spark pool for processing**, and then select the precreated**Spark pool** and **Storage account**.
79
79
:::image type="content" source="media/synapse-link-usesparkpool.png" alt-text="Azure Synapse Link for Dataverse configuration that includes spark pool.":::
80
80
81
81
1. Select **Next**.
@@ -90,14 +90,14 @@ This configuration can be considered a bootstrap step for average use cases.
90
90
91
91
## View your data from Synapse workspace
92
92
93
-
1. Select the Azure Synapse link you want, and then select **Go to Azure Synapse Analytics workspace** on the command bar.
93
+
1. Select the Azure Synapse Link you want, and then select **Go to Azure Synapse Analytics workspace** on the command bar.
94
94
1. Expand **Lake Databases** on the left pane, select **dataverse-***environmentNameorganizationUniqueName*,
95
95
and then expand **Tables**. All Parquet tables are listed and available for analysis with the naming convention
96
96
*DataverseTableName.***(Non_partitioned Table)**.
97
97
98
98
## View your data from Azure Data Lake Storage Gen2
99
99
100
-
1. Select the Azure Synapse link you want, and then select **Go to Azure data lake** on the command
100
+
1. Select the Azure Synapse Link you want, and then select **Go to Azure data lake** on the command
101
101
bar.
102
102
1. Select the **Containers** under **Data Storage**.
103
103
1. Select **dataverse-***environmentName-organizationUniqueName*. All parquet files are stored in the
0 commit comments