|
| 1 | +### YamlMime:FAQ |
| 2 | +metadata: |
| 3 | + title: Frequently asked questions about exporting Dataverse table data to Azure data lake |
| 4 | + description: Get answers to frequently asked questions about Power Apps export to data lake. |
| 5 | + author: sabinn-msft |
| 6 | + ms.service: powerapps |
| 7 | + ms.search.keywords: |
| 8 | + ms.date: 02/10/2021 |
| 9 | + ms.author: sabinn |
| 10 | + ms.reviewer: matp |
| 11 | +title: Power Apps component framework FAQs |
| 12 | +summary: This article provides information on frequently asked questions about exporting Dataverse table data to the Azure data lake. |
| 13 | +sections: |
| 14 | +- name: General |
| 15 | + questions: |
| 16 | + - question: When should I use a yearly or monthly partition strategy? |
| 17 | + answer: | |
| 18 | + For Dataverse tables where data volume is high within a year,we recommend you use monthly partitions. Doing so results in smaller files and better performance. Additionally, if the rows in Dataverse tables are updated frequently, splitting into multiple smaller files help improve performance in the case of in-place update scenarios. |
| 19 | + - question: When do I use Append only mode for a historical view of changes? |
| 20 | + answer: | |
| 21 | + Append only mode is the recommended option for writing Dataverse table data to the lake, especially when the data volumes are high within a partition with frequently changing data. Again, this is a commonly used and highly recommended option for enterprise customers. Additionally, you can choose to use this mode for scenarios where the intent is to incrementally review changes from Dataverse and process the changes for ETL, AI, and ML scenarios. Append only mode provides a history of changes, instead of the latest change or in place update, and enables several time series from AI scenarios, such as prediction or forecasting analytics based on historical values. |
| 22 | + - question: What happens when I add a column? |
| 23 | + answer: | |
| 24 | + When you add a new column to a table in the source, it is also added at the end of the file in the destination in the corresponding file partition. While the rows that existed prior to the addition of the column won't show the new column, new or updated rows will show the newly added column. |
| 25 | + - question: What happens when I delete a column? |
| 26 | + answer: | |
| 27 | + When you delete a column from a table in the source, the column is not dropped from the destination. Instead, the rows are no longer updated and are marked as null while preserving the previous rows. |
| 28 | + - question: What happens when I delete a row? |
| 29 | + answer: | |
| 30 | + Deleting a row is handled differently based on which data write options you choose: |
| 31 | + - In-place update: This is the default mode and when you delete a table row in this mode, the row is also deleted from the corresponding data partition in the Azure data lake. In other words, data is hard deleted from the destination. |
| 32 | + - Append-only: In this mode, when a Dataverse table row is deleted, it is not hard deleted from the destination. Instead, a row is added and set as isDeleted=True to the file in the corresponding data partition in Azure data lake. |
| 33 | + - question: What happens when I add a new table to sync to the data lake? |
| 34 | + answer: | |
| 35 | + It is important to note that export to data lake simultaneously processes a fixed set of tables for initial sync at any given time. When a new table is added to the export to data lake profile, the incremental or delta sync for the existing tables are paused. While the delta messages for existing tables continue to queue, export to data lake won't process any messages from the delta queue until the initial sync for the newly added table is complete. This is true whether you create a new linked lake, a new export to data lake profile, or add a table to an existing profile. For new or existing profiles, we recommend that you add no more than 5 tables to sync to the data lake at a time. Adding more tables won't expedite the sync process. While initial sync for the newly added tables in ongoing, it will queue any delta changes on those tables into a single queue. Export to data lake won't process these deltas change until the initial sync for all previously added tables is complete. |
| 36 | + - question: Does adding multiple Dataverse tables simultaneously to the lake impact performance? |
| 37 | + answer: | |
| 38 | + Adding multiple tables to export to data lake will impact performance in following scenarios: |
| 39 | + - Initial sync: When the tables are first added, there is a fixed set of tables which are synched for initial data at a time. We recommend adding no more than 5 tables at a time to new or existing linked lake or export to data lake profile. Adding more than the recommend number of tables doesn't expedite the sync since only the fixed set of tables are processed at a given time. |
| 40 | + - Delta sync: After the initial sync of tables are completed, any change on Dataverse for selected tables is added to a common queue in export to data lake. The queue is processed with a fixed set of parallel listeners. If you add more tables to the queue, export to data lake will process the messages, but if number of messages are higher, it will add latency for data to sync to the data lake. |
| 41 | +
|
| 42 | +additionalContent: | |
| 43 | +
|
| 44 | + ## See also |
| 45 | +
|
| 46 | + [Export table data to Azure Data Lake Storage Gen2](export-to-data-lake.md) |
0 commit comments