Skip to content

Commit 32fae08

Browse files
JoanneHendricksonVesaJuvonen
authored andcommitted
Update the migration-api-overview.md (SharePoint#2102)
Added section on MD5 change to ship disk option, formatting and grammar edits.
1 parent fbca359 commit 32fae08

File tree

1 file changed

+44
-12
lines changed

1 file changed

+44
-12
lines changed

docs/apis/migration-api-overview.md

Lines changed: 44 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ The required permissions are as follows in the Azure Storage API:
4848
SharedAccessBlobPermissions.List)
4949
```
5050

51-
**Note:** The change to enforce Read and List permissions on the SAS token is coming in a future build, and until then will be not be enforced, however it is best practice to use these values.
51+
**Note:** The change to enforce Read and List permissions on the SAS token is coming in a future build. Until then it will not be enforced. However, it is a best practice to use these values.
5252

5353
All files in the container must have at least a single snapshot applied to them to ensure that no file modification can occur from the customer during import. Any file that does not have a snapshot will be skipped during import and an error thrown, although the job will attempt to continue to import. The import pipeline will use the latest snapshot of the file available at the time of import. The following is an example of code that might be used to create a snapshot on a file after it is uploaded to Azure Blob Storage:
5454

@@ -59,7 +59,7 @@ blob.CreateSnapshot();
5959
```
6060

6161
> [!NOTE]
62-
> The change to require and use latest SnapShots on all files is coming in a future build, and until then will be ignored.
62+
> The change to require and use the latest SnapShots on all files is coming in a future build, and until then will be ignored.
6363
6464
##### azureContainerManifestUri
6565

@@ -133,8 +133,8 @@ SPMigrationJobState is an enumeration that tracks possible major states in the i
133133
|**Member name**|**Description**|
134134
|:-----|:-----|
135135
|None |Migration job is currently unknown to the queue, either through completion and removal, or invalid job identifier. Value=0.|
136-
|Queued |Migration job is currently known by the queue, and not being processed. Value=2.|
137-
|Processing |Migration job is currently known by the queue, and is being actively processed. Value=4.|
136+
|Queued |Migration job is currently known by the queue and not being processed. Value=2.|
137+
|Processing |Migration job is currently known by the queue and is being actively processed. Value=4.|
138138

139139
## Import Package Structure
140140

@@ -153,9 +153,9 @@ Package structure is based on a constrained version of the Content Deployment pa
153153

154154
### Content structure
155155

156-
File content that is referenced within the manifest of the package structure must be stored in either a flat or hierarchical structure within the Azure Blob Store Container defined by the CreateMigrationJob’s `azureContainerSourceUri` parameter. For example import packages generated form a legacy version export will not be hierarchical, and will instead have all files stored at the root level with a pattern like ########.dat where the # symbols are hexadecimal characters starting at 0 and no file names are repeated within a package. Alternately, a package generated from a file share can have the source folder hierarchy and file names preserved in the same hierarchy.
156+
File content that is referenced within the manifest of the package structure must be stored in either a flat or hierarchical structure within the Azure Blob Store Container defined by the CreateMigrationJob’s `azureContainerSourceUri` parameter. For example, import packages generated form a legacy version export will not be hierarchical, and will instead have all files stored at the root level with a pattern like ########.dat where the # symbols are hexadecimal characters starting at 0 and no file names are repeated within a package. Alternately, a package generated from a file share can have the source folder hierarchy and file names preserved in the same hierarchy.
157157

158-
The main requirement for the structure is that the FileValue references in the **Manifest.XML** file must refer to the exact name and physical hierarchy that the content is stored in within the Azure Blob Store ___location for import. The destination file names and folder hierarchy from the import operation are not directly related to the physical naming and hierarchy, and are instead defined through the **Manifest.XML** file.
158+
The main requirement for the structure is that the FileValue references in the **Manifest.XML** file must refer to the exact name and physical hierarchy that the content is stored in within the Azure Blob Store ___location for import. The destination file names and folder hierarchy from the import operation are not directly related to the physical naming and hierarchy and are instead defined through the **Manifest.XML** file.
159159

160160
### ExportSettings.XML
161161

@@ -177,7 +177,7 @@ The **Manifest.XML** is the primary descriptor for metadata within the package,
177177

178178
The main requirements for **Manifest.XML** to be possible to successfully import through the pipeline is that the Web Id and Document Library ID/List ID be consistent with the target ___location. If a Web ID is used which doesn’t match the target ___location, errors will occur because the parent web for the import operation cannot be found.
179179

180-
Likewise an incorrect Document Library ID/List ID will prevent the importation into the target Document Library or List. IDs should never be reused within the same site collection, so same packages should not be imported to the same target site collection regardless of the destination web.
180+
Likewise, an incorrect Document Library ID/List ID will prevent the importation into the target Document Library or List. IDs should never be reused within the same site collection, so same packages should not be imported to the same target site collection regardless of the destination web.
181181

182182
For individual files and folders within the document library or list, their identifiers should be consistent between import events to the same ___location. Specifically, performing an import of a package generated form a file share would initially require generating new GUIDs for each file and folder, along with matching GUIDs for the list items that represent them. Therefore, performing a second import against the same target using the same package would keep the same IDs, but performing a second import against the same target using a new package for the same content would result in ID conflicts and import errors for all items in conflict.
183183

@@ -229,7 +229,7 @@ Url="/personal/username/_catalog/users" />
229229

230230
The **UserGroupMap.XML** file is expected to be at the root of the Azure Blob Store Container defined by the CreateMigrationJob’s `azureContainerManifestUri` parameter. This required file is validated using the constrained **DeploymentUserGroupMap.XSD**, which has no change from current published full 2013 package schema.
231231

232-
The **UserGroupMap.XML** file may not contain any User or Group entries, but doing so will prevent author or security information from being populated during import and warnings will be logged in this case. Login and SID values for users must be either adjusted to match the values in SharePoint Online, or if the account no longer should exist can be listed as `IsDeleted = “true”` to prevent lookup failures and additional slowdown during the import operation.
232+
The **UserGroupMap.XML** file may not contain any User or Group entries but doing so will prevent author or security information from being populated during import and warnings will be logged in this case. Login and SID values for users must be either adjusted to match the values in SharePoint Online, or if the account no longer should exist can be listed as `IsDeleted = “true”` to prevent lookup failures and additional slowdown during the import operation.
233233

234234
### ViewFormsList.XML
235235

@@ -245,11 +245,43 @@ Upon completion, these logs will be copied to the `azureContainerManifestUri` lo
245245

246246
Several log types can be included such as the full import log, along with warning and error files that contain only the subset of import warnings or errors respectively. Log files have unique `datetime` and `job id` stamps to allow each attempted import event to have a unique log for better debugging purposes.
247247

248+
## Changes for those using the "Ship Disk" option
249+
250+
To use the Migration API, you must have a temporary storage container in Azure. When uploading files into the temporary storage, an MD5 is required as a property on every file. However, when shipping the data on hard drives this MD5 property doesn’t get assigned automatically. As a work around, we have adapted the Migration API to allow the MD5 to be passed for every file as part of the manifest. This also applies for IV values when encrypting the data.
251+
252+
Since the MD5 is generated at the source instead of at the upload time in Azure, Microsoft can confirm the integrity of the file directly against the source MD5.
253+
254+
### What is stored in those Azure Blob Containers?
255+
256+
The Migration API requires the Azure Container for content passing and also for log and queue reporting. It can be split down as a summary as follows:<br>
257+
258+
|**Content**|**Manifest**|
259+
|:-----|:-----|
260+
|Files and folders|XML files|
261+
262+
263+
There are two new optional parameters in manifest.xml:
264+
265+
- MD5Hash <br>
266+
- InitializationVector
267+
268+
269+
#### Preparing the package
270+
The method for calling the migration job doesn’t change; only the package generation needs to be changed.
271+
272+
In the Manifest container one file is named Manifest.xml. There are 2 optional attributes added to the file node: *MD5Hash* and *InitializationVector*. <br>
273+
274+
Example:
275+
276+
```xml
277+
<FileMD5Hash="CXPP/MWYxY87NjjnLZrFg==" InitializationVector="4WlC5zQK0r9s39LoB2w==" />
278+
```
279+
248280
## Best Practices and Special Mentions
249281

250282
### Package size
251283

252-
Even if the API support 15GB files, we recommend package sizes of up to 250 MB OR 250 items (depending which one comes first). If you have one large files larger than that recommended size limit you should send it in its own package. Same goes for versions, each versions counts against the size limit and item count. Additionally all the versions of a file should be in the same package.
284+
Even if the API support 15GB files, we recommend package sizes of up to 250 MB OR 250 items (depending which one comes first). If you have one large files larger than that recommended size limit you should send it in its own package. The same applies to versions; each version counts against the size limit and item count. Additionally all the versions of a file should be in the same package.
253285

254286
### File size
255287

@@ -267,7 +299,7 @@ Import packages can have references to multiple versions of a file, major and mi
267299

268300
The identifiers used within the import package explicitly are used during import to identify content. This allows preservation of existing identifiers for document library contents from a source environment. However, it also imposes a complexity during import package creation or transformation that mandates that the package explicitly reference the target web and list identifiers. Content type identifiers, file/folder item GUIDs, and list item integer identifiers are all preserved during import. If incorrect identifiers are specified in the package, import will fail.
269301

270-
Additionally due to identifier preservation, import events can potentially be done in successive iterations using different packages, allowing items to potentially move in ___location if their identifiers have not changed.
302+
Additionally, due to identifier preservation, import events can potentially be done in successive iterations using different packages, allowing items to potentially move in ___location if their identifiers have not changed.
271303

272304
<a name="OverwriteAPI"> </a>
273305

@@ -289,11 +321,11 @@ To prevent unintended file modification of the source blobs, the import pipeline
289321

290322
### Security and encryption
291323

292-
The import pipeline is using Azure Blob Storage security model as is. This means we will not do any special treatment for those azure containers that would differentiate from any other azure containers. Additionally the import pipeline currently does not accept encryption keys for content from the customer. Any encrypted content will be treated as opaque files that SharePoint may list, but be unable to index, the same as if encrypted files were uploaded through the UI to the environment.
324+
The import pipeline is using Azure Blob Storage security model as is. This means we will not do any special treatment for those azure containers that would differentiate from any other azure containers. Additionally, the import pipeline currently does not accept encryption keys for content from the customer. Any encrypted content will be treated as opaque files that SharePoint may list, but be unable to index, the same as if encrypted files were uploaded through the UI to the environment.
293325

294326
### Events and event handlers
295327

296-
The import pipeline allows event handlers to be referenced on list items, but doesn’t allow defining event handlers at the list level at this time. The import pipeline does not fire events as items are imported, so existing event handlers will not fire due to the import event. 
328+
The import pipeline allows event handlers to be referenced on list items but doesn’t allow defining event handlers at the list level at this time. The import pipeline does not fire events as items are imported, so existing event handlers will not fire due to the import event. 
297329

298330
## Appendices
299331

0 commit comments

Comments
 (0)