Skip to content

Commit 624f5d9

Browse files
Charlie OwenPhil Sturgeon
authored andcommitted
Edits to the file system learning resource.
1 parent 5caf481 commit 624f5d9

File tree

1 file changed

+137
-124
lines changed

1 file changed

+137
-124
lines changed

learn/file-system.md

Lines changed: 137 additions & 124 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,13 @@
11
---
22
layout: page
3-
title: Building a mount point schema
3+
title: Modeling a file system with JSON Schema
44
---
55

6-
This example shows a possible JSON representation of a hypothetical machine's mount points as represented in an `/etc/fstab` file.
6+
> Not all constraints to an fstab file can be modeled using JSON Schema alone; however, it can represent a good number of them and the exercise is useful to demonstrate how constraints work.
77
8-
An entry in an fstab file can have many different forms. Here is a possible representation of a full fstab:
8+
This example shows a possible JSON Schema representation of file system mount points as represented in an [`/etc/fstab`](https://en.wikipedia.org/wiki/Fstab) file.
9+
10+
An entry in an fstab file can have many different forms; Here is an example:
911

1012
```json
1113
{
@@ -41,54 +43,63 @@ An entry in an fstab file can have many different forms. Here is a possible repr
4143
}
4244
```
4345

44-
Not all constraints to an fstab file can be modeled using JSON Schema alone; however, it can represent a good number of them. We will add constraints one after the other until we get to a satisfactory result.
46+
## Creating the schema outline
47+
48+
We will start with a base JSON Schema expressing the following constraints:
4549

46-
Base schema
47-
-----------
50+
* the list of entries is a JSON object;
51+
* the member names (or property names) of this object must all be valid, absolute paths;
52+
* there must be an entry for the root filesystem (ie, `/`).
4853

49-
We will start with a base schema expressing the following constraints:
54+
Building out our JSON Schema from top to bottom:
5055

51-
- the list of entries is a JSON object;
52-
- the member names (or property names) of this object must all be valid, absolute paths;
53-
- there must be an entry for the root filesystem (ie, `/`).
56+
* The [`$id`](http://json-schema.org/latest/json-schema-core.html#rfc.section.8.2) keyword.
57+
* The [`$schema`](http://json-schema.org/latest/json-schema-core.html#rfc.section.7) keyword.
58+
* The [`type`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.1.1) validation keyword.
59+
* The [`required`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.5.3) validation keyword.
60+
* The [`properties`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.5.4) validation keyword with only a `/` entry.
61+
* The [`patternProperties`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.5.5) validation keyword to match other property names via a regular expression. Note: it does not match `/`).
62+
* The [`additionalProperties`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.5.6) validation keyword.
63+
* The value here is `false` to constrain object properties to be either `/` or to match the regular expression.
5464

55-
We also want the schema to be regarded as a draft v6 schema, we must therefore specify *$schema*:
65+
> You will notice that the regular expression is explicitly anchored (with `^` and `$`): in JSON Schema, regular expressions (in `patternProperties` and in `pattern`) are not anchored by default.
5666
5767
```json
5868
{
59-
"$schema": "http://json-schema.org/draft-06/schema#",
69+
"$id": "http://example.com/fstab",
70+
"$schema": "http://json-schema.org/draft-07/schema#",
6071
"type": "object",
72+
"required": [ "/" ],
6173
"properties": {
6274
"/": {}
6375
},
6476
"patternProperties": {
6577
"^(/[^/]+)+$": {}
6678
},
6779
"additionalProperties": false,
68-
"required": [ "/" ]
6980
}
7081
```
7182

72-
Note how the valid paths constraint is enforced here:
83+
## Starting an entry
7384

74-
- we have a *properties* keyword with only a `/` entry;
75-
- we use *patternProperties* to match other property names via a regular expression (note that it does not match `/`);
76-
- as *additionalProperties* is false, it constrains object properties to be either `/` or to match the regular expression.
85+
We will start with an outline of the JSON schema which adds new concepts to what we've already demonstrated.
7786

78-
You will notice that the regular expression is explicitly anchored (with `^` and `$`): in JSON Schema, regular expressions (in *patternProperties* and in *pattern*) are not anchored by default.
87+
We saw these keywords in the prior exercise: `$id`, `$schema`, `type`, `required` and `properties`.
7988

80-
For now, the schemas describing individual entries are empty: we will start describing the constraints in the following paragraphs, using another schema, which we will reference from the main schema when we are ready.
89+
To this we add:
8190

82-
The entry schema - starting out
83-
-------------------------------
84-
85-
Here again we will proceed step by step. We will start with the global structure of our schema, which will be as such:
91+
* The [`description`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.10.1) annotation keyword.
92+
* The [`oneOf`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.7.3) keyword.
93+
* The [`$ref`](http://json-schema.org/latest/json-schema-core.html#rfc.section.8.3) keyword.
94+
* In this case, all references used are local to the schema using a relative fragment URI (`#/...`).
95+
* The [`definitions`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.9) keyword.
96+
* Including several key names which we will define later.
8697

8798
```json
8899
{
89-
"id": "http://some.site.somewhere/entry-schema#",
90-
"$schema": "http://json-schema.org/draft-06/schema#",
91-
"description": "schema for an fstab entry",
100+
"id": "http://example.com/entry-schema",
101+
"$schema": "http://json-schema.org/draft-07/schema#",
102+
"description": "JSON Schema for an fstab entry",
92103
"type": "object",
93104
"required": [ "storage" ],
94105
"properties": {
@@ -111,37 +122,26 @@ Here again we will proceed step by step. We will start with the global structure
111122
}
112123
```
113124

114-
You should already be familiar with some of the constraints:
115-
116-
- an fstab entry must be an object (`"type": "object"`);
117-
- it must have one property with name *storage* (`"required": [ "storage" ]`);
118-
- the *storage* property must also be an object.
119-
120-
There are a couple of novelties:
121-
122-
- you will notice the appearance of JSON References, via the *$ref* keyword; here, all references used are local to the schema, and the fragment part is a URI encoded JSON Pointer;
123-
- you will notice the appearance of an *id*: this is the URI of this resource; we assume here that this URI is the actual URI of this schema;
124-
- the *oneOf* keyword is new in draft v4; its value is an array of schemas, and an instance is valid if and only if it is valid against exactly one of these schemas;
125-
- finally, the *definitions* keyword is a standardized placeholder in which you can define inline subschemas to be used in a schema.
126-
127-
### The *fstype*, *options* and *readonly* properties
128-
129-
The entry schema - adding constraints
130-
-------------------------------------
125+
## Constraining entries
131126

132-
Let's now extend this skeleton to add constraints to these three properties. Note that none of them are required:
127+
Let's now extend this skeleton to add constraints to some of the properties.
133128

134-
- we will pretend that we only support `ext3`, `ext4` and `btrfs` as filesystem types;
135-
- *options* must be an array, and the items in this array must be strings; moreover, there must be at least one item, and all items should be unique;
136-
- *readonly* must be a boolean.
129+
* Our `fstype` key uses the [`enum`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.1.2) validation keyword.
130+
* Our `options` key uses the following:
131+
* The `type` validation keyword (see above).
132+
* The [`minItems`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.4.4) validation keyword.
133+
* The [`items`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.4.1) validation keyword.
134+
* The [`uniqueItems`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.4.5) validation keyword.
135+
* Together these say: `options` must be an array, and the items therein must be strings, there must be at least one item, and all items should be unique.
136+
* We have a `readonly` key.
137137

138138
With these added constraints, the schema now looks like this:
139139

140140
```json
141141
{
142-
"id": "http://some.site.somewhere/entry-schema#",
143-
"$schema": "http://json-schema.org/draft-06/schema#",
144-
"description": "schema for an fstab entry",
142+
"id": "http://example.com/entry-schema",
143+
"$schema": "http://json-schema.org/draft-07/schema#",
144+
"description": "JSON Schema for an fstab entry",
145145
"type": "object",
146146
"required": [ "storage" ],
147147
"properties": {
@@ -160,10 +160,14 @@ With these added constraints, the schema now looks like this:
160160
"options": {
161161
"type": "array",
162162
"minItems": 1,
163-
"items": { "type": "string" },
163+
"items": {
164+
"type": "string"
165+
},
164166
"uniqueItems": true
165167
},
166-
"readonly": { "type": "boolean" }
168+
"readonly": {
169+
"type": "boolean"
170+
}
167171
},
168172
"definitions": {
169173
"diskDevice": {},
@@ -174,106 +178,117 @@ With these added constraints, the schema now looks like this:
174178
}
175179
```
176180

177-
For now, all definitions are empty (an empty JSON Schema validates all instances). We will write schemas for individual definitions below, and fill these schemas into the entry schema.
181+
## The `diskDevice` definition
178182

179-
The *diskDevice* storage type
180-
-----------------------------
183+
One new keyword is introduced here:
181184

182-
This storage type has two required properties, *type* and *device*. The type can only be *disk*, and the device must be an absolute path starting with */dev*. No other properties are allowed:
185+
* The [`pattern`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.3.3) validation keyword notes the `device` key must be an absolute path starting with */dev*.
183186

184187
```json
185188
{
186-
"properties": {
187-
"type": { "enum": [ "disk" ] },
188-
"device": {
189-
"type": "string",
190-
"pattern": "^/dev/[^/]+(/[^/]+)*$"
191-
}
192-
},
193-
"required": [ "type", "device" ],
194-
"additionalProperties": false
189+
"diskDevice": {
190+
"properties": {
191+
"type": {
192+
"enum": [ "disk" ]
193+
},
194+
"device": {
195+
"type": "string",
196+
"pattern": "^/dev/[^/]+(/[^/]+)*$"
197+
}
198+
},
199+
"required": [ "type", "device" ],
200+
"additionalProperties": false
201+
}
195202
}
196203
```
197204

198-
You will have noted that we need not specify that *type* must be a string: the constraint described by *enum* is enough.
205+
## The `diskUUID` definition
199206

200-
The *diskUUID* storage type
201-
---------------------------
207+
No new keywords are introduced here.
202208

203-
This storage type has two required properties, *type* and *label*. The type can only be *disk*, and the label must be a valid UUID. No other properties are allowed:
209+
We do have a new key: `label` and the `pattern` validation keyword states it must be a valid UUID.
204210

205211
```json
206212
{
207-
"properties": {
208-
"type": { "enum": [ "disk" ] },
209-
"label": {
210-
"type": "string",
211-
"pattern": "^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$"
212-
}
213-
},
214-
"required": [ "type", "label" ],
215-
"additionalProperties": false
213+
"diskUUID": {
214+
"properties": {
215+
"type": {
216+
"enum": [ "disk" ]
217+
},
218+
"label": {
219+
"type": "string",
220+
"pattern": "^[a-fA-F0-9]{8}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{4}-[a-fA-F0-9]{12}$"
221+
}
222+
},
223+
"required": [ "type", "label" ],
224+
"additionalProperties": false
225+
}
216226
}
217227
```
218228

219-
The *nfs* storage type
220-
----------------------
229+
## The `nfs` definition
221230

222-
This storage type has three required properties: *type*, *server* and *remotePath*. What is more, the server may be either a host name, an IPv4 address or an IPv6 address.
231+
We find another new keyword:
223232

224-
For the constraints on *server*, we use a new keyword: *format*. While it is not required that *format* be supported, we will suppose that it is here:
233+
* The [`format`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.7) annotation and assertion keyword.
225234

226235
```json
227236
{
228-
"properties": {
229-
"type": { "enum": [ "nfs" ] },
230-
"remotePath": {
231-
"type": "string",
232-
"pattern": "^(/[^/]+)+$"
237+
"nfs": {
238+
"properties": {
239+
"type": { "enum": [ "nfs" ] },
240+
"remotePath": {
241+
"type": "string",
242+
"pattern": "^(/[^/]+)+$"
243+
},
244+
"server": {
245+
"type": "string",
246+
"oneOf": [
247+
{ "format": "hostname" },
248+
{ "format": "ipv4" },
249+
{ "format": "ipv6" }
250+
]
251+
}
233252
},
234-
"server": {
235-
"type": "string",
236-
"oneOf": [
237-
{ "format": "hostname" },
238-
{ "format": "ipv4" },
239-
{ "format": "ipv6" }
240-
]
241-
}
242-
},
243-
"required": [ "type", "server", "remotePath" ],
244-
"additionalProperties": false
253+
"required": [ "type", "server", "remotePath" ],
254+
"additionalProperties": false
255+
}
245256
}
246257
```
247258

248-
The *tmpfs* storage type
249-
------------------------
259+
## The *tmpfs* definition
260+
261+
Our last definition introduces two new keywords:
250262

251-
This storage type has two required properties: *type* and *sizeInMB*. The size can only be an integer. What is more, we will require that the size be between 16 and 512, inclusive:
263+
* The [`minimum`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.2.4) validation keyword.
264+
* The [`maximum`](http://json-schema.org/latest/json-schema-validation.html#rfc.section.6.2.2) validation keword.
265+
* Together these require the size be between 16 and 512, inclusive.
252266

253267
```json
254268
{
255-
"properties": {
256-
"type": { "enum": [ "tmpfs" ] },
257-
"sizeInMB": {
258-
"type": "integer",
259-
"minimum": 16,
260-
"maximum": 512
261-
}
262-
},
263-
"required": [ "type", "sizeInMB" ],
264-
"additionalProperties": false
269+
"tmpfs": {
270+
"properties": {
271+
"type": { "enum": [ "tmpfs" ] },
272+
"sizeInMB": {
273+
"type": "integer",
274+
"minimum": 16,
275+
"maximum": 512
276+
}
277+
},
278+
"required": [ "type", "sizeInMB" ],
279+
"additionalProperties": false
280+
}
265281
}
266282
```
267283

268-
The full entry schema
269-
---------------------
284+
## The full entry schema
270285

271286
The resulting schema is quite large:
272287

273288
```json
274289
{
275290
"id": "http://some.site.somewhere/entry-schema#",
276-
"$schema": "http://json-schema.org/draft-06/schema#",
291+
"$schema": "http://json-schema.org/draft-07/schema#",
277292
"description": "schema for an fstab entry",
278293
"type": "object",
279294
"required": [ "storage" ],
@@ -356,14 +371,13 @@ The resulting schema is quite large:
356371
}
357372
```
358373

359-
Plugging this into our main schema
360-
----------------------------------
374+
## Plugging this into our main schema
361375

362376
Now that all possible entries have been described, we can refer to the entry schema from our main schema. We will, again, use a JSON Reference here:
363377

364378
```json
365379
{
366-
"$schema": "http://json-schema.org/draft-06/schema#",
380+
"$schema": "http://json-schema.org/draft-07/schema#",
367381
"type": "object",
368382
"properties": {
369383
"/": { "$ref": "http://some.site.somewhere/entry-schema#" }
@@ -376,17 +390,16 @@ Now that all possible entries have been described, we can refer to the entry sch
376390
}
377391
```
378392

379-
Wrapping up
380-
-----------
393+
## Wrapping up
381394

382395
This example is much more advanced than the previous example; you will have learned of schema referencing and identification, you will have been introduced to other keywords. There are also a few additional points to consider.
383396

384397
### The schema can be improved
385398

386399
This is only an example for learning purposes. Some additional constraints could be described. For instance:
387400

388-
- it makes no sense for `/` to be mounted on a tmpfs filesystem;
389-
- it makes no sense to specify the filesystem type if the storage is either NFS or tmpfs.
401+
* it makes no sense for `/` to be mounted on a tmpfs filesystem;
402+
* it makes no sense to specify the filesystem type if the storage is either NFS or tmpfs.
390403

391404
As an exercise, you can always try to add these constraints. It would probably require splitting the schema further.
392405

@@ -400,6 +413,6 @@ If we take an NFS entry as an example, JSON Schema alone cannot check that the s
400413

401414
While this is not a concern if you know that the schema you write will be used by you alone, you should keep this in mind if you write a schema which other people can potentially use. The schema we have written here has some features which can be problematic for portability:
402415

403-
- *format* support is optional, and as such other tools may ignore this keyword: this can lead to a different validation outcome for the same data;
404-
- it uses regular expressions: care should be taken not to use any advanced features (such as lookarounds), since they may not be supported at the other end;
405-
- it uses *$schema* to express the need to use draft v6 compliant processing, but not all tools support draft v6.
416+
* *format* support is optional, and as such other tools may ignore this keyword: this can lead to a different validation outcome for the same data;
417+
* it uses regular expressions: care should be taken not to use any advanced features (such as lookarounds), since they may not be supported at the other end;
418+
* it uses *$schema* to express the need to use draft v6 compliant processing, but not all tools support draft v6.

0 commit comments

Comments
 (0)