Advanced Referencing
The vast majority of referencing scenarios can be handled as described in the earlier parts of this tutorial. However, consider the following schema bundles:
>>> bundle1 = {
... "$schema": "https://json-schema.org/draft/2020-12/schema",
... "$id": "https://example.com/bundle1",
... "$defs": {
... "a": {
... "$id": "https://example.com/source1/a",
... "$ref": "../source2/b"
... },
... "b": {
... "$id": "https://example.com/source1/b",
... "type": "object"
... }
... }
... }
>>> bundle2 = {
... "$schema": "https://json-schema.org/draft/2020-12/schema",
... "$id": "https://example.com/bundle2",
... "$defs": {
... "a": {
... "$id": "https://example.com/source2/a",
... "$ref": "../source1/b"
... },
... "b": {
... "$id": "https://example.com/source2/b",
... "type": "array"
... }
... }
... }
There are several conditions here, including one not visible in the schemas but plausible in many software environments:
Mutual references (which the normal
JSONSchema
constructor call cannot handle)References only to URIs from subschema
"$id"
keywords (which the normalLocalSource
orRemoteSource
configurations cannot handle)Your code may need to handle schemas with contents that it does not know in advance. (which prevents creative use of a
Source
subclass to map the subschema"$id"
URIs to their top-level"$id"
in some way, assuming that the top-level"$id"
would normally be findable)
Together, these conditions require an extra step to manage.
Deferring reference resolution
Reference resolution can be deferred using a JSONSchema
constructor parameter. Deferred references must be resolved by calling
resolve_references()
prior to calling
evaluate()
:
>>> schema1 = JSONSchema(bundle1, resolve_references=False)
>>> schema2 = JSONSchema(bundle2)
>>> schema1.resolve_references()
We could have deferred reference resolution on both schemas, and then called
resolve_references()
on both of them.
But since schema1
was already present in the catalog, the catalog was
already aware of the "$id"
URIs needed to resolve references for schema2
.
Deferring through the Catalog
If schemas such as our mutually referencing bundles are being loaded through
the Catalog
, we need to configure the catalog to
defer resolution on all loaded schemas. This can be done through
jschon.create_catalog()
:
>>> from jschon import create_catalog, URI, LocalSource
>>> deferred_catalog = create_catalog('2020-12', name='deferred', resolve_references=False)
>>> deferred_catalog.add_uri_source(
... URI('https://example.com/'),
... LocalSource(base_dir='/opt/schemas/', suffix='.json'),
... )
>>> cat_bundle1 = deferred_catalog.get_schema(URI("https://example.com/bundle1"))
>>> cat_bundle2 = deferred_catalog.get_schema(URI("https://example.com/bundle2"))
>>> cat_bundle1.references_resolved
False
>>> cat_bundle2.references_resolved
False
We can use the jschon.catalog.Catalog.resolve_references()
convenience
method to resolve all references in all schemas in a particular schema cache.
We are using the default cache here so we do not need to pass a cacheid
:
>>> deferred_catalog.resolve_references()
>>> cat_bundle1.references_resolved
True
>>> cat_bundle2.references_resolved
True
You can access this method through a JSONSchema
instance, in which case it is a good idea to pass the cacheid
unless you
are certain the schema is using the default cache:
>>> cat_bundle1.catalog.resolve_references(cacheid=cat_bundle1.cacheid)
Note that resolving references may cause additional schemas to be loaded.
resolve_references()
will resolve references
in newly loaded schemas as well, until either the entire schema cache is
fully resolved as it would have been without deferral, or an error occurs.
Metaschemas and deferred resolution
The create_metaschema()
method validates
the metaschema when it is created. Therefore it also resolves references
in the metaschema cache just prior to calling
validate()
.