B2SHARE v2 vs v3 JSON Structural Differences#
Overview#
This document provides a technical comparison of the B2SHARE v2 and B2SHARE v3 JSON record structures. B2SHARE v3 is based on InvenioRDM v13, but introduces several custom extensions and data model adjustments, particularly in metadata, community handling, and PID organization.
All examples below use fictional data for demonstration purposes only.
Structural Overview#
| Aspect | B2SHARE v2 | B2SHARE v3 |
|---|---|---|
| Base system | Invenio 2.x | Customized InvenioRDM v13 |
| Record ID | UUID (b1a2c3d4e5f6...) |
Short alphanumeric ID (abcd1-efg23) |
| Record model | Flat JSON (metadata and files at top level) | Hierarchical JSON (with parent, pids, access, etc.) |
| Schema handling | $schema URI per community |
community_extension_schema field with associated community_extension |
| File metadata | Simple list of files with metadata | Nested object with per-file metadata, access, and PIDs |
| PID management | Embedded DOI/ePIC in metadata | Centralized pids section with structured providers |
| Access control | open_access and embargo_date |
Object-based model (access.record, embargo) |
| Communities | Community UUID under metadata.community |
Full parent.communities.entries structure with slug and policies |
| Licensing | Simple key-value structure | SPDX-aligned rights[] array |
Key Section Differences#
Top-level structure#
v2 example
{
"id": "b1a2c3d4e5f6",
"_pid": "10.23728/b2share.b1a2c3d4e5f6",
"metadata": {...},
"files": [...]
}
v3 example
{
"id": "abcd1-efg23",
"parent": {...},
"pids": {...},
"metadata": {...},
"access": {...},
"files": {...}
}
Main difference:
B2SHARE v3 adds modular sections for pids, parent, access, versions, and stats, following InvenioRDM’s structured model.
Metadata block#
| v2 Path | v3 Path | Notes |
|---|---|---|
metadata.titles[0].title |
metadata.title |
Flattened field |
metadata.descriptions[0].description |
metadata.description |
Flattened field |
metadata.creators[].creator_name |
metadata.creators[].person_or_org.name |
Structured under person_or_org |
metadata.creators[].family_name / given_name |
metadata.creators[].person_or_org.family_name / given_name |
Mandatory in v3 |
metadata.contributors[].contributor_type |
metadata.contributors[].role.id |
Uses controlled vocabulary |
metadata.contact_email |
metadata.contact_emails[] |
Supports multiple contacts |
metadata.resource_types |
metadata.resource_types[] |
Now an array (was a single object) |
metadata.disciplines[] |
metadata.subjects[].id |
Subjects reference vocabulary IDs |
metadata.keywords[] |
metadata.subjects[] |
Merged keywords and disciplines |
metadata.community_specific.<uuid> |
metadata.community_extension.<slug> |
Custom fields renamed |
metadata.instruments[] |
metadata.instruments[] |
Array of objects with pid and pid_type |
metadata.temporal_coverage |
metadata.temporal_coverage |
Added in B2SHARE v3 |
Example:
// v2
"metadata": {
"titles": [{"title": "Sample Forest Dataset"}],
"creators": [{"creator_name": "Doe, Jane"}],
"disciplines": ["2.1.1 → Biology → Ecology"],
"contact_email": "jane.doe@example.org",
"open_access": false,
"embargo_date": "2025-12-31",
"instruments": [
{"name": "xyz"}
]
}
// v3
"metadata": {
"title": "Sample Forest Dataset",
"creators": [
{
"person_or_org": {
"type": "personal",
"given_name": "Jane",
"family_name": "Doe",
"name": "Doe, Jane"
}
}
],
"subjects": [{"id": "2.1.1"}],
"contact_emails": [{"contact_email": "jane.doe@example.org"}],
"instruments": [
{"pid": "10.1234/instrument-xyz", "pid_type": "doi", "name": "xyz"} // all 3 fields are mandatory for version 3
]
}
PID management#
| v2 | v3 |
|---|---|
DOI and ePIC stored under metadata |
Moved to pids section |
| Record UUID doubles as internal PID | Introduced b2rec (16-byte UUID) |
OAI identifier stored under metadata.pids |
Added pids.oai |
// v3 example
"pids": {
"doi": {"identifier": "10.23728/b2share.0001abcd", "provider": "datacite"},
"epic": {"identifier": "http://hdl.handle.net/11304/xyz", "provider": "epic"},
"b2rec": {"identifier": "8f0fdd0163f044a082f8c2571205aaaa", "provider": "b2rec"}
}
File structure#
v2
"files": [
{"key": "data.csv", "size": 12345, "checksum": "md5:abc..."}
]
v3
"files": {
"count": 1,
"entries": {
"data.csv": {
"size": 12345,
"checksum": "md5:abc...",
"mimetype": "text/csv",
"pids": {"epic": {"identifier": "http://hdl.handle.net/11304/1234"}},
"links": {"content": "https://example.org/api/files/data.csv/content"}
}
}
}
Access control#
| v2 | v3 |
|---|---|
Only for files: "open_access": true or false |
Both files and records: "access": {"record": "public"} or "access": {"record": "restricted"} |
Only for files: embargo_date |
Both files and records: "embargo": {"until": "YYYY-MM-DD", "active": true} |
| Owners listed in metadata | Ownership under parent.access.owned_by |
| No access link management | Secret links via access_links API |
Community model#
| v2 | v3 |
|---|---|
metadata.community (UUID only) |
parent.communities.entries (with slug, metadata, access policy) |
metadata.community_specific |
metadata.community_extension |
Summary#
- Creators must now include the
family_name. - Subjects reference vocabulary IDs instead of full strings.
- Instruments require
pid,pid_typeandname. - b2rec PID is always a 16-byte UUID.
- Embargo model existed in v2 only for files and it was simplified (
open_access+embargo_date) and is now explicit in v3 and can be applied to both records and files. - Files, access, and community models are fully modular in v3.
Last update : 10.11.2025
Last review : 10.11.2025