Skip to content

B2SHARE v2 vs v3 JSON Structural Differences#


Overview#

This document provides a technical comparison of the B2SHARE v2 and B2SHARE v3 JSON record structures. B2SHARE v3 is based on InvenioRDM v13, but introduces several custom extensions and data model adjustments, particularly in metadata, community handling, and PID organization.

All examples below use fictional data for demonstration purposes only.


Structural Overview#

Aspect B2SHARE v2 B2SHARE v3
Base system Invenio 2.x Customized InvenioRDM v13
Record ID UUID (b1a2c3d4e5f6...) Short alphanumeric ID (abcd1-efg23)
Record model Flat JSON (metadata and files at top level) Hierarchical JSON (with parent, pids, access, etc.)
Schema handling $schema URI per community community_extension_schema field with associated community_extension
File metadata Simple list of files with metadata Nested object with per-file metadata, access, and PIDs
PID management Embedded DOI/ePIC in metadata Centralized pids section with structured providers
Access control open_access and embargo_date Object-based model (access.record, embargo)
Communities Community UUID under metadata.community Full parent.communities.entries structure with slug and policies
Licensing Simple key-value structure SPDX-aligned rights[] array

Key Section Differences#

Top-level structure#

v2 example

{
  "id": "b1a2c3d4e5f6",
  "_pid": "10.23728/b2share.b1a2c3d4e5f6",
  "metadata": {...},
  "files": [...]
}

v3 example

{
  "id": "abcd1-efg23",
  "parent": {...},
  "pids": {...},
  "metadata": {...},
  "access": {...},
  "files": {...}
}

Main difference: B2SHARE v3 adds modular sections for pids, parent, access, versions, and stats, following InvenioRDM’s structured model.


Metadata block#

v2 Path v3 Path Notes
metadata.titles[0].title metadata.title Flattened field
metadata.descriptions[0].description metadata.description Flattened field
metadata.creators[].creator_name metadata.creators[].person_or_org.name Structured under person_or_org
metadata.creators[].family_name / given_name metadata.creators[].person_or_org.family_name / given_name Mandatory in v3
metadata.contributors[].contributor_type metadata.contributors[].role.id Uses controlled vocabulary
metadata.contact_email metadata.contact_emails[] Supports multiple contacts
metadata.resource_types metadata.resource_types[] Now an array (was a single object)
metadata.disciplines[] metadata.subjects[].id Subjects reference vocabulary IDs
metadata.keywords[] metadata.subjects[] Merged keywords and disciplines
metadata.community_specific.<uuid> metadata.community_extension.<slug> Custom fields renamed
metadata.instruments[] metadata.instruments[] Array of objects with pid and pid_type
metadata.temporal_coverage metadata.temporal_coverage Added in B2SHARE v3

Example:

// v2
"metadata": {
  "titles": [{"title": "Sample Forest Dataset"}],
  "creators": [{"creator_name": "Doe, Jane"}],
  "disciplines": ["2.1.1 → Biology → Ecology"],
  "contact_email": "jane.doe@example.org",
  "open_access": false,
  "embargo_date": "2025-12-31",
  "instruments": [
    {"name": "xyz"}
  ]
}

// v3
"metadata": {
  "title": "Sample Forest Dataset",
  "creators": [
    {
      "person_or_org": {
        "type": "personal",
        "given_name": "Jane",
        "family_name": "Doe",
        "name": "Doe, Jane"
      }
    }
  ],
  "subjects": [{"id": "2.1.1"}],
  "contact_emails": [{"contact_email": "jane.doe@example.org"}],
  "instruments": [
    {"pid": "10.1234/instrument-xyz", "pid_type": "doi", "name": "xyz"} // all 3 fields are mandatory for version 3
  ]
}

PID management#

v2 v3
DOI and ePIC stored under metadata Moved to pids section
Record UUID doubles as internal PID Introduced b2rec (16-byte UUID)
OAI identifier stored under metadata.pids Added pids.oai
// v3 example
"pids": {
  "doi": {"identifier": "10.23728/b2share.0001abcd", "provider": "datacite"},
  "epic": {"identifier": "http://hdl.handle.net/11304/xyz", "provider": "epic"},
  "b2rec": {"identifier": "8f0fdd0163f044a082f8c2571205aaaa", "provider": "b2rec"}
}

File structure#

v2

"files": [
  {"key": "data.csv", "size": 12345, "checksum": "md5:abc..."}
]

v3

"files": {
  "count": 1,
  "entries": {
    "data.csv": {
      "size": 12345,
      "checksum": "md5:abc...",
      "mimetype": "text/csv",
      "pids": {"epic": {"identifier": "http://hdl.handle.net/11304/1234"}},
      "links": {"content": "https://example.org/api/files/data.csv/content"}
    }
  }
}


Access control#

v2 v3
Only for files: "open_access": true or false Both files and records: "access": {"record": "public"} or "access": {"record": "restricted"}
Only for files: embargo_date Both files and records: "embargo": {"until": "YYYY-MM-DD", "active": true}
Owners listed in metadata Ownership under parent.access.owned_by
No access link management Secret links via access_links API

Community model#

v2 v3
metadata.community (UUID only) parent.communities.entries (with slug, metadata, access policy)
metadata.community_specific metadata.community_extension

Summary#

  • Creators must now include the family_name.
  • Subjects reference vocabulary IDs instead of full strings.
  • Instruments require pid, pid_type and name.
  • b2rec PID is always a 16-byte UUID.
  • Embargo model existed in v2 only for files and it was simplified (open_access + embargo_date) and is now explicit in v3 and can be applied to both records and files.
  • Files, access, and community models are fully modular in v3.

  Last update : 10.11.2025

Last review : 10.11.2025