gbif_registrar.register

Register datasets with GBIF.

Module Contents

Functions

initialize_registrations_file(file_path)

Returns a template registrations file to path.

register_dataset(local_dataset_id, registrations_file)

Registers a local dataset with GBIF and adds it to the registrations

complete_registration_records(registrations_file[, ...])

Returns a completed set of registration records.

gbif_registrar.register.initialize_registrations_file(file_path)

Returns a template registrations file to path.

The registrations file maps datasets from the local EDI data repository to the remote GBIF registry.

Parameters:

file_path (str) – Path of file to be written. A .csv file extension is expected.

Returns:

Writes the template registrations file to disk as a .csv.

Return type:

None

Notes

The registrations file columns and definitions are as follows:

  • local_dataset_id: The dataset identifier in the EDI repository. This is the primary key. The term ‘dataset’ used here, is synonymous with the term ‘data package’ in the EDI repository. Values in this column have the format: {scope}.{identifier}.{revision}.

  • local_dataset_group_id: The dataset group identifier in the EDI repository. This often forms a one-to-many relationship with local_dataset_id. The term ‘dataset group’ used here, is synonymous with the term ‘data package series’ in the EDI repository.

  • local_dataset_endpoint: The endpoint for downloading the dataset from the EDI repository. This forms a one-to-one relationship with local_dataset_id.

  • gbif_dataset_uuid: The registration identifier assigned by GBIF to the local_dataset_group_id. This forms a one-to-one relationship with local_dataset_group_id.

  • synchronized: The synchronization status of the local_dataset_id with GBIF. Is True if the local dataset is synchronized with GBIF, and False if the local dataset is not synchronized with GBIF. This forms a one-to-one relationship with local_dataset_id. Note, older dataset versions that have previously been synchronized will continue to have a True status, even though they are no longer hosted on GBIF.

Examples

>>> initialize_registrations_file("registrations.csv")
gbif_registrar.register.register_dataset(local_dataset_id, registrations_file)

Registers a local dataset with GBIF and adds it to the registrations file.

Parameters:
  • local_dataset_id (str) – The local dataset identifier.

  • registrations_file (str) – The path of the registrations file.

Returns:

The registrations file, written back to itself as a .csv.

Return type:

None

Notes

This function requires authentication with GBIF. Use the load_configuration function from the configure module to do this.

Examples

>>> register_dataset("edi.929.2", "registrations.csv")
gbif_registrar.register.complete_registration_records(registrations_file, local_dataset_id=None)

Returns a completed set of registration records.

This function can be run to repair one or more dataset registrations that have incomplete information in the local_dataset_group_id, local_dataset_endpoint, or gbif_dataset_uuid columns.

Parameters:
  • registrations_file (str) – The path of the registrations file.

  • local_dataset_id (str, optional) – The dataset identifier in the EDI repository. If provided, only the registration record for the specified local_dataset_id will be completed. If not provided, all registration records with incomplete information will be repaired.

Returns:

The registrations file, written back to itself as a .csv.

Return type:

None

Notes

This function requires authentication with GBIF. Use the load_configuration function from the configure module to do this.

Examples

>>> # Complete all registration records with missing values.
>>> complete_registration_records("registrations.csv")
>>> # Repair the registration record for a specific dataset.
>>> complete_registration_records("registrations.csv", "edi.929.2")