:py:mod:`gbif_registrar.register` ================================= .. py:module:: gbif_registrar.register .. autoapi-nested-parse:: Register datasets with GBIF. Module Contents --------------- Functions ~~~~~~~~~ .. autoapisummary:: gbif_registrar.register.initialize_registrations_file gbif_registrar.register.register_dataset gbif_registrar.register.complete_registration_records .. py:function:: initialize_registrations_file(file_path) Returns a template registrations file to path. The registrations file maps datasets from the local EDI data repository to the remote GBIF registry. :param file_path: Path of file to be written. A .csv file extension is expected. :type file_path: str :returns: Writes the template registrations file to disk as a .csv. :rtype: None .. rubric:: Notes The registrations file columns and definitions are as follows: - `local_dataset_id`: The dataset identifier in the EDI repository. This is the primary key. The term 'dataset' used here, is synonymous with the term 'data package' in the EDI repository. Values in this column have the format: {scope}.{identifier}.{revision}. - `local_dataset_group_id`: The dataset group identifier in the EDI repository. This often forms a one-to-many relationship with `local_dataset_id`. The term 'dataset group' used here, is synonymous with the term 'data package series' in the EDI repository. - `local_dataset_endpoint`: The endpoint for downloading the dataset from the EDI repository. This forms a one-to-one relationship with `local_dataset_id`. - `gbif_dataset_uuid`: The registration identifier assigned by GBIF to the `local_dataset_group_id`. This forms a one-to-one relationship with `local_dataset_group_id`. - `synchronized`: The synchronization status of the `local_dataset_id` with GBIF. Is `True` if the local dataset is synchronized with GBIF, and `False` if the local dataset is not synchronized with GBIF. This forms a one-to-one relationship with `local_dataset_id`. Note, older dataset versions that have previously been synchronized will continue to have a `True` status, even though they are no longer hosted on GBIF. .. rubric:: Examples >>> initialize_registrations_file("registrations.csv") .. py:function:: register_dataset(local_dataset_id, registrations_file) Registers a local dataset with GBIF and adds it to the registrations file. :param local_dataset_id: The local dataset identifier. :type local_dataset_id: str :param registrations_file: The path of the registrations file. :type registrations_file: str :returns: The registrations file, written back to itself as a .csv. :rtype: None .. rubric:: Notes This function requires authentication with GBIF. Use the load_configuration function from the configure module to do this. .. rubric:: Examples >>> register_dataset("edi.929.2", "registrations.csv") .. py:function:: complete_registration_records(registrations_file, local_dataset_id=None) Returns a completed set of registration records. This function can be run to repair one or more dataset registrations that have incomplete information in the local_dataset_group_id, local_dataset_endpoint, or gbif_dataset_uuid columns. :param registrations_file: The path of the registrations file. :type registrations_file: str :param local_dataset_id: The dataset identifier in the EDI repository. If provided, only the registration record for the specified `local_dataset_id` will be completed. If not provided, all registration records with incomplete information will be repaired. :type local_dataset_id: str, optional :returns: The registrations file, written back to itself as a .csv. :rtype: None .. rubric:: Notes This function requires authentication with GBIF. Use the load_configuration function from the configure module to do this. .. rubric:: Examples >>> # Complete all registration records with missing values. >>> complete_registration_records("registrations.csv") >>> # Repair the registration record for a specific dataset. >>> complete_registration_records("registrations.csv", "edi.929.2")