Glossary
When consistent with usage in this document, definitions have been pulled from various other resources including Wikipedia, Open Data Handbook glossary, Duke Law EDRM glossary, and the Open Government Data Act codification of act definitions 44 usc 3502. It is possible to find alternative and contradictory definitions in other data management resources resources. The definitions provided here are those implied by the term’s usage in this document.
Accessibility: the degree to which the resource is obtainable by an interested party. Direct access without constraint would be the most accessible (e.g. resources that may be downloaded without requiring a login), whereas resources that require third-party intervention would be less accessible.
Archive Folder: a consistent file structure with use constraints and backup schedule that houses the definitive record of a project’s data resources. Products in the archive folder are the subject of metadata records and are the versions intended for use and dissemination. Contrast with working folder.
Data Catalog: database comprised of metadata allowing for the discovery of data resources.
Data Custodian: individual responsible for the storage and security of a data resource.
Data Trustee: individual having the authority to: 1) ensure resources are available to implement the complete project and data lifecycle and 2) ensure compliance with all data governance policies.
Data Dictionary: provides information on the contents of a dataset to support data quality and use. Such information includes entity (i.e., variable) definitions and allowable values. In the case of databases, or a collection of datasets, relationships between tables are also defined in the data dictionary.
Data Integrity: property describing foundational soundness of a data resource. Data with strong integrity have undergone quality control and assurance procedures throughout their lifespan, have permanence over a reasonable timeframe and changes to the data are appropriately documented.
Data Management: an administrative process that includes acquiring, validating, storing, and securing data to ensure the accessibility, integrity, and timeliness of the data for its users.
Data Management Plan: document that describes the data expected from the project, how such data will be handled throughout the project to protect data integrity, and stored at the conclusion of the project to ensure security, discoverability, and accessibility.
Data Resources: data. Recorded information, regardless of form or the media on which the data is recorded. aka Products.
Data Steward: individual responsible for reviewing the quality and metadata of a resource.
Discoverability: the degree to which information about a data resource’s existence is readily obtained via searching an information system (e.g., Data.gov). Certain aspects of the metadata for the resource may be useful in enhancing discoverability, such as keywords or spatial bounds. Data catalogs can enhance discoverability by providing a standard location for searching and organizing resources. A data resource may be discoverable (e.g. found in a search result) but not accessible (see accessibility).
ISO: the International Organization for Standardization. Entity that provides standards to ensure consistency in definitions, formats, and use.
mdEditor: a web application used to write archival-quality metadata for projects and data resources. mdeditor.org
Metadata: data that describes and provides additional information about other data to promote discoverability and proper use.
Open Format: data format that is platform independent, machine readable, and made available to the public without restrictions that would impede the re-use of that information.
Project: a discrete effort on a particular topic with defined objectives or goals.
Project Management: the practice of initiating, planning, executing, controlling, and closing the work of a team to achieve specific goals and meet specific success criteria at the specified time.
Quality Assurance: preventing errors. The maintenance of a desired level of quality in a product, by means of attention to every stage of the process of acquisition, manipulation, and use
Quality Control: identifying and correcting errors. Process of review to reduce or eliminate errors made during data acquisition and manipulation.
Reproducible (analyses, workflow, or research): structuring activities so that a product (e.g., a data set, analysis result, or report) can be repeated and the same results achieved. Replication could be achieved by either the same person or team that created the original product or a different team. Documentation and scripted work flows play a key role in reproducibility.
Tidy Data: standard way of relating the structure of a dataset to its meaning. Specifically, each row represents an observation and each column represents a variable recorded on an observation.
Working Folder: a file structure used by an individual, or a group in collaboration, to store data resources under production during the course of a project’s implementation. Contrast with archive folder.
Last updated