Data Curation and Quality Assurance in NEEShub

By Stanislav Pejša

Purdue University

Category

Seminars

Published on

Abstract

Earthquake engineering is a vibrant inter-disciplinary area that brings together researchers from seismology, structural, mechanical, and geotechnical engineering whose effort results in saving lives and protecting property during earthquakes and tsunamis. The NEES Data Repository serves the needs of the earthquake engineering community. Research teams carry out path-breaking research on small and large scale models using among other equipment shake tables, tsunami basins, and centrifuges. Such diversity represents a challenge for the collection of data and of sufficient documentation to make the data and the conditions of their origin understandable, not only to the earthquake research community, but also to practitioners and educators in the present and in the future.

Data use and re-use is one of the key goals of the NEES, during this talk I will discuss some of the solutions implemented by the NEEScomm IT team in the NEEShub that help to manage research data, collect necessary metadata and documentation for correct understanding and interpretation of the archived data. The high-quality of metadata, their consistency and predictability is essential for successful stewardship of research data and for easy transfer of knowledge from the research team to the repository with as little loss as possible. The tools that assist research teams with archiving their data need to be intuitive and non-intrusive so that the processes of metadata capture and solicitation of documentation can be pushed upstream back to the research teams that are the best source information. The NEES preservation framework is a blend of system provided data, information extracted from the files themselves, and metadata solicited from the research teams. It is an environment that provides storage for long-term access to authentic re-usable research data and preservation services, as well as virtual space for sharing and collaboration.

Bio

Stanislav Pejša is the Data Curator at the Network for Earthquake Engineering Simulations (NEES) located in Discovery Park at Purdue University. He is primarily responsible for the quality of data uploaded to the NEES data repository. He oversees evolution of data from mere aggregation of sensor measurements to fully curated research projects with metadata and documentation necessary for re-use, long-term access, and preservation. He is also involved in developing workflows and metadata solutions for improving access to and preservation of the research data stored in the repository and delivered through the NEEShub platform. He is interested in exploring new ways of effective sharing and interoperability of research data and issues related to their preservation.

Cite this work

Researchers should cite this work as follows:

  • Stanislav Pejša (2012), "Data Curation and Quality Assurance in NEEShub," https://help.hubzero.org/resources/801.

    BibTex | EndNote

Data Curation and Quality Assurance in NEEShub
by: Stanislav Pejša
  • George E. Brown, Jr. Network for Earthquake Engineering Simulation 1. George E. Brown, Jr. Network f… 0
    00:00/00:00
  • DCC Curation Life-Cycle Model 2. DCC Curation Life-Cycle Model 13.913913913913914
    00:00/00:00
  • Untitled: Slide 3 3. Untitled: Slide 3 145.94594594594597
    00:00/00:00
  • NEEScomm Data Goals 4. NEEScomm Data Goals 178.24491157824491
    00:00/00:00
  • Data Archiving at NEES 5. Data Archiving at NEES 239.80647313980649
    00:00/00:00
  • What kind of data? 6. What kind of data? 386.21955288621956
    00:00/00:00
  • Where are the data coming from? 7. Where are the data coming from… 448.58191524858194
    00:00/00:00
  • Quality assurance 8. Quality assurance 487.18718718718719
    00:00/00:00
  • Understandable data 9. Understandable data 551.85185185185185
    00:00/00:00
  • CONTENT - Metadata 10. CONTENT - Metadata 588.18818818818818
    00:00/00:00
  • Metadata - Autocomplete 11. Metadata - Autocomplete 618.51851851851859
    00:00/00:00
  • Metadata - Autocomplete 12. Metadata - Autocomplete 635.50216883550218
    00:00/00:00
  • Metadata - Checkboxes 13. Metadata - Checkboxes 641.74174174174175
    00:00/00:00
  • System-generated folders/Tabs 14. System-generated folders/Tabs 654.62128795462127
    00:00/00:00
  • Templates 15. Templates 687.28728728728731
    00:00/00:00
  • Completeness 16. Completeness 711.47814481147816
    00:00/00:00
  • Technical quality 17. Technical quality 789.38938938938941
    00:00/00:00
  • Technical quality 18. Technical quality 845.578912245579
    00:00/00:00
  • Technical quality 19. Technical quality 877.977977977978
    00:00/00:00
  • Technical quality 20. Technical quality 884.884884884885
    00:00/00:00
  • Curation - Path through SWAMP 21. Curation - Path through SWAMP 929.2625959292626
    00:00/00:00
  • Curation - Path to SWAMP 22. Curation - Path to SWAMP 969.76976976976982
    00:00/00:00
  • Photo credits 23. Photo credits 1012.5792459125793
    00:00/00:00
  • Questions? 24. Questions? 1014.4144144144144
    00:00/00:00
  • Copyright © 2022 Hubzero
  • Powered by Hubzero®