Database Technology at the HUB: Interactive Data Views for Community Shared Data
Category
Published on
Abstract
A new "database" resource has been developed to serve the data sharing needs of hub communities. This resource was designed and implemented at cceHUB for the Cancer Care Engineering project, and is now operating across six hubs, offering sixteen community databases. The support infrastructure consists of a MySQL database together with components that provide interfaces for data contribution and data exploration. For simple data models, database tables are created automatically by a spreadsheet parser. More complex data models require manual creation of data tables that describe elements and relationships.
Data is contributed to hub databases either via a spreadsheet parser operating on a standardized data format or through a sequence of web-forms representing the application data flow. Web-forms are managed by the “com_form†component which provides automatic form generation and processing. A simple data definition file is created for each form, using a toolkit of constructs for form design. Completed forms are submitted in XML format to back-end Java parsers for data validation, processing and insertion.
A powerful "com_dataview" component operates on hub database tables to provide spreadsheet-based data browse, filter, search, sort and download, with features such as linking to documents/images, viewing of photo galleries, and launching of hub tools - even dynamic plots and integrated Google maps. The core dataview code uses javascript and open source software (jquery, jqplot, datatables, explorercanvas). A simple data definition file is created for each view, identifying the columns and their features.
A few community databases were created in a single day, while other databases have been continuously revised and advanced for nearly two years. The current database creation process requires some manual interaction, but development efforts are underway to automate the entire process from beginning to end.
Bio
Ann Christine Catlin is a research scientist in the Rosen Center for Advanced Computing at Purdue University. She received a B.S. in Mathematics from Seton Hill University and an M.S. in Mathematics from Notre Dame University. She has worked for companies both large (AT&T Information Systems) and small (Applied Data Research), and worked as a research scientist in the Computer Science Department at Purdue University before moving to the Rosen Center. Catlin worked on the design and development of problem-solving environments for partial differential equation-based applications on multi-computer platforms, and co-authored more than 30 peer-reviewed publications over a ten year period about her research. She created a knowledge-based shipboard troubleshooting system for the U.S. Navy, and for this effort she won the School of Science Merit award for Extraordinary Achievement in 2004 and the Techpoint Mira Award in 2005. Catlin’s work at the Rosen Center has focused on designing and developing research environments based on HUBzerotm technology. She participated in work on pharmaHUB, and currently works on projects for NEEShub, thermalHUB and the health-exchange HUB. She is leading the design and implementation of cceHUB, an infrastructure for collaborative research that supports the cancer care engineering projects.