Use of Hierarchical Keywords for Easy Data Management on HUBzero

By Gaurav Nanda1; Jonathan Tan; Peter Auyeung; Bill Gaskill; Christopher A Smoak1; mark lehto1

1. Purdue University

Category

Seminars

Published on

Abstract

Post implementation, HUBzero has been well accepted as a knowledge management and collaboration platform for the reliability engineering (RE) division of a large consumer goods company. Various RE tools are being used in the organization in form of spreadsheets. Automated workflows have been developed to collect and publish good quality RE files on HUBzero as resources for benchmarking and reuse. The WEB 2.0 features of HUBzero such as tags, ratings and reviews help the users to browse efficiently through the various RE tool analyses on HUBzero. We are now working towards intelligently assigning keywords to each of the reliability tool spreadsheets in an automated manner. These keywords will be displayed on HUBzero in a similar fashion as tags but with scores.

We have used a customized statistical approach for keyword extraction based on the term frequency for identifying keywords from the RE spreadsheets, which contain data structured in a unique manner. The keywords assigned to a particular RE file/resource will serve two purposes: help users find the RE files and give users an idea of the content of the RE files without actually opening it. Hence, we have two types of keywords: Global keywords, aimed to direct the user from the top level to a group of RE files and also to facilitate browsing through related content. File keywords are aimed to provide specific details of a particular RE file. These keywords are determined using two types of scores associated with each word in the RE file: file score and global score. The file score of a word indicates the association strength of the keyword with a particular file and would be displayed along with the file information to the user. The global score for a particular keyword indicates if the word has presence across a group of files. We are in the process of implementing this approach.

Bio

Gaurav Nanda is a PhD student in the School of Industrial Engineering at Purdue University. He is working with Professor Mark Lehto in the area of text mining and collaborative knowledge management. Before joining the PhD program, he worked for five years with Infosys Technologies designing and implementing large-scale software systems in the area of retail banking. He obtained his Bachelors in Agricultural and Food Engineering and Masters in Water Resource Development and Management from Indian Institute of Technology Kharagpur. He worked in the area of non-conventional optimization during his Bachelors and Masters Thesis.

Cite this work

Researchers should cite this work as follows:

  • Gaurav Nanda; Jonathan Tan; Peter Auyeung; Bill Gaskill; Christopher A Smoak; mark lehto (2013), "Use of Hierarchical Keywords for Easy Data Management on HUBzero," https://help.hubzero.org/resources/1052.

    BibTex | EndNote

Submitter

Nikki Huang

Purdue University

Tags

Use of Hierarchical Keywords for Easy Data Management on HUBzero
by: Gaurav Nanda, Jonathan Tan, Peter Auyeung, Bill Gaskill, Christopher A Smoak, mark lehto
  • Use of Hierarchical Keywords for Easy Data Management on HUBzero 1. Use of Hierarchical Keywords f… 0
    00:00/00:00
  • Reliability Tools as Resources 2. Reliability Tools as Resources 153.58692025358693
    00:00/00:00
  • HUBzero 3. HUBzero 271.5382048715382
    00:00/00:00
  • HUBzero 4. HUBzero 469.26926926926927
    00:00/00:00
  • HUBzero 5. HUBzero 517.75108441775114
    00:00/00:00
  • HUBzero 6. HUBzero 526.6266266266266
    00:00/00:00
  • Keywords/Tags 7. Keywords/Tags 624.15749082415755
    00:00/00:00
  • Keyword Extraction 8. Keyword Extraction 814.44778111444782
    00:00/00:00
  • Keyword Extraction 9. Keyword Extraction 925.45879212545879
    00:00/00:00
  • Keyword Extraction 10. Keyword Extraction 969.93660326993665
    00:00/00:00
  • Keywords Display 11. Keywords Display 1024.8581915248583
    00:00/00:00
  • Keywords Display 12. Keywords Display 1043.0096763430097
    00:00/00:00
  • Future Work 13. Future Work 1077.5775775775776
    00:00/00:00
  • Thank You Questions? 14. Thank You Questions? 1120.5872539205873
    00:00/00:00
  • Copyright © 2022 Hubzero
  • Powered by Hubzero®