Use of Hierarchical Keywords for Easy Data Management on HUBzero
Category
Published on
Abstract
Post implementation, HUBzero has been well accepted as a knowledge management and collaboration platform for the reliability engineering (RE) division of a large consumer goods company. Various RE tools are being used in the organization in form of spreadsheets. Automated workflows have been developed to collect and publish good quality RE files on HUBzero as resources for benchmarking and reuse. The WEB 2.0 features of HUBzero such as tags, ratings and reviews help the users to browse efficiently through the various RE tool analyses on HUBzero. We are now working towards intelligently assigning keywords to each of the reliability tool spreadsheets in an automated manner. These keywords will be displayed on HUBzero in a similar fashion as tags but with scores.
We have used a customized statistical approach for keyword extraction based on the term frequency for identifying keywords from the RE spreadsheets, which contain data structured in a unique manner. The keywords assigned to a particular RE file/resource will serve two purposes: help users find the RE files and give users an idea of the content of the RE files without actually opening it. Hence, we have two types of keywords: Global keywords, aimed to direct the user from the top level to a group of RE files and also to facilitate browsing through related content. File keywords are aimed to provide specific details of a particular RE file. These keywords are determined using two types of scores associated with each word in the RE file: file score and global score. The file score of a word indicates the association strength of the keyword with a particular file and would be displayed along with the file information to the user. The global score for a particular keyword indicates if the word has presence across a group of files. We are in the process of implementing this approach.
Bio
Gaurav Nanda is a PhD student in the School of Industrial Engineering at Purdue University. He is working with Professor Mark Lehto in the area of text mining and collaborative knowledge management. Before joining the PhD program, he worked for five years with Infosys Technologies designing and implementing large-scale software systems in the area of retail banking. He obtained his Bachelors in Agricultural and Food Engineering and Masters in Water Resource Development and Management from Indian Institute of Technology Kharagpur. He worked in the area of non-conventional optimization during his Bachelors and Masters Thesis.
Cite this work
Researchers should cite this work as follows:
Submitter
Nikki Huang
Purdue University
Tags
-
1. Use of Hierarchical Keywords f…
0
00:00/00:00
-
2. Reliability Tools as Resources
153.58692025358693
00:00/00:00
-
3. HUBzero
271.5382048715382
00:00/00:00
-
4. HUBzero
469.26926926926927
00:00/00:00
-
5. HUBzero
517.75108441775114
00:00/00:00
-
6. HUBzero
526.6266266266266
00:00/00:00
-
7. Keywords/Tags
624.15749082415755
00:00/00:00
-
8. Keyword Extraction
814.44778111444782
00:00/00:00
-
9. Keyword Extraction
925.45879212545879
00:00/00:00
-
10. Keyword Extraction
969.93660326993665
00:00/00:00
-
11. Keywords Display
1024.8581915248583
00:00/00:00
-
12. Keywords Display
1043.0096763430097
00:00/00:00
-
13. Future Work
1077.5775775775776
00:00/00:00
-
14. Thank You Questions?
1120.5872539205873
00:00/00:00