Managing Workflows Within HUBzero: How to Use Pegasus to Execute Computational Pipelines
Category
Published on
Abstract
This talk will focus on the ability to construct and execute computational pipelines/workflows within the HUBzero environment using the Pegasus Workflow Management System. Pegasus is available today in NEES.org, DiaGrid.org, and other hubs. Pegasus allows users to develop workflows at a high-level of abstraction, without worrying about the details of the execution environment. The workflow includes information about the workflow steps, the input and output data they take-in and produce. Each hub is pre-configured for a particular execution environment, so that users can seamlessly launch their workflows on the available resources. Pegasus provides monitoring interfaces to follow the progress of the workflow. When failures occur, it tries to recover from them. However, if recovery is not possible, Pegasus provides detailed failure information. The standalone version of Pegasus has been used in a variety of domains: astronomy, bioinformatics, earth science, physics, and others. Pegasus within the hub opens up its capabilities to a broader range of users.
Bio
Ewa Deelman is a Research Associate Professor at the USC Computer Science Department and a Project Leader at the USC Information Sciences Institute. Dr. Deelman's research interests include the design and exploration of collaborative, distributed scientific environments, with particular emphasis on workflow management as well as the management of large amounts of data and metadata. At ISI, Dr. Deelman is leading the Pegasus project, which designs and implements workflow mapping techniques for large-scale applications running in distributed environments. Pegasus is being used today in a number of scientific disciplines, enabling researches to formulate complex computations in a declarative way. Over the years, Dr. Deelman worked with a number of application domains including astronomy, bioinformatics, earthquake science, gravitational-wave physics, and others. As part of these collaborations, new advances in computer science and in the domain sciences were made. For example, the data intensive workflows in LIGO (gravitational-wave physics) motivated new workflow analysis algorithms that minimize workflow data footprint during execution. On the other hand, improvements in the scalability of workflows enabled SCEC scientists (earthquake science) to develop new physics-based seismic hazard maps of Southern California. In 2007, Dr. Deelman edited a book on workflow research: "Workflows in e-Science: Scientific Workflows for Grids", published by Springer 2007. She is also the founder of the annual Workshop on Workflows in Support of Large-Scale Science, which is held in conjunction with the Super Computing conference. In 1997 Dr. Deelman received her PhD in Computer Science from the Rensselaer Polytechnic Institute. Her thesis topic was in the area of parallel discrete event simulation, where she applied parallel programming techniques to the simulation of the spread of Lyme disease in nature.
Cite this work
-
1. Managing Workflows Within HUBz…
0
00:00/00:00
-
2. Outline
81.247914581247912
00:00/00:00
-
3. Computational workflows
108.84217550884218
00:00/00:00
-
4. Workflow Management
190.12345679012347
00:00/00:00
-
5. Our Approach
293.25992659325993
00:00/00:00
-
6. Pegasus Workflow Management Sy…
414.647981314648
00:00/00:00
-
7. Planning Process
535.935935935936
00:00/00:00
-
8. Generating executable workflow…
660.76076076076083
00:00/00:00
-
9. Advanced features
750.01668335001671
00:00/00:00
-
10. HUBzero Integration Pegasus wi…
858.92559225892558
00:00/00:00
-
11. Pegasus Tutorial
869.26926926926933
00:00/00:00
-
12. Benefits of Pegasus for HUB Us…
890.09009009009014
00:00/00:00
-
13. Benefits of Pegasus for HUB Us…
968.70203536870213
00:00/00:00
-
14. Pegasus in HUBzero
1007.8411745078412
00:00/00:00
-
15. Abstract Workflow (DAX)
1062.6292959626294
00:00/00:00
-
16. Use of Pegasus with Submit Com…
1158.9255922589257
00:00/00:00
-
17. Data and transformation info
1204.9049049049049
00:00/00:00
-
18. Rappture (data definitions)
1232.5325325325325
00:00/00:00
-
19. wrapper.py
1274.5745745745746
00:00/00:00
-
20. Workflow generation
1301.3346680013347
00:00/00:00
-
21. User provides inputs to the wo…
1406.6733400066735
00:00/00:00
-
22. Workflow has completed. Output…
1426.8601935268603
00:00/00:00
-
23. OpenSEES / NEEShub
1446.8134801468136
00:00/00:00
-
24. The OpenSeesLab tool: http://n…
1467.6343009676343
00:00/00:00
-
25. Rappture
1478.3450116783451
00:00/00:00
-
26. Future Directions
1555.221888555222
00:00/00:00
-
27. Benefits of workflows in the H…
1628.8955622288956
00:00/00:00
-
28. Benefits of the HUB to Pegasus
1702.8695362028695
00:00/00:00
-
29. Further Information
1762.5625625625626
00:00/00:00