The Partnership to Advance Throughput Computing (PATh) project is funded by the NSF to advance High Throughput Computing (HTC) and its impact on research and education. To achieve this goal, PATh operates a fabric of capacity services that ranges from Research Computing Facilitation, to Access Points capable of managing distributed HTC workloads. Starting January 2022, PATh will add to its fabric of services support for capacity credit accounts. These accounts will provide PIs with HTC workloads access to the processing and storage capacity of dedicated resources managed by PATh. NSF will deposit credit into these accounts when funding a proposal.
A team of PATh Facilitators is available to guide PIs in effectively using the Access Points to manage their HTC workloads and utilizing their credit; training and documentation material are also available. Access Points provide a rich set of capabilities for managing workloads that consist of individual jobs, sets of jobs, or Directed Acyclic Graphs (DAG) of jobs. Further, PIs can utilize access points to perform data placement and data caching through the Open Science Data Federation.
The two PATh partners – the OSG Consortium and the UW-Madison Center for High Throughput Computing (CHTC), have a long track record in turning the potential of distributed HTC into scientific discovery in a broad range of domains. Information about how research efforts, ranging from a single PI to international collaboration, leveraged the power of our HTC services are documented in our collection of science stories.
The dedicated PATh resources that power the capacity credit accounts are expected to consist of:
The dedicated PATh resources will be distributed across four sites; credits for dedicated resources can be used at a PATh access point.
PATh dedicated resources are not the same as the Open Science Pool (OSPool). The OSPool is composed of resources, often opportunistic, that are managed by PATh through fairshare.
The dedicated resources are funded by NSF to be managed by PATh and accessible via the PATh credit account system. This provides PATh with the ability to set policy. For example, users will have more flexibility in their workloads; jobs can have much longer runtimes compared to the OSPool (where long-running jobs are often preempted by the local site).
Workloads placed on the PATh Access Points can harness resources pools beyond the credit-based dedicated resources. For example, the Open Science Pool (OSPool) capacity consists of aggregated opportunistic resources across about 50 sites on a fairshare basis. PIs can also utilize their XRAC or Frontera allocations through Access Points.
Requests should be for workloads that are amenable to the distributed high throughput computing services provided by PATh; to help the evaluation, information about HTC workloads should include the following information:
The dedicated PATh resources support the ability to execute software within containers or portable, self-contained applications.
A strength of the PATh Access Point is users can get started without any credits through using the OSPool’s opportunistic resources. We encourage users to contact [email protected] to get started on an Access Point today.
Users place data at the PATh Access Point; this can be moved to computing by:
Given the distributed nature of the hardware, there is no global shared filesystem.
PATh research facilitators are available to help explain the above concepts and discuss how a workload can be adapted to run on PATh. Please contact [email protected] with questions about PATh resources, using HTC, or estimating credit needs.
HTC resource management has different approaches compared to many batch systems. For example, PATh has more scheduling flexibility for smaller jobs, resulting in an escalating charge in credits for larger jobs.
Please see our 2022 charge listing for more details.
There are currently five locations that are expected to have dedicated hardware during 2022:
Additionally, there will be one location, to be confirmed, embedded in the nation’s R&E network infrastructure backbone.
Compute and GPU credits are good at all sites. By default, jobs may go to any location but users can add specific restrictions to target a single location. For example, jobs may be restricted to San Diego because they need to access a dataset at that location.