Description of image

Dear CICI PI: A Letter from the OSDF

The NSF PATh Project would like to help you enhance the experiences of the authorized consumers of your data by bringing your dataset to a national-scale data fabric. The Open Science Data Federation (OSDF) that we operate facilitates remote access to your data via a unified name space while managing the impact of this access on the storage hosting your data through a network of caches run across the nation and globe. We look forward to having your consumers join the growing community of researchers that are benefiting from the more than 100 transfers per second delivered by the OSDF.

Bringing your data to the OSDF is easy. We can help you copy the data to the NSF-funded OSStore storage operated by PATh or deploy a Pelican origin that serves as a gateway to the storage that hosts your data. This “Object Store” can be local on your campus (a filesystem) or in the cloud (an AWS S3 bucket).

Once in the OSDF, your data can be seamlessly processed leveraging capacity provided by the Open Science Pool (OSPool) that we operate or through NSF-funded services like the National Research Platform (NRP). The OSPool provides US researchers with compute capacity and automation. A large fraction of the more than 220M jobs served by the OSPool in the past year consumed objects provided by the OSDF. The high throughput computing (over .5M jobs/day) capacity offered by the OSPool is open to any US researcher. Namely, no allocation is needed.

Connecting to the OSDF

We offer two options to connect to the OSDF

We host it

The OSDF team runs the “origin” service connecting the repository to the OSDF; you provide the S3 credentials or HTTP access to the immutable objects.
If your dataset is on a filesystem, the NRP project hosts a Pelican origin in your DMZ.

You host it

You install & run the Pelican origin at your institution wherever the dataset is mounted.
This provides you control over the exported data and configuration of the authorization.

If neither of the above works, a copy of your dataset can be temporarily hosted at storage contributed by CC* awards through the PATh operated “OSStore” program.

You manage the access control policy of your data. The Pelican software interfaces with existing systems over standard protocols (such as OAuth2) or you can leverage the capabilities of CILogon to enable federated SSO and group access.

We will help you combine your dataset with higher-level services, such as the National Data Platform (NDP), to facilitate data discovery.

The NSF-funded OSDF is a service operated by the PATh project (#2030508) and powered by the Pelican platform (#2331480).

OSDF Logo