OSDF Cache Selection
April 4th, 2025
Fabio Andrijauskas - University of California San Diego
The OSPool leverages the Open Science Data Federation (OSDF) to efficiently provide data access for computing jobs and software efficiently, ensuring seamless and reliable data transfers. One of its most outstanding features is its ability to dynamically locate the closest cache for each data transfer, minimizing latency and improving overall performance. By optimizing data movement, the OSPool enhances accessibility to large datasets, reduces network congestion, and accelerates computational workflows. This distributed approach ensures that researchers and scientists can efficiently access the data they need, regardless of their physical location, enabling faster and more effective scientific discoveries.
Executive summary
This document presents the results of testing the OSDF cache selection mechanism, which was exercised by repeatedly requesting files from most of the geographic locations that serve the OSPool. Most of the tests were positive, delivering data from the expected cache location, but the OSDF cache selection did pick a suboptimal cache at a couple of geographical locations. The behavior was consistent in both low-load and in stress test setups.
Recommendations
There is one recommendation for the Pelican Platform Team:
· Create a tool to check the site and the selected cache periodically. This tool can be used to measure the transfer rate from sites and caches.
Next, the recommendations for the OSDF operations are:
· Check why the Pelican Platform can’t reach the proper cache in very few cases.
· Check the GeoIP positions for the sites that got the wrong closest cache.
Finally, the recommendations for the OSPool operations are:
· Create a pattern for OSPool site and node names (e.g., Country_State_City_baseInternetDomain_Project).
Detailed explanation
The tests were performed using ap21.uc.osg-htc.org and requested a list for all available sites in the OSPool using
condor_status -pool
gfactory-1.osg-htc.org -any -const 'MyType=="glidefactory"' -af
GLIDEIN_Supported_VOs GLIDEIN_Site | grep OSGVO | sort -u | awk {'print $2'} |
uniq > sites
and sites from the K6 testing tool. Each site was tested 15 times using one script using HTTP requests to the OSDF director and using the pelican client; the methodology was based on checking the site location and cache selected by the OSDF Director. Table 1 shows the name, location, and the location of the cache for the easy test. Furthermore, a stress test was executed on each site, requesting 100 files in parallel and checking the location.
- Sixty-eight sites were used in the test; only four sites informed a wrong closest cache. Considering only OSG sites, only two sites showed the wrong location.
- Two sites informed different positions using the Pelican client and the HTTP request.
- The stress test did not show any wrong locations for the transfers.
- A different cache was reported whenever the closest cache location was wrong.
Table 1: Site name, cache selected, and actual site location. The ‘k6 site’ is a host used for the K6 load testing tool; they use Amazon Cloud hosts.
Site name |
Selected Cache and Location/State |
Site Location/State |
Closest location -Pelican director |
Closest location -pelican client |
singapore k6 site |
singapore.nationalresearchplatform.org - Singapore |
Singapore |
yes |
- |
brazil k6 site |
osdf-cache.sprace.org.br - Brazil |
Brazil |
yes |
- |
tiger-osg-backfill-prod |
osdf-uw-cache.svc.osg-htc.org - Wisconsin |
Wisconsin |
yes |
yes |
clemson.edu |
dtn-pas.cinc.nrp.internet2.edu - Ohio |
South Carolina |
yes |
yes |
empest-epyc |
dtn-pas.bois.nrp.internet2.edu - Idaho |
MT |
yes |
yes |
lp126.lonepeak.peaks |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Utah |
yes |
yes |
mortimer.hpc.uwm.edu |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Wisconsin |
yes |
yes |
chpc.utah.edu |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Utah |
yes |
yes |
int.chpc.utah.edu |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Utah |
yes |
yes |
arkansas state university |
unl-cache.nationalresearchplatform.org - Nebraska |
Arkansas |
yes |
yes |
amazon:us:columbus k6 site |
osg-new-york-stashcache.nrp.internet2.edu – New York |
Ohio |
yes |
- |
amazon:ca:montreal k6 site |
mghpcc-cache.nationalresearchplatform.org - Massachusetts |
Montreal -Canada |
yes |
- |
amazon:de:frankfurt k6 site |
amst-fiona.nationalresearchplatform.org – Amestrdan - Netherlands |
Frankfurt - Germany |
yes |
- |
amazon:gb:london k6 site |
lond-osdf-xcache01.es.net – London - UK |
London - England |
yes |
- |
amazon:fr:paris k6 site |
lond-osdf-xcache01.es.net– London - UK |
Paris - France |
yes |
- |
amazon:us:palo alto k6 site |
dtn-pas.cinc.nrp.internet2.edu – Ohio | dtn-pas.bois.nrp.internet2.edu - Idaho |
California |
no |
- |
amazon:us:portland k6 site |
osg-chicago-stashcache.nrp.internet2.edu – Chicago | mghpcc-cache.nationalresearchplatform.org – Massachusetts |
Portland |
no |
- |
amazon:us:ashburn k6 site |
osg-new-york-stashcache.nrp.internet2.edu – New York |
Virginia
|
yes |
- |
alabama-chpc |
dtn-pas.jack.nrp.internet2.edu - Florida |
Alabama |
yes |
yes |
amnh |
osg-new-york-stashcache.nrp.internet2.edu – New York |
New York |
yes |
yes |
asu-sol |
fdp-d3d-cache.nationalresearchplatform.org - California ncar-cache.nationalresearchplatform.org Colorado |
Arizona |
no |
no |
beocat-slate |
unl-cache.nationalresearchplatform.org -Nebraska |
Kansas |
yes |
yes |
cameron university |
unl-cache.nationalresearchplatform.org -Nebraska |
Oklahoma |
yes |
yes |
center for advanced research computing |
fdp-d3d-cache.nationalresearchplatform.org - California |
California |
yes |
yes |
chtc |
osdf-uw-cache.svc.osg-htc.org - Winscosin |
Winscosin |
yes |
yes |
chtc-spark |
osdf-uw-cache.svc.osg-htc.org - Winscosin |
Winscosin |
yes |
yes |
colorado |
osg-new-york-stashcache.nrp.internet2.edu
director. – New York |
Colorado |
no |
no |
duke-ncshare |
dtn-pas.cinc.nrp.internet2.edu - Ohio |
North Calorina |
yes |
yes |
elsa |
osg-new-york-stashcache.nrp.internet2.edu – New York |
New Jersey |
yes |
yes |
fandm-its |
osg-new-york-stashcache.nrp.internet2.edu – New York |
Philadelphia |
yes |
yes |
fnal |
osg-new-york-stashcache.nrp.internet2.edu
– New York |
Illinois |
yes |
yes |
fnal_gpgrid |
dtn-pas.bois.nrp.internet2.edu
director - Idaho |
Illinois |
no |
no |
gatech |
dtn-pas.jack.nrp.internet2.edu - Florida |
Gerogia |
yes |
yes |
grid_ce2 |
osg-chicago-stashcache.nrp.internet2.edu - Michigan |
Michigan |
yes |
yes |
gsu-acids |
dtn-pas.jack.nrp.internet2.edu - Florida |
Georgia |
yes |
yes |
hawaii-koa |
osg-sunnyvale-stashcache.nrp.internet2.edu - California |
Hawaii |
yes |
yes |
iitisi |
osdf-uw-cache.svc.osg-htc.org - Winscosin |
Michigan |
yes |
yes |
langston-lion |
osg-houston-stashcache.nrp.internet2.edu - Texas |
Oklahoma |
yes |
yes |
lsuhsc-tigerfish |
dtn-pas.hous.nrp.internet2.edu - Texas |
Louisiana |
yes |
yes |
maine-acgmaine-penobscot |
mghpcc-cache.nationalresearchplatform.org – Massachusetts |
Maine |
yes |
yes |
michigan |
osg-chicago-stashcache.nrp.internet2.edu - Illinois |
Michigan |
yes |
yes |
mi-horus |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Missouri |
yes |
yes |
missouri-hellbender |
osg-chicago-stashcache.nrp.internet2.edu - Illinois |
Missouri |
yes |
yes |
mtstate-tempest |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Montana |
yes |
yes |
numepodu-ubuntuoru-titan |
unl-cache.nationalresearchplatform.org - Nebraska |
Oklahoma
|
yes |
yes |
osg_us_fsu_hnpgrid |
dtn-pas.jack.nrp.internet2.edu - Florida |
Florida |
yes |
yes |
pdx-coeus |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Portland |
yes |
yes |
psu-ligo |
osg-new-york-stashcache.nrp.internet2.edu – New York |
PA |
yes |
yes |
puertorico |
dtn-pas.jack.nrp.internet2.edu - Florida |
Puerto Rico |
yes |
yes |
purdue-anvil |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Purdue |
yes |
yes |
rhodes-hpc |
dtn-pas.cinc.nrp.internet2.edu - Ohio |
Tenesse |
yes |
yes |
siue-cc-production |
osg-chicago-stashcache.nrp.internet2.edu - Illinois |
Illinios |
yes |
yes |
swan |
unl-cache.nationalresearchplatform.org - Nebraska |
Nebraska |
yes |
yes |
swarthmore-firebird: |
osg-new-york-stashcache.nrp.internet2.edu – New Work |
Pennsylvania |
yes |
yes |
tntech-warp1 |
dtn-pas.cinc.nrp.internet2.edu - Ohio |
Tenesse |
yes |
yes |
denveruchicago |
dtn-pas.denv.nrp.internet2.edu - Colorador |
Denver |
yes |
yes |
ucmerced-pinnacles |
osg-sunnyvale-stashcache.nrp.internet2.edu - California |
California |
yes |
yes |
uconn |
mghpcc-cache.nationalresearchplatform.org - Massachusetts |
New York |
yes |
yes |
uconn-hpc |
mghpcc-cache.nationalresearchplatform.org - Massachusetts |
New York |
yes |
yes |
unespunr-cc |
osg-sunnyvale-stashcache.nrp.internet2.edu - California |
Brazil |
no |
no |
usd-lawrence |
unl-cache.nationalresearchplatform.org - Nebrasja |
South Dakota |
yes |
yes |
utah-granite |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Utah |
yes |
yes |
utah-kingspeak |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Utah |
yes |
yes |
utah-lonepeak |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Utah |
yes |
yes |
utah-notchpeak |
dtn-pas.bois.nrp.internet2.edu - Idaho |
Utah |
yes |
yes |
uw-it |
unl-cache.nationalresearchplatform.org - Nebraska |
UW |
yes |
yes |
uwm-mortimer |
osdf-uw-cache.svc.osg-htc.org - Wisconsin |
Wisconsin |
yes |
yes |
emporia state university |
unl-cache.nationalresearchplatform.org - Nebraska |
Kansas |
yes |
yes |