Some brief notes from plenaries
virtualization, partly because of power requirements. The simplests and most powerful thing you can do. Protects the web apps, databases that are “lashed up”. Valid use case, but not enterprise. So virtualization allows enterprise-like environment preserves this innovation without danger. Vital for science
not coming soon – Vms for Grids and Clusters. Too much admin hassle
Storage – first 100TB single namespace project. Jobs lost over data loss. Data triage is a given. Examples Single namespace for Mac has 80 TB, 1.1 PB on Linux system
Users have no idea of true cost of storage. $124 for 1TB fort hardware is misleading. Individual labs put in 100Tb+ systems
Unlimited data storage days are over. – need triage. Cheaper to repeat experiment than keep data
Data loss – exemple – double disk failure in metadata – 10 TB in goverment lab. You will get double disk failures. Need RAID6
Backup is becoming a thing of the past, no “nightly full”.
Amazon, Google MS can store for 80cent / GB / year. Can you do that??
IT cannot be sole decision maker for triage or for storgae optimization
Rate limits are chemistry, regagent costs and human factors
Proeblem is somewhat scary but most people surviving
Amazon is is the cloud – has mutli-year headstart
Security in the cloud – don’t expect things that you don’t provide. Objections are often political
Compute power is easier than IO. He believes that Amazon are working on data ingestion.
Will be big move of science data into storage cloud. Science data will take 1-way trip. Data will stay in cloud. Only derived data will return.
McKinsey report on Cloud Computing very good, also James Hamilton
Watch Amazon, Google and MS.
Best data practices are starting to trickle out. Google is now showing what it did 5 years ago so they must be up to very exciting things now
Finally – federated data storage – something for the future