Article 3WNRQ Video: Recent Results and Open Problems for Resilience at Scale

Video: Recent Results and Open Problems for Resilience at Scale

by
Rich Brueckner
from High-Performance Computing News Analysis | insideHPC on (#3WNRQ)
yvesrobert-150x124.jpg

In this video from PASC18, Yves Robert from icole normale supi(C)rieure de Lyon in France presents: Recent Results and Open Problems for Resilience at Scale. "The talk will address the following three questions: (i) fail-stop errors: checkpointing or replication or both? (ii) silent errors: application-specific detectors or plain old trustworthy replication? In terms of workflows: how to avoid checkpointing every task?"

The post Video: Recent Results and Open Problems for Resilience at Scale appeared first on insideHPC.

External Content
Source RSS or Atom Feed
Feed Location http://insidehpc.com/feed/
Feed Title High-Performance Computing News Analysis | insideHPC
Feed Link https://insidehpc.com/
Reply 0 comments