Комментарии 9
Простите, а как же родная ceph-fs, которая уже production-ready и даже имеет более-менее работающий dokan-интерфейс под Windows?
Тут http://www.dreamhoststatus.com/2016/10/11/dreamcompute-us-east-1-cluster-service-disruption/ одни ребята двое суток поднимали упавший ceph
Ваш комментарий выглядит жарено, («АААА ceph падает (и машет ножками), мы все умрем!!!»), однако, из текста неясно, что стало причиной:
-падения (хотя косвенно, у них там сетевое железо решило приказать долго жить).
-столь долгого оживления (что может быть от банально большого объема репликации по полуживой сети)
И неясно, с какой версией/конфигурацией они работают. Так что — предоставьте более полную информацию, если хотите конструктивного обсуждения.
-падения (хотя косвенно, у них там сетевое железо решило приказать долго жить).
-столь долгого оживления (что может быть от банально большого объема репликации по полуживой сети)
И неясно, с какой версией/конфигурацией они работают. Так что — предоставьте более полную информацию, если хотите конструктивного обсуждения.
Комментарий от Sage Weil (создатель Ceph) по поводу этой ситуации:
These VMs were backed by an old ceph cluster and the cluster fell over
after a switch failed. Because it's a beta cluster that's due to be
decommissioned shortly it wasn't upgraded from firefly. And because it's
old the PGs were mistuned (way too many) and machines were
underprovisioned on RAM (32GB for 12 OSDs; normally probably enough but
not on a very large cluster with 1000+ OSDs and too many PGs). It fell
into the somewhat familiar pattern of OSDs OOMing because of large OSDMaps
due to a degraded cluster.
The recovery was a bit tedious (tune osdmap caches way down, get all OSDs
to catch up on maps and rejoin cluster) but it's a procedure that's been
described on this list before. Once the core issue was identified it came
back pretty quickly.
Had the nodes had more RAM or had the PG counts been better tuned it would
have been avoided, and had the cluster been upgraded it *might* have been
avoided (hammer+ is more memory efficient, and newer versions have lower
default map cache sizes).
This was one of the very first large-scale clusters we ever built, so
we've learned quite a bit since then. :)
These VMs were backed by an old ceph cluster and the cluster fell over
after a switch failed. Because it's a beta cluster that's due to be
decommissioned shortly it wasn't upgraded from firefly. And because it's
old the PGs were mistuned (way too many) and machines were
underprovisioned on RAM (32GB for 12 OSDs; normally probably enough but
not on a very large cluster with 1000+ OSDs and too many PGs). It fell
into the somewhat familiar pattern of OSDs OOMing because of large OSDMaps
due to a degraded cluster.
The recovery was a bit tedious (tune osdmap caches way down, get all OSDs
to catch up on maps and rejoin cluster) but it's a procedure that's been
described on this list before. Once the core issue was identified it came
back pretty quickly.
Had the nodes had more RAM or had the PG counts been better tuned it would
have been avoided, and had the cluster been upgraded it *might* have been
avoided (hammer+ is more memory efficient, and newer versions have lower
default map cache sizes).
This was one of the very first large-scale clusters we ever built, so
we've learned quite a bit since then. :)
> Однако такая конфигурация обладает рядом свойств, которые сильно затрудняют работу с ней в случае, когда клиенты используют независимые виртуализированные кластера.
Вы не могли бы пояснить, какие проблемы у Люстры в такой конфигурации?
Вы не могли бы пояснить, какие проблемы у Люстры в такой конфигурации?
Ох и намучались мы с сепхом… В итоге на НФС переходим…
Зарегистрируйтесь на Хабре, чтобы оставить комментарий
Создание разделяемого хранилища на базе CEPH RBD и GFS2