Any day you learn something new is a good day


Extending the CEPH cluster, things we've learnt

, posted: 20-Jun-2018 09:41

The previously mentioned Croit CEPH Cluster has been running well for a few weeks now, with a couple of issues found and addressed with help from Croit.

 

Things we’ve learnt:

  • Storage nodes really need 8GB of RAM.
  • A monitor node can be an old laptop with 100mbp/s NIC as long as the CPU is 64bit and has 2GB ram.
  • Bluestore is faster than Filestore, so we don't need dedicated journal drives anymore.
  • Tune the ram cache to 512mb per HDD OSD
    • Maintenance / cluster settings / bluestore cache size hdd / set to 512mb
  • You need *lots* of PG’s if you’re going to have lots of OSD’s. The default is 128, and you want >30 per disk or CEPH will complain, so start with 512 or 1024 PG’s.
    • Pools / PGS / select cephfs_data / click edit / change number PGs to 1024
  • Backup your master node data.  Script for that is available inside the container, blog post talking about it is here; (https://croit.io/2017/09/14/2017-09-14-how-to-backup)
  • Keep an eye on Croit’s blog (https://croit.io/blog) as they post useful stuff there. Croit pointed out to me that the latest upgrade was available (https://croit.io/2018/05/28/2018-05-28-release-v1805) and the upgrade to the new version was painless.  Upgrading the nodes to the latest image was one of the smoothest upgrades I’ve ever had to do.
    • Check each node’s image is set to default;
      • Servers / select a node / edit / check that Image = Default
    • Once they’re all set to default, and the status of the cluster is OK, initiate a ‘Rolling Reboot’ to upgrade all the cluster nodes to the latest version of Croit.
      • Servers / Images / click ‘Rolling Reboot’ button

 

We’ve now got 92 OSD’s in the cluster, with a total disk space available a hair under 50TB. Still able to push Veeam backups into it at 130MB/s.

 

Currently we have two rooms set up in the crushmap, with about 11 storage nodes in each.  This will get split to three rooms with a balance of storage nodes across all three in time.

 

The last thing to do now is plan for a disaster, so we’ll be spitballing on the worst case things that can happen and how do we make sure we can access the cluster if bad things happen. Keeping a copy of the ceph.conf & admin.keyring in a very safe place is a start.

Other related posts:
Creating redundant, clustered & scalable storage - a DIY guide
Building A Win8.1 based Chromebook - A How To
OKI B411n & how to reset the NIC




nzsouthernman's profile

Dael 
Christchurch
New Zealand


This blog is mainly going to be for writing down things when I work them out so when I have to try and do it again I don't have to think too hard.  And also to comment on stuff.  Hopefully not too much rant /rant involved.

My latest finished and successful home project;

QNAS NAS/SAN Appliance
8x 750GB 2.5" SATA in R6 array, running PLEX and providing additional storage for MythTV


Toys in the attic;
PS3
PSP
iPhone 7+ (2D)
MythTV separated backend with 2 DVB-S encoders & 2TB disk space & two frontends

Follow me on twitter; http://twitter.com/nzsouthernman