EMC ScaleIO: What is it?

So, this may come as a shocker to you, but there are actually quite a few people around the world who doesn’t know what ScaleIO is, even though prominent people like Chad Sakac has mentioned it a myriad of times on his blog. But just to make sure that everyone know, here is another introduction.

ScaleIO is a Software Defined Storage-product from EMC. It’s primary focus is performance and thus it’s actually quite low on features compared to many other products out there. So if you’re looking for something that does replication, look the other way. If you’re looking for something that does deduplication, look the other way. If you’re looking for something that does anything else than block, look the other way. If you’re looking for something that does compression… you get the point. However, if you’re on the other hand looking for something that can be deployed on most anything, is incredibly performant and will tolerate a giant beating before letting you down, ScaleIO might be something for you.


A ScaleIO-system consists of a few components:

  • SDC: ScaleIO Data Client
  • SDC: ScaleIO Data Server
  • LIA: Light Installation Agent
  • MDM: Metadata Manager
  • ScaleIO GUI
  • ScaleIO Gateway

There are a few more than these, but I’m skipping them since they have no interest to me 🙂


So as you can probably guess, the SDC is a component that needs to be installed on whatever OS you want to map some storage to.


The SDS is installed on all the servers which make up the “storage array”. So basically all the servers that has spare capacity which you then want to present backup the all the SDC’s. And yes, an SDS can be an SDC as well (think HCI for instance).


This is the agent that is installed on all nodes (both MDM, SDC, SDS) in the ScaleIO-system. This agent is used when you want to do something on the endpoints, ie. collect logs or upgrade the system to a newer release.


These babies holds the keys to the castle. The contain information on where all data is at any given point. You would typically install the MDM in a 3-node active/passive/witness cluster or a 5-node active/passive/passive/witness/witness configuration. The MDM can be standalone machines or could be installed on the SDS’s. They can be deployed as masters or tie-breakers.


The management interface for ScaleIO. Wont say to much about this, a picture does a better job.

Billedresultat for scaleio management


An application that you would typically install on a server separate from the SDS/SDC/MDM-servers. This is because this component is used for deploying/reconfiguring/updating the system.

Billedresultat for scaleio gateway


Probably much more than you will ever need, but here is a few pointers.

  • Min/Max ScaleIO Installation Size: 300GB to 16PB
  • Individual Device Size: 100GB to 8TB
  • Volume Size: 8GB to 1PB
  • Max SDS Size: 96TB
  • Max SDS pr. System: 1024
  • Max SDS pr. Protection Domain: 128
  • Max Disks pr. Storage Pool: 300

Deployment Options

ScaleIO can be deployed in 3 different ways from a purely architectural point of view.

  • New and fancy HCI-mode. SDS + SDC on the same node.
  • Two-Layer mode. If you want to build it like you would any traditional storage system (other than the fact that it doesn’t share any other things that the deployment model)
  • Hybrid. Some nodes have both SDS + SDC, some only have SDS, others only SDC. Giant mashup.

Deployment Example

So now you’re thinking: “okay, fair enough.. but how are all these deployed into a storage system?”

To give you an example, lets say that you have 7 servers on which you want to deploy ScaleIO. A deployment example would be like this:

  • Server 1
    • MDM
    • SDS
    • LIA
  • Server 2
    • MDM
    • SDS
    • LIA
  • Server 3
    • MDM
    • SDS
    • LIA
  • Server 4
    • MDM (TB)
    • SDS
    • LIA
  • Server 5
    • MDM (TB)
    • SDS
    • LIA
  • Server 6
    • SDS
    • LIA
  • Server 7
    • SDS
    • LIA

What this gives you is a 5-node MDM cluster managing a 7-node SDS cluster all with LIAs installed so that it can be patched/managed from a single ScaleIO Gateway.

Wrap Up

So this was the extremely quick 40000 foot view of ScaleIO. If asked about it, I would characterize it like this.

  • True software defined storage.
  • Very flexible.
  • High performance
  • Very durable.

But I Want To Know More!

Might I then recommend reading the architecture guide (https://www.emc.com/collateral/white-papers/h14344-emc-scaleio-basic-architecture.pdf) or the user guide (https://community.emc.com/docs/DOC-45035). Both are very good and very detailed.

Leave a Reply

Your email address will not be published. Required fields are marked *