A primary challenge for any database in the modern AI era is scalability.
Database demand now can fluctuate dramatically over the course of time, which can often lead to organizations needing to over-provision capacity. It’s a challenge that open-source NoSQL database vendor ScyllaDB has been working to solve for some time with its scalable technology.
Today with the new ScyllaDB 6.0 open-source release, the technology is taking the next step forward with an innovative replication architecture called ‘tablets’ that enables some impressive performance numbers for database scaling. The tablet’s architecture allows ScyllaDB to double or even quadruple a cluster’s throughput capacity within just 15 minutes. This is substantially faster than other NoSQL databases which can only double capacity in 30 minutes.
ScyllaDB was originally designed as a drop-in replacement for the open-source Apache Cassandra database and is also compatible with Amazon DynamoDB. The company has raised over $100 million and includes social networking service Discord, travel site Expedia and media giant Comcast among its many brand name users.
“The largest value is the elasticity,” Dor Lior, CEO of ScyllaDB told VentureBeat in an exclusive interview. “The ability to change the cluster size at the fastest speed in the industry allows customers to be always on the safe side, prepared for any spike.”
How tablets change the open source database replication model
In today’s cloud and AI-driven world, applications must be able to rapidly scale database capacity up or down to handle highly volatile usage patterns from millions of users. However, most databases struggle to take full advantage of cloud infrastructure elasticity due to the significant lag required to redistribute data across newly provisioned nodes.
In ScyllaDB 6.0 the new tablet technology enables dynamic data redistribution as the workload changes, allowing new nodes to start serving traffic rapidly instead of waiting for lengthy rebalancing.
Lior claimed that even before tablets, ScyllaDB was able to scale quicker than Cassandra and now that speed will be accelerated further.
“Tablets also improve usability,” Lior said. “Previously both Scylla and Cassandra needed to run an operation called cleanup on all of the nodes, after scaling. Since Scylla has transactional metadata, we do it automatically, without bothering the user.”
How tablets work to redefine database scaling
In an exclusive interview with VentureBeat, ScyllaDB CTO Avi Kivity described tablets as “just another layer of indirection”. Despite the understated description, tablets represent a major step forward in how ScyllaDB handles data distribution and rebalancing across nodes in a cluster.
Kivity explained that instead of directing the data statically among the database nodes to enable scaling, tablets provide a more elegant solution.
“We can redirect ranges of [database] keys to different nodes based on needs,” explained Kivity.
This dynamic remapping of data enables much more efficient and predictable scaling operations compared to previous versions. Kivity noted that in the past, scaling time was highly dependent on schema design, where many small key-value pairs would take significantly longer than fewer, larger ones.
With tablets, that schema dependency is eliminated.
“We achieve independence from the shape of the schema,” said Kivity. “And so you have predictability. If you know how much data you have in terms of bytes, and you know the bandwidth of your disks and your network. Then you can predict how long it will take you to perform a scaling operation.”
Provisioning changes will impact the bottom line for database operations
The improved scaling will have a dramatic impact on the costs of running the database as it will now require less overprovisioning.
ScyllaDB 6.0 enables users to live closer to the edge of capacity. The ability of ScyllaDB 6.0 to more rapidly scale means that users can run their system closer to maximum capacity. That means they can be more efficient and potentially more cost-optimized as well.
“Previously, we would recommend that you fill your disks to about 70% and even set the alerts a little earlier, So that if you need to scale your cluster, you have sufficient advance notice that you can begin to operation since you don’t know how long it will take,” Kiviity said. “Now we can have clusters filled to 90 or 95% of the disk capacity.”
ScyllaDB 6.0 is currently available as an open-source release, with the company planning on rolling out enterprise and cloud support in the coming months.
The post ScyllaDB 6.0 advances open source database scalability appeared first on Venture Beat.