ClusterHQ makes your databases such as MongoDB, PostgreSQL, and Couchbase as easy to containerize as the stateless parts of your app. Get the benefits of containers—portability, flexibility, agility—for all your databases.
Flocker is the leading volume orchestrator for Docker. Used in production across hundreds of nodes, Flocker integrates with Docker Swarm, Kubernetes and Mesos to make container data volumes portable across hosts.
"The platform that ClusterHQ is building to enable entire DevOps teams to effectively manage and distribute their volumes across the entire application lifecycle will do for data management what Git did for code."
–Mike D. Kail, Chief Innovation Officer at Cybric and former CIO at Yahoo
"Loading test data into our CI pipeline is slow, reducing the efficiency of our development team. Being able to snapshot and load volumes directly into a containerized environment is exactly what we need to develop, test and deploy faster."
–Julien Maîtrehenry, IT Director, PetalMD
"Managing datasets for development and testing for fast moving products is tough. FlockerHub and Fli are the perfect developer and ops orientated tools to solve the problem."
–Lee Porte, System Administrator, Football Radar
FlockerHub & Fli in action
Just as it would be impossible to describe everything you can do with git in a single use case, it is impossible to show everything you can build using FlockerHub & Fli in a single page. Let’s just scratch the surface with Tiffany, a DevOps engineer building a microservice using Docker.
New feature, new microservice
Tiffany is working on a new feature to increase sales on an e-commerce site, a 'You might also like…' widget to be displayed on every product page. Tiffany needs to build, test and deploy the microservice end-to-end and provide a version of the microservice to her teammates so they can test their new features against it.
Building locally using Docker Compose
Tiffany starts by spinning up the e-commerce environment on her laptop, using a Docker Compose file she got from her manager. She’ll test her feature first locally using this environment.
Starting her feature branch
Now that she’s got her development environment set up, she creates a new branch and writes an initial implementation of the feature.
Testing locally is incomplete without data
Tiffany’s got the feature more or less how she wants it, and she wants to test it locally on her machine. She can run her branch against the local environment spun up with Docker Compose, but there is no data in the Order History and Product Catalogue databases that the widget needs, so she can’t see if the logic is correct.
Realistic data comes from production
Tiffany could put together some test data, but she would almost certainly miss edge cases that she’d find testing against realistic data. Tiffany asks Paul, her colleague in Ops, to send her the anonymized last 5 million entries from the Order History MongoDB database and the entire Product Catalog, all 1.5 million products, PostgreSQL database.
Snapshotting volumes with Fli
After receiving the data, Tiffany snapshots each database using Fli and pushes these volumes to the FlockerHub. That way, not only can she use this test data, but any of her colleagues or staging/QA environments can too.
Test data reveals a bug (Shocking!)
With her realistic test data available locally, Tiffany previews her branch. Looks good, except for that wrapping issue on the long product name. Who knew there was something called The Really, Incredibly, Stupendously, Fantastically Good Stain Remover®?
A little tweaking and voila...
Tiffany fixes the CSS for displaying long names and tries again. Bingo. Looks good.
And now for CI
Now that things are looking good locally, Tiffany needs to run her code through her company’s Jenkins environment. Their CI system kicks off for each commit in GitHub, so before pushing her branch, Tiffany makes sure that her Order History MongoDB and Product Catalog PostgreSQL that she pushed to FlockerHub are pulled into Jenkins /home/test-fixtures on all the Jenkins build slaves. This way, no matter which slave her build executes on, it will run her tests against the realistic data.
When a build fails, capture the state
Tiffany runs her test but the build fails when her integration test notices that one of the three product suggestions fails to load. At first glance, she is not quite sure what happened but can figure it out. Her team sets up Jenkins to snapshot all test databases using Fli as soon as a build fails, and then pushes these snapshots to FlockerHub. This way, Tiffany or any of her colleagues can have a completely consistent environment— code, container images and state–in which to debug.
Debugging locally with actual data
Because Fli uses incremental snapshots, the snapshot and push and subsequent pulls of the Mongo and Postgres data volumes from the Jenkins environment only take a few seconds, so in no time, Tiffany has an identical application environment running locally.
Sharing a development environment, including data, with a colleague
Tiffany digs around a bit and can’t recreate the bug. She decides to ask her colleague Sam to take a look. She sends Sam a Docker Compose file, which references her data volume on FlockerHub. Sam pulls all the Docker images and data volumes and is running an identical environment in minutes.
Armed with data, Sam bags the bug
Sam steps through the integration test and notices the bug only appears when there are exactly 4,294,967,296 orders in the Order History database. The code works fine up to 4,294,967,295 orders, then something goes horribly awry. But Sam’s brain always has a background thread calculating base 2 logarithms, so the problem is elementary. She knows exactly where to tweak the code. When she reruns the test, it displays fine.
Build passes, time for the next feature!
Sam commits her changes, the Jenkins build runs, and the test passes, even though it runs against the exact same database state as before, with 4,294,967,296 orders in the Order History database. High fives all around.