This interview was done for our Microservices for Startups ebook. Be sure to check it out for practical advice on microservices. Thanks to Khash for his time and input!
For context, how big is your engineering team? Are you using Microservices and can you give a general overview of how you’re using them?
At the moment we have 14 engineers in the team and pretty much all of them are full stack developers. Most of them have a focus on the backend side that's mostly around microservices.
We started Cloud66 in late 2012 and we rolled out our first product in early 2013. And as you can imagine back then, microservices were not necessarily a big thing, or people were not really thinking about containers in general. What we wanted to do was, given our background seeing the same challenges that we were having in startups in large companies, we wanted to make it simple for people to get up and running in the cloud and bringing provisioning and configuration of servers into the cloud space.
What didn't change, however, was the time that it would take from that Linux terminal that you have in front of you to make that server do what you want it to do. So while we shrunk eight weeks to five minutes we didn't really do much about what happens afterwards. And that's what we set out to do. So we started with Ruby on Rails as a framework that a lot of developers use; we started to automate that with the premise of something like, you can think of Heroku-ing your own servers. That led us to getting a lot of customers in that space, and around 2014 you started hearing about Cloud66 is great for Rails, but I want to have my Node.js, or I want to have my API in G,o or I have this PHP thing. We started looking around and realized that we can do the same thing if we knew where to draw a very clear boundary around what's a developer's role and what's an operations' role. And we thought anything inside of a container -- if we choose containers as a technology to deliver the service to our customers -- anything inside of that container would be a dev's responsibility. Anything outside the container will be an op's responsibility. So that clear boundary drawn by containers as the heart or the engine or the brain of microservices was the first initiative for us to embark on that journey.
Did you start with a monolith and later adopt Microservices? What was the motivation to adopt microservices? How did you evaluate the tradeoffs?
We started as a monolith with very few adjacent auxiliary services that were around that.
We built this monolith which on the surface is monolith because it's the same code base, it lives inside of the same repository. But you have these kind of boundaries between different layers of architecture as well as different microservices within the same monolith.
Because we didn't have those infrastructure or operations components that enables microservices, as we added more people to the team they started to deviate from that so we started with the idea. We deviated from that. And then when we started having the tool set that enables us to adopt microservices.
How did you approach the topic of microservices as a team/engineering organization? Was there discussion on aligning around what a microservice is?
We are a small team which means there is little formal discussion about how are we going to approach specific problems. Pretty much everybody is involved with every step. We are still more or less in a couple of pizza pie sized teams. So that didn't really happen in a formal way as in there is a soft team that owns microservices architecture or there is a microservices cop group that polices or dictates architectural guidelines down.
[A]s the team grows, following that discipline becomes more and more difficult because as it starts up you want to deliver fast, and you want to move forward fast, and you're focused on growth and all sorts of things around that, which means as you hire more people they don't necessarily share your background, they're not necessarily familiar with even the principles of segregated service-oriented architecture.
Did you change the way your team(s) were organized or operated in response to adopting microservices?
As a function of a small team you end up almost in many cases siloing the developer or couple of developers into one specific feature. As a business developer manager, [you] commission I should say maybe one or two developers to just go and do discovery and design and present and implement and test. And in many cases even operationally if you are small, operationally support a particular feature in the application. As much as you want to share knowledge about pretty much all aspects of a small system or a large system amongst your small development team so you can spread the knowledge and reduce your organizational risk, it naturally kind of happens that this guy is the guy to go to if you want to have you gateway support for networking, etc. And the other guy is the guy who knows the backup system fairly well, because they started a design and presented it, implemented it and are supporting it. Now these teams/people/sub-teams basically ended up having an organizational microservice between themselves. So the backup guys go to the storage guys and say, "So we have these things we want you to store for a long time." And it's as if they are talking contract-based programming. So then the storage guys would say, "Okay, give it to us in this format and then drop it for this rendezvous. We'll pick it up and this is how you request it back to restore," for example. It might not really turn into two separate microservices within a monolith or even in a microservices architecture. But you can see that people respect each other's contracts and boundaries and kind of develop along those lines.
How much freedom is there on technology choices? Did you all agree on sticking with one stack or is there flexibility to try new? How did you arrive at that decision?
That's a very interesting question because you can see how adopting microservices, containers, cloud orchestration and all those things almost liberates developers or the teams to choose whatever they want as the best tool for the job and go ahead and implement it in a way that security doesn't leak into other things. On the other hand, perhaps we haven't been doing this long enough collectively as an industry to see the pitfalls.
I kind of know the pitfalls of that and now I can see the ops guys saying, "Here is a vending machine full of candy. You know M&M's and whatever else you want. We just wrapped it up in the way we approve of it. You just have developers come in, enter the number of the type of M&M's that you want and they'll fall off." And it works great. You're not going to get anybody fired. And you can also contribute to this vending machine and pack your own candy and put it in the cell for the next person.
Have you broken a monolithic application into smaller microservices? If so, can you take us through that process?
There is little value in putting a monolith application into a single container running and saying we container-ized this thing. The value of containers comes from microservices, from breaking the application up into smaller pieces and imposing the disciplines that you need within your development team to roll this thing forward. And that's where we started to integrate microservices into Cloud66 ourselves at that point.
How did you approach the task?
Our background is we've worked with service-oriented architecture, the entire Martin Fowler domain driven design, contract-based design -- all the great things that we have that those theories bring into software engineering. We realized that the challenges of microservices that we needed to resolve for ourselves as a company and also for our customers are very similar to what we had to do 10 years earlier with service-oriented architecture, and it's not only that but it's also slightly more.
Now in 2017 we run Cloud66 entirely completely in containers. It's a microservices-based architecture. Cloud66.com itself is running inside containers. We run in communities powered by Docker as a runtime engine. And we have some very large components of this that are not microservices, that are not inside containers. For that we also adopted the same approach that microservices have. An example of that is what we call Build Grid, which is a set of bare metal servers that build images and we couldn't do that inside a container. It couldn't happen inside of a service. We still thought about that and we still did it as a service. It might not be a microservice. It's more of a macroservice, but still the same thing. And we adopted exactly the same principles of service discovery, of resilience, that it brings scalability, try to achieve statelessness and all of that with components that had nothing to do with containers as well. So our database, our storage, our key value stores, our queuing system, our metrics, and our build grid are not inside containers but they follow by microservices principles.
What were some unforeseen issues and lessons learned?
We built this monolith which on the surface is a monolith because it's the same code base, it lives inside of the same repository. But you have these kind of boundaries between different layers of architecture as well as different microservices within the same monolith. But as the team grows, following that discipline becomes more and more difficult because as it starts up you want to deliver fast, and you want to move forward fast, and you're focused on growth and all sorts of things around that which means as you hire more people they don't necessarily share your background, they're not necessarily familiar with even the principles of segregated service-oriented architecture. And also because that service-oriented architecture, for example, is very much superficial into the way code is written and not in the infrastructure that enables microservices. It's very easy to deviate from that principle. An example of that is: if you're running microservices, for example, you need to have the infrastructure that supports it. Service discovery for example is a key feature of microservices and in service-oriented architecture you have things like DIs and things that work around providing some sort of crazy semi-service discovery, but you don't necessarily have to have it. Because we didn't have those infrastructure or operations components that enables one microservice, as we added more people to the team they started to deviate from that so we started with the idea. We deviated from that. And then when we started having the tool set that enables us to adopt microservices, both at the organizational level and on the operational level when we started again going back.
How do you determine service boundaries?
For me, I think it's one of those things that has multiple answers, none of them necessarily wrong or right. It just depends on which principle you follow. For us it was very much us following the same principles of service-oriented architecture on the back of domain-driven design. So that was kind of the thing that we decided to do.
What that discussion like within your team?
We adopted domain-driven design because we wanted to be able to talk to the business in a way, in a language, that they would understand, that we would understand. We basically agree on this common vocabulary of when we say a "database," what do you mean? Is it a server or a database? So what does that mean? When we say network what do we mean? When you say port-- like all sorts of things. And when I say business I mean ops guys, you know guys that run data centers in cloud. So we adopted the domain-driven technology design, the Martin Fowler definition of that because we wanted to have a common language with our customers. And on the back of that you can then arrive into service as to little things that manipulate a specific domain. So with that we have the concepts of factories, and we have a concept of repository, and then you have a concept of a service where a business function is operated in it from the beginning to the end on one or more domain objects. Now that's one of the definitions. But the good thing about it is that it kind of makes sense if you buy into the whole thing. If you go all the way with domain-driven design and repositories and factories and all the sort of things around that, then the definition of service basically makes sense. You see everything start to fit into their place. Some of our customers draw the boundaries of the system completely differently and sometimes have services that kind of step on each other's toes. And sometimes that works, but for us the most successful implementation of microservices we've seen is either based on a software design principle like domain-driven design for example, and service-oriented architecture or the ones that reflect an organizational approach.
Can you give some examples?
So payments team for example-- they have the payment service or credit card validation service and that's the service they provide to the outside world. So it's not necessarily anything about software. It's mostly about the business unit or the team that within the IT division provides one more service to the outside world. And we've seen this in banks mostly, especially around quantitive analytics where there are these services or libraries that provide some mathematical calculation to other services. What is the forward price of this contract? What's the predicted for exchange between USD and Sterling? Those services that used to be in the past in Windows's world, in D&L's and dry libraries and things, modules that you would compile into your application are now provided as services, sometimes even as a service just Lambda style things. And that's another affliction of it. So there's this team which has nothing to do with IT. It's a bunch of business analysts that used to provide these through spreadsheets, in Excel spreadsheets, and then after that module libraries that you would call upon compile in your code and then you have now providing as a service endpoint that you could hit. So we've seen basically those three that would be the most successful ways of drawing boundaries into service-oriented or microservices.
How have microservices impacted your development process? Your ops and deployment processes? What were some challenges that came up and how did you solve them? Can you give some examples?
So a lot of times we draw a parallel between containers and VM and we say containers to a great degree are like smaller VMs when you want to simplify things. But the reality is, not only do you need a buy-in from ops guys to adopt containers, you also need a big buy-in from developers about containers because this then starts to get into microservices. In that sense there is little value in putting a monolith application into a single container running and saying we container-ized this thing. The value of containers comes from microservices, from breaking the application up into smaller pieces and imposing the disciplines that you need within your development team to roll this thing forward. And that's where we started to integrate Microservices into Cloud66 ourselves at that point. So our background is we've worked with service-oriented architecture, the entire Martin Fowler Domain Driven Design, contract-based design -- all the great things that we have that those theories bring into software engineering. We realized that the challenges of Microservices that we needed to resolve for ourselves internally as a company and also for our customers are very similar to what we had to do 10 years earlier with service-oriented architecture, and it's not only that but it's also slightly more. So if you look at the challenges that service-oriented architecture, for example, had about having strict discipline, about contracts between services, the version of all our compatibility and the contracts and all sorts of things around that. Not only do you have those in place but also that boundary is not confined within the same system. It can be broken up into multiple systems, multiple teams, multiple code repositories, multiple infrastructure requirements, etc. So again fast-forward from 2014 where we started embarking on this journey. Now in 2017 we run Cloud66 entirely completely in containers. It's a microservices-based architecture ourselves Cloud66.com itself that delivers microservices. We run in communities powered by Docker as a runtime engine. And we have some very large components of this that are not microservices, that are not inside containers. And for that we also adopted the same approach that microservices have. We adopted exactly the same principles of service discovery, of resilience, that it brings scalability, trying to achieve statelessness and all of that with components that had nothing to do with containers as well. So our database, our storage, our key value stores, our queuing system, our metrics, and our build grid are not inside containers but they follow by microservices principles.
How have microservices impacted the way you approach testing?
I think microservices themselves from any aspect that you think of can only be good for testing. They will not necessarily be good for quality but if you are into testing, which everybody should be, and you take testing seriously, then I think microservices can only be good for that. And the reason is we have the traditional testing, unit testing, and going and abstracting the way and marking out the interaction between different parts of software and testing each part with mocked-out auxiliaries in isolation, and that's great and we should carry on doing that. That's the whole idea of code coverage, etc. very much like computer science 101.
What are lessons learned or advice around this? Can you give some examples?
The next step to that is integration testing. And I think as an industry, as software engineers-- I don't want to use the word "struggle" but it's always been the sticking point of testing. Integration testing has always been the sticking point, because you have in your production where things actually have to work, you have this expensive Oracle database where you might have that specific storage system that might be there. Or even in the cloud, you might have specific ways of connecting things. And we've never been able to prioritize microservices. We've never been able to really do it in the real world. Microservices will allow us--as a reflection of the byproduct of containers and all that orchestration around it--allow us to build and fire up these mini environments, this multi-verse of applications that run and actually are real. You don't need to mock out system or your database or whatever else is. If it's running MySQL for example in production, you're running it MySQL in test as well instead of SQLite. And that's a weird example but you can tell that there is a lot in the back. In many cases we've seen that test environments, for example, don't use - prior to microservices - the real component like RabbitMQ or a queue system. They used a mocked-out one. So I think that's one good thing about it.
How have microservices impacted security and controlling access to data?
I think microservices as a software engineering practice can be very beneficial to that if used properly. And I go back to domain-driven design which we adopted and the concept of repositories where essentially you have a service that sits in front of your data. So no one in the system talks, for example, directly to any file or any database. Every access to that is regulated through a service called the repository. That's a DDD design principle. Now, it's not necessarily the best one. It's not necessarily the only one, that's for sure. But if you do that then you can imagine that you have one of these microservices as the guardian, as the gatekeeper to your data and then you can then benefit from a lot of other things from IP access control lists ACLs to different parts of read and write access and all sorts of things around that. It becomes a wonderful way of gatekeeping access to data. I think that's a positive side that you can benefit from when you're dealing with microservices. Essentially having services that are gatekeepers to your data.
What are lessons learned or advice around this? Can you give some examples?
When it comes to really implementing that while imagining a stay to access service is fairly easy to do and it's fairly easy to implement in many cases. In reality, when it comes to writing that code a lot of times you tend to expose the entire system to more secured users than you would have otherwise. And the reason is at the end of the day these services are built somewhere. Most of the time nowadays it's a container. It's running inside a container engine and it's coming from a Docker image for example in a very real way. And then you get to the issues of security of the Docker image or the container image and where it's built. Who is the vendor that's providing the upstream for this thing? Are there any security vulnerabilities that are there? If I were to, for example, talk to my database directly to get something, it's between myself and the database mostly behind the private firewall, all sorts of things. But if I make that access available through a service and that service is running on a vulnerable Linux distro and is exposed through a GRPC service that potentially can be vulnerable to some security then I am multiplying that effect. So it is always important that the microservices themselves as a theory can be really helpful to guarantee security-intensive access to data. But sometimes in a very technical and a very low level aspect of it, they actually compromise more so you're going to have to have the best tools and the practices in place to make sure that you're aware of the computer vulnerabilities, that you can patch them and all sorts of things. So there's the whole question of authorization, authentication sort of thing comes to it as well.
So we have a product called Cloud66 Skycap, which essentially we call a container delivery pipeline. It starts from your code repository and ends up with your orchestrator. And what it does is basically taking care of those things as in, I get your code and I build it in this way and you have to sign it off and the upstream is checked where it comes from. Part of it's access, part of it's security, part of it's best practices, part of it's workflow. It could be things like moving artifacts between different parts of the build so you make sure that developers by mistake don't check in their SSH keys for example in the image.
Thanks again to Khash for his time and input! This interview was done for our Microservices for Startups ebook. Be sure to check it out for practical advice on microservices.