Generating ideas and discussion on the issues associated with DevOps for the enterprise.
While Development teams often embrace a DevOps platform or ecosystem, many DevOps efforts have difficulty engaging the Operations teams fully. Accommodating concerns and requests from Operations (let alone Security) can seem like unnecessary overhead to development teams, but doing so can lead to benefits when it is time for applications to graduate to production.
In this post, I’d like to share some common operational and security concerns related to DevOps and some thoughts on how to successfully address them.
When introducing DevOps at the enterprise level, I often describe the tools as “Enterprise Developer Support Services”—for example, a source code management service or a build service. I often talk about the need for such services to be reliable, stable and something that the enterprise can rely on as part of their software development process. To be provocative, I sometimes describe the services as production-level systems to “poke the bear” of Operations. I do that in order to get their attention—often Operations ignores DevOps thinking that “it’s only for development”. When I describe the services as production-level, I suddenly get the necessary attention from Operations and Security to start the engagement process.
Let me add a disclaimer here. Describing DevOps services as production systems carries risks—suddenly people start asking some hard questions:
If DevOps services are meant to be production systems, how are they built and tested? Is there a DevOps development and test environment for the DevOps tools themselves? (errrr, “Physician, heal thyself”?)
Do you have a system security plan for the DevOps services?
What are the SLAs for the DevOps services? Backup requirements? High availability configuration?
What’s the process for releasing new versions of DevOps tools? Patching them?
People who implement DevOps tools often shudder when they hear these questions. After all, aren’t these just development tools? Why do they need to be run like production systems? Unfortunately, not answering these questions makes people look hypocritical. “Well, we think developers should follow these code quality standards, but they don’t apply to the DevOps tools themselves.” Not a good place to be.
Alright, if you’ve made it this far and as a DevOps professional are suitably chastened by the inability to confidently answer these questions, fear not. I’ve got some good ideas for you.
Development and Testing of DevOps Tools
Start by setting up environments where you try out new versions of DevOps tools. For example, you might have Jenkins 1.5 in production, and 1.6 in development. You maintain a backlog and release plan of how you are going to introduce new features to your developer teams that rely on your services. Having this discipline sets a good example to teams that you are trying to convince to use your tools.
Any mature IT organization has standards for installing and supporting COTS products in production. There may be security standards for user logins, logging standards, standard directory locations for software, monitoring agents and others. You should ask for those standards and try to adhere to them as much as possible—again, you are trying to set a good example. Here are some common scenarios and how I’ve seen them dealt with:
Approved software. At some companies, you can’t just download applications, especially if you are installing that software for use on the customer’s network, or even when you are using that software to build software that will eventually go on the customer’s network. Security (and sometimes licensing control) teams want to know where you got it from, whether or not it’s licensed for use, has it been virus-scanned, and how you are keeping track of the version you are using. Fortunately, DevOps people know how to answer this question—use a software repository like Nexus or Artifactory. It’s generally why I make sure to install this DevOps component first. I then load in all of the software I’ll be using to build out the rest of the DevOps services. That way I’m all set to answer the question: “Where did you get this software from and how are you keeping track of it?” It also sets me up to host all of the third-party libraries my developers will need, including Java jars, Ruby gems, Docker images, RPMs, etc.—because you’ll probably end up mirroring-in those as well (to satisfy the security and licensing processes).
Access control. If possible, try to use the enterprise user directory service (Active Directory or LDAP) instead of setting up a special one for development. Ask if you can get a read-only copy/cache of the main directory. Understand the process of new user identity creation in the enterprise and how groups/roles are managed. Most DevOps tools can delegate authorization to a central directory service. If possible, use the one that’s already in the enterprise.
Software configuration and installation standards. This can be a tricky one. Many DevOps tools will want to use “standard” locations like C:\Program Files or /usr/bin and these directories may be locked down by policies and governance. While /opt may be a good location on Linux, some companies lock down /opt as it is part of the base O/S image. For companies that use virtual machines or cloud images, it’s often the case that the image is sacrosanct and can’t be modified—the machine may be re-imaged at any time, wiping out any changes that were put in place.
Back in the day when laptops had smaller hard drives, many people would put their data on a separate drive (D:\ drive) so that they could upgrade or re-image their C:\ drive without losing their data. A similar approach can be used to deal with hardened production images—attach a separate volume to the base O/S image for installation of service-specific software. (As an ex-Oracle consultant, we used to recommend the installation of Oracle to a non-base mount-point like /u01).
Some RPM installers even let you “relocate” the base directory for installation although this is rare.
At one of my clients, they have a standard for mounting volumes. They use /data/[1..N] as a naming pattern, where /data/1 is the first mounted volume, /data/2 the second, and so on. Making use of that pattern, you can install the DevOps tools on a /data/1 location. I often “mimic” the original /-directories (/usr,/var,/opt) on the /data/1 mount point in order to make some software feel more “at home”, so I’ll install to /data/1/opt or /data/1/usr.
A benefit of this approach is that you’ve separated duties of O/S management—the base image managed by operations, and COTS software managed by you. Some other nice benefits may include the ability to independently “snapshot” or backup your volume.
For what its worth, Docker provides a nice way to achieve similar separation of duties but not every client’s security and operations team is comfortable with Docker in production. And even where they are, the above approach works in Docker too; it just takes a bit more effort.
Logging. You are providing an ELK service for your developers. You are using it to log the Jenkins, Nexus and other servers, right? No? Why not? You are also relaying your logs into the Operations central logging service too, right?
High-Availability. You’ve put the services behind load balancers, right? And on multiple servers? Maybe multiple availability zones? And you are replicating repositories and job definitions, right?
Maintain a “central” software repository, but provide projects and teams with a software repository image or container that they can use to host their project’s software package artifacts. Jenkins build jobs can then place their output into the project teams’ private repository on a server or host that they are paying for. This has the added benefit of letting teams control who sees their output until it’s ready and also forces them to pay for the disk space they are consuming to store their artifacts.
Maintain a central Jenkins master server, but encourage each project team to have a set of private build slaves. Create images or containers that make it easy for each team to have their own set of slaves. Use the folder and RBAC plugins in Jenkins to organize build jobs by team. This way teams can have a private build server with the tools they need on it (Java, Ruby, etc), they don’t hog the CPU on public ones, and they pay for their own resources depending on how often they build their code.