Scalable Certificate Monitoring
The enforcement of HTTPS by web browsers has introduced the pain of certificate management to small and medium businesses. My rules of thumb to make your life much easier.
Corporations have substantial budgets for the IT security management. Certificate management is one of the largest items. What has happened in the last five years is that small and medium businesses have started experiencing the same pain. Everyone who wants to keep their internet presence, but implement web encryption - HTTPS. Preventing encryption expiration has become an integral part of "keeping the business running".
As far as I can see, you can easily manage your certificates the way you want. IF you own a handful of servers, you're educated in IT management. You possibly remember the expiration of all your certificates by heart as well as you can name all your servers. I am pretty sure you have a strong feeling about someone like me telling you how to do things. This piece is not for you.
I wrote this for people who are less lucky and have to deal with growing numbers of certificates without getting more people, bigger budget, or a raise.
Most of us have better things to do than firefight missed expired certificates. What we want is a simple approach that will make your life easier. An approach that can turn the hassle of certificate management into an opportunity to use it as one of the most powerful intelligence tools about your IT.
The silver lining is that there are a few simple rules that significantly help if you stick with them.
Wildcard Certificates
Wildcard certificates are a powerful tool. They can authenticate an unlimited number of domain names and services. The trouble with using such powerful tools is that they can easily turn into powerfully dangerous.
The long term wild-card certificates are also much more expensive than certificates where you have to name every name with which they can be used. This is another incentive to use them to the limit.
Danger: the incorrect way of using wildcard certificates is to copy them (and their private keys) on many different locations. In a small company, you are likely to have a number of small servers or virtual machines that you've added over time for different purposes. You would need to copy your wildcard certificate to every single location. This will cause a problem a year later when you need to renew them. Getting a new certificate is easy. The hard bit is to install them. The success of the installation will depend on your spreadsheet with all the locations where you need to install the new one. This is prone to errors.
"Wildcard certificates are certificates that contain a domain name that starts with "*", e.g., "*.keychest.net". It is beautiful in a sense that you can use that one certificates for all subdomains: "a.keychest.net", "mywork.dev.keychest.net", etc. It will, however, not cover "keychest.net" as this main domain doesn't contain the leading ".".
Rule of thumb: A good use of wildcard certificates is on load balancers and servers that host a number of subdomains. This ensures that there is (ideally) just one location where the wildcard certificate is installed. Yet, you get all the benefits. It requires a good design of your networks. If you're not sure, stick with "SAN" certificates that contain a list of all domain names where they can be used.
Unique Names
We can sometimes see a group of certificates all with the same name and similar expiration date. You often need multiple certificates when you have a failover or load-balanced service. Each of these certificates will be installed on a different server. The problem is that it is impossible to instantly find out which certificate is where.
It's a good idea to add markers that would help you quickly identify the source of potential incidents. As many of us use Let's Encrypt, and we should have as few rules as possible, the scope for "tags" is quite limited and we are basically left with an additional domain name.
Rule of thumb: create alias DNS records that you can add to your certificates simply to identify services with which they are used. They can all be on the same server, i.e., the "functional" domain name is the same and you add a new one: "mq.domain.com", "mx.keychest.net", "web.keychest.net". That way you can instantly see if a potential problem is related to your Rabbit MQ, Postfix, or Apache.
Define Business Renewal Deadlines
The main criticism of certificates is that their expiration, while purely a technical "detail", has a direct impact on services that use them. What it means is that a low-cost or even free technical detail can cause tens of millions of dollar losses. Without any warning.
What we have implemented in KeyChest early on, was to de-couple the technical expiry from "business expiry" (e.g., Let's Encrypt certificates has this difference set to 2 weeks). This de-coupling, while a simple measure, has an immense business impact. You get rid of the cliff-edge caused by the technical expiry of certificates and get a gradual devaluation of certificates that can last weeks to months, or even years. This works well with usual business processes based on incident escalation used for any other security incidents.
More importantly - when you get an alert of an end-of-life certificate on Saturday night, you can finish your pint, go home, enjoy a family Sunday and fix things on Monday. You get the confidence that certificates will become a part of your life, but will not take it over.
Rule of thumb: define business deadlines for all your certificate renewals. Take into account your availability on the worst day of the year, internal policies and escalation processes. When your boss's key indicators (KPI) show RED, you know you have enough time to prevent any business impact.
Define your own deadlines for raising critical incidents. This will give you time to fix issues without business downtimes.
Use DNS CAA Records and CT Logs
One of the biggest issues with free Let's Encrypt certificates is unauthorized internal use. In most cases, it will be justified as a "quick fix to get things working again". Sometimes it will be misused to bypass security policies by diverting traffic or allowing personal devices pretend to be part of your company - to keep things up and running. Sometimes it can be, however, down to downright malicious actions.
There are two main technological solutions. The first one is to use your DNS records to limit the CAs that can issue certificates for your domains. DNS CAA records have to be followed by all internet CAs. You can specify which CAs are allowed to issue certificates for your domain. When a CA gets an unauthorized request, it has to send you a notification.
The second tool is the use of CT logs. Those are global logs of all issued certificates and if you search "CT log search", you can get several services that will let you search for certificates for your domain(s). Or you can use crt.sh to run quick searches.
Rule of thumb: Control and limit CAs that can issue certificates for your internet domains and set up mechanisms to receive notifications. Regularly check that there are no rogue certificates issued for your domains.
Extended Validation (EV) Certificates
You can now easily demonstrate that it is virtually impossible to say whether a website is protected with an EV certificate or OV (organization validated) certificate. A year or two back, web browsers showed a big green bar with the company name, when you opened a website with an EV certificate. Not any more.
Rule of thumb: save on EV certificates and use the leftover budget for useful gadgets that will make you look forward to getting to the office every morning.
I will regularly revisit this post and add new items as they become relevant. Meantime, you can try our instant domain expiry audit - completely free at https://keychest.net/auditnow. (You can check how Microsoft Teams are getting with improving their own certificate management after a recent downtime.
Or create a KeyChest account to get most of the "policies" above out-of-the-box.