Recently I’ve been working in Google Cloud Platform (GCP) in a highly
distributed system that is partitioned into many different services with
various deployment strategies. A lot of it runs in Google Kubernetes Engine
(GKE), a lot of it runs in Google Cloud Functions (GCF), and a lot of it runs
in other bits and pieces of GCP infrastructure. Even the areas that are managed
via GKE and Cloud Functions are heterogenous, with many idiosyncratic
infrastructure configurations.
This has made digging into the system as a whole quite difficult, as it can be
tricky even to know about the existence of a service, let alone where to start
looking for logging and other insights into its operation.
One tactic that has been quite effective so far is taking advantage of the
query syntax
in the GCP Logging tool. This is quite a full-fledged query syntax that lets you
explore large amounts of logging data to try and identify what might be useful
for understanding a particular service.
A good starting point might be to try and track down logs for a particular GKE
container or a particular Cloud Function, which would look like one of these:
resource.labels.container_name="foobar-gke-container"
resource.labels.function_name="foobar-cloud-function"
Make use of the auto-complete as it might suggest a searchable attribute that
you weren’t previously aware of.
Moving on: the single most useful tool is the regex operator. The basic
equality operator is OK when you know the exact names you’re looking for, but
as that is often not the case on sprawling projects that you are unfamiliar
with, you need something more flexible. The regex operator lets you make much
wider-reaching log queries.
resource.labels.container_name=~"foobar"
This will find log entries whose container name contains "foobar" .
This way you can try to find logs for containers or functions whose exact name
you can only guess at.
You can combine this with the OR operator for even wider exploratory searches:
resource.labels.container_name=~"foobar" OR resource.labels.function_name=~"foobar"
Note that you can use logical parentheses to do arbitrary boolean querying:
(resource.labels.container_name=~"foobar" OR resource.labels.function_name=~"foobar")
AND "baz"
Another useful element of the syntax is negative exclusions, i.e. the NOT
operator. This is a - before the query term, e.g.
-resource.labels.container_name=~"baz"
This will exclude any log entry whose container name includes "baz" .
You can start with a wide explaratory query and then narrow it down by adding
negative exclusions for each irrelevant thing you see in the search results.
This can be a good tactic for navigating your way to some specific logs that
might be useful for your investigations.
By the way, if you’re looking for a
freelance / contractor GCP software engineer,
then contact me.
View post:
Google Cloud Platform Logging Query Syntax Exploration Fu
|