Date created: March 10, 2018
Last updated: September 12, 2019
Google Stackdriver is a very good product for monitoring and logging your compute instances on Google Cloud, AWS, Azure, Alibaba, etc.
This article covers Stackdriver logging for Google Compute instances running Debian 9.
To make sure that Stackdriver is installed on each instance, I create instance templates that contain a script in the custom metadata section to automate Stackdriver installation and setup.
An important item to remember, startup scripts are executed every time an instance starts and not just on instance creation.
In my startup script for Debian 9 Stretch, I install Google Stackdriver logging and monitoring agents.
1 2 3 4 5 6 7 8 9 10 |
#!/bin/bash sudo apt-get update # Install Stackdriver logging curl -sSO https://dl.google.com/cloudagents/install-logging-agent.sh sudo bash install-logging-agent.sh # Install Stackdriver monitoring curl -sSO https://dl.google.com/cloudagents/install-monitoring-agent.sh sudo bash install-monitoring-agent.sh |
If you are manually creating a Compute instance, copy this script into the Automation -> Startup script section when creating the instance.
sudo
Installing the Monitoring Agent
Sending a test Stackdriver log message
logger "Hello Stackdriver"
This message is sent to Stackdriver and can be found in Stackdriver Logging -> GCE VM Instance -> Instance Name. If you do not see this message after about 15 seconds, check for Stackdriver errors in the logfile on the instance.
Stackdriver logfile
To see the latest logs in the Stackdriver logfile for debugging:
tail /var/log/google-fluentd/google-fluentd.log
Common Stackdriver errors
No service account assigned to the VM instance
1 |
2019-03-10 07:43:17 +0000 [warn]: #0 Dropping 1 log message(s) error="16:Getting metadata from plugin failed with error: #<Signet::AuthorizationError: Error code 404 trying to get security access token\nfrom Compute Engine metadata for the default service account. This\nmay be because the virtual machine instance does not have permission\nscopes specified.\n>" error_code="16" |
Missing IAM permission to write to Stackdriver
1 |
2019-03-10 07:55:15 +0000 [warn]: #0 Dropping 1 log message(s) error="User unauthorized to access 123456789012" error_code="7" |
Compute instances without a public IP address
For instances without external IP addresses, you must enable Private Google Access to allow the Stackdriver Logging agent to send logs.
Verify that your instance can resolve the following DNS hostnames:
-
-
- oauth2.googleapis.com
- monitoring.googleapis.com
- stackdriver.googleapis.com
-
Google Stackdriver service account file location
Stackdriver will check for the following location and use these credentials if present instead of the metadata service account credentials.
/etc/google/auth/application_default_credentials.json
IAM Permissions required for Stackdriver
Stackdriver Monitoring
Your VM instance needs the permission roles/monitoring.metricWriter
which can be added via the role roles/monitoring.metricWriter
. Link.
Stackdriver Logging
Your VM instance needs the permission logging.logEntries.create
which can be added via the role roles/logging.logWriter
. Link.
Stackdriver Error Reporting
Your VM instance needs the permission errorreporting.errorEvents.create
which can be added via the role roles/errorreporting.writer
. Link.
Stackdriver Profiler
Your VM instance needs the permission cloudprofiler.profiles.create
and
cloudprofiler.profiles.update
which can be added via the role roles/cloudprofiler.agent
. Link.
Stackdriver Trace
Your VM instance needs the permission cloudtrace.traces.patch
which can be added via the role roles/cloudtrace.agent
. Link.
Stackdriver Debugger
You don’t directly give members permissions; instead, you grant them one or more roles on a GCP resource, which have one or more permissions bundled within them. Refer to this document.
To determine the currently installed versions:
Stackdriver Monitoring
1 |
dpkg -l stackdriver-agent |
Output:
1 |
stackdriver-agent 5.5.2-382.stre amd64 Stackdriver system metrics collection daemon |
Stackdriver Logging
1 |
dpkg -l google-fluentd |
Output:
1 |
google-fluentd 1.6.0-1 amd64 Google Fluentd: A data collector for Google |
Add a startup Script remotely
You can add a startup-script for a running instance from the CLI. Note: this command will replace the existing startup script.
Copy the following to a local file. In this example startup.script. Modify to fit your requirements:
1 2 3 4 5 6 7 8 9 10 |
#!/bin/bash apt-get update # Install Stackdriver logging curl -sSO https://dl.google.com/cloudagents/install-logging-agent.sh bash install-logging-agent.sh # Install Stackdriver monitoring curl -sSO https://dl.google.com/cloudagents/install-monitoring-agent.sh bash install-monitoring-agent.sh |
Execute the following command from your desktop:
1 |
gcloud compute instances add-metadata INSTANCE_NAME --metadata-from-file startup-script=startup.script |
You can also store your scripts in Google Storage:
1 |
gcloud compute instances add-metadata INSTANCE_NAME --metadata startup-script-url=gs://bucket/file |
The startup script will be executed the next time the instance reboots.
1 |
gcloud compute instances reset INSTANCE_NAME --zone=ZONE |
Restarting the Stackdriver agent
sudo service google-fluentd restart
Stackdriver agent status
sudo service google-fluentd status
Upgrading the Stackdriver agent – Debian & Ubuntu systems
sudo apt-get install --only-upgrade google-fluentd
Note: This command does not change the agent’s configuration files. To get the latest default configuration and catch-all configuration files run the following command instead.
1 2 |
sudo apt-get install --only-upgrade -o Dpkg::Options::="--force-confnew" google-fluentd-catch-all-config sudo apt-get install --only-upgrade google-fluentd |
Uninstall the Stackdriver agent – Debian & Ubuntu systems
1 2 |
sudo service google-fluentd stop sudo apt-get remove google-fluentd google-fluentd-catch-all-config |
I design software for enterprise-class systems and data centers. My background is 30+ years in storage (SCSI, FC, iSCSI, disk arrays, imaging) virtualization. 20+ years in identity, security, and forensics.
For the past 14+ years, I have been working in the cloud (AWS, Azure, Google, Alibaba, IBM, Oracle) designing hybrid and multi-cloud software solutions. I am an MVP/GDE with several.
February 26, 2020 at 5:55 AM
Nice blog, it helped me a lot…