To get started, let’s quickly review the metrics to track for the Amazon DocumentDB primary (writer) node. The definitive monitoring reference is the monitoring chapter in the Amazon DocumentDB Developer Guide. That guide points out several types of metrics you should consider:
Note that the goal of monitoring is simply to provide information that you can act on. If you see concerning patterns in your database metrics, you have several options, including resizing the database, adding a caching layer, or limiting the number of concurrent connections.
As a quick review, Amazon CloudWatch stores metrics for up to 15 months. Metrics are time-ordered sets of data points. Metrics have a name, a namespace, and optionally dimensions. Dimensions are categories that help you refine metrics of interest.
The diagram below shows a view of a single Amazon CloudWatch metric,
ReadLatency. You may choose to see this metric for a single database instance, for all the read replicas in a cluster, or for an entire cluster.
The Amazon CloudWatch Concepts documentation has more information on metrics and dimensions.
For this workshop, you will monitor several metrics on the primary node:
There are several other metrics available, but this set is a good basic start.
On the Amazon DocumentDB console, you will find a built-in monitoring dashboard for each cluster by navigating to the
Clusters part of the console and clicking on the cluster identifier.
This built-in cluster dashboard shows several metrics in these categories:
The screenshot below shows the first two rows of the Resource utilization section for a cluster.
Similarly, you will find an instance monitoring dashboard by navigating to the
Instances section of the console and clicking on an instance identifier. The instance metrics shown fall under Resource Utilization, Throughput, Latency, Operations, and System.
You will set up an Amazon CloudWatch dashboard for the primary node manually now, and see how to automate the process in a later chapter.
First, go to the Amazon CloudWatch console and make sure you are in the correct region. Now go to the
Dashboards page, click
Give your dashboard a name.
Select a type of visualization (widget). For most metrics, the
Line widget is a good place to start.
Our data source is
Now you can add a graph to the dashboard. On the next page, start by looking at
All metrics, then enter
DocDB in the search field to find all of the metrics available in the
DocDB > Cluster Metrics by Role. On the next page, find the
CPUUtilization metric in the
On the next page, give the graph a title by clicking on the pencil icon near the top of the page. Then select the
Graphed metrics tab and review the options for our metric. By default the widget will display an average over a 5-minute period, but you can choose a different time frame or a different aggregation.
Go back to the list of all dashboards and select the dashboard you just created.
You can adjust the time frame the dashboard shows, which defaults to 3 hours, by accessing the options menu in the top right corner. Feel free to explore the other options.
At this point you can click the
Add Widget button and add graphs for the other metrics discussed earlier. You will see how to automate that process in a later chapter; for now, try adding just one or two additional metrics.
You can review the Amazon CloudWatch Metrics documentation to learn about other concepts such as metric math, which lets you produce metrics that are a combination of other metrics.