集成 Microsoft Azure Monitor Restful API 来监控 MongoDB

1. 前言

随时 IT 技术的蓬勃发展,出现了越来越多地以各种技术栈为核心的应用程序,同时应用架构和服务的调用关系也就变得越来越复杂,所以建立一个强大的监控系统的必要性就很自然的体现了出来。一般在实际的生产环境中,监控对象类型丰富多样,工具平台也有很多可以来选择。比如从最早的 Smokeping,Cacti,Nagios,Ganglia 发展到现在比较流行功能也更加强大的 Zabbix,Prometheus,Openfalcon 以及偏业务层做代码追踪的监控平台 CAT,Zipkin 等,这些都是很不错的工具和平台,只要运用合适,都可以解决很多场景和不同维度的监控需求。但是维护这些监控平台并不是一件容易的事情,从可靠性、性能到定制化开发,都需要花费很多的人力物力来做底层代码层面的修改,不然直接上生产有非常大的风险。依据此背景,Azure 也推出自己的监控服务 Azure Monitor,帮助客户不需要担心监控系统的可靠性,只需要关注业务的可靠性和监控需求的落地即可。Azure Monitor 支持从基础架构层、应用层甚至于容器层面的监控。本文主要介绍通过 Azure Monitor Restful API 来收集 MongoDB 的指标,从而来模拟通过 Azure Monitor 来做二开监控平台的场景。


2. Azure Monitor 介绍

Azure Monitor 提供用于收集、分析和处理来自云与本地环境的遥测数据的综合解决方案,可将应用程序和服务的可用性和性能最大化。它可以帮助你了解应用程序的性能,并主动识别影响应用程序及其所依赖资源的问题。下图提供了 Azure Monitor 的概要视图。示意图的中心是用于存储指标和日志 ( Azure Monitor 使用的两种基本类型的数据 ) 的数据存储,左侧是用于填充这些数据存储的监视数据源,右侧是 Azure Monitor 针对这些收集的数据执行的不同功能,例如分析、警报和流式传输到外部系统。

Azure Monitor 支持监控的数据类型的层级都很多,具体可以参考官方文档,就不在此赘述了。对于收集上来的数据,Azure Monitor 使用 Azure Data Explorer 使用的 Kusto 查询语言来做查询,同时包括高级功能,例如聚合、联接和智能分析等。话不多说,直接开干。


3. Azure Monitor 通过 Restful API 监控 MongoDB

3.1 创建测试服务器 CentOS Server

首先,通过 Terraform 自动化部署 CentOS 7.7 实例,具体的 .tf 文件内容如下,其中变量引用

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
# Configure the Microsoft Azure Provider
provider "azurerm" {
subscription_id = "${var.subscription_id}"
client_id = "${var.client_id}"
client_secret = "${var.client_secret}"
tenant_id = "${var.tenant_id}"
}

# Create a resource group if it doesn’t exist
resource "azurerm_resource_group" "myterraformgroup" {
name = "${var.lab_namespace}rg0001"
location = "${var.location}"
tags = {
environment = "Azure Terraform Automation"
}
}

# Create virtual network
resource "azurerm_virtual_network" "myterraformnetwork" {
name = "${var.lab_namespace}vnet0001"
address_space = ["10.11.0.0/16"]
location = "${var.location}"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"

tags = {
environment = "Azure Terraform Automation"
}
}

# Create Network Security Group and rule
resource "azurerm_network_security_group" "publicnsg" {
name = "public-nsg"
location = "${var.location}"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"

security_rule {
name = "SSH"
priority = 1001
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "*"
}

security_rule {
name = "RDP"
priority = 1002
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "3389"
source_address_prefix = "*"
destination_address_prefix = "*"
}

tags = {
environment = "Azure Terraform Automation"
}
}
resource "azurerm_network_security_group" "opnsg" {
name = "op-nsg"
location = "${var.location}"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"

security_rule {
name = "SSH"
priority = 1001
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "22"
source_address_prefix = "*"
destination_address_prefix = "*"
}

security_rule {
name = "RDP"
priority = 1002
direction = "Inbound"
access = "Allow"
protocol = "Tcp"
source_port_range = "*"
destination_port_range = "3389"
source_address_prefix = "*"
destination_address_prefix = "*"
}

tags = {
environment = "Azure Terraform Automation"
}
}

# Create subnet
resource "azurerm_subnet" "publicsubnet" {
name = "publicsubnet0001"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"
virtual_network_name = "${azurerm_virtual_network.myterraformnetwork.name}"
address_prefix = "10.11.0.0/23"
network_security_group_id = "${azurerm_network_security_group.publicnsg.id}"
}
resource "azurerm_subnet" "opsubnet" {
name = "opsubnet0001"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"
virtual_network_name = "${azurerm_virtual_network.myterraformnetwork.name}"
address_prefix = "10.11.2.0/23"
network_security_group_id = "${azurerm_network_security_group.opnsg.id}"
}

# Create Public IP
resource "azurerm_public_ip" "centospips" {
name = "centos0001pip"
location = "${var.location}"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"
allocation_method = "Static"

tags = {
environment = "Azure Terraform Automation"
}
}

# Create Network interface
resource "azurerm_network_interface" "centosnics" {
name = "centos0001nic0001"
location = "${var.location}"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"

ip_configuration {
name = "centos0001nic0001"
subnet_id = "${azurerm_subnet.publicsubnet.id}"
private_ip_address_allocation = "Static"
private_ip_address = "10.11.0.4"
public_ip_address_id = "${azurerm_public_ip.centospips.id}"
}

tags = {
environment = "Azure Terraform Automation"
}
}

# Create storage account for boot diagnostics
resource "azurerm_storage_account" "mystorageaccount" {
name = "${azurerm_resource_group.myterraformgroup.name}sa0001"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"
location = "${var.location}"
account_kind = "StorageV2"
account_tier = "Standard"
account_replication_type = "LRS"

tags = {
environment = "Azure Terraform Automation"
}
}

# Create virtual machine
resource "azurerm_virtual_machine" "centos0001" {
name = "centos0001"
location = "${var.location}"
resource_group_name = "${azurerm_resource_group.myterraformgroup.name}"
network_interface_ids = ["${azurerm_network_interface.centosnics.id}"]
vm_size = "Standard_D2s_v3"

storage_os_disk {
name = "centos0001OsDisk"
caching = "ReadWrite"
create_option = "FromImage"
managed_disk_type = "Standard_LRS"
}

storage_image_reference {
id = "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/xxxxxx/providers/Microsoft.Compute/images/centos77image0001"
}

os_profile {
computer_name = "centos0001"
admin_username = "${var.admin_username}"

}

os_profile_linux_config {
disable_password_authentication = true
ssh_keys {
path = "/home/${var.admin_username}/.ssh/authorized_keys"
key_data = "${var.ssh_public_key_data}"
}
}

boot_diagnostics {
enabled = "true"
storage_uri = "${azurerm_storage_account.mystorageaccount.primary_blob_endpoint}"
}

tags = {
environment = "Azure Terraform Automation"
}
}

整个部署过程大概需要10分钟左右完成。

3.2 部署单节点 MongoDB Server 4.2.0

3.2.1 配置 CentOS 7.7 MongoDB 4.x Yum 源

1
2
3
4
5
6
7
8
$ cd /etc/yum.repos.d/
$ vi mongodb.repo
[mongodb-org-4.2]
name=MongoDB Repository
baseurl=https://repo.mongodb.org/yum/redhat/$releasever/mongodb-org/4.2/x86_64/
gpgcheck=1
enabled=1
gpgkey=https://www.mongodb.org/static/pgp/server-4.2.asc

保存退出,Yum 源配置完成。

3.2.2 安装单节点 MongoDB Server 4.2.0

安装 MongoDB Server 4.2.0:

1
$ yum install -y mongodb-org-4.2.0 mongodb-org-server-4.2.0 mongodb-org-shell-4.2.0 mongodb-org-mongos-4.2.0 mongodb-org-tools-4.2.0

检查安装包及版本:

1
2
3
4
5
6
$ rpm -qa |grep mongodb
mongodb-org-tools-4.2.0-1.el7.x86_64
mongodb-org-mongos-4.2.0-1.el7.x86_64
mongodb-org-shell-4.2.0-1.el7.x86_64
mongodb-org-4.2.0-1.el7.x86_64
mongodb-org-server-4.2.0-1.el7.x86_64

修改 MongoDB Server 配置文件:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
$ vi /etc/mongod.conf
# mongod.conf

# for documentation of all options, see:
# http://docs.mongodb.org/manual/reference/configuration-options/

# where to write logging data.
systemLog:
destination: file
logAppend: true
path: /var/log/mongodb/mongod.log

# Where and how to store data.
storage:
dbPath: /var/lib/mongo
journal:
enabled: true
# engine:
# wiredTiger:

# how the process runs
processManagement:
fork: true # fork and run in background
pidFilePath: /var/run/mongodb/mongod.pid # location of pidfile
timeZoneInfo: /usr/share/zoneinfo

# network interfaces
net:
port: 27017
bindIp: 0.0.0.0 # Enter 0.0.0.0,:: to bind to all IPv4 and IPv6 addresses or, alternatively, use the net.bindIpAll setting.

#security:
#operationProfiling:
replication:
oplogSizeMB: "20480"
replSetName: repconfig

启动 MongoDB Sever 服务:

1
systemctl start mongod && systemctl enable mongod

MongoDB 初始化:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
$ mongo --port 27017
> repconfig = { _id : "repconfig", members : [ {_id : 0, host : "10.11.0.4:27017" , priority: 1 } ] }
{
"_id" : "repconfig",
"members" : [
{
"_id" : 0,
"host" : "10.11.0.4:27017",
"priority" : 1
}
]
}
> rs.initiate(repconfig);
{
"ok" : 1,
"$clusterTime" : {
"clusterTime" : Timestamp(1579339466, 1),
"signature" : {
"hash" : BinData(0,"AAAAAAAAAAAAAAAAAAAAAAAAAAA="),
"keyId" : NumberLong(0)
}
},
"operationTime" : Timestamp(1579339466, 1)
}
repconfig:SECONDARY>
repconfig:PRIMARY>

初始化结束后,可以通过 Mongo Shell rs.status() 来查 Replica 信息。

3.3 通过 Python 调用 Azure Monitor Restful API 来监控 Mongodb Server 4.2.0

3.3.1 创建 Azure Log Analysis Workspace

具体操作过程不再赘述,通过 Portal 或者 Cli 等方式 Step by Step 点击创建即可,如图:

创建好了需要的几个信息,如图:

“WORKSPACE ID” 为脚本中的 customer_id,”PRIMARY KEY” 为脚本中的 shared_key。

3.3.2 Python 调用 Azure Monitor Restful API 向 Azure Log Analysis Workspace 传送数据

Azure Monitor Restful API 不做赘述,具体可以参考这里。本实验先通过 pymongo 收集 MongoDB Metrics,然后送到 Azure Log Analysis Workspace 中,先安装 pymongo:

1
2
$ yum install python-pip y
$ pip install pymongo requests

具体的 Python 脚本如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
#!/usr/bin/env python
"""
Date: 01/18/2020
Author: Xinsheng Wang
Description: A custom script to get MongoDB metrics and send data to Azure Monitor
Requires: MongoClient in python
"""

from calendar import timegm
from time import gmtime

from pymongo import MongoClient, errors
from sys import exit

import json
import requests
import datetime
import hashlib
import hmac
import base64
import os
from glob import glob

# Update the customer ID to your Log Analytics workspace ID
customer_id = '86df0cbc-076c-4483-8a32-c59c6550a771'

# For the shared key, use either the primary or the secondary Connected Sources client authentication key
shared_key = "b3uLsEOXBFBqTiAHDGp9boTeKR6v86f/9cLPWWsWUvs+LcjBIqjDp9CDJL+7vxlKDDRxqXIf1jjjKcZbdV0H/Q=="

# The log type is the name of the event that is being submitted
log_type = 'MongoDBMonitorLog'

class MongoDB(object):
"""main script class"""
# pylint: disable=too-many-instance-attributes

def delete_temporary_files(self):
"""delete temporary files"""
for file in glob('/tmp/mongometrics000*'):
os.remove(file)

def __init__(self):
self.mongo_host = "10.11.0.4"
self.mongo_port = 27017
self.mongo_db = ["admin", ]
self.mongo_user = None
self.mongo_password = None
self.__conn = None
self.__dbnames = None
self.__metrics = []

def connect(self):
"""Connect to MongoDB"""
if self.__conn is None:
if self.mongo_user is None:
try:
self.__conn = MongoClient('mongodb://%s:%s' %
(self.mongo_host,
self.mongo_port))
except errors.PyMongoError as py_mongo_error:
print('Error in MongoDB connection: %s' %
str(py_mongo_error))
else:
try:
self.__conn = MongoClient('mongodb://%s:%s@%s:%s' %
(self.mongo_user,
self.mongo_password,
self.mongo_host,
self.mongo_port))
except errors.PyMongoError as py_mongo_error:
print('Error in MongoDB connection: %s' %
str(py_mongo_error))

def add_metrics(self, k, v):
"""add each metric to the metrics list"""
global body
dict_metrics = {}
dict_metrics["key"] = k
dict_metrics["value"] = v
self.__metrics.append(dict_metrics)
dic = json.dumps(dict_metrics, sort_keys=True, indent=4, separators=(',', ':')).replace('}', '},')

f = open('/tmp/mongometrics0001.txt','a')
f.write(dic)
f.close

os.system("cat /tmp/mongometrics0001.txt |sed '$s/\}\,/\}\]/g;1s/{/[{/' > /tmp/mongometrics0002.txt")

with open('/tmp/mongometrics0002.txt','r') as src:
body = src.read()
print(body)

def get_db_names(self):
"""get a list of DB names"""
if self.__conn is None:
self.connect()
db_handler = self.__conn[self.mongo_db[0]]

master = db_handler.command('isMaster')['ismaster']
dict_metrics = {}
dict_metrics['key'] = 'mongodb.ismaster'
if master:
dict_metrics['value'] = 1
db_names = self.__conn.database_names()
self.__dbnames = db_names
else:
dict_metrics['value'] = 0
self.__metrics.append(dict_metrics)

def get_mongo_db_lld(self):
"""print DB list in json format, to be used for mongo db discovery"""
if self.__dbnames is None:
db_names = self.get_db_names()
else:
db_names = self.__dbnames
dict_metrics = {}
db_list = []
dict_metrics['key'] = 'mongodb.discovery'
dict_metrics['value'] = {"data": db_list}
if db_names is not None:
for db_name in db_names:
dict_lld_metric = {}
dict_lld_metric['{#MONGODBNAME}'] = db_name
db_list.append(dict_lld_metric)
dict_metrics['value'] = '{"data": ' + json.dumps(db_list) + '}'
self.__metrics.insert(0, dict_metrics)

def get_oplog(self):
"""get replica set oplog information"""
if self.__conn is None:
self.connect()
db_handler = self.__conn['local']

coll = db_handler.oplog.rs

op_first = (coll.find().sort('$natural', 1).limit(1))
op_last = (coll.find().sort('$natural', -1).limit(1))

# if host is not a member of replica set, without this check we will
# raise StopIteration as guided in
# http://api.mongodb.com/python/current/api/pymongo/cursor.html

if op_first.count() > 0 and op_last.count() > 0:
op_fst = (op_first.next())['ts'].time
op_last_st = op_last[0]['ts']
op_lst = (op_last.next())['ts'].time

status = round(float(op_lst - op_fst), 1)
self.add_metrics('mongodb.oplog', status)

current_time = timegm(gmtime())
oplog = int(((str(op_last_st).split('('))[1].split(','))[0])
self.add_metrics('mongodb.oplog-sync', (current_time - oplog))


def get_maintenance(self):
"""get replica set maintenance info"""
if self.__conn is None:
self.connect()
db_handler = self.__conn

fsync_locked = int(db_handler.is_locked)
self.add_metrics('mongodb.fsync-locked', fsync_locked)

try:
config = db_handler.admin.command("replSetGetConfig", 1)
connstring = (self.mongo_host + ':' + str(self.mongo_port))
connstrings = list()

for i in range(0, len(config['config']['members'])):
host = config['config']['members'][i]['host']
connstrings.append(host)

if connstring in host:
priority = config['config']['members'][i]['priority']
hidden = int(config['config']['members'][i]['hidden'])

self.add_metrics('mongodb.priority', priority)
self.add_metrics('mongodb.hidden', hidden)
except errors.PyMongoError:
print ('Error while fetching replica set configuration.'
'Not a member of replica set?')
except UnboundLocalError:
print ('Cannot use this mongo host: must be one of ' + ','.join(connstrings))
exit(1)

def get_server_status_metrics(self):
"""get server status"""
if self.__conn is None:
self.connect()
db_handler = self.__conn[self.mongo_db[0]]
ss = db_handler.command('serverStatus')

# db info
self.add_metrics('mongodb.version', ss['version'])
self.add_metrics('mongodb.storageEngine', ss['storageEngine']['name'])
self.add_metrics('mongodb.uptime', int(ss['uptime']))
self.add_metrics('mongodb.okstatus', int(ss['ok']))

# asserts
for k, v in ss['asserts'].items():
self.add_metrics('mongodb.asserts.' + k, v)

# operations
for k, v in ss['opcounters'].items():
self.add_metrics('mongodb.operation.' + k, v)

# connections
for k, v in ss['connections'].items():
self.add_metrics('mongodb.connection.' + k, v)

# extra info
self.add_metrics('mongodb.page.faults',
ss['extra_info']['page_faults'])

#wired tiger
if ss['storageEngine']['name'] == 'wiredTiger':
self.add_metrics('mongodb.used-cache',
ss['wiredTiger']['cache']
["bytes currently in the cache"])
self.add_metrics('mongodb.total-cache',
ss['wiredTiger']['cache']
["maximum bytes configured"])
self.add_metrics('mongodb.dirty-cache',
ss['wiredTiger']['cache']
["tracked dirty bytes in the cache"])

# global lock
lock_total_time = ss['globalLock']['totalTime']
self.add_metrics('mongodb.globalLock.totalTime', lock_total_time)
for k, v in ss['globalLock']['currentQueue'].items():
self.add_metrics('mongodb.globalLock.currentQueue.' + k, v)
for k, v in ss['globalLock']['activeClients'].items():
self.add_metrics('mongodb.globalLock.activeClients.' + k, v)

def get_db_stats_metrics(self):
"""get DB stats for each DB"""
if self.__conn is None:
self.connect()
if self.__dbnames is None:
self.get_db_names()
if self.__dbnames is not None:
for mongo_db in self.__dbnames:
db_handler = self.__conn[mongo_db]
dbs = db_handler.command('dbstats')
for k, v in dbs.items():
if k in ['storageSize', 'ok', 'avgObjSize', 'indexes',
'objects', 'collections', 'fileSize',
'numExtents', 'dataSize', 'indexSize',
'nsSizeMB']:
self.add_metrics('mongodb.stats.' + k +
'[' + mongo_db + ']', int(v))
def close(self):
"""close connection to mongo"""
if self.__conn is not None:
self.__conn.close()

# Build the API signature
def build_signature(customer_id, shared_key, date, content_length, method, content_type, resource):
x_headers = 'x-ms-date:' + date
string_to_hash = method + "\n" + str(content_length) + "\n" + content_type + "\n" + x_headers + "\n" + resource
bytes_to_hash = bytes(string_to_hash).encode('utf-8')
decoded_key = base64.b64decode(shared_key)
encoded_hash = base64.b64encode(hmac.new(decoded_key, bytes_to_hash, digestmod=hashlib.sha256).digest())
authorization = "SharedKey {}:{}".format(customer_id,encoded_hash)
return authorization

# Build and send a request to the POST API
def mongodb_azuremonitor_loganalysis_post_data(customer_id, shared_key, body, log_type):
method = 'POST'
content_type = 'application/json'
resource = '/api/logs'
rfc1123date = datetime.datetime.utcnow().strftime('%a, %d %b %Y %H:%M:%S GMT')
content_length = len(body)
signature = build_signature(customer_id, shared_key, rfc1123date, content_length, method, content_type, resource)
uri = 'https://' + customer_id + '.ods.opinsights.azure.com' + resource + '?api-version=2016-04-01'

headers = {
'content-type': content_type,
'Authorization': signature,
'Log-Type': log_type,
'x-ms-date': rfc1123date
}

response = requests.post(uri,data=body, headers=headers)
if (response.status_code >= 200 and response.status_code <= 299):
print 'Accepted'
else:
print "Response code: {}".format(response.status_code)

if __name__ == '__main__':
mongodb = MongoDB()
mongodb.delete_temporary_files()
mongodb.get_db_names()
mongodb.get_mongo_db_lld()
mongodb.get_oplog()
mongodb.get_maintenance()
mongodb.get_server_status_metrics()
mongodb.get_db_stats_metrics()
mongodb.close()
mongodb_azuremonitor_loganalysis_post_data(customer_id, shared_key, body, log_type)

注意相应的 customer_id 和 shared_key 替换成图中的 key,log_type 为收集到 Log Analysis Workspace 的 namespace 名字。保存脚本名字为 AzureMonitor_LogAnalysis_MongodbMetrics.py,执行脚本:

1
python AzureMonitor_LogAnalysis_MongodbMetrics.py

通过 print 可以打印一些交互式信息,观察数据是否已经送到 Azure Log Analysis Workspace。

3.3.3 利用 Kusto 查询 MongoDB Custom Logs

在 Azure Portal 查看并通过 KQL 查询 MongoDB 的数据:

如图可见,数据已经成功送到了 Azure Log Analysis Workspace,后续可以写更复杂的 Kusto 来做相关的查询,并可以考虑结合调度或者 Linux Crontab 等服务做定时的数据拉取和告警了。


4. 总结

测试至此,Azure Monitor Restful API 监控 MongoDB 的示例完成了,希望能够给想要通过 Azure Monitor 做二次开发的同学们一些参考。但本文还有一些可以优化之处,比如 Python 脚本没有考虑更多的 Log Output 以及执行失败时候的 Retry 逻辑,在实际生产中,为了严谨,还是希望大家要充分考虑逻辑在上线,不过这些功能有待大家自己添加了。