快速通過(guò)zabbix獲取數(shù)據(jù)庫(kù)連接的信息及部分?jǐn)U展
背景
隨著應(yīng)用系統(tǒng)的不斷增加,原本不告警的active threads,開(kāi)始頻繁告警(一天2次左右)。雖然告警次數(shù)不多,而且該監(jiān)控項(xiàng)舍得閾值不高(不超過(guò)50),但對(duì)于運(yùn)維來(lái)說(shuō)數(shù)據(jù)庫(kù)的threads-running是一個(gè)必須要重視的點(diǎn)。
一般告警出現(xiàn)在半夜,不可能靠人工去記錄threads-running過(guò)線后,到底哪些連接正在處理。市面上也沒(méi)好的工具能自動(dòng)記錄這些數(shù)據(jù),所以這時(shí)候就需要zabbix的action功能了。
zabbix配置
1. 定義監(jiān)控項(xiàng)
這里偷個(gè)懶,直接使用了percona模板中Threads Running監(jiān)控項(xiàng):
2. 定義觸發(fā)器
同樣適用percona的觸發(fā)器設(shè)置:
3. 創(chuàng)建action
按照下圖的順序創(chuàng)建action:
4. action條件
A、B、C、D條件都滿足,才會(huì)觸發(fā)動(dòng)作,這里盡量篩選的詳細(xì)點(diǎn),免得出現(xiàn)zabbix錯(cuò)誤調(diào)用的情況。
5. 完善操作內(nèi)容
這里適用ssh方式,當(dāng)然你也可以在類型欄使用自定義腳本選項(xiàng),就是要多給zabbix客戶端賦sudo權(quán)限。
命令欄填的是/bin/sh /opt/connect.sh命令,這個(gè)很好理解,直接調(diào)用connect.sh腳本,具體腳本附在后文中。
6. 修改zabbix-agent配置
進(jìn)入被監(jiān)控服務(wù)器:
vim /etc/zabbix/zabbix_agentd.conf EnableRemoteCommands=1 #增加這項(xiàng)參數(shù),意思是允許zabbix server遠(yuǎn)程命令 service zabbix-agent restart
至此,zabbix相關(guān)的配置均已完成,接下來(lái)只需要將寫好的處理腳本放入/opt目錄即可。
功能腳本
這次要實(shí)現(xiàn)的是,在連接超過(guò)50個(gè)時(shí),輸出到底是哪個(gè)賬號(hào)、哪個(gè)ip、在執(zhí)行哪個(gè)sql等信息。腳本如下:
#!/bin/sh export PATH=$PATH:/usr/bin da=`date +%Y%m%d` dc=`date +%Y-%m-%d" "%H:%M:%S` echo $dc"-------------------------------我是分割線------------------------------------" >> /tmp/ok_$da.log /usr/local/mysql/bin/mysql -uroot -pXXX -e "select * from information_schema.PROCESSLIST where COMMAND != 'Sleep' order by TIME DESC;" >> /tmp/ok_$da.log
擴(kuò)展
既然zabbix在報(bào)警時(shí)可以調(diào)用腳本,那是不是可以讓zabbix處理點(diǎn)更為復(fù)雜的工作?
數(shù)據(jù)庫(kù)連接、鎖、存儲(chǔ)引擎等信息
#!/bin/sh export PATH=$PATH:/usr/bin da=`date +%Y%m%d` dc=`date +%Y-%m-%d" "%H:%M:%S` echo $dc"-------------------------------我是分割線------------------------------------" >> /home/zabbix/engine_log/engine_log_$da.log /usr/bin/mysql -hlocalhost -uroot -pXXX -e "show engine innodb status \G;" >> /home/zabbix/engine_log/engine_log_$da.log echo -e "\n\n\n" >> /home/zabbix/engine_log/engine_log_$da.log echo $dc"-------------------------------我是分割線------------------------------------" >> /home/zabbix/processlist/processlist_$da.log /usr/bin/mysql -hlocalhost -uroot -pXXX -e "select * from information_schema.processlist where time>=0 and command !='sleep' order by time desc \G;" >> /home/zabbix/processlist/processlist_$da.log echo -e "\n\n\n" >> /home/zabbix/processlist/processlist_$da.log echo $dc"-------------------------------我是分割線------------------------------------" >> /home/zabbix/lock/lock_$da.log /usr/bin/mysql -hlocalhost -uroot -pXXX -e "select 'Blocker' role, p.id, p.user, left(p.host, locate(':', p.host) - 1) host, tx.trx_id, tx.trx_state, tx.trx_started, timestampdiff(second, tx.trx_started, now()) duration, lo.lock_mode, lo.lock_type, lo.lock_table, lo.lock_index, tx.trx_query, tx.trx_tables_in_use, tx.trx_tables_locked, tx.trx_rows_locked from information_schema.innodb_trx tx, information_schema.innodb_lock_waits lw, information_schema.innodb_locks lo, information_schema.processlist p where lw.blocking_trx_id = tx.trx_id and p.id = tx.trx_mysql_thread_id and lo.lock_trx_id = tx.trx_id union all select 'Blockee' role, p.id, p.user, left(p.host, locate(':', p.host) - 1) host, tx.trx_id, tx.trx_state, tx.trx_started, timestampdiff(second, tx.trx_started, now()) duration, lo.lock_mode, lo.lock_type, lo.lock_table, lo.lock_index, tx.trx_query, tx.trx_tables_in_use, tx.trx_tables_locked, tx.trx_rows_locked from information_schema.innodb_trx tx, information_schema.innodb_lock_waits lw, information_schema.innodb_locks lo, information_schema.processlist p where lw.requesting_trx_id = tx.trx_id and p.id = tx.trx_mysql_thread_id and lo.lock_trx_id = tx.trx_id \G;" >> /home/zabbix/lock/lock_$da.log echo -e "\n\n\n" >> /home/zabbix/lock/lock_$da.log val=`/usr/bin/mysql -hlocalhost -uroot -pXXX -N -e "show variables like 'general_log'" |awk '{print $2}'` if [ $val = 'OFF' ]; then /usr/bin/mysql -hlocalhost -uroot -pXXX -e "set global general_log=0;" else exit 0; fi
在數(shù)據(jù)庫(kù)壓力大的時(shí)候,可以記錄哥哥鎖信息、連接信息還有存儲(chǔ)引擎信息。所謂壓力大,我們可以將觸發(fā)項(xiàng)定以下幾個(gè):
- threads-running:running的連接過(guò)多
- Innodb Row Lock Waits:鎖等待時(shí)間過(guò)長(zhǎng)
- Com Select\Update\Insert\Delete:增產(chǎn)查改過(guò)多
- Incoming\Outgoing network traffic:進(jìn)出流量不正常時(shí)
這些都可以作為觸發(fā)條件。
殺死長(zhǎng)sql
#!/bin/sh export PATH=$PATH:/usr/bin da=`date +%Y%m%d` dc=`date +%Y-%m-%d" "%H:%M:%S` user="root" password="XXX" val=`mysql -u$user -p$password -N -e "select count(*) from information_schema.processlist where time>=180 and command ='Query' and user in ('job_name','report')" | awk '{print $1}'` if [ $val -gt 0 ]; then echo $dc"-------------------------------我是分割線------------------------------------" >> /home/zabbix/kill_log/long_query_$da.log mysql -u$user -p$password -e "select * from information_schema.processlist where time>=180 and command ='Query' and user in ('job_name','report') order by time desc \G;" >> /home/zabbix/kill_log/long_query_$da.log echo -e "\n\n\n" >> /home/zabbix/kill_log/long_query_$da.log count=`mysql -u$user -p$password -N -e "select id from information_schema.processlist where time>=180 and command ='Query' and user in ('job_name','report')"` for id in $count; do mysql -u$user -p$password -e "kill $id" done else exit 0; fi
有時(shí)候一些job或者報(bào)表sql會(huì)長(zhǎng)時(shí)間在那里執(zhí)行,導(dǎo)致影響到其他業(yè)務(wù),我們可以簡(jiǎn)單做下判斷,當(dāng)數(shù)據(jù)庫(kù)連接激增,多數(shù)是由于有大sql夯在那里,這時(shí)候可以讓zabbix調(diào)用這個(gè)腳本去殺死特殊賬號(hào)發(fā)起的超過(guò)180秒的sql。至于kill sql的條件可以在腳本中自定義。
當(dāng)然了,像這種明知需要很長(zhǎng)時(shí)間才能處理完的sql,應(yīng)該放在從庫(kù)中去執(zhí)行。
刪除無(wú)用日志
#!/bin/sh logdir='/mysql/logs' binlog='/mysql/binlog' var_percent=`df -h |grep var|grep dev/sda|awk '{print $5}'|awk -F% '{print $1}'` if [ -z $var_percent ] || [ $var_percent -lt 90 ];then echo "never mind" exit 0 fi date >>$binlog/del_list.txt list=`ls -l --time-style='+%Y-%m-%d %H:%M:%S' $binlog/mysql-bin.0?????|awk '{print $6","$7","$8}'` for i in $list do filetime=`echo $i|awk -F "," '{print $1,$2}'` filetimestamp=`date -d "$filetime" +%s` cur_time=`date +%s` if [ $(($cur_time - $filetimestamp)) -gt $((3*24*3600)) ];then filename=`echo $i|awk -F, '{print $3}'` echo "$filename will delete">>$binlog/del_list.txt /bin/rm $filename fi done if [ -f $logdir/mysql-slow.log ];then slow_log_size=`stat $logdir/mysql-slow.log|grep 'Size:'|awk -F ':' '{print $2}'|awk '{print $1}'` if [ $slow_log_size -gt $((2*1024*1024*1024)) ];then echo "$logdir/mysql-slow.log">>$logdir/del_list.txt /bin/rm $logdir/mysql-slow.log fi fi
有些時(shí)候binlog和slowlog沒(méi)設(shè)置自動(dòng)刪除,時(shí)間長(zhǎng)了會(huì)把磁盤空間占滿。這時(shí)候我們可以關(guān)聯(lián)Free disk space on /mysql監(jiān)控項(xiàng),查過(guò)閾值后,調(diào)用上面的腳本去清理無(wú)用的binlog和slowlog。
在有些依賴binlog的情況下,比如主從中斷,之后還需要恢復(fù)的情況下,需要謹(jǐn)慎使用。
參考鏈接 :
通過(guò)zabbix獲取數(shù)據(jù)庫(kù)連接的信息及部分?jǐn)U展 :https://www.jb51.net/article/207412.htm
到此這篇關(guān)于通過(guò)zabbix獲取數(shù)據(jù)庫(kù)連接的信息及部分?jǐn)U展的文章就介紹到這了,更多相關(guān)zabbix數(shù)據(jù)庫(kù)連接內(nèi)容請(qǐng)搜索本站以前的文章或繼續(xù)瀏覽下面的相關(guān)文章希望大家以后多多支持本站!
版權(quán)聲明:本站文章來(lái)源標(biāo)注為YINGSOO的內(nèi)容版權(quán)均為本站所有,歡迎引用、轉(zhuǎn)載,請(qǐng)保持原文完整并注明來(lái)源及原文鏈接。禁止復(fù)制或仿造本網(wǎng)站,禁止在非www.sddonglingsh.com所屬的服務(wù)器上建立鏡像,否則將依法追究法律責(zé)任。本站部分內(nèi)容來(lái)源于網(wǎng)友推薦、互聯(lián)網(wǎng)收集整理而來(lái),僅供學(xué)習(xí)參考,不代表本站立場(chǎng),如有內(nèi)容涉嫌侵權(quán),請(qǐng)聯(lián)系alex-e#qq.com處理。