利用saltstack的event实现自己的功能

saltstack的master上minion连接较多,下面这个程序可以分析哪些minion任务执行成功,哪些执行失败以及哪些没有返回。

脚本说明:

一、最先打印出本次任务的job id、command name以及其它相关信息,然后是本次任务的执行流程和结果,这和我们单独执行这个命令是一致的。最后程序会打印出所有未成功的任务和未返回的任务,并且重新执行一遍。 这里要说明的是,因为没有查看对应的情景,对于失败任务的排判断做的不好,另外minion未连接我也归为任务未返回,并且会再执行一遍,实际上如果是minion未连接,则不应该执行。

二、 程序我们先派生子进程去执行salt命令,再salt命令执行完毕后,我们的程序会对其中失败的和未返回的minion任务二次执行

三、编写脚本

import salt.utils.event  import re  import signal, time  import sys  import os  def single_handler(target):      os.execl('/usr/bin/salt', 'salt', target, 'state.sls', 'os')    def handler(num1, num2):      #signal.signal(signal.SIGCLD,signal.SIG_IGN)      print 'We are in signal handler'      print 'Job Not Ret: '+str(record[jid])      print ' Job Failed: '+str(failedrecord[jid])      print 'all done...'      for item in failedrecord[jid]:          #print item          try:             pid  = os.fork()             if pid == 0:                single_handler(item)          except OSError:             print 'we exec. '+ item +' error!'      for item in record[jid]:          #print item          try:             print 'fork ok ' + item             pid = os.fork()             if pid == 0 :                single_handler(item)          except OSError:             print 'we exec. '+item + ' error!'      sys.stdout.flush()      os._exit(0)        fd = open('/tmp/record', 'w+')  #sys.stdout = fd  #sys.stderr = fd    signal.signal(signal.SIGCLD, handler)    #fd = open('/var/log/record', 'w+')  os.dup2(fd.fileno(), sys.stdout.fileno())  os.dup2(fd.fileno(), sys.stderr.fileno())    #sys.stdout = fd  #sys.stderr = fd      try:     pid = os.fork()     if pid == 0:        time.sleep(2)        try:           os.execl('/usr/bin/salt', 'salt', '*', 'state.sls', 'os')        except OSError:           print 'exec error!'           os._exit(1)  except OSError:     print 'first fork error!'     os._exit(1)  event = salt.utils.event.MasterEvent('/var/run/salt/master')  flag=False  reg=re.compile('salt/job/([0-9]+)/new')  reg1=reg  #a process to exec. command, but will sleep some time  #another process listen the event  #if we use this method, we can filter the event through func. name  record={}  failedrecord={}  jid = 0      #try:  for eachevent in event.iter_events(tag='salt/job',full=True):      eachevent=dict(eachevent)      result = reg.findall(eachevent['tag'])      if not flag and result:         flag = True         jid = result[0]         print "   job_id: " + jid         print "  Command: " + dict(eachevent['data'])['fun'] + ' ' + str(dict(eachevent['data'])['arg'])         print "    RunAs: " + dict(eachevent['data'])['user']         print "exec_time: " + dict(eachevent['data'])['_stamp']         print "host_list: " + str(dict(eachevent['data'])['minions'])         sys.stdout.flush()         record[jid]=eachevent['data']['minions']         failedrecord[jid]=[]         reg1 = re.compile('salt/job/'+jid+'/ret/([0-9.]+)')      else:         result = reg1.findall(eachevent['tag'])         if result:            record[jid].remove(result[0])            if not dict(eachevent['data'])['success']:               failedrecord[jid].append(result[0])  #except:  #   print 'we in except'  """     print 'Job Not Ret: '+str(record[jid])     print ' Job Failed: '+str(failedrecord[jid])     for item in failedrecord[jid]:         os.system('salt '+ str(item) + ' state.sls os')     for item in record[jid]:         os.system('salt '+ str(item) + ' state.sls os')     os._exit(0)  """  

执行结果:

   job_id: 20151208025319005896    Command: state.sls ['os']      RunAs: root  exec_time: 2015-12-08T02:53:19.006284  host_list: ['172.18.1.212', '172.18.1.214', '172.18.1.213', '172.18.1.211']  172.18.1.213:  ----------            ID: configfilecopy      Function: file.managed          Name: /root/node3        Result: True       Comment: File /root/node3 is in the correct state       Started: 02:53:19.314015      Duration: 13.033 ms       Changes:  ----------            ID: commonfile      Function: file.managed          Name: /root/commonfile        Result: True       Comment: File /root/commonfile is in the correct state       Started: 02:53:19.327173      Duration: 1.993 ms       Changes:    Summary  ------------  Succeeded: 2  Failed:    0  ------------  Total states run:     2  172.18.1.212:  ----------            ID: configfilecopy      Function: file.managed          Name: /root/node2        Result: True       Comment: File /root/node2 is in the correct state       Started: 02:53:19.337325      Duration: 8.327 ms       Changes:  ----------            ID: commonfile      Function: file.managed          Name: /root/commonfile        Result: True       Comment: File /root/commonfile is in the correct state       Started: 02:53:19.345787      Duration: 1.996 ms       Changes:    Summary  ------------  Succeeded: 2  Failed:    0  ------------  Total states run:     2  172.18.1.211:  ----------            ID: configfilecopy      Function: file.managed          Name: /root/node1        Result: True       Comment: File /root/node1 is in the correct state       Started: 02:53:19.345017      Duration: 12.741 ms       Changes:  ----------            ID: commonfile      Function: file.managed          Name: /root/commonfile        Result: True       Comment: File /root/commonfile is in the correct state       Started: 02:53:19.357873      Duration: 1.948 ms       Changes:    Summary  ------------  Succeeded: 2  Failed:    0  ------------  Total states run:     2  172.18.1.214:      Minion did not return. [Not connected]  We are in signal handler  Job Not Ret: ['172.18.1.214']   Job Failed: []  all done...  fork ok 172.18.1.214  172.18.1.214:      Minion did not return. [Not connected]  

原文出处:cnblogs -> https://www.cnblogs.com/nulige/p/9219086.html

本站所发布的一切资源仅限用于学习和研究目的;不得将上述内容用于商业或者非法用途,否则,一切后果请用户自负。本站信息来自网络,版权争议与本站无关。您必须在下载后的24个小时之内,从您的电脑中彻底删除上述内容。如果您喜欢该程序,请支持正版软件,购买注册,得到更好的正版服务。如果侵犯你的利益,请发送邮箱到 [email protected],我们会很快的为您处理。