Tâche #29642
Scénario #29652: Traitement express MEN (10-12)
Soucis de cohabitation z_stats et dhcp
100%
Related issues
Associated revisions
zstats.cfg : add "activer_dhcp"
Ref: #29642
agentmanager/config.py : add ACTIVER_DHCP
Ref: #29642
Do not load the agent if dhcp is not activated
Ref: #29642
History
#1 Updated by Thierry Bertrand over 3 years ago
sur des serveurs seth 2.7.1, que la fonctionnalité dhcpd soit activée ou pas, on obtient des erreurs z_stats.
Au delà des erreurs, cela génère des coupures de communication avec zephir.
1er serveur :
root@set-30-09:~# systemctl status z_stats.service ● z_stats.service - Agent zephir Loaded: loaded (/lib/systemd/system/z_stats.service; enabled; vendor preset: enabled) Active: active (running) since Wed 2020-02-19 11:26:06 CET; 1 day 6h ago Main PID: 11397 (twistd) Tasks: 1 (limit: 4915) CGroup: /system.slice/z_stats.service └─11397 /usr/bin/python2 /usr/bin/twistd --nodaemon --no_save --pidfile /run/z_stats.pid zephiragents --tmp=data --data=stats --archive=/tmp --static=static --actions=actions févr. 20 17:23:10 set-30-09 zephiragents[11397]: févr. 20 17:23:10 set-30-09 zephiragents[11397]: [-] Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1. févr. 20 17:23:10 set-30-09 zephiragents[11397]: [-] Unhandled error in Deferred: févr. 20 17:23:10 set-30-09 zephiragents[11397]: [-] Unhandled Error févr. 20 17:23:10 set-30-09 zephiragents[11397]: [-] Traceback (most recent call last): févr. 20 17:23:10 set-30-09 zephiragents[11397]: [-] Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1. févr. 20 17:26:12 set-30-09 zephiragents[11397]: 2020-02-20T17:26:12+0100 [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool févr. 20 17:26:12 set-30-09 zephiragents[11397]: 2020-02-20T17:26:12+0100 [-] got stderr: 'Traceback (most recent call last):\n File "/usr/share/eole/sbin/dhcp-tool", line 102, in <module>\n ranges = get_ranges()\n File "/usr/shar févr. 20 17:26:12 set-30-09 zephiragents[11397]: [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool févr. 20 17:26:12 set-30-09 zephiragents[11397]: [-] got stderr: 'Traceback (most recent call last):\n File "/usr/share/eole/sbin/dhcp-tool", line 102, in <module>\n ranges = get_ranges()\n File "/usr/share/eole/sbin/dhcp-tool", l root@set-30-09:~# CreoleGet eole_release 2.7.1 root@set-30-09:~# CreoleGet activer_dhcp oui root@set-30-09:~# dhcp-tool ddtm30 30 51
Serveur 2:
root@seth-49-04:/usr/lib/nagios/plugins# systemctl status z_stats.service ● z_stats.service - Agent zephir Loaded: loaded (/lib/systemd/system/z_stats.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2020-02-20 05:33:54 CET; 11h ago Main PID: 1765 (twistd) Tasks: 1 (limit: 4915) CGroup: /system.slice/z_stats.service └─1765 /usr/bin/python2 /usr/bin/twistd --nodaemon --no_save --pidfile /run/z_stats.pid zephiragents --tmp=data --data=stats --archive=/tmp --static=static --actions=actions févr. 20 17:23:07 seth-49-04 zephiragents[1765]: [-] Traceback (most recent call last): févr. 20 17:23:07 seth-49-04 zephiragents[1765]: [-] Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1. févr. 20 17:23:07 seth-49-04 zephiragents[1765]: [-] Unhandled error in Deferred: févr. 20 17:23:07 seth-49-04 zephiragents[1765]: [-] Unhandled Error févr. 20 17:23:07 seth-49-04 zephiragents[1765]: [-] Traceback (most recent call last): févr. 20 17:23:07 seth-49-04 zephiragents[1765]: [-] Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1. févr. 20 17:23:59 seth-49-04 zephiragents[1765]: 2020-02-20T17:23:59+0100 [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool févr. 20 17:23:59 seth-49-04 zephiragents[1765]: 2020-02-20T17:23:59+0100 [-] got stderr: 'Traceback (most recent call last):\n File "/usr/share/eole/sbin/dhcp-tool", line 102, in <module>\n ranges = g févr. 20 17:23:59 seth-49-04 zephiragents[1765]: [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool févr. 20 17:23:59 seth-49-04 zephiragents[1765]: [-] got stderr: 'Traceback (most recent call last):\n File "/usr/share/eole/sbin/dhcp-tool", line 102, in <module>\n ranges = get_ranges()\n File "/usr root@seth-49-04:/usr/lib/nagios/plugins# CreoleGet eole_release 2.7.1 root@seth-49-04:/usr/lib/nagios/plugins# CreoleGet activer_dhcp non root@seth-49-04:/usr/lib/nagios/plugins# dhcp-tool Traceback (most recent call last): File "/usr/share/eole/sbin/dhcp-tool", line 102, in <module> ranges = get_ranges() File "/usr/share/eole/sbin/dhcp-tool", line 26, in get_ranges for ip, netmask in zip(list(cfg.creole.dhcp.adresse_network_dhcp.adresse_network_dhcp), File "/usr/lib/python2.7/dist-packages/tiramisu/config.py", line 248, in __getattr__ return self.getattr(name) File "/usr/lib/python2.7/dist-packages/tiramisu/config.py", line 316, in getattr raise props tiramisu.error.PropertiesOptionError: ne peut accéder à l'optiondescription "dhcp" a cause de la propriété disabled (la valeur de "Activer le serveur DHCP" est "non")
#2 Updated by Thierry Bertrand over 3 years ago
- Tracker changed from Scénario to Proposition Scénario
#3 Updated by Philippe Carre over 3 years ago
D'apres mes tests , l'activation du dhcp change quand même la donne :
Sur un eSSL 2.7.1
Demande de synchronisation auprès du service z_stats : Traceback (most recent call last): File "/usr/bin/synchro_zephir", line 66, in <module> sys.stdout.write(z_stats_proxy.archive_for_upload()) File "/usr/lib/python2.7/xmlrpclib.py", line 1243, in __call__ return self.__send(self.__name, args) File "/usr/lib/python2.7/xmlrpclib.py", line 1602, in __request verbose=self.__verbose File "/usr/lib/python2.7/xmlrpclib.py", line 1283, in request return self.single_request(host, handler, request_body, verbose) File "/usr/lib/python2.7/xmlrpclib.py", line 1316, in single_request return self.parse_response(response) File "/usr/lib/python2.7/xmlrpclib.py", line 1493, in parse_response return u.close() File "/usr/lib/python2.7/xmlrpclib.py", line 800, in close raise Fault(**self._stack[0]) xmlrpclib.Fault: <Fault 8002: 'error'> root@essl-271-6371:~# dhcp-tool Traceback (most recent call last): File "/usr/share/eole/sbin/dhcp-tool", line 102, in <module> ranges = get_ranges() File "/usr/share/eole/sbin/dhcp-tool", line 26, in get_ranges for ip, netmask in zip(list(cfg.creole.dhcp.adresse_network_dhcp.adresse_network_dhcp), File "/usr/lib/python2.7/dist-packages/tiramisu/config.py", line 248, in __getattr__ return self.getattr(name) File "/usr/lib/python2.7/dist-packages/tiramisu/config.py", line 316, in getattr raise props tiramisu.error.PropertiesOptionError: ne peut accéder à l'optiondescription "dhcp" a cause de la propriété disabled (la valeur de "Activer le serveur DHCP" est "non")
Activation du DHCP , puis reconfigure .
Tout est ok.
root@essl-271-6371:~# synchro_zephir Demande de synchronisation auprès du service z_stats : ok root@essl-271-6371:~# dhcp-tool plage0 10 10 lugdunum 767 767 root@essl-271-6371:~# systemctl status z_stats.service ● z_stats.service - Agent zephir Loaded: loaded (/lib/systemd/system/z_stats.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2020-02-27 11:20:31 CET; 21min ago
Si je re-désactive le dhcp c'est de nouveau HS.
root@essl-271-6371:~# systemctl status z_stats.service ● z_stats.service - Agent zephir Loaded: loaded (/lib/systemd/system/z_stats.service; enabled; vendor preset: enabled) Active: active (running) since Thu 2020-02-27 11:50:10 CET; 2h 41min ago Main PID: 2923 (twistd) Tasks: 1 (limit: 3546) CGroup: /system.slice/z_stats.service └─2923 /usr/bin/python2 /usr/bin/twistd --nodaemon --no_save --pidfile /run/z_stats.pid zephiragents --tmp=data --data=stats --archive=/tmp --static=static févr. 27 14:27:27 essl-271-6371 zephiragents[2923]: [-] Traceback (most recent call last): févr. 27 14:27:27 essl-271-6371 zephiragents[2923]: [-] Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: févr. 27 14:28:27 essl-271-6371 zephiragents[2923]: 2020-02-27T14:28:27+0100 [-] KernelMaintenance : pas de dernière mesure disponible. févr. 27 14:28:27 essl-271-6371 zephiragents[2923]: [-] KernelMaintenance : pas de dernière mesure disponible. févr. 27 14:30:27 essl-271-6371 zephiragents[2923]: 2020-02-27T14:30:27+0100 [-] KernelMaintenance : pas de dernière mesure disponible. févr. 27 14:30:27 essl-271-6371 zephiragents[2923]: [-] KernelMaintenance : pas de dernière mesure disponible. févr. 27 14:30:30 essl-271-6371 zephiragents[2923]: 2020-02-27T14:30:30+0100 [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool févr. 27 14:30:30 essl-271-6371 zephiragents[2923]: [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool févr. 27 14:30:30 essl-271-6371 zephiragents[2923]: 2020-02-27T14:30:30+0100 [-] got stderr: 'Traceback (most recent call last):\n File "/usr/share/eole/sbin/dhcp-too févr. 27 14:30:30 essl-271-6371 zephiragents[2923]: [-] got stderr: 'Traceback (most recent call last):\n File "/usr/share/eole/sbin/dhcp-tool", line 102, in <module>
root@essl-271-6371:~# systemctl start z_stats.service
Pas de message d'erreur
mais le service est toujours HS.
Si je réactive le dhcp , tout redevient OK.
Au final, il semble bien que ce soit dhcp-tool qui cause l'erreur (nouveauté uniquement présente sur les 2.7.1), en fonction de l'activation ou non du dhcp.
Et, ça ne concerne que les stats zephir, l'envoi de conf ou la sauvegarde vers zephir sont ok.
#4 Updated by équipe eole Academie d'Orléans-Tours over 3 years ago
Bonjour,
je plussoie cette demande sur des AMON 2.7.1 a jour.
Sur ma maquette, je reproduit ce problème, qui apparait sur tous nos AMON déployés en etab privé : aucun n'a de DHCP. RAS sur nos AMON en EPLE publique : on a partout du DHCP.
Sur maquette, la désactivation du service dhcp (pas besoin de reconfigure, juste CreoleSet + restart z_stats) donne dans /var/log/rsyslog/local/zephiragents/zephiragents.info.log :
2020-02-28T17:00:06.941839+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: 2020-02-28T17:00:06+0100 [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool 2020-02-28T17:00:06.942239+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents: [-] Erreur remontée par /usr/share/eole/sbin/dhcp-tool 2020-02-28T17:00:06.942606+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: 2020-02-28T17:00:06+0100 [-] got stderr: 'Traceback (most recent call last):\n' 2020-02-28T17:00:06.942807+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents: [-] got stderr: 'Traceback (most recent call last):\n' 2020-02-28T17:02:51.171601+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: 2020-02-28T17:02:51+0100 [twisted.internet.defer#critical] Unhandled error in Deferred: 2020-02-28T17:02:51.173427+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: 2020-02-28T17:02:51+0100 [twisted.internet.defer#critical] 2020-02-28T17:02:51.173608+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011Traceback (most recent call last): 2020-02-28T17:02:51.173806+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011Failure: twisted.internet.error.ProcessTerminated: A process has ended with a probable error condition: process ended with exit code 1. 2020-02-28T17:02:51.173937+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 2020-02-28T17:02:58.268781+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: 2020-02-28T17:02:58+0100 [_GenericHTTPChannelProtocol,373,127.0.0.1] Unhandled Error 2020-02-28T17:02:58.269502+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011Traceback (most recent call last): 2020-02-28T17:02:58.269627+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/twisted/web/server.py", line 195, in process 2020-02-28T17:02:58.269746+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 self.render(resrc) 2020-02-28T17:02:58.269862+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/twisted/web/server.py", line 255, in render 2020-02-28T17:02:58.270001+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 body = resrc.render(self) 2020-02-28T17:02:58.270144+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/twisted/web/resource.py", line 250, in render 2020-02-28T17:02:58.270282+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 return m(request) 2020-02-28T17:02:58.270404+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/twisted/web/xmlrpc.py", line 172, in render_POST 2020-02-28T17:02:58.270540+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 d = defer.maybeDeferred(function, *args) 2020-02-28T17:02:58.270672+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011--- <exception caught here> --- 2020-02-28T17:02:58.270797+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/twisted/internet/defer.py", line 150, in maybeDeferred 2020-02-28T17:02:58.270912+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 result = f(*args, **kw) 2020-02-28T17:02:58.271038+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/zephir/monitor/agentmanager/zephirservice.py", line 456, in xmlrpc_archive_for_upload 2020-02-28T17:02:58.271172+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 self.wakeup_for_upload(False) 2020-02-28T17:02:58.271304+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/zephir/monitor/agentmanager/zephirservice.py", line 281, in wakeup_for_upload 2020-02-28T17:02:58.271428+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 agent.archive() 2020-02-28T17:02:58.271547+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/zephir/monitor/agentmanager/agent.py", line 415, in archive 2020-02-28T17:02:58.271678+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 self.ensure_data_uptodate() 2020-02-28T17:02:58.271792+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/zephir/monitor/agentmanager/agent.py", line 426, in ensure_data_uptodate 2020-02-28T17:02:58.271916+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 self.write_data() 2020-02-28T17:02:58.272029+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 File "/usr/lib/python2.7/dist-packages/zephir/monitor/agents/dhcp.py", line 49, in write_data 2020-02-28T17:02:58.272141+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 self.table.table_data = self.last_measure.value['leases'] 2020-02-28T17:02:58.272253+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011exceptions.TypeError: 'NoneType' object has no attribute '__getitem__' 2020-02-28T17:02:58.272367+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: #011 2020-02-28T17:02:58.278959+01:00 amon36-0999999f.etab-maquette36-dsi.lan zephiragents[2472]: 2020-02-28T17:02:58+0100 [twisted.python.log#info] 127.0.0.1 - - [28/Feb/2020:16:02:56 +0000] "POST /xmlrpc HTTP/1.1" 200 263 "-" "xmlrpclib.py/1.0.1 (by www.pythonware.com)"
Et synchro_zephir devient alors inopérant.
Nicolas
#5 Updated by Joël Cuissinat over 3 years ago
- Related to Scénario #20753: Visibilité des baux DHCP restants (EAD3 et agent) added
#6 Updated by Joël Cuissinat over 3 years ago
- Subject changed from soucis de cohabitation z_stats et dhcp to Soucis de cohabitation z_stats et dhcp
- Parent task set to #29652
#7 Updated by Joël Cuissinat over 3 years ago
- Status changed from Nouveau to En cours
- Assigned To set to Joël Cuissinat
- Start date set to 03/04/2020
L'agent "dhcp" ajouté en 2.7.1 dans #20753 ne devait pas être chargé si le service DHCP n'est pas activé.
=> 3 paquets refaits et ajoutés à la candidate en cours
#8 Updated by Joël Cuissinat over 3 years ago
- Status changed from En cours to Résolu
- % Done changed from 0 to 100
#9 Updated by Daniel Dehennin over 3 years ago
Sur un Amon 2.7.1 avec les paquets à jour cela fonctionne correctement :
eole-server
2.7.1-59eole-dhcp
2.7.1-10zephir-client
2.7.1-16
#10 Updated by Daniel Dehennin over 3 years ago
- Status changed from Résolu to Fermé
- Remaining (hours) set to 0.0
#11 Updated by Thierry Bertrand almost 3 years ago
- Status changed from Fermé to Nouveau
- Assigned To deleted (
Joël Cuissinat) - Estimated time set to 0.00 h
#12 Updated by Thierry Bertrand almost 3 years ago
- Status changed from Nouveau to Fermé