Tâche #19403
Scénario #18341: Les sauvegardes Bacula/Bareos ne doivent pas dépendre de creoled
Horus 2.5.2 : Erreur Bareos suite à un plantage de creoled
Description
Bonjour,
Nous avons des erreurs de sauvegarde Bareos suite à un plantage du service creoled :
Extrait /var/log/rsyslog/local/bareos-dir/bareos-dir.err.log
2017-02-28T00:00:09.095014+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: BeforeJob: #033[31mErreur au test de montage : HTTP error: socket.error: [Errno 111] ECONNREFUSED 2017-02-28T00:00:09.095054+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: BeforeJob: Please check creoled's log (/var/log/creoled.log) and restart service with command 'service creoled start' 2017-02-28T00:00:09.112261+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: BeforeJob: #033[39;49m 2017-02-28T00:00:09.112305+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: Error: Runscript: BeforeJob returned non-zero status=1. ERR=Child exited with code 1
Extrait /var/log/creoled.log
2017-02-27 23:01:47,712: cherrypy.access.140188878139664 - 127.0.0.1 - - [27/Feb/2017:23:01:47] "GET /get/creole?variable=container_ip_fichier HTTP/1.1" 200 38 "" "restkit/4.2.2" 2017-02-27 23:01:47,715: cherrypy.access.140188878139664 - 127.0.0.1 - - [27/Feb/2017:23:01:47] "GET /get/creole?variable=adresse_ip_gw HTTP/1.1" 200 41 "" "restkit/4.2.2" 2017-02-27 23:01:48,472: cherrypy.access.140188878139664 - 127.0.0.1 - - [27/Feb/2017:23:01:48] "GET /get/creole?variable=container_path_mail HTTP/1.1" 200 29 "" "restkit/4.2.2" 2017-02-27 23:09:25,065: cherrypy.error - ENGINE Listening for SIGHUP. 2017-02-27 23:09:25,086: cherrypy.error - ENGINE Listening for SIGTERM. 2017-02-27 23:09:25,086: cherrypy.error - ENGINE Listening for SIGUSR1. 2017-02-27 23:09:25,086: cherrypy.error - ENGINE Listening for SIGINT. 2017-02-27 23:09:25,129: cherrypy.error - ENGINE Bus STARTING 2017-02-27 23:09:25,130: cherrypy.error - ENGINE Forking once. 2017-02-27 23:09:25,130: cherrypy.error - ENGINE Forking twice. 2017-02-27 23:09:25,130: cherrypy.error - ENGINE Daemonized to PID: 1531 2017-02-27 23:09:25,131: cherrypy.error - ENGINE Start InotifyMonitor thread 2017-02-27 23:09:25,131: cherrypy.error - ENGINE Started monitor thread '_TimeoutMonitor'. 2017-02-27 23:09:25,131: cherrypy.error - ENGINE PID 1531 written to '/run/creoled.pid'. 2017-02-27 23:09:25,132: cherrypy.error - ENGINE Started monitor thread 'Autoreloader'. 2017-02-27 23:11:05,502: cherrypy.error - ENGINE Error in 'start' listener <bound method Server.start of <cherrypy._cpserver.Server object at 0x7ff578f97590>> Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/cherrypy/process/wspbus.py", line 197, in publish output.append(listener(*args, **kwargs)) File "/usr/lib/python2.7/dist-packages/cherrypy/_cpserver.py", line 151, in start ServerAdapter.start(self) File "/usr/lib/python2.7/dist-packages/cherrypy/process/servers.py", line 174, in start self.wait() File "/usr/lib/python2.7/dist-packages/cherrypy/process/servers.py", line 214, in wait wait_for_occupied_port(host, port) File "/usr/lib/python2.7/dist-packages/cherrypy/process/servers.py", line 427, in wait_for_occupied_port raise IOError("Port %r not bound on %r" % (port, host)) IOError: Port 8000 not bound on '127.0.0.1' 2017-02-27 23:11:05,502: cherrypy.error - ENGINE Shutting down due to error in start listener: Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/cherrypy/process/wspbus.py", line 235, in start self.publish('start') File "/usr/lib/python2.7/dist-packages/cherrypy/process/wspbus.py", line 215, in publish raise exc ChannelFailures: IOError("Port 8000 not bound on '127.0.0.1'",) 2017-02-27 23:11:05,502: cherrypy.error - ENGINE Bus STOPPING 2017-02-27 23:11:05,502: cherrypy.error - ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('127.0.0.1', 8000)) already shut down 2017-02-27 23:11:05,502: cherrypy.error - ENGINE Stopped thread '_TimeoutMonitor'. 2017-02-27 23:11:05,502: cherrypy.error - ENGINE Stopped thread 'Autoreloader'. 2017-02-27 23:11:05,502: cherrypy.error - ENGINE Stop InotifyMonitor thread 2017-02-27 23:11:05,503: cherrypy.error - ENGINE Bus STOPPED 2017-02-27 23:11:05,503: cherrypy.error - ENGINE Bus EXITING 2017-02-27 23:11:05,503: cherrypy.error - ENGINE PID file removed: '/run/creoled.pid'. 2017-02-27 23:11:05,503: cherrypy.error - ENGINE Bus EXITED 2017-02-28 07:00:07,593: cherrypy.error - ENGINE Listening for SIGHUP. 2017-02-28 07:00:07,594: cherrypy.error - ENGINE Listening for SIGTERM. 2017-02-28 07:00:07,594: cherrypy.error - ENGINE Listening for SIGUSR1. 2017-02-28 07:00:07,594: cherrypy.error - ENGINE Listening for SIGINT. 2017-02-28 07:00:07,604: cherrypy.error - ENGINE Bus STARTING 2017-02-28 07:00:07,606: cherrypy.error - ENGINE Forking once. 2017-02-28 07:00:07,607: cherrypy.error - ENGINE Forking twice. 2017-02-28 07:00:07,607: cherrypy.error - ENGINE Daemonized to PID: 6336 2017-02-28 07:00:07,607: cherrypy.error - ENGINE Start InotifyMonitor thread 2017-02-28 07:00:07,608: cherrypy.error - ENGINE Started monitor thread '_TimeoutMonitor'. 2017-02-28 07:00:07,609: cherrypy.error - ENGINE PID 6336 written to '/run/creoled.pid'. 2017-02-28 07:00:07,609: cherrypy.error - ENGINE Started monitor thread 'Autoreloader'. 2017-02-28 07:00:07,736: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:07] "GET /get/creole?variable=eole_module HTTP/1.1" 200 34 "" "restkit/4.2.2" 2017-02-28 07:00:07,737: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:07] "GET /get/creole?variable=eole_release HTTP/1.1" 200 34 "" "restkit/4.2.2" 2017-02-28 07:00:07,815: cherrypy.error - ENGINE Serving on 127.0.0.1:8000 2017-02-28 07:00:07,815: cherrypy.error - ENGINE Bus STARTED 2017-02-28 07:00:07,972: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:07] "GET /get/creole?variable=mode_conteneur_actif HTTP/1.1" 200 32 "" "restkit/4.2.2" 2017-02-28 07:00:08,003: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:08] "GET /get/containers/containers HTTP/1.1" 200 2917 "" "restkit/4.2.2" 2017-02-28 07:00:08,782: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:08] "GET /get/containers?withoption=real_container&withvalue=root HTTP/1.1" 200 160026 "" "restkit/4.2.2" 2017-02-28 07:00:08,815: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:08] "GET /get/containers/services HTTP/1.1" 200 13086 "" "restkit/4.2.2" 2017-02-28 07:00:09,097: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:09,239: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:09,265: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:09,292: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:09,320: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:09,346: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:09,372: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:09,402: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 2017-02-28 07:00:24,912: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:24] "GET /get/creole?variable=adresse_ip_gw HTTP/1.1" 200 41 "" "restkit/4.2.2" 2017-02-28 07:00:24,913: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:24] "GET /get/creole?variable=container_ip_fichier HTTP/1.1" 200 38 "" "restkit/4.2.2
Serait-il possible de tester l'état du service creoled au niveau de JobSchedulePre pour le relancer si besoins ?
Merci pour votre aide.
Cordialement,
Yoni
Associated revisions
Générer un fichier de configuration pour éviter les appels à Creole.
Ref #19403
Ne pas faire appel à CreoleClient au moment de la sauvegarde.
Ref #19403
Rendre CreoleRun indépendant de creoled.
Ref #19403
Créer un template contenant les informations des conteneurs.
Ref #19403
Compléter le support du passage des informations sur les conteneurs en arguments.
Ref #19403
Tenter de lire les fichiers mettant en cache les informations sur les conteneurs.
Ref #19403
Générer des fichiers de cache supplémentaires pour les informations des conteneurs.
Ref #19403
Supprimer la dépendance directe à creoled dans les scripts schedule.
Ref #19403
Supprimer la dépendance directe à creoled pour le script schedule.
Ref #19403
Faire la correspondance entre les noms de variables.
Ref #19403
History
#1 Updated by Daniel Dehennin over 6 years ago
- Parent task set to #18341
#2 Updated by Benjamin Bohard over 6 years ago
- Project changed from Distribution EOLE to eole-bacula
- Status changed from Nouveau to En cours
#3 Updated by Benjamin Bohard over 6 years ago
- Assigned To set to Benjamin Bohard
#4 Updated by Benjamin Bohard over 6 years ago
- Estimated time set to 6.00 h
- Remaining (hours) set to 2.0
Bibliothèque pyeole/bareos.py traité en séparant ce qui touche à l’aspect configuration (appels à creoled conservés pour la templatisation notamment) et ce qui ne concerne que l’exécution de la sauvegarde.
#5 Updated by Benjamin Bohard over 6 years ago
Gros impact avec l’implication des scripts schedule.
Temps de génération du cache des variables important allongeant sensiblement le reconfigure (+30 s).
3 s pour la génération du fichier de variables sur les conteneurs (utilisé en source par les scripts bash).
27 s pour la génération du fichier d’informations sur les conteneurs (utilisé pour le script python CreoleService).
#6 Updated by Benjamin Bohard over 6 years ago
- Status changed from En cours to Fermé
- % Done changed from 0 to 100
- Remaining (hours) changed from 2.0 to 0.0
Modifications non intégrées : pas de garantie de stabilité (ajout de scripts schedule pouvant tout remettre en cause, etc.)
Voir la conclusion de la tâche https://dev-eole.ac-dijon.fr/issues/19915.