Project

General

Profile

Tâche #19403

Scénario #18341: Les sauvegardes Bacula/Bareos ne doivent pas dépendre de creoled

Horus 2.5.2 : Erreur Bareos suite à un plantage de creoled

Added by Yoni Baude about 3 years ago. Updated almost 3 years ago.

Status:
Fermé
Priority:
Normal
Assigned To:
Start date:
02/28/2017
Due date:
% Done:

100%

Estimated time:
6.00 h
Spent time:
Remaining (hours):
0.0

Description

Bonjour,

Nous avons des erreurs de sauvegarde Bareos suite à un plantage du service creoled :

Extrait /var/log/rsyslog/local/bareos-dir/bareos-dir.err.log

2017-02-28T00:00:09.095014+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: BeforeJob: #033[31mErreur au test de montage : HTTP error: socket.error: [Errno 111] ECONNREFUSED
2017-02-28T00:00:09.095054+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: BeforeJob: Please check creoled's log (/var/log/creoled.log) and restart service with command 'service creoled start'
2017-02-28T00:00:09.112261+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: BeforeJob: #033[39;49m
2017-02-28T00:00:09.112305+01:00 ... bareos-dir: 127.0.0.1-dir JobId 25: Error: Runscript: BeforeJob returned non-zero status=1. ERR=Child exited with code 1

Extrait /var/log/creoled.log

2017-02-27 23:01:47,712: cherrypy.access.140188878139664 - 127.0.0.1 - - [27/Feb/2017:23:01:47] "GET /get/creole?variable=container_ip_fichier HTTP/1.1" 200 38 "" "restkit/4.2.2" 
2017-02-27 23:01:47,715: cherrypy.access.140188878139664 - 127.0.0.1 - - [27/Feb/2017:23:01:47] "GET /get/creole?variable=adresse_ip_gw HTTP/1.1" 200 41 "" "restkit/4.2.2" 
2017-02-27 23:01:48,472: cherrypy.access.140188878139664 - 127.0.0.1 - - [27/Feb/2017:23:01:48] "GET /get/creole?variable=container_path_mail HTTP/1.1" 200 29 "" "restkit/4.2.2" 
2017-02-27 23:09:25,065: cherrypy.error -  ENGINE Listening for SIGHUP.
2017-02-27 23:09:25,086: cherrypy.error -  ENGINE Listening for SIGTERM.
2017-02-27 23:09:25,086: cherrypy.error -  ENGINE Listening for SIGUSR1.
2017-02-27 23:09:25,086: cherrypy.error -  ENGINE Listening for SIGINT.
2017-02-27 23:09:25,129: cherrypy.error -  ENGINE Bus STARTING
2017-02-27 23:09:25,130: cherrypy.error -  ENGINE Forking once.
2017-02-27 23:09:25,130: cherrypy.error -  ENGINE Forking twice.
2017-02-27 23:09:25,130: cherrypy.error -  ENGINE Daemonized to PID: 1531
2017-02-27 23:09:25,131: cherrypy.error -  ENGINE Start InotifyMonitor thread
2017-02-27 23:09:25,131: cherrypy.error -  ENGINE Started monitor thread '_TimeoutMonitor'.
2017-02-27 23:09:25,131: cherrypy.error -  ENGINE PID 1531 written to '/run/creoled.pid'.
2017-02-27 23:09:25,132: cherrypy.error -  ENGINE Started monitor thread 'Autoreloader'.
2017-02-27 23:11:05,502: cherrypy.error -  ENGINE Error in 'start' listener <bound method Server.start of <cherrypy._cpserver.Server object at 0x7ff578f97590>>
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cherrypy/process/wspbus.py", line 197, in publish
    output.append(listener(*args, **kwargs))
  File "/usr/lib/python2.7/dist-packages/cherrypy/_cpserver.py", line 151, in start
    ServerAdapter.start(self)
  File "/usr/lib/python2.7/dist-packages/cherrypy/process/servers.py", line 174, in start
    self.wait()
  File "/usr/lib/python2.7/dist-packages/cherrypy/process/servers.py", line 214, in wait
    wait_for_occupied_port(host, port)
  File "/usr/lib/python2.7/dist-packages/cherrypy/process/servers.py", line 427, in wait_for_occupied_port
    raise IOError("Port %r not bound on %r" % (port, host))
IOError: Port 8000 not bound on '127.0.0.1'

2017-02-27 23:11:05,502: cherrypy.error -  ENGINE Shutting down due to error in start listener:
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/cherrypy/process/wspbus.py", line 235, in start
    self.publish('start')
  File "/usr/lib/python2.7/dist-packages/cherrypy/process/wspbus.py", line 215, in publish
    raise exc
ChannelFailures: IOError("Port 8000 not bound on '127.0.0.1'",)

2017-02-27 23:11:05,502: cherrypy.error -  ENGINE Bus STOPPING
2017-02-27 23:11:05,502: cherrypy.error -  ENGINE HTTP Server cherrypy._cpwsgi_server.CPWSGIServer(('127.0.0.1', 8000)) already shut down
2017-02-27 23:11:05,502: cherrypy.error -  ENGINE Stopped thread '_TimeoutMonitor'.
2017-02-27 23:11:05,502: cherrypy.error -  ENGINE Stopped thread 'Autoreloader'.
2017-02-27 23:11:05,502: cherrypy.error -  ENGINE Stop InotifyMonitor thread
2017-02-27 23:11:05,503: cherrypy.error -  ENGINE Bus STOPPED
2017-02-27 23:11:05,503: cherrypy.error -  ENGINE Bus EXITING
2017-02-27 23:11:05,503: cherrypy.error -  ENGINE PID file removed: '/run/creoled.pid'.
2017-02-27 23:11:05,503: cherrypy.error -  ENGINE Bus EXITED
2017-02-28 07:00:07,593: cherrypy.error -  ENGINE Listening for SIGHUP.
2017-02-28 07:00:07,594: cherrypy.error -  ENGINE Listening for SIGTERM.
2017-02-28 07:00:07,594: cherrypy.error -  ENGINE Listening for SIGUSR1.
2017-02-28 07:00:07,594: cherrypy.error -  ENGINE Listening for SIGINT.
2017-02-28 07:00:07,604: cherrypy.error -  ENGINE Bus STARTING
2017-02-28 07:00:07,606: cherrypy.error -  ENGINE Forking once.
2017-02-28 07:00:07,607: cherrypy.error -  ENGINE Forking twice.
2017-02-28 07:00:07,607: cherrypy.error -  ENGINE Daemonized to PID: 6336
2017-02-28 07:00:07,607: cherrypy.error -  ENGINE Start InotifyMonitor thread
2017-02-28 07:00:07,608: cherrypy.error -  ENGINE Started monitor thread '_TimeoutMonitor'.
2017-02-28 07:00:07,609: cherrypy.error -  ENGINE PID 6336 written to '/run/creoled.pid'.
2017-02-28 07:00:07,609: cherrypy.error -  ENGINE Started monitor thread 'Autoreloader'.
2017-02-28 07:00:07,736: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:07] "GET /get/creole?variable=eole_module HTTP/1.1" 200 34 "" "restkit/4.2.2" 
2017-02-28 07:00:07,737: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:07] "GET /get/creole?variable=eole_release HTTP/1.1" 200 34 "" "restkit/4.2.2" 
2017-02-28 07:00:07,815: cherrypy.error -  ENGINE Serving on 127.0.0.1:8000
2017-02-28 07:00:07,815: cherrypy.error -  ENGINE Bus STARTED
2017-02-28 07:00:07,972: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:07] "GET /get/creole?variable=mode_conteneur_actif HTTP/1.1" 200 32 "" "restkit/4.2.2" 
2017-02-28 07:00:08,003: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:08] "GET /get/containers/containers HTTP/1.1" 200 2917 "" "restkit/4.2.2" 
2017-02-28 07:00:08,782: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:08] "GET /get/containers?withoption=real_container&withvalue=root HTTP/1.1" 200 160026 "" "restkit/4.2.2" 
2017-02-28 07:00:08,815: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:08] "GET /get/containers/services HTTP/1.1" 200 13086 "" "restkit/4.2.2" 
2017-02-28 07:00:09,097: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:09,239: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:09,265: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:09,292: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:09,320: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:09,346: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:09,372: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:09,402: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:09] "GET /get/creole HTTP/1.1" 200 18945 "" "restkit/4.2.2" 
2017-02-28 07:00:24,912: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:24] "GET /get/creole?variable=adresse_ip_gw HTTP/1.1" 200 41 "" "restkit/4.2.2" 
2017-02-28 07:00:24,913: cherrypy.access.140206337360144 - 127.0.0.1 - - [28/Feb/2017:07:00:24] "GET /get/creole?variable=container_ip_fichier HTTP/1.1" 200 38 "" "restkit/4.2.2

Serait-il possible de tester l'état du service creoled au niveau de JobSchedulePre pour le relancer si besoins ?
Merci pour votre aide.

Cordialement,
Yoni

Associated revisions

Revision 060bfe2f (diff)
Added by Benjamin Bohard almost 3 years ago

Générer un fichier de configuration pour éviter les appels à Creole.

Ref #19403

Revision 34246c72 (diff)
Added by Benjamin Bohard almost 3 years ago

Ne pas faire appel à CreoleClient au moment de la sauvegarde.

Ref #19403

Revision d88db7c6 (diff)
Added by Benjamin Bohard almost 3 years ago

Rendre CreoleRun indépendant de creoled.

Ref #19403

Revision b7697852 (diff)
Added by Benjamin Bohard almost 3 years ago

Créer un template contenant les informations des conteneurs.

Ref #19403

Revision 9520835f (diff)
Added by Benjamin Bohard almost 3 years ago

Compléter le support du passage des informations sur les conteneurs en arguments.

Ref #19403

Revision eff5de02 (diff)
Added by Benjamin Bohard almost 3 years ago

Tenter de lire les fichiers mettant en cache les informations sur les conteneurs.

Ref #19403

Revision 19f5e63e (diff)
Added by Benjamin Bohard almost 3 years ago

Générer des fichiers de cache supplémentaires pour les informations des conteneurs.

Ref #19403

Revision 75df6824 (diff)
Added by Benjamin Bohard almost 3 years ago

Supprimer la dépendance directe à creoled dans les scripts schedule.

Ref #19403

Revision 66a55d6c (diff)
Added by Benjamin Bohard almost 3 years ago

Supprimer la dépendance directe à creoled pour le script schedule.

Ref #19403

Revision 9d221a6a (diff)
Added by Benjamin Bohard almost 3 years ago

Faire la correspondance entre les noms de variables.

Ref #19403

History

#1 Updated by Daniel Dehennin about 3 years ago

  • Parent task set to #18341

#2 Updated by Benjamin Bohard almost 3 years ago

  • Project changed from Distribution EOLE to eole-bacula
  • Status changed from Nouveau to En cours

#3 Updated by Benjamin Bohard almost 3 years ago

  • Assigned To set to Benjamin Bohard

#4 Updated by Benjamin Bohard almost 3 years ago

  • Estimated time set to 6.00 h
  • Remaining (hours) set to 2.0

Bibliothèque pyeole/bareos.py traité en séparant ce qui touche à l’aspect configuration (appels à creoled conservés pour la templatisation notamment) et ce qui ne concerne que l’exécution de la sauvegarde.

#5 Updated by Benjamin Bohard almost 3 years ago

Gros impact avec l’implication des scripts schedule.

Temps de génération du cache des variables important allongeant sensiblement le reconfigure (+30 s).

3 s pour la génération du fichier de variables sur les conteneurs (utilisé en source par les scripts bash).

27 s pour la génération du fichier d’informations sur les conteneurs (utilisé pour le script python CreoleService).

#6 Updated by Benjamin Bohard almost 3 years ago

  • Status changed from En cours to Fermé
  • % Done changed from 0 to 100
  • Remaining (hours) changed from 2.0 to 0.0

Modifications non intégrées : pas de garantie de stabilité (ajout de scripts schedule pouvant tout remettre en cause, etc.)

Voir la conclusion de la tâche https://dev-eole.ac-dijon.fr/issues/19915.

Also available in: Atom PDF