Project

General

Profile

Tâche #21911

Scénario #22741: Traitement express MEN (05-07)

La sortie du diagnose haute dispos doit être lisible même en cas d'erreur

Added by Joël Cuissinat almost 6 years ago. Updated over 5 years ago.

Status:
Fermé
Priority:
Normal
Start date:
02/09/2018
Due date:
% Done:

100%

Estimated time:
0.25 h
Spent time:
Remaining (hours):
0.0

Description

Correction SP-T04-003 - Comportement du cluster pendant reconfigure du noeud Sphynx esclave 2.6.2beta5
http://squash-tm.eole.lan/squash/executions/6138

crm_mon :

Last updated: Mon Nov  6 15:33:31 2017          Last change: Mon Nov  6 15:31:33 2017 by root via cibadmin on sp
hynxb
Stack: corosync
Current DC: sphynx (version 1.1.14-70404b0) - partition with quorum
2 nodes and 6 resources configured

Online: [ sphynx sphynxb ]

 Resource Group: VIPCluster
     VIP_externe        (ocf::heartbeat:IPaddr2):       Started sphynx
     VIP_interne        (ocf::heartbeat:IPaddr2):       Started sphynx
     ipsec_rsc  (service:strongswan):   FAILED[ sphynx sphynxb ]
     arv_rsc    (service:arv):  FAILED[ sphynx sphynxb ]
 Clone Set: gw_pingd_clone [gw_pingd]
     gw_pingd   (ocf::pacemaker:ping):  FAILED sphynxb
     Started: [ sphynx ]

Failed Actions:
* gw_pingd_monitor_0 on sphynxb 'unknown error' (1): call=-1, status=Timed Out, exitreason='none',
    last-rc-change='Mon Nov  6 15:32:40 2017', queued=0ms, exec=0ms

diagnose :

*** Haute disponibilité
.            Service Corosync => OK
.                       Noeud sphynx => OK
.                       Noeud sphynxb => OK
.                      Update => 6/Nov/2017 15:35:24
.                   Ressource ipsec_rsc => $3 (sphynx)
.                   Ressource gw_pingd => OK (sphynx)
.                   Ressource VIP_interne => OK (sphynx)
.                   Ressource arv_rsc => $3 (sphynx)
.                   Ressource VIP_externe => OK (sphynx)
.                   Ressource gw_pingd => $3 (sphynxb)
.                   Ressource ipsec_rsc => $3 (sphynxb)
.                   Ressource arv_rsc => $3 (sphynxb)

Associated revisions

Revision c28cea6a (diff)
Added by Fabrice Barconnière over 5 years ago

Le status d'une ressource n'était pas correctement affiché en cas de problème sur celle-ci

ref #21911

History

#1 Updated by Scrum Master almost 6 years ago

  • Parent task deleted (#21800)

#2 Updated by Scrum Master almost 6 years ago

  • Tracker changed from Tâche to Proposition Scénario
  • Subject changed from Correction SP-T04-003 - Comportement du cluster pendant reconfigure du noeud Sphynx esclave 2.6.2beta5 to La sortie du diagnose haute dispos doit être lisible même en cas d'erreur
  • Description updated (diff)

#3 Updated by Scrum Master almost 6 years ago

  • Target version deleted (sprint 2017 43-45 Equipe MENSR)

#4 Updated by Gilles Grandgérard over 5 years ago

  • Tracker changed from Proposition Scénario to Tâche
  • Parent task set to #22741

#5 Updated by Gilles Grandgérard over 5 years ago

pour 2.6.2

#6 Updated by Fabrice Barconnière over 5 years ago

  • Status changed from Nouveau to En cours
  • Start date set to 02/09/2018

#7 Updated by Fabrice Barconnière over 5 years ago

  • Assigned To set to Fabrice Barconnière

#8 Updated by Fabrice Barconnière over 5 years ago

  • Status changed from En cours to Résolu
  • % Done changed from 0 to 100
  • Estimated time set to 0.25 h
  • Remaining (hours) set to 0.25
Pour tester :
  • Créer un fichier exécutable /root/mon_crm_mon contenant :
    echo "Last updated: Fri Feb  9 13:26:18 2018        Last change: Fri Feb  9 13:26:13 2018 by hacluster via crmd on sphynx
    Stack: corosync
    Current DC: sphynx (version 1.1.14-70404b0) - partition with quorum
    2 nodes and 6 resources configured
    
    Node sphynx: online
        gw_pingd    (ocf::pacemaker:ping):  Started
        ipsec_rsc   (service:strongswan):   Started
        arv_rsc (service:arv):  FAILED
        VIP_interne (ocf::heartbeat:IPaddr2):   Started
        VIP_externe (ocf::heartbeat:IPaddr2):   Started
    Node sphynxslave: online
        gw_pingd    (ocf::pacemaker:ping):  Started
    
    Failed Actions:
    * arv_rsc_start_0 on sphynx 'not running' (7): call=382, status=complete, exitreason='none',
        last-rc-change='Fri Feb  9 13:26:13 2018', queued=0ms, exec=2021ms
    " 
    
  • Modifier /usr/share/eole/diagnose/150-ha pour utiliser la fausse sortie de crm_mon :
    ....
    ....
    function _crm_mon_info_get()
    {
    
      /root/mon_crm_mon | \
      #crm_mon -1 -n 2>&1 | \
        awk -v len_pf=${len_pf} \
    ....
    ....
    
  • Lancer diagnose pour constater

#9 Updated by Emmanuel GARETTE over 5 years ago

  • Status changed from Résolu to Fermé
  • Remaining (hours) changed from 0.25 to 0.0

Attention au "\t" a la place des espaces.

root@sphynx:~# /usr/share/eole/diagnose/150-ha
*** Haute disponibilité
.            Service Corosync => OK
.                       Noeud sphynx => OK
.                      Update => 9/Feb/2018 16:00:58
.                   Ressource ipsec_rsc => OK (sphynx)
.                   Ressource VIP_interne => OK (sphynx)
.                   Ressource arv_rsc => FAILED (sphynx)
.                   Ressource VIP_externe => OK (sphynx)
.                   Ressource gw_pingd => OK (sphynx)

Also available in: Atom PDF