https://dev-eole.ac-dijon.fr/
https://dev-eole.ac-dijon.fr/favicon.ico
2020-08-28T14:44:23Z
Ensemble Ouvert Libre Évolutif
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146303
2020-08-28T14:44:23Z
Benjamin Bohard
bbohard@cadoles.com
<ul></ul><p>En conservant le plus longtemps possible le contenu échappé et en appelant la méthode unquote au dernier moment, lorsqu’on sait qu’on cherche les en-têtes du fichier be1d, on peut déterminer l’encodage avec plus d’assurance.</p>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146304
2020-08-28T14:44:55Z
Benjamin Bohard
bbohard@cadoles.com
<ul><li><strong>Statut</strong> changé de <i>Nouveau</i> à <i>En cours</i></li></ul>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146325
2020-08-31T12:00:28Z
Joël Cuissinat
joel.cuissinat@ac-dijon.fr
<ul></ul><p>C'est mieux, mais le test est toujours en erreur :<br /><a class="external" href="https://dev-eole.ac-dijon.fr/jenkins/job/2.8.0/job/test-importation-acascribe-special-2.8.0-amd64/13/console">https://dev-eole.ac-dijon.fr/jenkins/job/2.8.0/job/test-importation-acascribe-special-2.8.0-amd64/13/console</a><br /><pre>
INFO:scribe.importation:## Lecture des élèves... ##
ERROR:scribe.importation:local variable 'encoding' referenced before assignment
AUTOMATE : Traceback dans la sortie console!
Traceback (most recent call last):
File "/usr/share/ead2/backend/bin/importation.py", line 257, in do_parse_be1d
be1d.parse_be1d_eleves(self.store, eleve)
File "/usr/lib/python3/dist-packages/scribe/parsing/be1d.py", line 49, in parse_be1d_eleves
unquoted_csv = unquote(quoted_csv, encoding=encoding)
UnboundLocalError: local variable 'encoding' referenced before assignment
</pre></p>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146365
2020-09-02T08:40:55Z
Daniel Dehennin
<ul></ul><p>Je pense que l’utilisation de <a href="https://github.com/chardet/chardet" class="external">chardet</a> pourrait rendre le code plus sexy et prendre en compte plus de cas du coup :</p>
<pre><code class="diff syntaxhl"><span class="CodeRay"><span class="line comment">diff --git a/scribe/parsing/be1d.py b/scribe/parsing/be1d.py</span>
<span class="line comment">index d0561b1..2ddf9aa 100644</span>
<span class="line head"><span class="head">--- </span><span class="filename">a/scribe/parsing/be1d.py</span></span>
<span class="line head"><span class="head">+++ </span><span class="filename">b/scribe/parsing/be1d.py</span></span>
<span class="line change"><span class="change">@@</span> -16,6 +16,8 <span class="change">@@</span></span>
<span class="keyword">from</span> <span class="include">csv</span> <span class="keyword">import</span> <span class="include">DictReader</span>, <span class="include">reader</span>
<span class="keyword">from</span> <span class="include">urllib.parse</span> <span class="keyword">import</span> <span class="include">unquote</span>
<span class="keyword">from</span> <span class="include">sqlalchemy</span> <span class="keyword">import</span> <span class="include">and_</span>
<span class="line insert"><span class="insert">+</span><span class="keyword">from</span> <span class="include">chardet</span> <span class="keyword">import</span> <span class="include">detect</span></span>
<span class="line insert"><span class="insert">+</span></span>
<span class="keyword">from</span> <span class="include">scribe.eoletools</span> <span class="keyword">import</span> <span class="include">replace_cars</span>, <span class="include">convert_file</span>, <span class="include">formate_date</span>, <span class="error">\</span>
formate_civilite, replace_more_cars
<span class="keyword">from</span> <span class="include">scribe.importation</span> <span class="keyword">import</span> <span class="include">log</span>
<span class="change"><span class="change">@@</span> -34,23 +36,22 <span class="change">@@</span></span> <span class="keyword">def</span> <span class="function">parse_be1d_eleves</span>(store, csvfile):
<span class="docstring"><span class="delimiter">"""</span><span class="content"> </span></span>
num = <span class="integer">0</span>
log.infolog(<span class="string"><span class="delimiter">"</span><span class="content">Lecture des élèves...</span><span class="delimiter">"</span></span>, title=<span class="predefined-constant">True</span>)
<span class="line delete"><span class="delete">-</span> <span class="comment"># fichier en UTF-8 ?</span></span>
<span class="line delete"><span class="delete">-</span> <span class="comment"># le caractère è est encodé avec %C3%A9 en utf-8, %E8 en iso-8859-1</span></span>
<span class="line delete"><span class="delete">-</span> <span class="comment"># ce qui se trouve entre le premier l et le premier v permet de déduire l’encodage.</span></span>
<span class="line delete"><span class="delete">-</span> <span class="keyword">with</span> <span class="predefined">open</span>(csvfile, <span class="string"><span class="delimiter">'</span><span class="content">r</span><span class="delimiter">'</span></span>) <span class="keyword">as</span> csvstream:</span>
<span class="line delete"><span class="delete">-</span> quoted_csv = csvstream.read()</span>
<span class="line delete"><span class="delete">-</span> first_l_index = quoted_csv.index(<span class="string"><span class="delimiter">'</span><span class="content">l</span><span class="delimiter">'</span></span>)</span>
<span class="line delete"><span class="delete">-</span> first_v_index = quoted_csv.index(<span class="string"><span class="delimiter">'</span><span class="content">v</span><span class="delimiter">'</span></span>)</span>
<span class="line delete"><span class="delete">-</span> marker = quoted_csv[first_l_index + <span class="integer">1</span>: first_v_index].upper()</span>
<span class="line delete"><span class="delete">-</span> <span class="keyword">if</span> marker == <span class="string"><span class="delimiter">'</span><span class="content">%C3%A8</span><span class="delimiter">'</span></span>:</span>
<span class="line delete"><span class="delete">-</span> encoding = <span class="string"><span class="delimiter">'</span><span class="content">utf-8</span><span class="delimiter">'</span></span></span>
<span class="line delete"><span class="delete">-</span> <span class="keyword">elif</span> marker == <span class="string"><span class="delimiter">'</span><span class="content">%E8</span><span class="delimiter">'</span></span>:</span>
<span class="line delete"><span class="delete">-</span> encoding = <span class="string"><span class="delimiter">'</span><span class="content">iso-8859-1</span><span class="delimiter">'</span></span></span>
<span class="line delete"><span class="delete">-</span> <span class="keyword">else</span>:</span>
<span class="line delete"><span class="delete">-</span> <span class="keyword">raise</span> <span class="exception">Exception</span>(<span class="string"><span class="delimiter">"</span><span class="content">L’encodage du fichier source n’a pas pu être déterminé</span><span class="delimiter">"</span></span>)</span>
<span class="line delete"><span class="delete">-</span> unquoted_csv = unquote(quoted_csv, encoding=encoding)</span>
<span class="line insert"><span class="insert">+</span></span>
<span class="line insert"><span class="insert">+</span> <span class="comment"># Detect file encoding</span></span>
<span class="line insert"><span class="insert">+</span> <span class="keyword">with</span> <span class="predefined">open</span>(csvfile, <span class="string"><span class="delimiter">'</span><span class="content">rb</span><span class="delimiter">'</span></span>) <span class="keyword">as</span> csvstream:</span>
<span class="line insert"><span class="insert">+</span> sample = csvstream.read(<span class="integer">4096</span>)</span>
<span class="line insert"><span class="insert">+</span> encoding_detection = detect(sample)</span>
<span class="line insert"><span class="insert">+</span> <span class="keyword">if</span> encoding_detection[<span class="string"><span class="delimiter">'</span><span class="content">confidence</span><span class="delimiter">'</span></span>] < <span class="float">0.5</span>:</span>
<span class="line insert"><span class="insert">+</span> <span class="keyword">raise</span> <span class="exception">Exception</span>(f<span class="string"><span class="delimiter">"</span><span class="content">L’encodage du fichier source n’a pas pu être déterminé</span><span class="delimiter">"</span></span>)</span>
<span class="line insert"><span class="insert">+</span></span>
<span class="line insert"><span class="insert">+</span> <span class="keyword">with</span> <span class="predefined">open</span>(csvfile, <span class="string"><span class="delimiter">'</span><span class="content">r</span><span class="delimiter">'</span></span>, encoding=encoding_detection[<span class="string"><span class="delimiter">'</span><span class="content">encoding</span><span class="delimiter">'</span></span>]) <span class="keyword">as</span> csvstream:</span>
<span class="line insert"><span class="insert">+</span> quoted_csv = csvstream.read()</span>
<span class="line insert"><span class="insert">+</span></span>
<span class="line insert"><span class="insert">+</span> unquoted_csv = unquote(quoted_csv, encoding=encoding_detection[<span class="string"><span class="delimiter">'</span><span class="content">encoding</span><span class="delimiter">'</span></span>])</span>
<span class="line insert"><span class="insert">+</span></span>
<span class="keyword">with</span> <span class="predefined">open</span>(csvfile, <span class="string"><span class="delimiter">'</span><span class="content">w</span><span class="delimiter">'</span></span>) <span class="keyword">as</span> csvstream:
csvstream.write(unquoted_csv)
<span class="line insert"><span class="insert">+</span></span>
source = <span class="string"><span class="delimiter">'</span><span class="content">onde</span><span class="delimiter">'</span></span>
<span class="comment"># champs obligatoires</span>
onde_fields = [<span class="string"><span class="delimiter">'</span><span class="content">Nom élève</span><span class="delimiter">'</span></span>, <span class="string"><span class="delimiter">'</span><span class="content">Prénom élève</span><span class="delimiter">'</span></span>, <span class="string"><span class="delimiter">'</span><span class="content">Date naissance</span><span class="delimiter">'</span></span>, <span class="string"><span class="delimiter">'</span><span class="content">Sexe</span><span class="delimiter">'</span></span>, <span class="string"><span class="delimiter">'</span><span class="content">INE</span><span class="delimiter">'</span></span>,
</span></code></pre>
<p>La demande est toujours en cours, puis-je pousser ma modification ?</p>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146397
2020-09-03T11:54:26Z
Daniel Dehennin
<ul></ul><p>Cet encodage est fait par nous même dans l’EAD pour faire passer le contenu du service web au backend EAD par XMLRPC.</p>
<p>J’ai finalement troqué <strong><code>quote/unquote</code></strong> de <strong><code>urllib</code></strong>, qui est normalement prévu pour l’encodage d’URL, par un encodage <strong><code>base64</code></strong>.</p>
<p>Tout ceci se fait dans l’EAD et ne doit pas impacter les routines d’importation Scribe.</p>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146482
2020-09-08T08:34:45Z
Joël Cuissinat
joel.cuissinat@ac-dijon.fr
<ul><li><strong>Assigné à</strong> mis à <i>Daniel Dehennin</i></li></ul>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146484
2020-09-08T08:34:47Z
Joël Cuissinat
joel.cuissinat@ac-dijon.fr
<ul><li><strong>Statut</strong> changé de <i>En cours</i> à <i>Résolu</i></li></ul>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146486
2020-09-08T08:58:30Z
Joël Cuissinat
joel.cuissinat@ac-dijon.fr
<ul><li><strong>Statut</strong> changé de <i>Résolu</i> à <i>Fermé</i></li><li><strong>Restant à faire (heures)</strong> mis à <i>0.0</i></li></ul><p>Le test Jenkins plante toujours mais plus loin :o (<a class="issue tracker-6 status-5 priority-4 priority-default closed child" title="Tâche: Il reste encore une erreur dans l'importation ONDE/Be1D (Fermé)" href="https://dev-eole.ac-dijon.fr/issues/30584">#30584</a>)<br />J'ai testé un import de fichier par l'EAD sans erreur.</p>
Distribution EOLE - Tâche #30534: Erreur de "décodage" du contenu du fichier téléchargé pour l’import be1d
https://dev-eole.ac-dijon.fr/issues/30534?journal_id=146490
2020-09-08T09:03:23Z
Joël Cuissinat
joel.cuissinat@ac-dijon.fr
<ul><li><strong>% réalisé</strong> changé de <i>0</i> à <i>100</i></li><li><strong>Temps estimé</strong> mis à <i>0.00 h</i></li></ul>