View Issue Details

IDProjectCategoryView StatusLast Update
0007390Talerwallet (TS core)public2022-11-04 20:52
Reporterttn Assigned ToChristian Grothoff  
PrioritynormalSeverityminorReproducibilityhave not tried
Status closedResolutionfixed 
Product Versiongit (master) 
Target Version0.9Fixed in Version0.9 
Summary0007390: CS 4.1.1 (LEVEL A) -- Parsing
DescriptionIn content implemented using markup languages, elements have complete start
and end tags, elements are nested according to their specifications, elements do
not contain duplicate attributes, and any IDs are unique, except where the
specifications allow these features.

The following page has an HTML attribute and element that don't
have ending or starting tags that comply with the HTML
specifications:
https://shop.demo.taler.net/en/ (See the extra quote in the
attribute name at line 203, column 114 and the

end tag
while there is no open

element From line 1941, column 133;
to line 1941, column 136.)

TagsNo tags attached.

Activities

Christian Grothoff

2022-10-10 22:22

manager   ~0019214

I think I've fixed the specific parse errors, but we should double-check once the demo has been updated AND probably re-check "all" pages.

sebasjm

2022-10-19 06:31

developer   ~0019236

I'm using https://validator.w3.org/ to check

The blog has a problem with some articles, in particular with "Only the Free World Can Stand Up to Microsoft" but it may happen with others:

The python script read the html article and tries to get the subtitle from a paragraph.
The first paragraph in this article has an email address and the text has a "greater than" email "lesser than" which is pasted verbatim and sent to the browser with an email as an invalid html tag.

Christian Grothoff

2022-10-19 08:28

manager   ~0019244

I've modified the content.py extraction logic to go for the first li or p element in the body, which should get the proper teaser for the free-world-microsoft article. Furthermore, I've changed the code to use '.prettify()', which should mean it now outputs HTML instead of text with <>. Not yet tested, got to run ;-).

Christian Grothoff

2022-10-21 00:21

manager   ~0019262

Parses fine now.

Issue History

Date Modified Username Field Change
2022-10-10 21:55 ttn New Issue
2022-10-10 21:55 ttn Status new => assigned
2022-10-10 21:55 ttn Assigned To => sebasjm
2022-10-10 22:22 Christian Grothoff Note Added: 0019214
2022-10-19 06:31 sebasjm Note Added: 0019236
2022-10-19 08:04 sebasjm Assigned To sebasjm => Christian Grothoff
2022-10-19 08:28 Christian Grothoff Note Added: 0019244
2022-10-20 11:39 Christian Grothoff Target Version git (master) => 0.9
2022-10-21 00:21 Christian Grothoff Status assigned => resolved
2022-10-21 00:21 Christian Grothoff Resolution open => fixed
2022-10-21 00:21 Christian Grothoff Fixed in Version => 0.9
2022-10-21 00:21 Christian Grothoff Note Added: 0019262
2022-11-04 20:52 Christian Grothoff Status resolved => closed