View Issue Details
| ID | Project | Category | View Status | Date Submitted | Last Update |
|---|---|---|---|---|---|
| 0010570 | Taler | Web site(s) | public | 2025-11-09 23:19 | 2025-11-09 23:24 |
| Reporter | hga3 | Assigned To | hga3 | ||
| Priority | normal | Severity | minor | Reproducibility | always |
| Status | assigned | Resolution | open | ||
| Product Version | git (master) | ||||
| Summary | 0010570: Site gen failed when parsing news in function cut_text | ||||
| Description | env "BASEURL=" ./inc/build-site Traceback (most recent call last): File "/home/user/www.taler.net/./inc/build-site", line 27, in <module> main() File "/home/user/www.taler.net/./inc/build-site", line 24, in main x.run() File "/home/user/www.taler.net/inc/sitegen/site.py", line 307, in run self.run_localized(locale, tr) File "/home/user/www.taler.net/inc/sitegen/site.py", line 225, in run_localized content = tmpl.render( ^^^^^^^^^^^^ File "/home/user/www.taler.net/.venv/lib/python3.12/site-packages/jinja2/environment.py", line 1295, in render self.environment.handle_exception() File "/home/user/www.taler.net/.venv/lib/python3.12/site-packages/jinja2/environment.py", line 942, in handle_exception raise rewrite_traceback_stack(source=source) File "/home/user/www.taler.net/template/rss.xml.j2", line 39, in top-level template code {{ get_abstract('news/' + newspostitem['page'], 1000) }} ^^^^^^^^^^^^^^^^ File "/home/user/www.taler.net/inc/sitegen/site.py", line 152, in get_abstract return cut_text(root / "template" / (name + ".j2"), length) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/user/www.taler.net/inc/sitegen/site.py", line 62, in cut_text for i in soup.findAll("p")[1]: ~~~~~~~~~~~~~~~~~^^^ IndexError: list index out of range make: *** [Makefile:11: site] Error 1 | ||||
| Tags | No tags attached. | ||||
|
|
Resolved with attached patch. The patch is applied and, after successful local testing, pushed to upstream onto "c76de6d942874529eaba97bc3d065863bf1b27d1" of git://git.gnunet.org/www_shared.git. sitegen.site.py.patch (1,248 bytes)
diff --git a/sitegen/site.py b/sitegen/site.py
index b7fbf6b..4aaddd3 100644
--- a/sitegen/site.py
+++ b/sitegen/site.py
@@ -58,13 +58,25 @@ def cut_text(filename, count):
soup = BeautifulSoup(html, features="lxml")
for script in soup(["script", "style"]):
script.extract()
- k = []
- for i in soup.findAll("p")[1]:
- k.append(i)
- b = "".join(str(e) for e in k)
- text = html2text(b.replace("\n", " "))
- textreduced = (text[:count] + " [...]") if len(text) > count else (text)
- return textreduced
+ paragraphs = soup.find_all("p")
+
+ # No <p> tags at all → return empty string
+ if not paragraphs:
+ return ""
+
+ # If only one <p>, use that one; otherwise use the second
+ target = paragraphs[1] if len(paragraphs) > 1 else paragraphs[0]
+
+ # Convert contents of the <p> to HTML string
+ b = "".join(str(e) for e in target.contents)
+
+ # Convert to text
+ text = html2text(b.replace("\n", " ")).strip()
+
+ # Truncate
+ if len(text) > count:
+ return text[:count] + " [...]"
+ return text
def extract_body(text, content_id="newspost-content"):
|