Skip to content

Commit

Permalink
fix: NullPointerException was raised on some data URLs
Browse files Browse the repository at this point in the history
A NullPointerException was raised on `data` URLs ending with a query-like
string.
This was caused by a double-bug in Galimatias:
- removing the query-like string (as we do when registering references)
  with `URL#withQuery(null)` resulted in a hierarchical URL despite
  `data` URLs being non-hierarchical.
- such hybrid `data` URLs caused an NPE to be raised when canonicalized

We now add some check to not remove the query component on non-hierarchical
URLs.

Also, `OCFContainer#isRemote(URL)` now returns false without further checks
for `data` URLs.

Fixes #1536
  • Loading branch information
rdeltour committed Dec 23, 2024
1 parent d60da30 commit c50536d
Show file tree
Hide file tree
Showing 8 changed files with 61 additions and 2 deletions.
2 changes: 1 addition & 1 deletion src/main/java/com/adobe/epubcheck/ocf/OCFContainer.java
Original file line number Diff line number Diff line change
Expand Up @@ -131,7 +131,7 @@ public String relativize(URL url)
public boolean isRemote(URL url)
{
Preconditions.checkArgument(url != null, "URL is null");
if (contains(url))
if (!url.isHierarchical() || contains(url))
{
return false;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,10 @@ public void registerReference(URL url, Type type, EPUBLocation location,
if (url == null) return;

// Remove query component of local URLs
if (url.query() != null && !container.isRemote(url))
// Note: we only do this for hierarchical URLs, to work around a bug
// in Galimatias that would transform a non-hierarchical URL into a
// hierarchical one. Queries for data URLs can safely be ignored here.
if (url.isHierarchical() && url.query() != null && !container.isRemote(url))
{
try
{
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
<!DOCTYPE html>
<html xmlns:epub="http://www.idpf.org/2007/ops"
xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
<head>
<meta charset="utf-8" />
<title>Minimal EPUB</title>
</head>
<body>
<h1>Loomings</h1>
<p>Call me Ishmael.</p>
<img src="data:image/gif;base64,XXX??aaa" alt="" />
</body>
</html>
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:epub="http://www.idpf.org/2007/ops" xml:lang="en" lang="en">
<head>
<meta charset="utf-8"/>
<title>Minimal Nav</title>
</head>
<body>
<nav epub:type="toc">
<ol>
<li><a href="content_001.xhtml">content 001</a></li>
</ol>
</nav>
</body>
</html>
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
<?xml version="1.0" encoding="UTF-8"?>
<package xmlns="http://www.idpf.org/2007/opf" version="3.0" xml:lang="en" unique-identifier="q">
<metadata xmlns:dc="http://purl.org/dc/elements/1.1/">
<dc:title id="title">Minimal EPUB 3.0</dc:title>
<dc:language>en</dc:language>
<dc:identifier id="q">NOID</dc:identifier>
<meta property="dcterms:modified">2017-06-14T00:00:01Z</meta>
</metadata>
<manifest>
<item id="content_001" href="content_001.xhtml" media-type="application/xhtml+xml"/>
<item id="nav" href="nav.xhtml" media-type="application/xhtml+xml" properties="nav"/>
</manifest>
<spine>
<itemref idref="content_001" />
</spine>
</package>
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
<?xml version="1.0" encoding="UTF-8" ?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
<rootfiles>
<rootfile full-path="EPUB/package.opf" media-type="application/oebps-package+xml"/>
</rootfiles>
</container>
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
application/epub+zip
6 changes: 6 additions & 0 deletions src/test/resources/epub3/03-resources/resources.feature
Original file line number Diff line number Diff line change
Expand Up @@ -549,6 +549,12 @@
Then error RSC-032 is reported
And no other errors or warnings are reported

@spec @xref:sec-data-urls
Scenario: Verify a data URL having unesapced query-like component
See https://github.com/w3c/epubcheck/issues/1536
When checking EPUB 'data-url-with-unescaped-query-valid'
Then no other errors or warnings are reported

## 3.8 File URLs

@spec @xref:sec-file-urls
Expand Down

0 comments on commit c50536d

Please sign in to comment.