commit 134a1ac3372fe1eae6bc5c6acd12666c17e82696
parent 6a7229149f03a54d7d63241c4cbc1c83aa9831f0
Author: Hiltjo Posthuma <hiltjo@codemadness.org>
Date: Sat, 31 Oct 2020 19:51:17 +0100
sfeed_web: improve parsing a <link> if it has no type attribute
This happens because the previous link type is not reset when a <link> tag
starts again, but it is reset when a type attribute starts.
Found on the spanish newspaper site: elpais.com
Input:
<link rel="alternate" href="https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada" type="application/rss+xml" title="RSS de la portada de El PaĆs"/>
<link rel="canonical" href="https://elpais.com"/>
Would print (second line is incorrect).
https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada application/rss+xml
https://elpais.com/ application/rss+xml
Now prints:
https://feeds.elpais.com/mrss-s/pages/ep/site/elpais.com/portada application/rss+xml
Fix: reset it also at the start of a <link> tag in this case (for <base href />
it is still not wanted).
Diffstat:
1 file changed, 1 insertion(+), 0 deletions(-)
diff --git a/sfeed_web.c b/sfeed_web.c
@@ -32,6 +32,7 @@ xmltagstart(XMLParser *p, const char *t, size_t tl)
} else if (!strcasecmp(t, "link")) {
islinktag = 1;
linkhref[0] = '\0';
+ linktype[0] = '\0';
}
}