Thanks for v the quick reply james but doesnt your approach imply that the tokenisation into sentences has already been done? Im trying t o avoid a two pass solution as I expect to be doing this hundreds of times
reluctantly using Outlook for Android
From: James Cummings <james@blushingbunny.net>
Sent: Monday, November 5, 2018 1:10:02 PM
To: Lou Burnard
Cc: oxygen-user@oxygenxml.com
Subject: Re: [oXygen-user] an xslt challengeHi Lou,
Would it make sense to use xsl:for-each-group to group the sentences into <s> units to make this easier? Then I'd probably recursively call a template or function passing the current collection of <s> units as a variable item* value, testing if its tokenised number is above or below $maxWords.
Not got time to write that out as a solution atm, and I'm sure it can be done without the recursivity as well, but that is the approach that would have occurred to me at least.
-James
On Mon, 5 Nov 2018 at 12:03, Lou Burnard <lou.burnard@retired.ox.ac.uk> wrote:
_______________________________________________I hope I am not abusing this list in asking occasionally for advice on the best way to hack something in xslt.
Today's problem is to output only the first x sentences (string terminated by a full stop) of a paragraph such that the total number of words (space delimited strings) is less than some limit (call it $maxWords) Since the sentences are of variable length, obviously I don't know what x is.
Here's where I got to so far:
<xsl:template match="t:p">
<xsl:variable name="pString">
<xsl:value-of select="."/>
</xsl:variable>
<xsl:for-each select="tokenize($pString, '\.\s')">
<xsl:variable name="seq">
<xsl:value-of select="string(position())"/>
</xsl:variable>
<xsl:variable name="wordsSoFar">
<xsl:value-of select="string-length(translate(normalize-space
(preceding-sibling::text()), ' ', '')) + 1"/>
</xsl:variable>
<xsl:if test="$wordsSoFar < $maxWords"><s n="{$seq}">
<xsl:value-of select="."/>
</s><xsl:if>
</xsl:for-each>
</xsl:template>
But this is not valid because preceding-sibling:: wants a node() not a string (even though "text()" *is* a node imho).
Am I going about this entirely the wrong way?
oXygen-user mailing list
oXygen-user@oxygenxml.com
https://www.oxygenxml.com/mailman/listinfo/oxygen-user