This is the mail archive of the xsl-list@mulberrytech.com mailing list .


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]
Other format: [Raw text]

RE: need help on stylesheet efficiency


Ouch. The title says it all.

<xsl:for-each select="/document/record/Flags[not(preceding::Flags=.)]">
  <xsl:variable name="Flags_1" select="."/>
  <xsl:if test="/document[record[Flags=$Flags_1]]">
    <xsl:for-each
select="/document/record/SPM_RegioId[not(preceding::SPM_RegioId=.)]">
      <xsl:variable name="SPM_RegioId_2" select="."/>
      <xsl:if
test="/document[record[Flags=$Flags_1][SPM_RegioId=$SPM_RegioId_2]]">
        <xsl:for-each select="/document/record/SPM_DeviceId
                      [not(preceding::SPM_DeviceId=.)]">

and lots more of the same.

Your first for-each is processing all the Flags elements that differ
from a previous Flags element. The first improvement you can make is to
change it to:

select="/document/record[not(Flags =
preceding-sibling::record/Flags)]/Flags">

preceding-sibling involves a much shorter search than preceding.

But you can do much better than this using keys. Look up Muenchian
grouping to see how you can select the distinct Flags values with the
help of a key.

Now look at the xsl:if. This is saying "if the document contains a
record whose Flags value is equal to this one". Well of course it does,
but the poor old XSLT processor is having to do a lot of work to prove
it.

Now look at the second for-each. Remember that you are executing this
once for every distinct Flags value. This is saying "for every distinct
SPM_RegioId in the document..." As before, you can find these much more
efficiently using a key. But more to the point, you don't need to find
them afresh for each distinct Flags value, because you'll get the same
answer each time. Basically, you don't need nested for-each constructs
at all, because the select expression in each one is absolute rather
than relative.

And so it goes on, to 15 levels of nesting. Since each level of for-each
has O(n^2) with respect to the size of the document, and the xsl:if adds
another O(n), I think the final complexity is O(n^45), which I think
must be some kind of record. It means that if you double the size of the
source document, processing will take about 30 million billion times as
long. If Xalan finished after 3 minutes, it was doing remarkably well.

Michael Kay
Software AG
home: Michael.H.Kay@ntlworld.com
work: Michael.Kay@softwareag.com 

> -----Original Message-----
> From: owner-xsl-list@lists.mulberrytech.com 
> [mailto:owner-xsl-list@lists.mulberrytech.com] On Behalf Of 
> Malia Zaheer
> Sent: 25 July 2002 17:53
> To: XSL-List@lists.mulberrytech.com
> Subject: [xsl] need help on stylesheet efficiency
> 
> 
> Hi,
> 
> I have a stylesheet that I use to process large xml files 
> that are larger than 1MB.  Using Xalan, it takes 3 minutes 
> and 40 seconds to transform only 75KB xml.  I was wondering 
> if people on this list can help me with improving the 
> efficiency of my stylesheet.  Here it is:
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform";
> exclude-result-prefixes="java" version="1.0" 
> xmlns:java="http://xml.apache.org/xslt/java";><xsl:output 
> indent="yes" method="xml"/><xsl:template 
> match="/"><xsl:element name="document"><xsl:call-template
> name="template_1"/></xsl:element></xsl:template>
> 
> <xsl:template name="template_1">
> <xsl:for-each 
> select="/document/record/Flags[not(preceding::Flags=.)]">
> <xsl:variable name="Flags_1" select="."/>
> <xsl:if test="/document[record[Flags=$Flags_1]]">
> <xsl:for-each 
> select="/document/record/SPM_RegioId[not(preceding::SPM_RegioId=.)]">
> <xsl:variable name="SPM_RegioId_2" select="."/>
> <xsl:if 
> test="/document[record[Flags=$Flags_1][SPM_RegioId=$SPM_RegioId_2]]">
> <xsl:for-each 
> select="/document/record/SPM_DeviceId[not(preceding::SPM_Devic
> eId=.)]">
> <xsl:variable name="SPM_DeviceId_3" select="."/>
> <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags
> =$Flags_1][SPM_Reg
> ioId=$SPM_RegioId_2]]">
> <xsl:for-each 
> select="/document/record/SUB_Instance[not(preceding::SUB_Insta
> nce=.)]">
> <xsl:variable name="SUB_Instance_4" select="."/>
> <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags
> =$Flags_1][SUB_Ins 
> tance=$SUB_Instance_4][SPM_RegioId=$SPM_RegioId_2]]">
> <xsl:for-each 
> select="/document/record/SPM_SubId[not(preceding::SPM_SubId=.)]">
> <xsl:variable name="SPM_SubId_5" select="."/>
> <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags
> =$Flags_1][SUB_Ins 
> tance=$SUB_Instance_4][SPM_SubId=$SPM_SubId_5][SPM_RegioId
> =$SPM_RegioId_2]]"
> >
> <xsl:for-each 
> select="/document/record/SPM_IspId[not(preceding::SPM_IspId=.)]">
> <xsl:variable name="SPM_IspId_6" select="."/><xsl:if 
> test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags
> =$Flags_1][SUB_Ins 
> tance=$SUB_Instance_4][SPM_SubId=$SPM_SubId_5][SPM_RegioId
> =$SPM_RegioId_2][S
> PM_IspId=$SPM_IspId_6]]">
> <xsl:for-each 
> select="/document/record/TimeStamp[not(preceding::TimeStamp
> =.)]"><xsl:variab
> le name="TimeStamp_7" select="."/>
> <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags
> =$Flags_1][SUB_Ins 
> tance=$SUB_Instance_4][SPM_SubId=$SPM_SubId_5][SPM_RegioId
> =$SPM_RegioId_2][T imeStamp=$TimeStamp_7][SPM_IspId=$SPM_IspId_6]]">
> <xsl:for-each 
> select="/document/record/SPM_TRUNKID[not(preceding::SPM_TRUNKID
> =.)]"><xsl:va
> riable name="SPM_TRUNKID_8" select="."/>
> <xsl:if test="/document[record[SPM_DeviceId=$SPM_DeviceId_3][Flags
> =$Flags_1][SPM_TRU 
> NKID=$SPM_TRUNKID_8][SUB_Instance=$SUB_Instance_4][SPM_SubId
> =$SPM_SubId_5][S 
> PM_RegioId=$SPM_RegioId_2][TimeStamp=$TimeStamp_7][SPM_IspId
> =$SPM_IspId_6]]"
> >
> <xsl:for-each 
> select="/document/record/IFI_IPACKETS[not(preceding::IFI_IPACKETS
> =.)]"><xsl:
> variable name="IFI_IPACKETS_9" select="."/>
> <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] 
> [Flags=$Flags_1][SPM_RegioId=$SPM_RegioId_2][SPM_TRUNKID
> =$SPM_TRUNKID_8][SUB 
> _Instance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId
> =$SPM_IspI
> d_6][TimeStamp=$TimeStamp_7]]">
> <xsl:for-each 
> select="/document/record/IFI_OPACKETS[not(preceding::IFI_OPACK
> ETS=.)]">
> <xsl:variable name="IFI_OPACKETS_10" select="."/><xsl:if 
> test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] 
> [IFI_OPACKETS=$IFI_OPACKETS_10][Flags=$Flags_1][SPM_RegioId=$S
> PM_RegioId_2]
> [ 
> SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance=$SUB_Instance_4][SPM_DeviceId
> =$SPM_ DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp=$TimeStamp_7]]">
> <xsl:for-each 
> select="/document/record/IFI_IBYTES[not(preceding::IFI_IBYTES
> =.)]"><xsl:vari
> able name="IFI_IBYTES_11" select="."/>
> <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] 
> [IFI_OPACKETS=$IFI_OPACKETS_10][Flags=$Flags_1][SPM_RegioId=$S
> PM_RegioId_2]
> [ IFI_IBYTES=$IFI_IBYTES_11][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance
> =$SUB_Ins 
> tance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6]
> [TimeStamp
> =$Ti
> meStamp_7]]">
> <xsl:for-each 
> select="/document/record/IFI_OBYTES[not(preceding::IFI_OBYTES=.)]">
> <xsl:variable name="IFI_OBYTES_12" select="."/><xsl:if 
> test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] 
> [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags
> =$Flags_1][S 
> PM_RegioId=$SPM_RegioId_2][IFI_IBYTES=$IFI_IBYTES_11][SPM_TRUNKID
> =$SPM_TRUNK 
> ID_8][SUB_Instance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId
> _3][SPM_IspId
> =
> $SPM_IspId_6][TimeStamp=$TimeStamp_7]]"><xsl:for-each
> select="/document/record/IFI_IQDROPS[not(preceding::IFI_IQDROPS
> =.)]"><xsl:va
> riable name="IFI_IQDROPS_13" select="."/>
> <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] 
> [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags
> =$Flags_1][I 
> FI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2][IFI_IBYTES
> =$IFI_IBYT
> ES_11][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance
> =$SUB_Instance_4][SPM_DeviceI 
> d=$SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp
> =$TimeStamp_7]]"><xsl:fo
> r-each select="/document/record/IFI_OQDROPS[not(preceding::IFI_OQDROPS
> =.)]">
> <xsl:variable name="IFI_OQDROPS_14" select="."/>
> <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] 
> [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags
> =$Flags_1][I 
> FI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2][IFI_IBYTES
> =$IFI_IBYT
> ES_11][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance
> =$SUB_Instance_4][SPM_DeviceI 
> d=$SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp
> =$TimeStamp_7][IFI_OQDRO
> PS=$IFI_OQDROPS_14]]">
> <xsl:for-each 
> select="/document/record/PKTS_DROP_ERR[not(preceding::PKTS_DROP_ERR
> =.)]"><xs
> l:variable name="PKTS_DROP_ERR_15" select="."/>
> <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] 
> [IFI_OBYTES=$IFI_OBYTES_12][IFI_OPACKETS=$IFI_OPACKETS_10][Flags
> =$Flags_1][I 
> FI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2][IFI_IBYTES
> =$IFI_IBYT
> ES_11][PKTS_DROP_ERR=$PKTS_DROP_ERR_15][SPM_TRUNKID
> =$SPM_TRUNKID_8][SUB_Inst 
> ance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId=
> $SPM_IspId_6]
> [ TimeStamp=$TimeStamp_7][IFI_OQDROPS=$IFI_OQDROPS_14]]"><xsl:for-each
> select="/document/record/MULTICAST_IN_PKTS[not(preceding::MULT
> ICAST_IN_PKTS
> =
> .)]">
> <xsl:variable name="MULTICAST_IN_PKTS_16" select="."/>
> <xsl:if test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] [MULTICAST_IN_PKTS=$MULTICAST_IN_PKTS_16][IFI_OBYTES
> =$IFI_OBYTES_12][IFI_OPA 
> CKETS=$IFI_OPACKETS_10][Flags=$Flags_1][IFI_IQDROPS
> =$IFI_IQDROPS_13][SPM_Reg 
> ioId=$SPM_RegioId_2][IFI_IBYTES=$IFI_IBYTES_11][PKTS_DROP_ERR
> =$PKTS_DROP_ERR 
> _15][SPM_TRUNKID=$SPM_TRUNKID_8][SUB_Instance=$SUB_Instance_4]
> [SPM_DeviceId
> =
> $SPM_DeviceId_3][SPM_IspId=$SPM_IspId_6][TimeStamp
> =$TimeStamp_7][IFI_OQDROPS
> =$IFI_OQDROPS_14]]">
> <xsl:for-each
> select 
> ="/document/record/MULTICAST_OUT_PKTS[not(preceding::MULTICAST_OUT_PKT
> S=.)]"><xsl:variable name="MULTICAST_OUT_PKTS_17" 
> select="."/> <xsl:choose> <xsl:when 
> test="/document[record[SPM_SubId=$SPM_SubId_5][IFI_IPACKETS
> =$IFI_IPACKETS_9] [MULTICAST_IN_PKTS=$MULTICAST_IN_PKTS_16][IFI_OBYTES
> =$IFI_OBYTES_12][IFI_OPA 
> CKETS=$IFI_OPACKETS_10][Flags=$Flags_1][MULTICAST_OUT_PKTS
> =$MULTICAST_OUT_PK 
> TS_17][IFI_IQDROPS=$IFI_IQDROPS_13][SPM_RegioId=$SPM_RegioId_2
> ][IFI_IBYTES
> =$ 
> IFI_IBYTES_11][PKTS_DROP_ERR=$PKTS_DROP_ERR_15][SPM_TRUNKID=$S
> PM_TRUNKID_8]
> [ 
> SUB_Instance=$SUB_Instance_4][SPM_DeviceId=$SPM_DeviceId_3][SPM_IspId
> =$SPM_I 
> spId_6][TimeStamp=$TimeStamp_7][IFI_OQDROPS=$IFI_OQDROPS_14]]">
> <xsl:element name="record"><xsl:element 
> name="Flags"><xsl:value-of 
> select="$Flags_1"/></xsl:element><xsl:element
> name="SPM_RegioId"><xsl:value-of 
> select="$SPM_RegioId_2"/></xsl:element>
> <xsl:element name="SPM_DeviceId"><xsl:value-of 
> select="$SPM_DeviceId_3"/></xsl:element>
> <xsl:element name="SUB_Instance"><xsl:value-of 
> select="$SUB_Instance_4"/></xsl:element>
> <xsl:element name="SPM_SubId"><xsl:value-of 
> select="$SPM_SubId_5"/></xsl:element>
> <xsl:element name="SPM_IspId"><xsl:value-of 
> select="$SPM_IspId_6"/></xsl:element>
> <xsl:element name="TimeStamp"><xsl:value-of 
> select="$TimeStamp_7"/></xsl:element>
> <xsl:element name="SPM_TRUNKID"><xsl:value-of 
> select="$SPM_TRUNKID_8"/></xsl:element>
> <xsl:element name="IFI_IPACKETS"><xsl:value-of 
> select="$IFI_IPACKETS_9"/></xsl:element>
> <xsl:element name="IFI_OPACKETS"><xsl:value-of 
> select="$IFI_OPACKETS_10"/></xsl:element>
> <xsl:element name="IFI_IBYTES"><xsl:value-of 
> select="$IFI_IBYTES_11"/></xsl:element>
> <xsl:element name="IFI_OBYTES"><xsl:value-of 
> select="$IFI_OBYTES_12"/></xsl:element>
> <xsl:element name="IFI_IQDROPS"><xsl:value-of 
> select="$IFI_IQDROPS_13"/></xsl:element>
> <xsl:element name="IFI_OQDROPS"><xsl:value-of 
> select="$IFI_OQDROPS_14"/></xsl:element>
> <xsl:element name="PKTS_DROP_ERR"><xsl:value-of
> select="$PKTS_DROP_ERR_15"/></xsl:element><xsl:element
> name="MULTICAST_IN_PKTS"><xsl:value-of
> select="$MULTICAST_IN_PKTS_16"/></xsl:element><xsl:element
> name="MULTICAST_OUT_PKTS"><xsl:value-of
> select="$MULTICAST_OUT_PKTS_17"/></xsl:element></xsl:element>
> </xsl:when>
> </xsl:choose>
> </xsl:for-each>
> </xsl:if>
> </xsl:for-each> 
> </xsl:if></xsl:for-each></xsl:if></xsl:for-each></xsl:if>
> </xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl:for-each
> ></xsl:if>
> </xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl:for-each
> ></xsl:if>
> </xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl:for-each
> ></xsl:if>
> </xsl:for-each> 
> </xsl:if></xsl:for-each></xsl:if></xsl:for-each></xsl:if></xsl
> :for-each>
> </xsl:template></xsl:stylesheet>
> 
> 
> And sample data record is:
> 
> <record><Flags>0</Flags>
> <SPM_RegioId>1</SPM_RegioId>
> <SPM_DeviceId>1</SPM_DeviceId>
> <SUB_Instance>-1</SUB_Instance>
> <SPM_SubId>-1</SPM_SubId>
> <SPM_IspId>6</SPM_IspId>
> <TimeStamp>Thu Jul 04 17:40:30 EDT 2002</TimeStamp> 
> <SPM_TRUNKID>45</SPM_TRUNKID> 
> <IFI_IPACKETS>113</IFI_IPACKETS> 
> <IFI_OPACKETS>219</IFI_OPACKETS> 
> <IFI_IBYTES>7002</IFI_IBYTES> <IFI_OBYTES>13038</IFI_OBYTES> 
> <IFI_IQDROPS>0</IFI_IQDROPS> <IFI_OQDROPS>0</IFI_OQDROPS> 
> <PKTS_DROP_ERR>0</PKTS_DROP_ERR> 
> <MULTICAST_IN_PKTS>6760</MULTICAST_IN_PKTS>
> <MULTICAST_OUT_PKTS>0</MULTICAST_OUT_PKTS>
> </record>
> 
> I know that the sylesheet is not efficient. That is because I 
> am generating it programmatically, not by hand so that I can 
> customize it to each type of input.  Any help on making it 
> more effieicnt would be greatly appreciated. What can I use 
> instead of preceding:: axis? 
> 
> Thank you so much!
> Malia
> 
> 
>  XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
> 
> 


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list


Index Nav: [Date Index] [Subject Index] [Author Index] [Thread Index]
Message Nav: [Date Prev] [Date Next] [Thread Prev] [Thread Next]