Merge branch 'scour-project:master' into master

This commit is contained in:
doom-goober 2021-12-04 08:40:28 -08:00 committed by GitHub
commit 7dea6c7aa1
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
6 changed files with 175 additions and 183 deletions

View file

@ -4,7 +4,7 @@ Contributions to Scour are welcome, feel free to create a pull request!
In order to be able to merge your PR as fast as possible please try to stick to the following guidelines.
> _**TL;DR** (if you now what you're doing) Always run [`make check`](https://github.com/scour-project/scour/blob/master/Makefile) before creating a PR to check for common problems._
> _**TL;DR** (if you now what you're doing) Always run [`make check`](https://github.com/scour-project/scour/blob/master/Makefile) before creating a PR to check for common problems._
## Code Style

View file

@ -47,7 +47,7 @@
## Version 0.35 (2016-09-14)
* Drop official support for Python 2.6. (While it will probably continue to work for a while compatibility is not guaranteed anymore. If you continue to use Scour with Python 2.6 and should find/fix any compatibility issues pull requests are welcome, though.)
* Fix: Unused IDs were not shortended when `--shorten-ids` was used. ([#19](https://github.com/scour-project/scour/issues/62))
* Fix: Unused IDs were not shortened when `--shorten-ids` was used. ([#19](https://github.com/scour-project/scour/issues/62))
* Fix: Most elements were still removed from `<defs>` when `--keep-unreferenced-defs` was used. ([#62](https://github.com/scour-project/scour/issues/62))
* Improve escaping of single/double quotes ('/") in attributes. ([#64](https://github.com/scour-project/scour/issues/64))
* Print usage information if no input file was specified (and no data is available from `stdin`). ([#65](https://github.com/scour-project/scour/issues/65))
@ -56,13 +56,13 @@
* Improve code to remove default attribute values and add a lot of new default values. ([#70](https://github.com/scour-project/scour/issues/70))
* Fix: Only attempt to group elements that the content model allows to be children of a `<g>` when `--create-groups` is specified. ([#98](https://github.com/scour-project/scour/issues/98))
* Fix: Update list of SVG presentation attributes allowing more styles to be converted to attributes and remove two entries (`line-height` and `visibility`) that were actually invalid. ([#99](https://github.com/scour-project/scour/issues/99))
* Add three options that work analoguous to `--remove-metadata` (removes `<metadata>` elements) ([#102](https://github.com/scour-project/scour/issues/102))
* Add three options that work analogous to `--remove-metadata` (removes `<metadata>` elements) ([#102](https://github.com/scour-project/scour/issues/102))
* `--remove-titles` (removes `<title>` elements)
* `--remove-descriptions` (removes `<desc>` elements)
* `--remove-descriptive-elements` (removes all of the descriptive elements, i.e. `<title>`, `<desc>` and `<metadata>`)
* Fix removal rules for the `overflow` attribute. ([#104](https://github.com/scour-project/scour/issues/104))
* Improvement: Automatically order all attributes ([#105](https://github.com/scour-project/scour/issues/105)), as well as `style` declarations ([#107](https://github.com/scour-project/scour/issues/107)) allowing for a constant output across multiple runs of Scour. Before order could change arbitrarily.
* Improve path scouring. ([#108](https://github.com/scour-project/scour/issues/108))<br>Notably Scour performs all caculations with enhanced precision now, guaranteeing maximum accuracy when optimizing path data. Numerical precision is reduced as a last step of the optimization according to the `--precision` option.
* Improve path scouring. ([#108](https://github.com/scour-project/scour/issues/108))<br>Notably Scour performs all calculations with enhanced precision now, guaranteeing maximum accuracy when optimizing path data. Numerical precision is reduced as a last step of the optimization according to the `--precision` option.
* Fix replacement of removed duplicate gradients if the `fill`/`stroke` properties contained a fallback. ([#109](https://github.com/scour-project/scour/issues/109))
* Fix conversion of cubic Bézier "curveto" commands into "shorthand/smooth curveto" commands. ([#110](https://github.com/scour-project/scour/issues/110))
* Fix some issues due to removal of properties without considering inheritance rules. ([#111](https://github.com/scour-project/scour/issues/111))
@ -74,7 +74,7 @@
* Input/output file can now be specified as positional arguments (e.g. `scour input.svg output.svg`). ([#46](https://github.com/scour-project/scour/issues/46))
* Improve `--help` output by intuitively arranging options in groups. ([#46](https://github.com/scour-project/scour/issues/46))
* Add option `--error-on-flowtext` to raise an exception whenever a non-standard `<flowText>` element is found (which is only supported in Inkscape). If this option is not specified a warning will be shown. ([#53](https://github.com/scour-project/scour/issues/53))
* Automate tests with continouous integration via Travis. ([#52](https://github.com/scour-project/scour/issues/52))
* Automate tests with continuous integration via Travis. ([#52](https://github.com/scour-project/scour/issues/52))
## Version 0.33 (2016-01-29)
@ -104,7 +104,7 @@
## Version 0.30 (2014-08-05)
* Fix ingoring of additional args when invoked from scons.
* Fix ignoring of additional args when invoked from scons.
## Version 0.29 (2014-07-26)

View file

@ -7,11 +7,11 @@
---
Scour is an SVG optimizer/cleaner that reduces the size of scalable vector graphics by optimizing structure and removing unnecessary data written in Python.
Scour is an SVG optimizer/cleaner written in Python that reduces the size of scalable vector graphics by optimizing structure and removing unnecessary data.
It can be used to create streamlined vector graphics suitable for web deployment, publishing/sharing or further processing.
The goal of Scour is to output a file that renderes identically at a fraction of the size by removing a lot of redundant information created by most SVG editors. Optimization options are typically lossless but can be tweaked for more agressive cleaning.
The goal of Scour is to output a file that renders identically at a fraction of the size by removing a lot of redundant information created by most SVG editors. Optimization options are typically lossless but can be tweaked for more aggressive cleaning.
Scour is open-source and licensed under [Apache License 2.0](https://github.com/codedread/scour/blob/master/LICENSE).

View file

@ -63,6 +63,7 @@ from decimal import Context, Decimal, InvalidOperation, getcontext
import six
from six.moves import range, urllib
from scour.stats import ScourStats
from scour.svg_regex import svg_parser
from scour.svg_transform import svg_transform_parser
from scour.yocto_css import parseCssString
@ -666,14 +667,13 @@ def removeUnusedDefs(doc, defElem, elemsToRemove=None, referencedIDs=None):
return elemsToRemove
def removeUnreferencedElements(doc, keepDefs):
def remove_unreferenced_elements(doc, keepDefs, stats):
"""
Removes all unreferenced elements except for <svg>, <font>, <metadata>, <title>, and <desc>.
Also vacuums the defs of any non-referenced renderable elements.
Returns the number of unreferenced elements removed from the document.
"""
global _num_elements_removed
num = 0
# Remove certain unreferenced elements outside of defs
@ -688,8 +688,8 @@ def removeUnreferencedElements(doc, keepDefs):
elemsToRemove = removeUnusedDefs(doc, aDef, referencedIDs=referencedIDs)
for elem in elemsToRemove:
elem.parentNode.removeChild(elem)
_num_elements_removed += 1
num += 1
stats.num_elements_removed += len(elemsToRemove)
num += len(elemsToRemove)
for id in identifiedElements:
if id not in referencedIDs:
@ -699,7 +699,7 @@ def removeUnreferencedElements(doc, keepDefs):
and goner.parentNode.tagName != 'defs'):
goner.parentNode.removeChild(goner)
num += 1
_num_elements_removed += 1
stats.num_elements_removed += 1
return num
@ -937,20 +937,18 @@ def unprotected_ids(doc, options):
return identifiedElements
def removeUnreferencedIDs(referencedIDs, identifiedElements):
def remove_unreferenced_ids(referencedIDs, identifiedElements):
"""
Removes the unreferenced ID attributes.
Returns the number of ID attributes removed
"""
global _num_ids_removed
keepTags = ['font']
num = 0
for id in identifiedElements:
node = identifiedElements[id]
if id not in referencedIDs and node.nodeName not in keepTags:
node.removeAttribute('id')
_num_ids_removed += 1
num += 1
return num
@ -994,7 +992,7 @@ def removeNamespacedElements(node, namespaces):
return num
def removeDescriptiveElements(doc, options):
def remove_descriptive_elements(doc, options):
elementTypes = []
if options.remove_descriptive_elements:
elementTypes.extend(("title", "desc", "metadata"))
@ -1006,29 +1004,38 @@ def removeDescriptiveElements(doc, options):
if options.remove_metadata:
elementTypes.append("metadata")
if not elementTypes:
return
return 0
global _num_elements_removed
num = 0
elementsToRemove = []
for elementType in elementTypes:
elementsToRemove.extend(doc.documentElement.getElementsByTagName(elementType))
for element in elementsToRemove:
element.parentNode.removeChild(element)
num += 1
_num_elements_removed += 1
return num
return len(elementsToRemove)
def removeNestedGroups(node):
def g_tag_is_mergeable(node):
"""Check if a <g> tag can be merged or not
<g> tags with a title or descriptions should generally be left alone.
"""
if any(
True for n in node.childNodes
if n.nodeType == Node.ELEMENT_NODE and n.nodeName in ('title', 'desc')
and n.namespaceURI == NS['SVG']
):
return False
return True
def remove_nested_groups(node, stats):
"""
This walks further and further down the tree, removing groups
which do not have any attributes or a title/desc child and
promoting their children up one level
"""
global _num_elements_removed
num = 0
groupsToRemove = []
@ -1038,24 +1045,21 @@ def removeNestedGroups(node):
for child in node.childNodes:
if child.nodeName == 'g' and child.namespaceURI == NS['SVG'] and len(child.attributes) == 0:
# only collapse group if it does not have a title or desc as a direct descendant,
for grandchild in child.childNodes:
if grandchild.nodeType == Node.ELEMENT_NODE and grandchild.namespaceURI == NS['SVG'] and \
grandchild.nodeName in ['title', 'desc']:
break
else:
if g_tag_is_mergeable(child):
groupsToRemove.append(child)
for g in groupsToRemove:
while g.childNodes.length > 0:
g.parentNode.insertBefore(g.firstChild, g)
g.parentNode.removeChild(g)
_num_elements_removed += 1
num += 1
num += len(groupsToRemove)
stats.num_elements_removed += len(groupsToRemove)
# now recurse for children
for child in node.childNodes:
if child.nodeType == Node.ELEMENT_NODE:
num += removeNestedGroups(child)
num += remove_nested_groups(child, stats)
return num
@ -1169,11 +1173,7 @@ def mergeSiblingGroupsWithCommonAttributes(elem):
if nextNode.nodeName != 'g' or nextNode.namespaceURI != NS['SVG']:
break
nextAttributes = {a.nodeName: a.nodeValue for a in nextNode.attributes.values()}
hasNoMergeTags = (True for n in nextNode.childNodes
if n.nodeType == Node.ELEMENT_NODE
and n.nodeName in ('title', 'desc')
and n.namespaceURI == NS['SVG'])
if attributes != nextAttributes or any(hasNoMergeTags):
if attributes != nextAttributes or not g_tag_is_mergeable(nextNode):
break
else:
runElements += 1
@ -1214,7 +1214,7 @@ def mergeSiblingGroupsWithCommonAttributes(elem):
return num
def createGroupsForCommonAttributes(elem):
def create_groups_for_common_attributes(elem, stats):
"""
Creates <g> elements to contain runs of 3 or more
consecutive child elements having at least one common attribute.
@ -1226,8 +1226,6 @@ def createGroupsForCommonAttributes(elem):
This function acts recursively on the given element.
"""
num = 0
global _num_elements_removed
# TODO perhaps all of the Presentation attributes in http://www.w3.org/TR/SVG/struct.html#GElement
# could be added here
@ -1327,9 +1325,8 @@ def createGroupsForCommonAttributes(elem):
# Include the group in elem's children.
elem.childNodes.insert(runStart, group)
group.parentNode = elem
num += 1
curChild = runStart - 1
_num_elements_removed -= 1
stats.num_elements_removed -= 1
else:
curChild -= 1
else:
@ -1338,9 +1335,7 @@ def createGroupsForCommonAttributes(elem):
# each child gets the same treatment, recursively
for childNode in elem.childNodes:
if childNode.nodeType == Node.ELEMENT_NODE:
num += createGroupsForCommonAttributes(childNode)
return num
create_groups_for_common_attributes(childNode, stats)
def removeUnusedAttributesOnParent(elem):
@ -1380,12 +1375,11 @@ def removeUnusedAttributesOnParent(elem):
unusedAttrs[attr.nodeName] = attr.nodeValue
# for each child, if at least one child inherits the parent's attribute, then remove
for childNum in range(len(childElements)):
child = childElements[childNum]
for child in childElements:
inheritedAttrs = []
for name in unusedAttrs:
val = child.getAttribute(name)
if val == '' or val is None or val == 'inherit':
if val == '' or val == 'inherit':
inheritedAttrs.append(name)
for a in inheritedAttrs:
del unusedAttrs[a]
@ -1398,8 +1392,7 @@ def removeUnusedAttributesOnParent(elem):
return num
def removeDuplicateGradientStops(doc):
global _num_elements_removed
def remove_duplicate_gradient_stops(doc, stats):
num = 0
for gradType in ['linearGradient', 'radialGradient']:
@ -1432,15 +1425,13 @@ def removeDuplicateGradientStops(doc):
for stop in stopsToRemove:
stop.parentNode.removeChild(stop)
num += 1
_num_elements_removed += 1
num += len(stopsToRemove)
stats.num_elements_removed += len(stopsToRemove)
# linear gradients
return num
def collapseSinglyReferencedGradients(doc):
global _num_elements_removed
def collapse_singly_referenced_gradients(doc, stats):
num = 0
identifiedElements = findElementsWithId(doc.documentElement)
@ -1493,7 +1484,7 @@ def collapseSinglyReferencedGradients(doc):
if target_href:
# If the elem node had an xlink:href, then the
# refElem have to point to it as well to
# perserve the semantics of the image.
# preserve the semantics of the image.
refElem.setAttributeNS(NS['XLINK'], 'href', target_href)
else:
# The elem node had no xlink:href reference,
@ -1502,8 +1493,9 @@ def collapseSinglyReferencedGradients(doc):
# now delete elem
elem.parentNode.removeChild(elem)
_num_elements_removed += 1
stats.num_elements_removed += 1
num += 1
return num
@ -1903,7 +1895,7 @@ def styleInheritedByChild(node, style, nodeIsChild=False):
'missing-glyph', 'pattern', 'svg', 'switch', 'symbol']:
return False
# in all other cases we have to assume the inherited value of 'style' is meaningfull and has to be kept
# in all other cases we have to assume the inherited value of 'style' is meaningful and has to be kept
# (e.g nodes without children at the end of the DOM tree, text nodes, ...)
return True
@ -2129,7 +2121,7 @@ def removeDefaultAttributeValue(node, attribute):
"""
Removes the DefaultAttribute 'attribute' from 'node' if specified conditions are fulfilled
Warning: Does NOT check if the attribute is actually valid for the passed element type for increased preformance!
Warning: Does NOT check if the attribute is actually valid for the passed element type for increased performance!
"""
if not node.hasAttribute(attribute.name):
return 0
@ -2154,7 +2146,7 @@ def removeDefaultAttributeValue(node, attribute):
return 0
def removeDefaultAttributeValues(node, options, tainted=set()):
def removeDefaultAttributeValues(node, options, tainted=None):
u"""'tainted' keeps a set of attributes defined in parent nodes.
For such attributes, we don't delete attributes with default values."""
@ -2162,6 +2154,9 @@ def removeDefaultAttributeValues(node, options, tainted=set()):
if node.nodeType != Node.ELEMENT_NODE:
return 0
if tainted is None:
tainted = set()
# Conditionally remove all default attributes defined in 'default_attributes' (a list of 'DefaultAttribute's)
#
# For increased performance do not iterate the whole list for each element but run only on valid subsets
@ -2289,12 +2284,10 @@ def convertColors(element):
# reusing data structures, etc
def cleanPath(element, options):
def clean_path(element, options, stats):
"""
Cleans the path string (d attribute) of the element
"""
global _num_bytes_saved_in_path_data
global _num_path_segments_removed
# this gets the parser object from svg_regex.py
oldPathStr = element.getAttribute('d')
@ -2445,34 +2438,34 @@ def cleanPath(element, options):
while i < len(data):
if data[i] == data[i + 1] == 0:
del data[i:i + 2]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
i += 2
elif cmd == 'c':
while i < len(data):
if data[i] == data[i + 1] == data[i + 2] == data[i + 3] == data[i + 4] == data[i + 5] == 0:
del data[i:i + 6]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
i += 6
elif cmd == 'a':
while i < len(data):
if data[i + 5] == data[i + 6] == 0:
del data[i:i + 7]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
i += 7
elif cmd == 'q':
while i < len(data):
if data[i] == data[i + 1] == data[i + 2] == data[i + 3] == 0:
del data[i:i + 4]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
i += 4
elif cmd in ['h', 'v']:
oldLen = len(data)
path[pathIndex] = (cmd, [coord for coord in data if coord != 0])
_num_path_segments_removed += len(path[pathIndex][1]) - oldLen
stats.num_path_segments_removed += len(path[pathIndex][1]) - oldLen
# remove no-op commands
pathIndex = len(path)
@ -2493,7 +2486,7 @@ def cleanPath(element, options):
# continue a draw on the same subpath after a
# "z").
del path[pathIndex]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
# it is not safe to rewrite "m0 0 ..." to "l..."
# because of this "z" command.
@ -2503,7 +2496,7 @@ def cleanPath(element, options):
# Ends with an empty move (but no line/draw
# following it)
del path[pathIndex]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
continue
if subpath_needs_anchor:
subpath_needs_anchor = False
@ -2511,7 +2504,7 @@ def cleanPath(element, options):
# unanchored, i.e. we can replace "m0 0 ..." with
# "l..." as there is no "z" after it.
path[pathIndex] = ('l', data[2:])
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
# fixup: Delete subcommands having no coordinates.
path = [elem for elem in path if len(elem[1]) > 0 or elem[0] == 'z']
@ -2599,14 +2592,14 @@ def cleanPath(element, options):
lineTuples = []
# append the v and then the remaining line coords
newPath.append(('v', [data[i + 1]]))
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
elif data[i + 1] == 0:
if lineTuples:
# flush the line command, then append the h and then the remaining line coords
newPath.append(('l', lineTuples))
lineTuples = []
newPath.append(('h', [data[i]]))
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
lineTuples.extend(data[i:i + 2])
i += 2
@ -2626,7 +2619,7 @@ def cleanPath(element, options):
cmd = 'l' # dealing with linetos now
# append the v and then the remaining line coords
newPath.append(('v', [data[i + 1]]))
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
elif data[i + 1] == 0:
if lineTuples:
# flush the m/l command, then append the h and then the remaining line coords
@ -2634,7 +2627,7 @@ def cleanPath(element, options):
lineTuples = []
cmd = 'l' # dealing with linetos now
newPath.append(('h', [data[i]]))
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
lineTuples.extend(data[i:i + 2])
i += 2
@ -2663,7 +2656,7 @@ def cleanPath(element, options):
curveTuples = []
# append the s command
newPath.append(('s', [data[i + 2], data[i + 3], data[i + 4], data[i + 5]]))
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
j = 0
while j <= 5:
@ -2688,7 +2681,7 @@ def cleanPath(element, options):
curveTuples = []
# append the t command
newPath.append(('t', [data[i + 2], data[i + 3]]))
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
j = 0
while j <= 3:
@ -2721,7 +2714,7 @@ def cleanPath(element, options):
if is_same_sign(data[coordIndex], data[coordIndex+1]):
data[coordIndex] += data[coordIndex+1]
del data[coordIndex+1]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
coordIndex += 1
@ -2734,7 +2727,7 @@ def cleanPath(element, options):
data[coordIndex+1] += data[coordIndex+3]
del data[coordIndex+2] # delete the next two elements
del data[coordIndex+2]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
coordIndex += 2
@ -2747,7 +2740,7 @@ def cleanPath(element, options):
data[coordIndex+1] += data[coordIndex+3]
del data[coordIndex+2] # delete the next two elements
del data[coordIndex+2]
_num_path_segments_removed += 1
stats.num_path_segments_removed += 1
else:
coordIndex += 2
@ -2782,7 +2775,7 @@ def cleanPath(element, options):
# if for whatever reason we actually made the path longer don't use it
# TODO: maybe we could compare path lengths after each optimization step and use the shortest
if len(newPathStr) <= len(oldPathStr):
_num_bytes_saved_in_path_data += (len(oldPathStr) - len(newPathStr))
stats.num_bytes_saved_in_path_data += (len(oldPathStr) - len(newPathStr))
element.setAttribute('d', newPathStr)
@ -2846,11 +2839,11 @@ def parseListOfPoints(s):
return nums
def cleanPolygon(elem, options):
def clean_polygon(elem, options):
"""
Remove unnecessary closing point of polygon points attribute
"""
global _num_points_removed_from_polygon
num_points_removed_from_polygon = 0
pts = parseListOfPoints(elem.getAttribute('points'))
N = len(pts) / 2
@ -2859,8 +2852,9 @@ def cleanPolygon(elem, options):
(endx, endy) = pts[-2:]
if startx == endx and starty == endy:
del pts[-2:]
_num_points_removed_from_polygon += 1
num_points_removed_from_polygon += 1
elem.setAttribute('points', scourCoordinates(pts, options, True))
return num_points_removed_from_polygon
def cleanPolyline(elem, options):
@ -2943,7 +2937,7 @@ def scourCoordinates(data, options, force_whitespace=False, control_points=[], f
# - this number starts with a dot but the previous number had *no* dot or exponent
# i.e. '1.3 0.5' -> '1.3.5' or '1e3 0.5' -> '1e3.5' is fine but '123 0.5' -> '123.5' is obviously not
# - 'force_whitespace' is explicitly set to 'True'
# we never need a space after flags (occuring in elliptical arcs), but librsvg struggles without it
# we never need a space after flags (occurring in elliptical arcs), but librsvg struggles without it
if (c > 0
and (force_whitespace
or scouredCoord[0].isdigit()
@ -3300,31 +3294,27 @@ def optimizeTransforms(element, options):
return num
def removeComments(element):
def remove_comments(element, stats):
"""
Removes comments from the element and its children.
"""
global _num_bytes_saved_in_comments
num = 0
if isinstance(element, xml.dom.minidom.Comment):
_num_bytes_saved_in_comments += len(element.data)
stats.num_bytes_saved_in_comments += len(element.data)
stats.num_comments_removed += 1
element.parentNode.removeChild(element)
num += 1
else:
for subelement in element.childNodes[:]:
num += removeComments(subelement)
return num
remove_comments(subelement, stats)
def embedRasters(element, options):
def embed_rasters(element, options):
import base64
"""
Converts raster references to inline images.
NOTE: there are size limits to base64-encoding handling in browsers
"""
global _num_rasters_embedded
num_rasters_embedded = 0
href = element.getAttributeNS(NS['XLINK'], 'href')
@ -3392,8 +3382,9 @@ def embedRasters(element, options):
element.setAttributeNS(NS['XLINK'], 'href',
'data:image/' + ext + ';base64,' + b64eRaster.decode())
_num_rasters_embedded += 1
num_rasters_embedded += 1
del b64eRaster
return num_rasters_embedded
def properlySizeDoc(docElement, options):
@ -3413,7 +3404,6 @@ def properlySizeDoc(docElement, options):
# parse viewBox attribute
vbSep = RE_COMMA_WSP.split(docElement.getAttribute('viewBox'))
# if we have a valid viewBox we need to check it
vbWidth, vbHeight = 0, 0
if len(vbSep) == 4:
try:
# if x or y are specified and non-zero then it is not ok to overwrite it
@ -3448,7 +3438,6 @@ def remapNamespacePrefix(node, oldprefix, newprefix):
parent = node.parentNode
# create a replacement node
newNode = None
if newprefix != '':
newNode = doc.createElementNS(namespace, newprefix + ":" + localName)
else:
@ -3643,10 +3632,14 @@ def serializeXML(element, options, indent_depth=0, preserveWhitespace=False):
# this is the main method
# input is a string representation of the input XML
# returns a string representation of the output XML
def scourString(in_string, options=None):
def scourString(in_string, options=None, stats=None):
# sanitize options (take missing attributes from defaults, discard unknown attributes)
options = sanitizeOptions(options)
if stats is None:
# This is easier than doing "if stats is not None:" checks all over the place
stats = ScourStats()
# default or invalid value
if(options.cdigits < 0):
options.cdigits = options.digits
@ -3659,37 +3652,6 @@ def scourString(in_string, options=None):
scouringContext = Context(prec=options.digits)
scouringContextC = Context(prec=options.cdigits)
# globals for tracking statistics
# TODO: get rid of these globals...
global _num_elements_removed
global _num_attributes_removed
global _num_ids_removed
global _num_comments_removed
global _num_style_properties_fixed
global _num_rasters_embedded
global _num_path_segments_removed
global _num_points_removed_from_polygon
global _num_bytes_saved_in_path_data
global _num_bytes_saved_in_colors
global _num_bytes_saved_in_comments
global _num_bytes_saved_in_ids
global _num_bytes_saved_in_lengths
global _num_bytes_saved_in_transforms
_num_elements_removed = 0
_num_attributes_removed = 0
_num_ids_removed = 0
_num_comments_removed = 0
_num_style_properties_fixed = 0
_num_rasters_embedded = 0
_num_path_segments_removed = 0
_num_points_removed_from_polygon = 0
_num_bytes_saved_in_path_data = 0
_num_bytes_saved_in_colors = 0
_num_bytes_saved_in_comments = 0
_num_bytes_saved_in_ids = 0
_num_bytes_saved_in_lengths = 0
_num_bytes_saved_in_transforms = 0
doc = xml.dom.minidom.parseString(in_string)
# determine number of flowRoot elements in input document
@ -3704,13 +3666,13 @@ def scourString(in_string, options=None):
print("WARNING: {}".format(errmsg), file=sys.stderr)
# remove descriptive elements
removeDescriptiveElements(doc, options)
stats.num_elements_removed += remove_descriptive_elements(doc, options)
# remove unneeded namespaced elements/attributes added by common editors
if options.keep_editor_data is False:
_num_elements_removed += removeNamespacedElements(doc.documentElement,
stats.num_elements_removed += removeNamespacedElements(doc.documentElement,
unwanted_ns)
_num_attributes_removed += removeNamespacedAttributes(doc.documentElement,
stats.num_attributes_removed += removeNamespacedAttributes(doc.documentElement,
unwanted_ns)
# remove the xmlns: declarations now
@ -3722,7 +3684,7 @@ def scourString(in_string, options=None):
for attr in xmlnsDeclsToRemove:
doc.documentElement.removeAttribute(attr)
_num_attributes_removed += 1
stats.num_attributes_removed += len(xmlnsDeclsToRemove)
# ensure namespace for SVG is declared
# TODO: what if the default namespace is something else (i.e. some valid namespace)?
@ -3757,28 +3719,28 @@ def scourString(in_string, options=None):
for attrName in xmlnsDeclsToRemove:
doc.documentElement.removeAttribute(attrName)
_num_attributes_removed += 1
stats.num_attributes_removed += len(xmlnsDeclsToRemove)
for prefix in redundantPrefixes:
remapNamespacePrefix(doc.documentElement, prefix, '')
if options.strip_comments:
_num_comments_removed = removeComments(doc)
remove_comments(doc, stats)
if options.strip_xml_space_attribute and doc.documentElement.hasAttribute('xml:space'):
doc.documentElement.removeAttribute('xml:space')
_num_attributes_removed += 1
stats.num_attributes_removed += 1
# repair style (remove unnecessary style properties and change them into XML attributes)
_num_style_properties_fixed = repairStyle(doc.documentElement, options)
stats.num_style_properties_fixed = repairStyle(doc.documentElement, options)
# convert colors to #RRGGBB format
if options.simple_colors:
_num_bytes_saved_in_colors = convertColors(doc.documentElement)
stats.num_bytes_saved_in_colors = convertColors(doc.documentElement)
# remove unreferenced gradients/patterns outside of defs
# and most unreferenced elements inside of defs
while removeUnreferencedElements(doc, options.keep_defs) > 0:
while remove_unreferenced_elements(doc, options.keep_defs, stats) > 0:
pass
# remove empty defs, metadata, g
@ -3796,29 +3758,30 @@ def scourString(in_string, options=None):
removeElem = True
if removeElem:
elem.parentNode.removeChild(elem)
_num_elements_removed += 1
stats.num_elements_removed += 1
if options.strip_ids:
referencedIDs = findReferencedElements(doc.documentElement)
identifiedElements = unprotected_ids(doc, options)
removeUnreferencedIDs(referencedIDs, identifiedElements)
stats.num_ids_removed += remove_unreferenced_ids(referencedIDs,
identifiedElements)
while removeDuplicateGradientStops(doc) > 0:
while remove_duplicate_gradient_stops(doc, stats) > 0:
pass
# remove gradients that are only referenced by one other gradient
while collapseSinglyReferencedGradients(doc) > 0:
while collapse_singly_referenced_gradients(doc, stats) > 0:
pass
# remove duplicate gradients
_num_elements_removed += removeDuplicateGradients(doc)
stats.num_elements_removed += removeDuplicateGradients(doc)
if options.group_collapse:
_num_elements_removed += mergeSiblingGroupsWithCommonAttributes(doc.documentElement)
stats.num_elements_removed += mergeSiblingGroupsWithCommonAttributes(doc.documentElement)
# create <g> elements if there are runs of elements with the same attributes.
# this MUST be before moveCommonAttributesToParentGroup.
if options.group_create:
createGroupsForCommonAttributes(doc.documentElement)
create_groups_for_common_attributes(doc.documentElement, stats)
# move common attributes to parent group
# NOTE: the if the <svg> element's immediate children
@ -3827,20 +3790,20 @@ def scourString(in_string, options=None):
# doesn't accept fill=, stroke= etc.!
referencedIds = findReferencedElements(doc.documentElement)
for child in doc.documentElement.childNodes:
_num_attributes_removed += moveCommonAttributesToParentGroup(child, referencedIds)
stats.num_attributes_removed += moveCommonAttributesToParentGroup(child, referencedIds)
# remove unused attributes from parent
_num_attributes_removed += removeUnusedAttributesOnParent(doc.documentElement)
stats.num_attributes_removed += removeUnusedAttributesOnParent(doc.documentElement)
# Collapse groups LAST, because we've created groups. If done before
# moveAttributesToParentGroup, empty <g>'s may remain.
if options.group_collapse:
while removeNestedGroups(doc.documentElement) > 0:
while remove_nested_groups(doc.documentElement, stats) > 0:
pass
# remove unnecessary closing point of polygons and scour points
for polygon in doc.documentElement.getElementsByTagName('polygon'):
cleanPolygon(polygon, options)
stats.num_points_removed_from_polygon += clean_polygon(polygon, options)
# scour points of polyline
for polyline in doc.documentElement.getElementsByTagName('polyline'):
@ -3851,11 +3814,11 @@ def scourString(in_string, options=None):
if elem.getAttribute('d') == '':
elem.parentNode.removeChild(elem)
else:
cleanPath(elem, options)
clean_path(elem, options, stats)
# shorten ID names as much as possible
if options.shorten_ids:
_num_bytes_saved_in_ids += shortenIDs(doc, options.shorten_ids_prefix, options)
stats.num_bytes_saved_in_ids += shortenIDs(doc, options.shorten_ids_prefix, options)
# scour lengths (including coordinates)
for type in ['svg', 'image', 'rect', 'circle', 'ellipse', 'line',
@ -3872,18 +3835,18 @@ def scourString(in_string, options=None):
doc.documentElement.setAttribute('viewBox', ' '.join(lengths))
# more length scouring in this function
_num_bytes_saved_in_lengths = reducePrecision(doc.documentElement)
stats.num_bytes_saved_in_lengths = reducePrecision(doc.documentElement)
# remove default values of attributes
_num_attributes_removed += removeDefaultAttributeValues(doc.documentElement, options)
stats.num_attributes_removed += removeDefaultAttributeValues(doc.documentElement, options)
# reduce the length of transformation attributes
_num_bytes_saved_in_transforms = optimizeTransforms(doc.documentElement, options)
stats.num_bytes_saved_in_transforms = optimizeTransforms(doc.documentElement, options)
# convert rasters references to base64-encoded strings
if options.embed_rasters:
for elem in doc.documentElement.getElementsByTagName('image'):
embedRasters(elem, options)
stats.num_rasters_embedded += embed_rasters(elem, options)
# properly size the SVG document (ideally width/height should be 100% with a viewBox)
if options.enable_viewboxing:
@ -3917,7 +3880,7 @@ def scourString(in_string, options=None):
# used mostly by unit tests
# input is a filename
# returns the minidom doc representation of the SVG
def scourXmlFile(filename, options=None):
def scourXmlFile(filename, options=None, stats=None):
# sanitize options (take missing attributes from defaults, discard unknown attributes)
options = sanitizeOptions(options)
# we need to make sure infilename is set correctly (otherwise relative references in the SVG won't work)
@ -3926,7 +3889,7 @@ def scourXmlFile(filename, options=None):
# open the file and scour it
with open(filename, "rb") as f:
in_string = f.read()
out_string = scourString(in_string, options)
out_string = scourString(in_string, options, stats=stats)
# prepare the output xml.dom.minidom object
doc = xml.dom.minidom.parseString(out_string.encode('utf-8'))
@ -4176,22 +4139,22 @@ def getInOut(options):
return [infile, outfile]
def getReport():
def generate_report(stats):
return (
' Number of elements removed: ' + str(_num_elements_removed) + os.linesep +
' Number of attributes removed: ' + str(_num_attributes_removed) + os.linesep +
' Number of unreferenced IDs removed: ' + str(_num_ids_removed) + os.linesep +
' Number of comments removed: ' + str(_num_comments_removed) + os.linesep +
' Number of style properties fixed: ' + str(_num_style_properties_fixed) + os.linesep +
' Number of raster images embedded: ' + str(_num_rasters_embedded) + os.linesep +
' Number of path segments reduced/removed: ' + str(_num_path_segments_removed) + os.linesep +
' Number of points removed from polygons: ' + str(_num_points_removed_from_polygon) + os.linesep +
' Number of bytes saved in path data: ' + str(_num_bytes_saved_in_path_data) + os.linesep +
' Number of bytes saved in colors: ' + str(_num_bytes_saved_in_colors) + os.linesep +
' Number of bytes saved in comments: ' + str(_num_bytes_saved_in_comments) + os.linesep +
' Number of bytes saved in IDs: ' + str(_num_bytes_saved_in_ids) + os.linesep +
' Number of bytes saved in lengths: ' + str(_num_bytes_saved_in_lengths) + os.linesep +
' Number of bytes saved in transformations: ' + str(_num_bytes_saved_in_transforms)
' Number of elements removed: ' + str(stats.num_elements_removed) + os.linesep +
' Number of attributes removed: ' + str(stats.num_attributes_removed) + os.linesep +
' Number of unreferenced IDs removed: ' + str(stats.num_ids_removed) + os.linesep +
' Number of comments removed: ' + str(stats.num_comments_removed) + os.linesep +
' Number of style properties fixed: ' + str(stats.num_style_properties_fixed) + os.linesep +
' Number of raster images embedded: ' + str(stats.num_rasters_embedded) + os.linesep +
' Number of path segments reduced/removed: ' + str(stats.num_path_segments_removed) + os.linesep +
' Number of points removed from polygons: ' + str(stats.num_points_removed_from_polygon) + os.linesep +
' Number of bytes saved in path data: ' + str(stats.num_bytes_saved_in_path_data) + os.linesep +
' Number of bytes saved in colors: ' + str(stats.num_bytes_saved_in_colors) + os.linesep +
' Number of bytes saved in comments: ' + str(stats.num_bytes_saved_in_comments) + os.linesep +
' Number of bytes saved in IDs: ' + str(stats.num_bytes_saved_in_ids) + os.linesep +
' Number of bytes saved in lengths: ' + str(stats.num_bytes_saved_in_lengths) + os.linesep +
' Number of bytes saved in transformations: ' + str(stats.num_bytes_saved_in_transforms)
)
@ -4200,10 +4163,11 @@ def start(options, input, output):
options = sanitizeOptions(options)
start = time.time()
stats = ScourStats()
# do the work
in_string = input.read()
out_string = scourString(in_string, options).encode("UTF-8")
out_string = scourString(in_string, options, stats=stats).encode("UTF-8")
output.write(out_string)
# Close input and output files (but do not attempt to close stdin/stdout!)
@ -4229,7 +4193,7 @@ def start(options, input, output):
oldsize,
sizediff), file=options.ensure_value("stdout", sys.stdout))
if options.verbose:
print(getReport(), file=options.ensure_value("stdout", sys.stdout))
print(generate_report(stats), file=options.ensure_value("stdout", sys.stdout))
def run():

28
scour/stats.py Normal file
View file

@ -0,0 +1,28 @@
class ScourStats(object):
__slots__ = (
'num_elements_removed',
'num_attributes_removed',
'num_style_properties_fixed',
'num_bytes_saved_in_colors',
'num_ids_removed',
'num_comments_removed',
'num_style_properties_fixed',
'num_rasters_embedded',
'num_path_segments_removed',
'num_points_removed_from_polygon',
'num_bytes_saved_in_path_data',
'num_bytes_saved_in_colors',
'num_bytes_saved_in_comments',
'num_bytes_saved_in_ids',
'num_bytes_saved_in_lengths',
'num_bytes_saved_in_transforms',
)
def __init__(self):
self.reset()
def reset(self):
# Set all stats to 0
for attr in self.__slots__:
setattr(self, attr, 0)

View file

@ -28,10 +28,10 @@ vector graphics by optimizing structure and removing unnecessary data.
It can be used to create streamlined vector graphics suitable for web
deployment, publishing/sharing or further processing.
The goal of Scour is to output a file that renderes identically at a
The goal of Scour is to output a file that renders identically at a
fraction of the size by removing a lot of redundant information created
by most SVG editors. Optimization options are typically lossless but can
be tweaked for more agressive cleaning.
be tweaked for more aggressive cleaning.
Website
- http://www.codedread.com/scour/ (original website)