Commit graph

320 commits

Author SHA1 Message Date
Niels Thykier
21f1262bcb
Avoid creating single-use-throw-away lists for string join
There is no need to create a list of it only to discard it after a
single use with join (which gladly accepts an iterator/generator
instead).

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 18:07:29 +00:00
Niels Thykier
47e8b15315
convertColors: Fix bug in computation in how many bytes are saved (#245)
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 18:35:46 +02:00
Niels Thykier
a15acb3e4e
Rename testX.py to test_X.py to make py.test work out of the box (#181)
This rename makes py.test/py.test-3 find the test suite out of the
box.  Example command lines:

       # Running the test suite (optionally include "-v")
       $ py.test-3
       # Running the test suite with coverage enabled (and branch
       # coverage).
       $ py.test-3 --cov=scour --cov-report=html --cov-branch

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 19:55:24 +02:00
Niels Thykier
dd2155e576 Merge sibling <g> nodes with identical attributes
In some cases, gnuplot generates a very suboptimal SVG content of the
following pattern:

        <g color="black" fill="none" stroke="currentColor">
        <path d="m82.5 323.3v-4.1" stroke="#000"/>
        </g>
        <g color="black" fill="none" stroke="currentColor">
        <path d="m116.4 323.3v-4.1" stroke="#000"/>
        </g>
        ... repeated 10+ more times here ...
        <g color="black" fill="none" stroke="currentColor">
        <path d="m65.4 72.8v250.5h420v-250.5h-420z" stroke="#000"/>
        </g>

A more optimal pattern would be:

        <g color="black" fill="none" stroke="#000">
        <path d="m82.5 323.3v-4.1"/>
        <path d="m116.4 323.3v-4.1"/>
        ... 10+ more paths here ...
        <path d="m65.4 72.8v250.5h420v-250.5h-420z"/>
        </g>

This patch enables that optimization by handling the merging of two
sibling <g> entries that have identical attributes.  In the above
example that does not solve the rewrite from "currentColor" to "#000"
for the stroke attribute.  However, the existing code already handles
that automatically after the <g> elements have been merged.

This change provides comparable results to --create-groups as shown by
the following diagram while being a distinct optimization:

       +----------------------------+-------+--------+
       |           Test             | Size  |  in %  |
       +----------------------------+-------+--------+
       | baseline                   | 17961 |  100%  |
       | baseline + --create-groups | 17418 |  97.0% |
       | patched                    | 16939 |  94.3% |
       | patched + --create-groups  | 16855 |  93.8% |
       +----------------------------+-------+--------+

The image used in the size table above was generated based on the
instructions from https://bugs.debian.org/858039#10 with gnuplot 5.2
patchlevel 2.  Beyond the test-based "--create-groups", the following
scour command-line parameters were used:
      --enable-id-stripping --enable-comment-stripping \
      --shorten-ids --indent=none

Note that the baseline was scour'ed repeatedly to stablize the image
size.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 19:37:32 +02:00
Patrick Storz
40753af88a Fix whitespace handling for SVG 1.2 flowed text
See 718748ff22

Fixes https://github.com/scour-project/scour/issues/235
2020-05-17 17:33:50 +02:00
Patrick Storz
f65ca60809 Fix deprecation warning 2020-05-17 17:10:26 +02:00
Patrick Storz
4fe2655f86
Merge pull request #187 from nthykier/fix-gh-186-shorten-id-recycle-used-ids
Enable shortenIDs to recycle existing IDs
2020-05-17 16:48:18 +02:00
Niels Thykier
58b75c314a Add test case for #198/#202
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 16:29:08 +02:00
Niels Thykier
6846e0c9ee Preserve xhref:href attr when collapsing referenced gradients
Closes: #198
Closes: #202
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 16:29:08 +02:00
Niels Thykier
f61b4d36d6 Add test case for #203
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 16:13:45 +02:00
Niels Thykier
09a656287d Avoid picking an id-less gradient to replace one with an id
Closes: #203
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 16:13:45 +02:00
Patrick Storz
695676e3a5 Run tests with Python 3.7 / 3.8 2020-05-17 16:03:06 +02:00
Eduard Braun
049264eba6 Scour v0.37 2018-07-04 19:16:55 +02:00
Eduard Braun
5ccba31ff9 Update HISTORY.md 2018-07-04 19:05:25 +02:00
Patrick Storz
718748ff22
Merge pull request #199 from Ede123/newline_handling
Several improvements for handling whitespace including newlines, especially in text nodes
2018-07-03 22:56:36 +02:00
Eduard Braun
651694a6c0 Add unittests for whitespace handling in text node
Also expand/fix the test for line endings
2018-07-03 22:53:05 +02:00
Eduard Braun
703122369e Strip newlines from text nodes and be done with it
Follow the spec "blindly" as it turns out covering all the border
and getting reasonably styled output is just to cumbersome.
This way at least scour output is consistent and it also saves us
some bytes (a lot in some cases as we do not indent <tspan>s etc.
anymore)
2018-07-02 22:14:14 +02:00
Eduard Braun
2200f8dc81 temp 2018-07-02 01:05:54 +02:00
Eduard Braun
e1c2699f07 Improve whitespace handling in text content elements
SVG specifies special logic for handling whitespace, see
   https://www.w3.org/TR/SVG/text.html#WhiteSpace
by implementing it we can even shave off some unneeded bytes here
and there (e.g. consecutive spaces).

Unfortunately handling of newlines by renderers is inconsistent:
Sometimes they are replaced by a single space, sometimes they
are removed in the output.
As we can not know the expected behavior work around this by keeping
newlines inside text content elements intact.

Fixes #160.
2018-07-01 20:19:58 +02:00
Eduard Braun
7d28f5e051 Improve handling of newlines
Previously we added way to many and removed empty lines afterwards
(potentially destructive if xml:space="preserve")

Also adds proper indentation for comment nodes
2018-07-01 19:48:18 +02:00
Eduard Braun
06ea23d0e1 fix typo 2018-07-01 13:52:51 +02:00
Patrick Storz
8c95d950af
Merge pull request #192 from nthykier/gh-189-order-vs-SVGLength
Work around an exception in removeDefaultAttributeValue() caused by some rarely used filter attributes that allow an optional second value which SVGLength does not handle properly
2018-06-30 19:03:15 +02:00
Patrick Storz
5d579f8927
Also special-case baseFrequency and add 'radius 2018-06-30 18:58:36 +02:00
Eduard Braun
3c64623a12 Discontinue official support for Python 3.3
(testing failed due to wheel now requiring Python >= 3.4)

Also run flake8 in latest Python 3.6
(3.7 is not supported on Travis yet)
2018-06-29 19:29:09 +02:00
Patrick Storz
9f4a707bb7
Merge pull request #178 from nthykier/gh-163-path-rewrite
Correct handling of "m0 0" vs. "z" commands
2018-06-29 19:11:53 +02:00
Niels Thykier
8a2892b458 Avoid crashing on stdDeviation attribute
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-21 06:39:08 +00:00
Niels Thykier
c504891bd7 test: Use number-optional-number variant of kernelUnitLength
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-21 06:19:38 +00:00
Tobias Oberstein
47f918e696
Merge pull request #191 from nthykier/gh-190-optimizeTransform-IndexError
Avoid crashing on "scale(1)" (short for "scale(1, 1)")
2018-04-18 19:25:48 +02:00
Niels Thykier
18e57cddae Avoid crashing on "scale(1)" (short for "scale(1, 1)")
The scale function on the transform attribute has a short form, where
only the first argument is used.  But optimizeTransform would always
assume that there were two when checking for the identity scale.

Closes: #190
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-18 05:41:35 +00:00
Niels Thykier
a459d629c1 removeDefaultAttributeValue: Special-case order attribute
Scour tried to handle "order" attribute as a SVGLength.  However, the
"order" attribute *can* consist of two integers according to the
[SVG 1.1 Specification] and SVGLength is not designed to handle that.

With this change, we now pretend that "order" is a string, which side
steps this issue.

[SVG 1.1 Specification]: https://www.w3.org/TR/SVG11/single-page.html#filters-feConvolveMatrixElementOrderAttribute

Closes: #189
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-17 19:48:37 +00:00
Niels Thykier
039022ee9d shortenID: Improve tracking of optimal ID lengths
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-16 18:52:12 +00:00
Niels Thykier
e25b0dae73 Remove a (now) unused parameter to renameID
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-15 17:36:07 +00:00
Niels Thykier
91503c6d7e renameID: Replace referencedIDs with referringNodes
This change pushes the responsibility of updating referencedIDs to its
callers where needed.  The only caller of renameIDs is shortenIDs and
that works perfectly fine without updating its copy of referencedIDs.
In shortenIDs, we need to be able to lookup which nodes referenced the
"original ID" (and not the "new ID").

While shortenIDs *could* update referencedIDs so it remained valid, it
is extra complexity for no gain.  As an example of this complexity,
imagine if two or more IDs are "rotated" like so:

      Original IDs: a, bb, ccc, dddd
      Mapping:
        dddd -> ccc
        ccc  -> bb
        bb   -> a
        a    -> dddd

While doable within reasonable performance, we do not need to support
it at the moment, so there is no reason to handle that complexity.

Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-15 17:35:05 +00:00
Niels Thykier
d6406a3470 shortenIDs: Avoid pointless renames of IDs
With the current code, scour could do a pointless remap of an ID,
where there is no benefit in it.  Consider:

```xml
<?xml version="1.0" encoding="UTF-8"?>
<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink">
 <defs>
  <rect id="a" width="80" height="50" fill="red"/>
  <rect id="b" width="80" height="50" fill="blue"/>
 </defs>
 <use xlink:href="#a"/>
 <use xlink:href="#b"/>
 <use xlink:href="#b"/>
</svg>
```

In this example, there is no point in swapping the IDs - even if "#b"
is used more often than "#a", they have the same length.  Besides a
performance win on an already scour'ed image, it also mean scour will
behave like a function with a fixed-point (i.e. scour eventually stops
altering the image).

To solve this, we no longer check whether an we find exactly the same
ID.  Instead, we look at the length of the new ID compared to the
original.  This gives us a slight complication as we can now "reserve"
a "future" ID to avoid the rename.

Thanks to Eduard "Ede_123" Braun for providing the test case.

Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-15 17:34:24 +00:00
Eduard Braun
8ddb7d8913 Add valid elements for 'spreadMethod' attribute
Turns out 'default_attributes_universal' is actually empty right now
so we might consider removing it altogether...
2018-04-15 18:40:06 +02:00
Eduard Braun
0ec0732447 Simplify 'default_attributes' handling a bit 2018-04-15 18:33:46 +02:00
Eduard Braun
20dcbcbe64 'default_attributes': make sure 'elements' is a list 2018-04-15 18:31:51 +02:00
Niels Thykier
1650f91ea4 Optimize removeDefaultAttributeValues
Avoid looping over DefaultAttribute(s) that are not relevant for a
given node.  This skips a lot of calls to removeDefaultAttributeValue
but more importantly, it avoids "node.nodeName not in attribute.elements"
line in removeDefaultAttributeValue.  As attribute.elements is a list, this
becomes expensive for "larger lists" (or in this case when there are a lot
of attributes).

This seems to remove about 1½-2 minutes of runtime (out of ~8) on the
1_42_polytope_7-cube.svg test case provided in #184.

Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-15 18:29:58 +02:00
Niels Thykier
5dc1b7a820 scour: Make optimized default_attribute data structures
There are a lot of "DefaultAttribute"s and for a given tag, most of
the "DefaultAttribute"s are not applicable.  Therefore, we create two
data structures to assist us with only dealing with the attributes
that matter.

Here there are two cases:

 * Those that always matter.  These go into
   default_attributes_unrestricted list.
 * Those that matter only based on the node name.  These go into the
   default_attributes_restricted_by_tag with the node name as key
   (with the value being a list of matching attributes).

In the next commit, we will use those for optimizing the removal of
default attributes.

Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-15 18:29:58 +02:00
Niels Thykier
00cf42b554 Rename function to match DEP8 conventions
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-15 16:22:00 +00:00
Niels Thykier
0254014e06 Enable shortenIDs to recycle existing IDs
This patch enables shortenIDs to remap IDs currently in use.  This is
very helpful to ensure that scour does not change been "optimal" and
"suboptimal" choices for IDs as observed in GH#186.

Closes: #186
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-04-13 21:00:35 +00:00
Eduard Braun
3283d6d5ec Simplify control point detection logic
- make controlPoints() return a consistent type like flags()
- rename the ambiguous "reduce_precision" to "is_control_point"
2018-04-08 16:48:33 +02:00
Eduard Braun
103dcc0a48
Fix handling of boolean flags in elliptical path commands (#183)
* properly parse paths without space after boolean flags (fixes #161)
* omit space after boolean flag to shave off a few bytes when not using renderer workarounds
2018-04-08 15:32:47 +02:00
Niels Thykier
ba7f4b5f18 Remove more redundant uses of .keys()
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-03-26 22:36:19 +02:00
Niels Thykier
f8d5af0e56 Remove now unused variable
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-03-26 22:36:19 +02:00
Eduard Braun
d508f59aa6 Completely remove "walltime" variable and use time.time() directly 2018-03-26 22:34:11 +02:00
Niels Thykier
b622642aa1 Simplify timer selection to always use time.time() (#175)
In python2.7 and python3.3, time.time() is sufficient accurate for our
purpose and avoids going through hoops to select the best available
function.

Signed-off-by: Niels Thykier <niels@thykier.net>
2018-03-26 22:30:25 +02:00
Niels Thykier
38274f75bc Implement a basic rewrite of redundant commands
This basic implementation can drop and rewrite some cases of "m0 0"
and "z" without triggering the issues experienced in #163.  It works
by analysing the path backwards and tracking "z" and "m" commands.

Signed-off-by: Niels Thykier <niels@thykier.net>
2018-03-11 08:33:50 +00:00
Niels Thykier
a2c94c96fb Disable the "m0 0"-optimization as it is wrong in some cases
The "m0 0" rewrite gets some cases wrong, like:

         m150 240h200m0 0 150 150v-300z

Scour rewrote that into the following
         m150 240h200l150 150v-300z

However, these two paths do not produce an identical figure at all.
The first is a line followed by a triangle while the second is a
quadrilateral.

While there are some instances we can rewrite (that scour will no
longer rewrite), these will require an analysis over multiple commands
to determine whether the rewrite is safe.  This will reappear in the
next commit.

Closes: #163
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-03-11 08:25:46 +00:00
Niels Thykier
6ea126d290 Gracefully handle unreferenced gradients with --keep-unreferenced-defs (#173)
Closes: #156
Signed-off-by: Niels Thykier <niels@thykier.net>
2018-03-10 16:06:50 +01:00