Commit graph

198 commits

Author SHA1 Message Date
Nikita Karamov
56910e6e9c
Fix formatting 2023-01-23 07:26:43 +01:00
Nikita Karamov
24c297cfe7
Remove six dependency 2023-01-23 07:26:43 +01:00
Nikita Karamov
9141483527
Use dict literals 2023-01-23 06:51:29 +01:00
Nikita Karamov
c3b305cc5d
Remove unneeded six stuff 2023-01-23 06:51:29 +01:00
Nikita Karamov
1cc86cc3c8
Use str.format() 2023-01-23 06:51:29 +01:00
Nikita Karamov
d1fd32fd5b
Do not explicitly extend object 2023-01-23 06:51:29 +01:00
Nikita Karamov
1ceeaf11e6
Remove __future__ imports 2023-01-23 06:51:18 +01:00
Nikita Karamov
a284dec2f9
Remove "coding" pragma 2023-01-23 06:51:06 +01:00
Nikita Karamov
27bb4cef92
Do not use u-strings 2023-01-23 06:44:41 +01:00
a1346054
0609c59676
Fix spelling (#284) 2021-08-30 19:17:00 +02:00
Niels Thykier
fbf0c06e84 Avoid mutating a mutable kwarg
Signed-off-by: Niels Thykier <niels@thykier.net>
2021-02-23 20:00:20 +01:00
Niels Thykier
841ad54e7f Refactor function to avoid double negative
Signed-off-by: Niels Thykier <niels@thykier.net>
2021-02-23 20:00:20 +01:00
Niels Thykier
68c1e545da Replace global stats vars with a ScourStats object
This enables us to get rid of all the global variables.

I used the opportunity to update function names where call sites where
affected to move scour a step towards a more pythonic style in
general.

Signed-off-by: Niels Thykier <niels@thykier.net>
2021-02-23 20:00:20 +01:00
Niels Thykier
a7a16799a2 Remove some dead assignments
Signed-off-by: Niels Thykier <niels@thykier.net>
2021-02-23 20:00:20 +01:00
Niels Thykier
7b9c4ee935 Simplif loop logic
Signed-off-by: Niels Thykier <niels@thykier.net>
2021-02-23 20:00:20 +01:00
Niels Thykier
aa9796ea87 Refactor: Create a g_tag_is_unmergeable
Both `mergeSiblingGroupsWithCommonAttributes` and `removeNestedGroups`
used the same code in different forms.  Extract it into its own
function.

Signed-off-by: Niels Thykier <niels@thykier.net>
2021-02-23 20:00:20 +01:00
Niels Thykier
b8a071f995
scour: Fix another variant of the crash from #260 (#264)
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-11-22 15:00:43 +01:00
Niels Thykier
f56843acc0
mergeSiblingGroupsWithCommonAttributes: Avoid creating "empty" <g>-tags (#261)
Closes: #260
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-09-02 17:03:36 +00:00
Niels Thykier
f0788d5c0d
renameID: Fix bug when swapping two IDs
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-27 10:26:20 +00:00
Niels Thykier
9a1286132f
remapNamespacePrefix: Preserve prefix of attribute names (#255)
Preserve prefix of attribute names when copying them over to the new
node.  This fixes an unintentional rewrite of `xml:space` to `space`
that also caused scour to strip whitespace that should have been
preserved.

Closes: #239
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-10 20:18:21 +02:00
Niels Thykier
ca2b32c0b3
removeDuplicateGradients: Maintain referenced_ids
This avoids calling `findReferencedElements` more than once per
removeDuplicateGradients.  This is good for performance as
`findReferencedElements` is one of the slowest functions in scour.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-09 16:45:12 +00:00
Niels Thykier
3d29029c72
findReferencedElements: Use a set instead of list for tracking nodes
Except for one caller, nothing cares what kind of collection is used.
By migrating to a set, we can enable a future rewrite.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-09 16:45:11 +00:00
Niels Thykier
0e82b8dcad
Refactor removeDuplicateGradients to loop until it reaches a fixed point
This is commits enables a future optimization (but is not a notable
optimization in itself).

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-09 16:45:10 +00:00
Niels Thykier
a3f761f40c
Refactor some code out of removeDuplicateGradients
This is commits enables a future optimization (but is not a notable
optimization in itself).

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-09 16:45:09 +00:00
Niels Thykier
36ee0932a4
removeDuplicateGradients: Compile at most one regex per master gradient
Regex compilation is by far the most expensive part of
removeDuplicateGradients.  This commit reduces the pain a bit by
trading "many small regexes" to "few larger regexes", which avoid some
of the compilation overhead.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-09 16:45:08 +00:00
Niels Thykier
9e3a5f2e40
removeDuplicateGradients: Refactor how duplicates are passed around
This commit is mostly to enable the following commit to make
improvements.  It does reduce the number of duplicate getAttribute
calls by a tiny bit but it is unlikely to matter in practice.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-09 16:45:07 +00:00
Niels Thykier
ace24df5c3
removeDuplicateGradients: Avoid compiling regex unless we need it
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-09 16:45:05 +00:00
Patrick Storz
985cb58a26 Remove outdated comment
originally added in
  879300373f
and fixed shortly after in
  2dc788aa3f
2020-06-08 19:45:48 +02:00
Niels Thykier
fd2daf44b4
Avoid compiling "the same" regex multiple times
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 19:06:51 +00:00
Niels Thykier
045f1f0ad5
removeNamespacedElements: Avoid calling it twice as it is indempotent
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 19:04:43 +00:00
Niels Thykier
29a7474f74
removeNamespacedAttributes: Avoid calling it twice as it is indempotent
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 19:04:42 +00:00
Niels Thykier
528ad91418
removeUnusedDefs: Call getAttribute at most once per element
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 19:04:41 +00:00
Niels Thykier
c5362743c3
_getStyle: Avoid calling getAttribute twice for no reason
_getStyle accounted for ~8.9% (~17700) of all calls to getAttribute on
devices/hidef/secure-card.svgz file from the Oxygen icon theme.  This
commit removes this part of the dead weight.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 19:04:40 +00:00
Niels Thykier
5881890e44
removeUnreferencedElements: Remove defs before unref elements
The `removeUnusedDefs` function can take `referencedIDs` as parameter
and its work do not invalidate it.  By moving it up in
`removeUnreferencedElements` we can save a call to
`findReferencedElements` per call to `removeUnreferencedElements`.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 19:04:39 +00:00
Niels Thykier
397ffc5529
make_well_formed: Optimize for the common case of nothing needs to be escaped
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 18:07:32 +00:00
Niels Thykier
9656569a72
serializeXML: Refactor the attribute ordering code
Rewrite the code for ordering attributes in the output and extract it
into a function.  As a side-effect, we ensure we only use the
`.item(index)` method once per attribute because it is inefficient
(see https://bugs.python.org/issue40689).

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 18:07:31 +00:00
Niels Thykier
5be6b03d7c
Serialization: Avoid creating a single-use dict in each call to make_well_formed
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 18:07:30 +00:00
Niels Thykier
21f1262bcb
Avoid creating single-use-throw-away lists for string join
There is no need to create a list of it only to discard it after a
single use with join (which gladly accepts an iterator/generator
instead).

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 18:07:29 +00:00
Niels Thykier
47e8b15315
convertColors: Fix bug in computation in how many bytes are saved (#245)
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-06-07 18:35:46 +02:00
Niels Thykier
dd2155e576 Merge sibling <g> nodes with identical attributes
In some cases, gnuplot generates a very suboptimal SVG content of the
following pattern:

        <g color="black" fill="none" stroke="currentColor">
        <path d="m82.5 323.3v-4.1" stroke="#000"/>
        </g>
        <g color="black" fill="none" stroke="currentColor">
        <path d="m116.4 323.3v-4.1" stroke="#000"/>
        </g>
        ... repeated 10+ more times here ...
        <g color="black" fill="none" stroke="currentColor">
        <path d="m65.4 72.8v250.5h420v-250.5h-420z" stroke="#000"/>
        </g>

A more optimal pattern would be:

        <g color="black" fill="none" stroke="#000">
        <path d="m82.5 323.3v-4.1"/>
        <path d="m116.4 323.3v-4.1"/>
        ... 10+ more paths here ...
        <path d="m65.4 72.8v250.5h420v-250.5h-420z"/>
        </g>

This patch enables that optimization by handling the merging of two
sibling <g> entries that have identical attributes.  In the above
example that does not solve the rewrite from "currentColor" to "#000"
for the stroke attribute.  However, the existing code already handles
that automatically after the <g> elements have been merged.

This change provides comparable results to --create-groups as shown by
the following diagram while being a distinct optimization:

       +----------------------------+-------+--------+
       |           Test             | Size  |  in %  |
       +----------------------------+-------+--------+
       | baseline                   | 17961 |  100%  |
       | baseline + --create-groups | 17418 |  97.0% |
       | patched                    | 16939 |  94.3% |
       | patched + --create-groups  | 16855 |  93.8% |
       +----------------------------+-------+--------+

The image used in the size table above was generated based on the
instructions from https://bugs.debian.org/858039#10 with gnuplot 5.2
patchlevel 2.  Beyond the test-based "--create-groups", the following
scour command-line parameters were used:
      --enable-id-stripping --enable-comment-stripping \
      --shorten-ids --indent=none

Note that the baseline was scour'ed repeatedly to stablize the image
size.

Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 19:37:32 +02:00
Patrick Storz
40753af88a Fix whitespace handling for SVG 1.2 flowed text
See 718748ff22

Fixes https://github.com/scour-project/scour/issues/235
2020-05-17 17:33:50 +02:00
Patrick Storz
4fe2655f86
Merge pull request #187 from nthykier/fix-gh-186-shorten-id-recycle-used-ids
Enable shortenIDs to recycle existing IDs
2020-05-17 16:48:18 +02:00
Niels Thykier
6846e0c9ee Preserve xhref:href attr when collapsing referenced gradients
Closes: #198
Closes: #202
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 16:29:08 +02:00
Niels Thykier
09a656287d Avoid picking an id-less gradient to replace one with an id
Closes: #203
Signed-off-by: Niels Thykier <niels@thykier.net>
2020-05-17 16:13:45 +02:00
Eduard Braun
703122369e Strip newlines from text nodes and be done with it
Follow the spec "blindly" as it turns out covering all the border
and getting reasonably styled output is just to cumbersome.
This way at least scour output is consistent and it also saves us
some bytes (a lot in some cases as we do not indent <tspan>s etc.
anymore)
2018-07-02 22:14:14 +02:00
Eduard Braun
2200f8dc81 temp 2018-07-02 01:05:54 +02:00
Eduard Braun
e1c2699f07 Improve whitespace handling in text content elements
SVG specifies special logic for handling whitespace, see
   https://www.w3.org/TR/SVG/text.html#WhiteSpace
by implementing it we can even shave off some unneeded bytes here
and there (e.g. consecutive spaces).

Unfortunately handling of newlines by renderers is inconsistent:
Sometimes they are replaced by a single space, sometimes they
are removed in the output.
As we can not know the expected behavior work around this by keeping
newlines inside text content elements intact.

Fixes #160.
2018-07-01 20:19:58 +02:00
Eduard Braun
7d28f5e051 Improve handling of newlines
Previously we added way to many and removed empty lines afterwards
(potentially destructive if xml:space="preserve")

Also adds proper indentation for comment nodes
2018-07-01 19:48:18 +02:00
Eduard Braun
06ea23d0e1 fix typo 2018-07-01 13:52:51 +02:00
Patrick Storz
8c95d950af
Merge pull request #192 from nthykier/gh-189-order-vs-SVGLength
Work around an exception in removeDefaultAttributeValue() caused by some rarely used filter attributes that allow an optional second value which SVGLength does not handle properly
2018-06-30 19:03:15 +02:00