* src/smooth/ftgrays.c (gray_render_conic): Move variable and
structure declarations to beginning of function. Inspite of C99
compliance we still do this for the sake of backward compatibility.
This also avoids a shadowing declaration of `count`.
(gray_convert_glyph_inner): Fix typo.
Guard inclusion of emmintrin.h with "#ifdef __SSE2__". The gcc version
of this header, xmmintrin.h, and mmintrin.h check that the appropriate
defines are set before defining anything (are internally guarded).
However the clang versions of these includes are not internally guarded.
As a result of this, externally guard the inclusion of these headers.
Benchmarking shows that this provides a very slighty performance
boost when rendering fonts with lots of quadratic bezier arcs,
compared to the recursive arc splitting, but only when SSE2 is
available, or on 64-bit CPUs.
On a 2017 Core i5-7300U CPU on Linux/x86_64:
./ftbench -p -s10 -t5 -cb .../DroidSansFallbackFull.ttf
Before: 4.033 us/op (best of 5 runs for all numbers)
After: 3.876 us/op
./ftbench -p -s60 -t5 -cb .../DroidSansFallbackFull.ttf
Before: 13.467 us/op
After: 13.385 us/op
This speeds up the smooth rasterizer by avoiding a
conditional branches in the hot path. Namely:
- Define a fixed "null cell" which will be pointed
to whenever the current cell is outside of the current
target region. This avoids a "ras.cell != NULL"
check in the FT_INTEGRATE() macro.
- Also use the null cell as a sentinel at the end of
all ycells[] linked-lists, by setting its x coordinate
to INT_MAX. This avoids a 'if (!cell)' check in
gray_set_cell() as well.
- Slightly change the worker struct fields to perform
a little less operations during rendering.
Example results (on a 2013 Corei5-3337U CPU)
out/ftbench -p -s10 -t5 -bc /usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf
Before: 5.472 us/op
After: 5.275 us/op
out/ftbench -p -s60 -t5 -bc /usr/share/fonts/truetype/droid/DroidSansFallbackFull.ttf
Before: 17.988 us/op
After: 17.389 us/op
FT_Render_Glyph picked up FAILURE or 1 returned from the raster
function, which became a confusing error code. Instead, return
Raster_Overflow in the unlikely event that banding does not help or
another meaningful error.
* src/smooth/ftgrays.c (gray_convert_glyph_inner, gray_convert_glyph):
Use Raster_Overflow when the rendering pool is exhausted and return it
if banding does not help.
(gray_raster_render): Use Smooth_Err_Ok.
* src/raster/ftraster.c (Render_Single_Pass): Return Raster_Overflow
if banding does not help or another error code.
Selecting the fill rule or checking the direct mode each time we call
`gray_hline' is sub-optimal. This effectively splits the direct mode
into a separate code path while inlining `gray_hline' and saving 5-7%
of rendering time.
* src/smooth/ftgrays.c (gray_hline): Eliminated in favor of...
(FT_FILL_RULE, FT_GRAY_SET): ... these new macros...
(gray_sweep): ... inlined here.
(gray_sweep_direct): New function that handles the direct span buffer.
(gray_TWorker): Remove the span buffer.
(gray_raster_render, gray_convert_glyph): Updated.
We now record `cover' and `area' directly into the linked list. This
makes rendering faster by 10% or even more at larger sizes.
* src/smooth/ftgrays.c (FT_INTEGRATE): Write directly.
(gray_TWorker): Add direct cell reference and remove unused fields.
(gray_set_cell): Consolidate the linked list management and pointers.
(gray_convert_glyph, gray_convert_glyph_inner): Updated.
We no longer have to take care of the 8.3 file name limit; this
allows us (a) to introduce longer, meaningful file names, and (b) to
avoid macro names in `#include' lines altogether since some
compilers (most notably Visual C++) doesn't support this properly.
*/*: Replace
#include FOO_H
with
#include <freetype/foo.h>
or something similar. Also update the documentation.
The buffer size FT_MAX_GRAY_SPANS is set to 10 spans, which should be
enough to cover the entire scanline for simple glyphs in most cases:
each slightly slanted edge needs up to two spans, plus a filling span
in-between. This is not new, we used to do it before cb4388783c.
* src/smooth/ftgrays.c (gray_TWorker): Add `spans' and `num_spans'.
(gray_hline, gray_sweep): Implement the span buffering.
(gray_raster_render): Use negative `num_spans' to avoid the direct
mode.
The previous implementation is correct but it is too complex.
The revised algorithm is based on the fact that each split moves
the control points closer to the trisection points on the chord.
The corresponding distances are good surrogates for the curve
deviation from the straight line.
This cubic flattening algorithm is somewhat similar to the conic
algorithm based the distance from the control point to the middle of
the chord. The cubic distances, however, decrease less predictably
but are easy enough to calculate on each step.
* src/smooth/ftgrays.c (gray_render_cubic): Replace the split
condition.
* src/raster/ftraster.c [STANDALONE] (FT_Outline_Get_CBox): Add
function.
[!STANDALONE]: Include FT_OUTLINE_H.
(ft_black_render): Compute CBox and reject glyphs larger than
0xFFFF x 0xFFFF.
* src/smooth/ftgrays.c (gray_raster_render): Reject glyphs larger
than 0xFFFF x 0xFFFF.
This monster commit was created by applying Nikhil's scripts
`docconverter.py' and `markify.py' to all C header and source files,
followed up by minor manual clean-up.
No change in functionality, of course.
I used commit f7419907bc6044b9b7057f9789866426c804ba82 from
https://github.com/nikramakrishnan/freetype-docs.git.
The robust rendering of estra large glyphs came with unbearable cost.
The old way of bisecting should fail but fail faster.
* src/smooth/ftgrays.c (gray_convert_glyph): Switch back to bisecting
in y-direction.
*/* [FT_CONFIG_OPTION_PIC]: Remove all code guarded by this
preprocessor symbol.
*/*: Replace `XXX_GET' macros (which could be either a function in
PIC mode or an array in non-PIC mode) with `xxx' arrays.
* include/freetype/internal/ftpic.h, src/autofit/afpic.c,
src/autofit/afpic.h, src/base/basepic.c, src/base/basepic.h,
src/base/ftpic.c, src/cff/cffpic.c, src/cff/cffpic.h,
src/pshinter/pshpic.c, src/pshinter/pshpic.h, src/psnames/pspic.c,
src/psnames/pspic.h, src/raster/rastpic.c, src/raster/rastpic.h,
src/sfnt/sfntpic.c, src/sfnt/sfntpic.h, src/smooth/ftspic.c,
src/smooth/ftspic.h, src/truetype/ttpic.c, src/truetype/ttpic.h:
Removed.
We used to split large glyphs into horizontal bands and continue
bisecting them still horizontally if that was not enough. This is
guaranteed to fail when a single scanline cannot fit into the
rendering memory pool. Now we bisect the bands vertically so that
the smallest unit is a column of the band height, which is guranteed
to fit into memory.
* src/smooth/ftgrays.c (gray_convert_glyph): Implement it.
At large sizes almost but not exactly horizontal segments can quickly
drain the rendering pool. This patch at least avoids filling the pool
with trivial cells. Beyond this, we can only increase the pool size.
Reported, analyzed, and tested by Colin Fahey.
* src/smooth/ftgrays.c (gray_set_cell): Do not record trivial cells.