Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use ryu and fastfloat for double serialization/deserialization #49

Merged
merged 20 commits into from
Sep 7, 2023

Conversation

paleolimbot
Copy link
Contributor

@paleolimbot paleolimbot commented Sep 5, 2023

Improves WKT parsing performance by ~1.5x and WKT printing performance by >6x. More importantly, this makes both locale-independent.

import geoarrow.pyarrow as ga
import pyarrow as pa
import numpy as np

n = int(1e6)
xs = np.random.random(n)
ys = np.random.random(n)
points = ga.point().wrap_array(pa.StructArray.from_arrays([xs, ys], names=["x", "y"]))
points_wkt = ga.as_wkt(points)


%timeit ga.as_geoarrow(points_wkt)
#> Before:
#> 206 ms ± 713 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
#> After:
#> 140 ms ± 351 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
%timeit ga.as_wkt(points)
#> Before:
#> 634 ms ± 1.37 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
#> After:
#> 83 ms ± 339 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

@codecov
Copy link

codecov bot commented Sep 5, 2023

Codecov Report

Merging #49 (960e143) into main (4dac0bd) will increase coverage by 1.07%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##             main      #49      +/-   ##
==========================================
+ Coverage   92.48%   93.56%   +1.07%     
==========================================
  Files          25       30       +5     
  Lines        4271     5003     +732     
  Branches       59        0      -59     
==========================================
+ Hits         3950     4681     +731     
- Misses        288      322      +34     
+ Partials       33        0      -33     
Files Changed Coverage Δ
python/src/geoarrow/pandas.py 93.46% <100.00%> (ø)
python/src/geoarrow/pyarrow/_compute.py 98.04% <100.00%> (ø)
python/src/geoarrow/pyarrow/_kernel.py 94.25% <100.00%> (ø)
src/geoarrow/double_parse_fast_float.cc 100.00% <100.00%> (ø)
src/geoarrow/double_print.c 100.00% <100.00%> (ø)
src/geoarrow/kernel.c 94.49% <100.00%> (+2.58%) ⬆️
src/geoarrow/util.c 77.77% <100.00%> (+2.77%) ⬆️
src/geoarrow/wkt_reader.c 97.58% <100.00%> (+1.42%) ⬆️
src/geoarrow/wkt_writer.c 98.50% <100.00%> (+0.69%) ⬆️
src/geoarrow/wkx_testing.hpp 94.11% <100.00%> (ø)

... and 11 files with indirect coverage changes

@paleolimbot paleolimbot marked this pull request as ready for review September 7, 2023 12:57
@paleolimbot paleolimbot merged commit 73a5d09 into geoarrow:main Sep 7, 2023
5 checks passed
@paleolimbot paleolimbot deleted the parsing-libs branch September 7, 2023 13:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant