Skip to content

Commit

Permalink
Separate Python packages (#50)
Browse files Browse the repository at this point in the history
* start separating python packages

* maybe CI

* maybe docs

* move import to geoarrow.c

* with passing test

* readme

* fix a few references

* maybe fix doctests

* fix package name

* fix install instructions

* maybe fix doctest

* move doctests to the end

* maybe fix wheel name

* maybe fix

* actions

* maybe fix doc

* also ignore opt/ for coverage

* maybe actually get doctests to run

* maybe more portable doctest
  • Loading branch information
paleolimbot authored Sep 7, 2023
1 parent 73a5d09 commit a50551d
Show file tree
Hide file tree
Showing 35 changed files with 157 additions and 194 deletions.
6 changes: 0 additions & 6 deletions .env

This file was deleted.

32 changes: 19 additions & 13 deletions .github/workflows/python.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -32,30 +32,22 @@ jobs:

- name: Install geoarrow
run: |
pushd python
pushd python/geoarrow-c
pip install .[test]
popd
pip list
- name: Run tests
run: |
pytest python/tests -v -s
- name: Run doctests
if: success() && matrix.python-version == '3.10'
run: |
pytest --pyargs geoarrow --doctest-modules
# No Cython docs yet
# pip install pytest-cython
# pytest --pyargs geoarrow --doctest-cython
pytest python/geoarrow-c/tests -v -s
- name: Coverage
if: success() && matrix.python-version == '3.10'
run: |
sudo apt-get install -y lcov
pip uninstall --yes geoarrow
pip install pytest-cov Cython
pushd python
pushd python/geoarrow-c
# Build with Cython + gcc coverage options
pip install -e .[test]
Expand All @@ -69,9 +61,10 @@ jobs:
lcov \
--capture --directory build \
--exclude "/usr/*" \
--exclude "/opt/*" \
--exclude "/Library/*" \
--exclude "*/_lib.cpp" \
--exclude "*/src/geoarrow/geoarrow/*" \
--exclude "*/src/geoarrow/c/geoarrow/*" \
--output-file=coverage.info
lcov --list coverage.info
Expand All @@ -80,4 +73,17 @@ jobs:
if: success() && matrix.python-version == '3.10'
uses: codecov/codecov-action@v2
with:
files: 'python/coverage.info,python/coverage.xml'
files: 'python/geoarrow-coverage.info,python/geoarrow-c/coverage.xml'

- name: Run doctests
if: success() && matrix.python-version == '3.10'
run: |
# Because of namespace packaging we have to add this here and
# rebuild to avoid confusig pytest
touch python/geoarrow-c/src/geoarrow/__init__.py
pip install python/geoarrow-c
pytest --pyargs geoarrow.c --doctest-modules
# No Cython docs yet
# pip install pytest-cython
# pytest --pyargs geoarrow --doctest-cython
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ release of the geoarrow specification.
## Get started in Python

```python
import geoarrow.pyarrow as ga
import geoarrow.c.pyarrow as ga

ga.point()
# PointType(geoarrow.point)
Expand Down
4 changes: 2 additions & 2 deletions ci/scripts/build-docs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -51,10 +51,10 @@ main() {
# pip install . doesn't quite work with the setuptools available on the
# ubuntu docker image...python -m build works I think because it sets up
# a virtualenv
pushd python
pushd python/geoarrow-c
rm -rf dist
python3 -m build --wheel
pip3 install dist/geoarrow-*.whl
pip3 install dist/geoarrow*.whl
popd

pushd docs
Expand Down
9 changes: 1 addition & 8 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -4,14 +4,7 @@ version: '3.5'
services:

docs:
image: ${REPO}:ubuntu-${GEOARROW_ARCH}
build:
context: .
cache_from:
- ${REPO}:ubuntu-${GEOARROW_ARCH}
dockerfile: ci/docker/ubuntu.dockerfile
args:
GEOARROW_ARCH: ${GEOARROW_ARCH}
image: ghcr.io/apache/arrow-nanoarrow:ubuntu
volumes:
- .:/geoarrow-c
command: "/bin/bash /geoarrow-c/ci/scripts/build-docs.sh"
2 changes: 1 addition & 1 deletion docs/source/conf.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
import sys
import datetime

import geoarrow
import geoarrow.c

sys.path.insert(0, os.path.abspath(".."))

Expand Down
2 changes: 1 addition & 1 deletion docs/source/python/geoarrow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Core API
========

.. automodule:: geoarrow
.. automodule:: geoarrow.c
:members:

Constants
Expand Down
6 changes: 3 additions & 3 deletions docs/source/python/pyarrow.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Integration with pyarrow
========================

.. automodule:: geoarrow.pyarrow
.. automodule:: geoarrow.c.pyarrow

Array constructors
------------------
Expand Down Expand Up @@ -100,8 +100,8 @@ Integration with pyarrow
.. autoclass:: MultiPolygonType
:members:

.. autoclass:: geoarrow.pyarrow._dataset.GeoDataset
.. autoclass:: geoarrow.c.pyarrow._dataset.GeoDataset
:members:

.. autoclass:: geoarrow.pyarrow._dataset.ParquetRowGroupGeoDataset
.. autoclass:: geoarrow.c.pyarrow._dataset.ParquetRowGroupGeoDataset
:members:
File renamed without changes.
4 changes: 2 additions & 2 deletions python/.gitignore → python/geoarrow-c/.gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -16,8 +16,8 @@
# specific language governing permissions and limitations
# under the License.

src/geoarrow/geoarrow
src/geoarrow/_lib.cpp
src/geoarrow/c/geoarrow
src/geoarrow/c/_lib.cpp

# Byte-compiled / optimized / DLL files
__pycache__/
Expand Down
5 changes: 3 additions & 2 deletions python/MANIFEST.in → python/geoarrow-c/MANIFEST.in
Original file line number Diff line number Diff line change
Expand Up @@ -16,5 +16,6 @@
# under the License.

exclude bootstrap.py
include src/geoarrow/geoarrow/*.h
include src/geoarrow/geoarrow/*.hpp
include src/geoarrow/c/**/**/*.h
include src/geoarrow/c/**/*.h
include src/geoarrow/c/**/*.hpp
42 changes: 13 additions & 29 deletions python/README.ipynb → python/geoarrow-c/README.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
"Python bindings for nanoarrow are not yet available on PyPI. You can install via URL (requires a C++ compiler):\n",
"\n",
"```bash\n",
"python -m pip install \"https://github.com/geoarrow/geoarrow-cpp/archive/refs/heads/main.zip#egg=geoarrow&subdirectory=python\"\n",
"python -m pip install \"https://github.com/geoarrow/geoarrow-c/archive/refs/heads/main.zip#egg=geoarrow-c&subdirectory=python/geoarrow-c\"\n",
"```\n",
"\n",
"If you can import the namespace, you're good to go! The only reasonable interface to geoarrow currently depends on `pyarrow`, which you can import with:"
Expand All @@ -26,7 +26,7 @@
"metadata": {},
"outputs": [],
"source": [
"import geoarrow.pyarrow as ga"
"import geoarrow.c.pyarrow as ga"
]
},
{
Expand All @@ -48,7 +48,7 @@
"data": {
"text/plain": [
"PointArray:PointType(geoarrow.point)[1]\n",
"<POINT (0 1)>\n"
"<POINT (0 1)>"
]
},
"execution_count": 2,
Expand Down Expand Up @@ -88,7 +88,7 @@
"PointArray:PointType(geoarrow.point)[3]\n",
"<POINT (1 3)>\n",
"<POINT (2 4)>\n",
"<POINT (3 5)>\n"
"<POINT (3 5)>"
]
},
"execution_count": 3,
Expand Down Expand Up @@ -117,7 +117,7 @@
"PointArray:PointType(interleaved geoarrow.point)[3]\n",
"<POINT (1 2)>\n",
"<POINT (3 4)>\n",
"<POINT (5 6)>\n"
"<POINT (5 6)>"
]
},
"execution_count": 4,
Expand All @@ -137,7 +137,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"Importing `geoarrow.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository."
"Importing `geoarrow.c.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository."
]
},
{
Expand Down Expand Up @@ -189,22 +189,6 @@
"execution_count": 6,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/geopandas/_compat.py:124: UserWarning: The Shapely GEOS version (3.11.1-CAPI-1.17.1) is incompatible with the GEOS version PyGEOS was compiled with (3.10.1-CAPI-1.16.0). Conversions between both will be slow.\n",
" warnings.warn(\n",
"/var/folders/gt/l87wjg8s7312zs9s7c1fgs900000gn/T/ipykernel_81348/2107898165.py:1: DeprecationWarning: Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas still uses PyGEOS by default. However, starting with version 0.14, the default will switch to Shapely. To force to use Shapely 2.0 now, you can either uninstall PyGEOS or set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:\n",
"\n",
"import os\n",
"os.environ['USE_PYGEOS'] = '0'\n",
"import geopandas\n",
"\n",
"In the next release, GeoPandas will switch to using Shapely by default, even if PyGEOS is installed. If you only have PyGEOS installed to get speed-ups, this switch should be smooth. However, if you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).\n",
" import geopandas\n"
]
},
{
"data": {
"text/plain": [
Expand All @@ -216,9 +200,9 @@
"<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>\n",
"...245 values...\n",
"<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>\n",
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>\n",
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>\n",
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>\n",
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>\n",
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>\n",
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>\n",
"<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>"
]
},
Expand Down Expand Up @@ -381,9 +365,9 @@
"<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>\n",
"...245 values...\n",
"<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>\n",
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>\n",
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>\n",
"<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>\n",
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>\n",
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>\n",
"<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>\n",
"<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>"
]
},
Expand Down Expand Up @@ -444,7 +428,7 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.9.6"
"version": "3.11.2"
},
"orig_nbformat": 4
},
Expand Down
39 changes: 12 additions & 27 deletions python/README.md → python/geoarrow-c/README.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
# geoarrow for Python

The geoarrow Python package provides bindings to the geoarrow-c implementation of the [GeoArrow specification](https://github.com/geoarrow/geoarrow). The geoarrow Python bindings provide input/output to/from Arrow-friendly formats (e.g., Parquet, Arrow Stream, Arrow File) and general-purpose coordinate shuffling tools among GeoArrow, WKT, and WKB encodings.
The geoarrow Python package provides bindings to the geoarrow-c implementation of the [GeoArrow specification](https://github.com/geoarrow/geoarrow). The geoarrow Python bindings provide input/output to/from Arrow-friendly formats (e.g., Parquet, Arrow Stream, Arrow File) and general-purpose coordinate shuffling tools among GeoArrow, WKT, and WKB encodings.

## Installation

Python bindings for nanoarrow are not yet available on PyPI. You can install via URL (requires a C++ compiler):

```bash
python -m pip install "https://github.com/geoarrow/geoarrow-cpp/archive/refs/heads/main.zip#egg=geoarrow&subdirectory=python"
python -m pip install "https://github.com/geoarrow/geoarrow-c/archive/refs/heads/main.zip#egg=geoarrow-c&subdirectory=python/geoarrow-c"
```

If you can import the namespace, you're good to go! The only reasonable interface to geoarrow currently depends on `pyarrow`, which you can import with:


```python
import geoarrow.pyarrow as ga
import geoarrow.c.pyarrow as ga
```

## Examples
Expand All @@ -34,7 +34,6 @@ ga.as_geoarrow(["POINT (0 1)"])




This will work with:

- An existing array created by geoarrow
Expand All @@ -51,7 +50,7 @@ Alternatively, you can construct GeoArrow arrays directly from a series of buffe
import numpy as np

ga.point().from_geobuffers(
None,
None,
np.array([1.0, 2.0, 3.0]),
np.array([3.0, 4.0, 5.0])
)
Expand All @@ -68,7 +67,6 @@ ga.point().from_geobuffers(




```python
ga.point().with_coord_type(ga.CoordType.INTERLEAVED).from_geobuffers(
None,
Expand All @@ -86,8 +84,7 @@ ga.point().with_coord_type(ga.CoordType.INTERLEAVED).from_geobuffers(




Importing `geoarrow.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository.
Importing `geoarrow.c.pyarrow` will register the geoarrow extension types with pyarrow such that you can read/write Arrow streams, Arrow files, and Parquet that contains Geoarrow extension types. A number of these files are available from the [geoarrow-data](https://github.com/geoarrow/geoarrow-data) repository.


```python
Expand Down Expand Up @@ -135,18 +132,6 @@ array = ga.as_geoarrow(df.geometry)
array
```

/Library/Frameworks/Python.framework/Versions/3.9/lib/python3.9/site-packages/geopandas/_compat.py:124: UserWarning: The Shapely GEOS version (3.11.1-CAPI-1.17.1) is incompatible with the GEOS version PyGEOS was compiled with (3.10.1-CAPI-1.16.0). Conversions between both will be slow.
warnings.warn(
/var/folders/gt/l87wjg8s7312zs9s7c1fgs900000gn/T/ipykernel_81348/2107898165.py:1: DeprecationWarning: Shapely 2.0 is installed, but because PyGEOS is also installed, GeoPandas still uses PyGEOS by default. However, starting with version 0.14, the default will switch to Shapely. To force to use Shapely 2.0 now, you can either uninstall PyGEOS or set the environment variable USE_PYGEOS=0. You can do this before starting the Python process, or in your code before importing geopandas:

import os
os.environ['USE_PYGEOS'] = '0'
import geopandas

In the next release, GeoPandas will switch to using Shapely by default, even if PyGEOS is installed. If you only have PyGEOS installed to get speed-ups, this switch should be smooth. However, if you are using PyGEOS directly (calling PyGEOS functions on geometries from GeoPandas), this will then stop working and you are encouraged to migrate from PyGEOS to Shapely 2.0 (https://shapely.readthedocs.io/en/latest/migration_pygeos.html).
import geopandas





Expand All @@ -158,9 +143,9 @@ array
<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>
...245 values...
<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>
<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>


Expand All @@ -180,7 +165,7 @@ geopandas.GeoSeries.from_wkb(ga.as_wkb(array))
2 MULTILINESTRING ((631355.519 5122892.285, 6313...
3 MULTILINESTRING ((665166.020 5138641.982, 6651...
4 MULTILINESTRING ((673606.020 5162961.982, 6736...
...
...
250 MULTILINESTRING ((681672.620 5078601.582, 6818...
251 MULTILINESTRING ((414867.917 5093040.881, 4147...
252 MULTILINESTRING ((414867.917 5093040.881, 4148...
Expand Down Expand Up @@ -280,9 +265,9 @@ geoarrow_array2
<MULTILINESTRING ((673606.0199999996 5162961.9823, 673606.01999999...>
...245 values...
<MULTILINESTRING ((681672.6200000001 5078601.5823, 681866.01999999...>
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414793.81699999...>
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414829.71700000...>
<MULTILINESTRING ((414867.9170000004 5093040.8807, 414937.21700000...>
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414793.8169999...>
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414829.7170000...>
<MULTILINESTRING ((414867.91700000037 5093040.8807, 414937.2170000...>
<MULTILINESTRING ((648686.0197000001 5099181.984099999, 648866.019...>


Expand Down
Loading

0 comments on commit a50551d

Please sign in to comment.