Skip to content

Commit

Permalink
#139 Added the doc for SaaS and PathLike
Browse files Browse the repository at this point in the history
  • Loading branch information
ahsimb committed Jul 4, 2024
1 parent 3e60237 commit bcb5695
Show file tree
Hide file tree
Showing 12 changed files with 225 additions and 11 deletions.
18 changes: 18 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,24 @@ exasol.bucketfs.Bucket
:undoc-members:
:show-inheritance:

exasol.bucketfs.SaaSBucket
-----------------------
.. autoclass:: exasol.bucketfs.Bucket
:members:
:undoc-members:
:show-inheritance:

exasol.bucketfs.path.PathLike
-----------------------
.. autoclass:: exasol.bucketfs.path.PathLike
:members:
:undoc-members:
:show-inheritance:

exasol.bucketfs.path.build_path
------------------------
.. autofunction:: exasol.bucketfs.path.build_path

exasol.bucketfs.as_bytes
------------------------
.. autofunction:: exasol.bucketfs.as_bytes
Expand Down
1 change: 1 addition & 0 deletions doc/changes/unreleased.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ The current release adds a dependency to plugin `pytest_exasol_saas` and replace
## Documentation

* #144: Added comment on using fixtures from pytest-plugin `pytest-exasol-saas`
* #147: Added documentation for the SaaS and the PathLike interface.
16 changes: 16 additions & 0 deletions doc/examples/bucket_saas.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
"""
This tutorial is relevant for the Exasol SaaS database.
It demonstrates the creation of a bucket object for a SaaS database.
"""
import os

from exasol.bucketfs import SaaSBucket

# Let's assume that the required SaaS connection parameters
# are stored in environment variables.
bucket = SaaSBucket(
url=os.environ.get('SAAS_URL'),
account_id=os.environ.get('SAAS_ACCOUNT_ID'),
database_id=os.environ.get('SAAS_DATABASE_ID'),
pat=os.environ.get('SAAS_PAT'),
)
3 changes: 3 additions & 0 deletions doc/examples/delete.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
"""
This tutorial is relevant for the On-Prem Exasol database.
"""
from exasol.bucketfs import Service

URL = "http://localhost:6666"
Expand Down
3 changes: 3 additions & 0 deletions doc/examples/download.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
"""
This tutorial is relevant for the On-Prem Exasol database.
"""
from exasol.bucketfs import (
Service,
as_bytes,
Expand Down
3 changes: 3 additions & 0 deletions doc/examples/list.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
"""
This tutorial is relevant for the On-Prem Exasol database.
"""
from exasol.bucketfs import Service

URL = "http://localhost:6666"
Expand Down
138 changes: 138 additions & 0 deletions doc/examples/path_like.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,138 @@
"""
We will demonstrate the usage of the PathLike interface with an example of handling
customer reviews.
"""
from typing import ByteString
import tempfile
import os

import exasol.bucketfs as bfs

# First, we need to get a path in the BucketFS where we will store reviews.
# We will use the build_path() function for that. This function takes different
# input parameters depending on the backend in use. We will set the type of
# backed to the variable below. Please change it to bfs.path.StorageBackend.saas
# if needed.
backend = bfs.path.StorageBackend.onprem

if backend == bfs.path.StorageBackend.onprem:
# The parameters below are the default BucketFS parameters of the Docker-DB
# running on a local machine. Please change them according to the settings of the
# On-Prem database being used. For better security, consider storing the password
# in an environment variable.
reviews = bfs.path.build_path(
backend=backend,
url="http://localhost:6666",
bucket_name='default',
service_name='bfsdefault',
path='reviews',
username='w',
password='write',
verify=False
)
elif backend == bfs.path.StorageBackend.saas:
# In case of a SaaS database we will assume that the required SaaS connection
# parameters are stored in environment variables.
reviews = bfs.path.build_path(
backend=backend,
url=os.environ.get('SAAS_URL'),
account_id=os.environ.get('SAAS_ACCOUNT_ID'),
database_id=os.environ.get('SAAS_DATABASE_ID'),
pat=os.environ.get('SAAS_PAT'),
path='reviews',
)
else:
raise RuntimeError(f'Unknown backend {backend}')

# Let's create a path for good reviews and write some reviews there,
# each into a separate file.
good_reviews = reviews / 'good'

john_h_review = good_reviews / 'John-H.review'
john_h_review.write(
b'I had an amazing experience with this company! '
b'The customer service was top-notch, and the product exceeded my expectations. '
b'I highly recommend them to anyone looking for quality products and excellent service.'
)

sarah_l_review = good_reviews / 'Sarah-L.review'
sarah_l_review.write(
b'I am a repeat customer of this business, and they never disappoint. '
b'The team is always friendly and helpful, and their products are outstanding. '
b'I have recommended them to all my friends and family, and I will continue to do so!'
)

david_w_review = good_reviews / 'David-W.review'
david_w_review.write(
b'After trying several other companies, I finally found the perfect fit with this one. '
b'Their attention to detail and commitment to customer satisfaction is unparalleled. '
b'I will definitely be using their services again in the future.'
)

# Now let's write same bad reviews into a different subdirectory.
bad_reviews = reviews / 'bad'

# Previously we provided content as a ByteString. But we can also use a file object,
# as shown here.
with tempfile.TemporaryFile() as file_obj:
file_obj.write(
b'I first began coming here because of their amazing reviews. '
b'Unfortunately, my experiences have been overwhelmingly negative. '
b'I was billed more than 2,600 euros, the vast majority of which '
b'I did not consent to and were never carried out.'
)
file_obj.seek(0)
mike_s_review = bad_reviews / 'Mike-S.review'
mike_s_review.write(file_obj)


# A PathLike object supports an interface similar to the PosixPurePath.
for path_obj in [reviews, good_reviews, john_h_review]:
print(path_obj)
print('\tname:', path_obj.name)
print('\tsuffix:', path_obj.suffix)
print('\tparent:', path_obj.parent)
print('\texists:', path_obj.exists())
print('\tis_dir:', path_obj.is_dir())
print('\tis_file:', path_obj.is_file())

# The as_udf_path() function returns the correspondent path, as it's seen from a UDF.
print("A UDF can find John's review at", john_h_review.as_udf_path())


# The read() method returns an iterator over chunks of content.
# The function below reads the whole content of the specified file.
def read_content(bfs_path: bfs.path.PathLike) -> ByteString:
return b''.join(bfs_path.read())


# Like the pathlib.Path class, the BucketFS PathLike object provides methods
# to iterate over the content of a directory.
# Let's use the iterdir() method to print all good reviews.
for item in good_reviews.iterdir():
if item.is_file():
print(item.name, 'said:')
print(read_content(item))


# The walk method allows traversing subdirectories.
# Let's use this method to create a list of all review paths.
all_reviews = [node / file for node, _, files in reviews.walk() for file in files]
for review in all_reviews:
print(review)


# A file can be deleted using the rm() method. Please note that once the file is
# deleted it won't be possible to write another file to the same path for a certain
# period of time, due to internal internode synchronisation procedure.
mike_s_review.rm()

# A directory can be deleted using the rmdir() method. If it is not empty we need
# to use the recursive=True option to delete the directory with all its content.
good_reviews.rmdir(recursive=True)

# Now all reviews should be deleted.
print('Are any reviews left?', reviews.exists())

# In BucketFS a directory doesn't exist as a physical object. Therefore, the
# exists() function called on a path for an empty directory returns False.
14 changes: 9 additions & 5 deletions doc/examples/quickstart.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
"""
This tutorial is relevant for the On-Prem Exasol database.
"""

from exasol.bucketfs import (
Service,
as_bytes,
Expand All @@ -11,17 +15,17 @@
# 0. List buckets
buckets = [bucket for bucket in bucketfs]

# 2. Get a bucket
# 1. Get a bucket
default_bucket = bucketfs["default"]

# 3. List files in bucket
# 2. List files in bucket
files = [file for file in default_bucket]

# 4. Upload a file to the bucket
# 3. Upload a file to the bucket
default_bucket.upload("MyFile.txt", b"File content")

# 5. Download a file/content
# 4. Download a file/content
data = as_bytes(default_bucket.download("MyFile.txt"))

# 6. Delete a file from a bucket
# 5. Delete a file from a bucket
default_bucket.delete("MyFile.txt")
4 changes: 4 additions & 0 deletions doc/examples/service.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
"""
This tutorial is relevant for the On-Prem Exasol database.
"""

# List buckets
from exasol.bucketfs import Service

Expand Down
3 changes: 3 additions & 0 deletions doc/examples/upload.py
Original file line number Diff line number Diff line change
@@ -1,3 +1,6 @@
"""
This tutorial is relevant for the On-Prem Exasol database.
"""
import io

from exasol.bucketfs import Service
Expand Down
29 changes: 25 additions & 4 deletions doc/user_guide/basics.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,8 +3,8 @@ Basic's

The Bucketfs Service
--------------------
A single bucketfs service can host multiple buckets. In order to interact with a bucketfs service one
can use the :ref:`exasol.bucketfs.Service <api:exasol.bucketfs.Service>` class.
In the On-Prem database, a single bucketfs service can host multiple buckets. In order to interact with a
bucketfs service one can use the :ref:`exasol.bucketfs.Service <api:exasol.bucketfs.Service>` class.

List buckets
++++++++++++
Expand All @@ -25,8 +25,14 @@ Get a Bucket reference
Bucket class
-------------
A Bucket contains a set of files which may be restricted, depending on the credentials of the requester.
Using :ref:`exasol.bucketfs.Bucket <api:exasol.bucketfs.Bucket>` class the user can interact (download, upload, list and delete) files.
with the files in the bucket.
The Bucket class for an On-Prem database is :ref:`exasol.bucketfs.Bucket <api:exasol.bucketfs.Bucket>`.
The correspondent class for a SaaS database is :ref:`exasol.bucketfs.Bucket <api:exasol.bucketfs.SaaSBucket>`.
Using these classes the user can interact with the files in the bucket (download, upload, list and delete them).

Most of the examples below are based on the On-Prem implementation of the BucketFS. In the SaaS implementation
there is only one BucketFS service, providing a single bucket. To access the BucketFS in SaaS the Bucket
object should be created directly, as it is demonstrated in the last example. The interface of the Bucket
object for the SaaS database is identical to that of the On-Prem database.

List files in a Bucket
++++++++++++++++++++++
Expand Down Expand Up @@ -73,6 +79,21 @@ Delete files from Bucket
:language: python3
:end-before: # Expert/Mapped bucket API

Create bucket object in SaaS
++++++++++++++++++++++++

.. literalinclude:: /examples/bucket_saas.py
:language: python3

PathLike interface
-------------
A PathLike is an interface similar to the pathlib.Path and should feel familiar to most users.

Using the PathLike interface
++++++++++++++++++++++++

.. literalinclude:: /examples/path_like.py
:language: python3

Configure logging
+++++++++++++++++
Expand Down
4 changes: 2 additions & 2 deletions doc/user_guide/user_guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@ Bucketfs
Depending on the database configuration, the bucketfs setup can range from straight forward to fairly complex.
This is due to the fact that:

* Each database can have one or more BucketFS services
* Each database can have one or more BucketFS services (in the On-Prem database)
* Each BucketFS service is available on all worker cluster of a database
* Each BucketFS service runs on all data nodes of a database
* Each BucketFS service can have one or more Buckets
* Each BucketFS service can have one or more Buckets (in the On-Prem database)
* Each Bucket can hold one or more files

The overview bellow tries to illustrate this in a more tangible manner.
Expand Down

0 comments on commit bcb5695

Please sign in to comment.