Skip to content

Commit

Permalink
Merge pull request #164 from raphaelrpl/patch-opendata-format
Browse files Browse the repository at this point in the history
📚 Review docs and setup (close #163)
  • Loading branch information
raphaelrpl authored Sep 23, 2022
2 parents 951449b + bb4f8ee commit 27de9ee
Show file tree
Hide file tree
Showing 3 changed files with 120 additions and 68 deletions.
123 changes: 97 additions & 26 deletions DEPLOY.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,57 +6,128 @@
under the terms of the MIT License; see LICENSE file for more details.


Deploying
=========
Deploy
======

This section explains how to get the ``Cube-Builder-AWS`` up and running on `Amazon Web Services <https://aws.amazon.com/>`_.
If you do not read yet the :doc:`installation`, take a look at this tutorial on how to install it in your system in devmode
and be familiar with Python module.

Create infrastructure
---------------------

.. code-block:: shell
.. warning::

Make sure to identify which region the dataset is available.
For example, most of `GEO Earth datasets <https://aws.amazon.com/earth/>`_ like ``Sentinel-2``, ``Landsat-8`` are
stored in ``Oregon`` (``us-west-2``). In this tutorial, we are going to use ``us-west-2``.

If you generate any data cube that are in a different region of BDC services, you may face high cost charges in the billing.



.. requirements:
Requirements
------------

- `RDS PostgreSQL <https://aws.amazon.com/rds/postgresql/>`_: A minimal instance of PostgreSQL database with PostGIS support.
The ``instance_type`` depends essentially how many parallel processing ``Lambdas`` are running. For this example,
we can use the minimal instance ``db.t2.micro``. For a Brazil territory, considerer more robust instances like ``db.t2.large``
which supports aroung ``600`` concurrent connections.

After the instance up and running, you must initialize `BDC-Catalog <https://github.com/brazil-data-cube/bdc-catalog>`_.
Please, refer to ``Compatibility Table`` in :doc:`installation` for supported versions.

- `S3 - Simple Storage Service <https://aws.amazon.com/s3/>`_: A bucket to store ``Lambda codes`` and another bucket for ``data storage``.

- `Kinesis <https://aws.amazon.com/kinesis/>`_: a Kinesis instance to streaming data cube step metadata be transfered along ``Lambdas`` and ``DynamoDB``.
For this example, minimal instance to support ``1000`` records (Default lambda parallel executions) is enough.

- `DynamoDB <https://aws.amazon.com/dynamodb/>`_: a set of dynamo tables to store data cube metadata.


Prepare environment
-------------------

The ``Cube-Builder-AWS`` command utilities uses `NodeJS <https://nodejs.org/en/>`_ module named `serverless <https://www.serverless.com/>`_
to deploy the stack of data cubes on Amazon Web Services.
First you need to install ``NodeJS``. We recommend you to use `nvm <https://github.com/nvm-sh/nvm>`_ which can be easily installed with
single command line and its supports to have multiple versions of nodejs installed. You can install it with command::

curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.39.1/install.sh | bash

$ cd deploy/step_1/
$ sh start.sh

1. access https://console.aws.amazon.com/rds/home by browser
Set the following entry into ``~/.bashrc``::

2. select region used to create RDS
export NVM_DIR="$([ -z "${XDG_CONFIG_HOME-}" ] && printf %s "${HOME}/.nvm" || printf %s "${XDG_CONFIG_HOME}/nvm")"
[ -s "$NVM_DIR/nvm.sh" ] && \. "$NVM_DIR/nvm.sh" # This loads nvm

3. select databases

4. Wait until the created database has a status of 'Available' (~10min)
Install ``NodeJS 12+``::

nvm install 12
nvm use 12 # Activate the version as current


After that, use the following command to install ``serverless`` and their dependencies::

npm install -g serverless


The second part is to have `AWS Identity and Access Management (IAM) <https://aws.amazon.com/iam/>`_ credentials with right access to deploy
the `requirements`_ section.


Prepare the infrastructure
--------------------------

We have prepared a script to set up a RDS PostgreSQL instance up and running. Use the following script::

cd deploy/step_1/
sh start.sh


The AWS RDS database set up takes aroung 10 minutes to launch. You can monitore the status following
https://console.aws.amazon.com/rds/home.

.. note::

Make sure you are in region ``us-west-2 (Oregon)``.

5. click on database


Create database structure
-------------------------

Create initial database structure to catalog the cubes to be generated
Once RDS database is up and running, we need to create the ``BDC-Catalog`` model::

.. code-block:: shell
$ cd ../../deploy/step_2/
$ sh start.sh
cd ../../deploy/step_2/
sh start.sh


Deploy Lambda service
---------------------

** create file *.env* based on *example.env* in cube-builder-aws folder. Then set the environment variables with your information in *.env*
Before to proceed in ``Cube-Builder`` service, we need to create a ``cube-builder-aws/.env``.
We have prepared a minimal example ``cube-builder-aws/example.env`` and the following variables are available:

then:
- ``PROJECT_NAME``: A name for the given project set up. This name will be set as ``prefix`` in Lambdas.
- ``STAGE``: A type of service environment context. Use ``dev`` or ``prod``.
- ``REGION``: AWS region to launch services.
- ``KEY_ID``: AWS Access Key.
- ``SECRET_KEY``: AWS Access Secret Key.
- ``SQLALCHEMY_DATABASE_URI``: URI for PostgreSQL instance. It have the following structure: ``postgresql://USER:PASSWD@HOST/DB_NAME``

.. code-block:: shell
Once ``cube-builder-aws/.env`` is set, you can run then following script to launch Lambda into AWS::

$ cd ../../deploy/step_3/
$ sh deploy.sh
cd ../../deploy/step_3/
sh deploy.sh


Get service status
---------------------
The script helper will generate an URI for the Lambda location.
You can access this resource and check if everything is running.

.. code-block:: shell

$ curl {your-lambda-endpoint}/
Next steps
----------

After ``Cube-Builder-AWS`` backend is up and running, we recommend you to install the `Data Cube Manager GUI <https://github.com/brazil-data-cube/dc-manager>`_
59 changes: 20 additions & 39 deletions INSTALL.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,23 @@ The ``Cube Builder AWS`` depends essentially on:
- `Rio-cogeo <https://pypi.org/project/rio-cogeo/>`_



Compatibility
+++++++++++++

+------------------+-------------+
| Cube-Builder-AWS | BDC-Catalog |
+==================+=============+
| 0.8.2 | 0.8.2 |
+------------------+-------------+
| 0.8.0 ~ 0.8.1 | 0.8.1 |
+------------------+-------------+
| 0.6.x | 0.8.1 |
+------------------+-------------+
| 0.4.x | 0.8.1 |
+------------------+-------------+


Clone the software repository
+++++++++++++++++++++++++++++

Expand All @@ -51,16 +68,16 @@ Go to the source code folder::

Install in development mode::

$ pip3 install -e .[all]
$ pip3 install -e .[docs,tests]


.. note::

If you want to create a new *Python Virtual Environment*, please, follow this instruction:

*1.* Create a new virtual environment linked to Python 3.7::
*1.* Create a new virtual environment linked to Python 3.8::

python3.7 -m venv venv
python3.8 -m venv venv


**2.** Activate the new environment::
Expand Down Expand Up @@ -94,39 +111,3 @@ You can open the above documentation in your favorite browser, as::
firefox docs/sphinx/_build/html/index.html


Prepare environment to deploy
+++++++++++++++++++++++++++++

Prepare your AWS account and HOST to deploy application.


1) in AWS Console
-----------------

- create AWS account

- Login with AWS account created

- create IAM user

- set full permissions (fullAccess) to IAM user created

- generate credentals to IAM user


2) in your HOST (command line):
-------------------------------

- install *AWS CLI*

- configure credentials
- e.g: aws configure --profile *iam-user-name*

- install *Node.js* (global)
- `Download Nodejs <https://nodejs.org/en/download/>`_

- install *serverless*

.. code-block:: shell
$ npm install -g serverless
6 changes: 3 additions & 3 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,7 +52,7 @@
'bdc-catalog @ git+https://github.com/brazil-data-cube/[email protected]#egg=bdc-catalog',
'Flask>=1.1.1,<2',
'Flask-SQLAlchemy==2.4.1',
'psycopg2-binary==2.8.5',
'psycopg2-binary>=2.8,<3',
'boto3==1.14.49',
'botocore==1.17.49',
'marshmallow-sqlalchemy==0.25.0',
Expand All @@ -62,7 +62,7 @@
'rasterio==1.2.1',
'requests>=2.23.0',
'rio-cogeo==3.0.2',
'shapely==1.7.0',
'shapely>=1.7,<2',
'stac.py==0.9.0.post5',
'cloudpathlib[s3]==0.4.0',
]
Expand Down Expand Up @@ -108,4 +108,4 @@
'Topic :: Scientific/Engineering :: GIS',
'Topic :: Software Development :: Libraries :: Python Modules',
],
)
)

0 comments on commit 27de9ee

Please sign in to comment.