Skip to content

Latest commit

 

History

History
189 lines (147 loc) · 5.9 KB

README.md

File metadata and controls

189 lines (147 loc) · 5.9 KB

nginx-fileserver

Docker for file servers to be used in conjuncation with CouchDB in the nEDM experiment.

Information

This docker is based off of nginx/latest and includes a reverse proxy in front of the nEDM CouchDB server.

Most requests are simply forwarded to the CouchDB backend and the nginx server listens on both the 80 and 5984 ports (serving up the same content on both).

The main functionality of this server is also to provide an additional file-server which is exposed on the _attachments URI path. This allows uploading 'attachments' associated with documents in the DB, but storing them externally. This is particularly useful when storing large files which do not perform well stored directly in a CouchDB database.

Authenticaton is handled by the file server backend by forwarding requests back to the CouchDB server and checking for the correct priveleges. The file server supports the following syntax:

/_attachments/db_name/doc_id/attachment

and the HTTP verbs: GET, PUT, DELETE

For GET, the user requires read access to the CouchDB document. For PUT and DELETE, the user requires write access to the CouchDB document.

Setup options

An example command (to run as a daemon, swap -it with -d):

docker run -it -p 5984:5984 -p 80:80\
  -v /volume1/Measurements:/database_attachments\
  -e "NGX_VIRTUAL_SERVER_2=db.nedm1@/nedm_head/_design/nedm_head/_rewrite"\
  -e "NGX_VIRTUAL_SERVER_1=raid.nedm1@"\
  --name nEDM-FileServer --link nEDM-CouchDB:db registry.hub.docker.com/mgmarino/fileserver-docker:latest

The particular steps are explained in the following:

  1. Mount attachments directory (R/W) to /database_attachments

  2. Pass environment variables to the container to define virtual servers (multiple may be passed, the must simply have a different _my_suffix):

    -e 'NGX_VIRTUAL_SERVER[_my_suffix]=db.name.org@/path/to/rewrite'

Note that the path should never end with a slash.

  1. (optional) Pass environment variables to set readonly status:

    -e 'READ_ONLY_FILESERVER=yes'
  2. (experimental) Pass environment variables to set CHMOD of written files:

    -e 'FILE_MODE_FILESERVER=0664'
  3. (optional) Mount supervisor log directory (R/W) to view the output of the file-server backend to /var/log/supervisor

  4. Link to the nEDM couchdb container, here the db name must be kept! (--line name_of_container:db)

  5. Export port(s) 80 and/or 5984.

Note, that it is important that no ports are forwarded in the CouchDB container to the external world. This is because the nginx container sits in front of the CouchDB instance and routes the commands to it.

API use examples

The following are some examples as to how one can use the API. pynedm provides functions to do all of this for you, so use it before rolling your own code, if possible.

Upload

Using pycurl and cloudant (unfortunately, requests doesn't handle 100 Expect: continue at all).

import pycurl
from StringIO import StringIO
import json
from clint.textui.progress import Bar as ProgressBar

def upload_file_with_curl(file_name, post_to_path, cookies=None):
    """
    file_name : full path to file
    post_to_path : of the form "server/{db}/{doc_id}/{attachment_name}
    cookies : Any cookies (as string) to send along with the request
    """
    total_size = os.path.getsize(file_name)
    bar = ProgressBar(expected_size=total_size, filled_char='=')

    class FileReader:
        def __init__(self, fp):
            self.fp = fp
            self.total_read = 0
        def read_callback(self, size):
            x = self.fp.read(size)
            if x is not None:
                self.total_read += len(x)
                bar.show(self.total_read)
            return x

    c = pycurl.Curl()
    storage = StringIO()
    c.setopt(pycurl.URL, post_to_path)
    c.setopt(pycurl.PUT, 1)
    c.setopt(pycurl.READFUNCTION, FileReader(open(file_name, 'rb')).read_callback)
    c.setopt(pycurl.INFILESIZE, total_size)
    c.setopt(c.WRITEFUNCTION, storage.write)
    if cookies is not None:
        print "setting, ", cookies
        c.setopt(c.COOKIE, cookies)
    c.perform()
    c.close()
    content = storage.getvalue()
    try:
        return json.loads(content)
    except:
        return content

# To login, we use cloudant
_server = "http://127.0.0.1"
acct = cloudant.Account(_server)
acct.login(un, password) # give credentials
cookies = '; '.join(['='.join(x) for x in acct._session.cookies.items()])
submit = {
  "db" : "db_name",
  "id" : "doc_id",
  "att_name" : os.path.basename(file_name)
}
o = upload_file_with_curl(file_name, _server + '/_attachments/{db}/{id}/{att_name}'.format(**submit), cookies)

# On success, it should return a json object with information of all
# attachments currently associated with this document
print json.dumps(o, indent=4)

Download

This is notably simpler, and this time we may use requests:

...
acct = cloudant.Account(_server)
...
def download_file(file_name, get_from_path):
    r = acct.get(get_from_path, stream=True)
    total_size = int(r.headers['content-length'])
    bar = ProgressBar(expected_size=total_size, filled_char='=')
    with open(file_name, 'wb') as f:
        total = 0
        for chunk in r.iter_content(chunk_size=100*1024):
            if chunk: # filter out keep-alive new chunks
                total += len(chunk)
                bar.show(total)
                f.write(chunk)
                f.flush()

download_file(file_name, _server + '/_attachments/{db}/{id}/{att_name}'.format(**submit))

Delete

...
acct = cloudant.Account(_server)
...
o = acct.delete(_server + '/_attachments/{db}/{id}/{att_name}'.format(**submit)).json()
print json.dumps(o, indent=4)

Head

...
acct = cloudant.Account(_server)
...
o = acct.head(_server + '/_attachments/{db}/{id}/{att_name}'.format(**submit))