Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial docs #79

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
109 changes: 109 additions & 0 deletions docs/building-elwin.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,109 @@
# Compiling and Running Elwin from Source

# 1. Overview

Elwin is an executable that can perform a/b/c and multivariate
testing. The main library that performs the majority of the work is
called `choices`. It is written in go. Elwin contains the server logic
and evaluates experiments based on the requests it receives. There are
two binaries that make up a basic elwin deployment.

# 2. Compiling

## 2.1 Go compiler

In order to compile an Elwin executable you will need a go compiler. I
use the most up to date compiler version, currently go1.8. You can
download go at [https://golang.org/dl/](https://golang.org/dl/).

## 2.2 Getting source code

The first step will be to clone the elwin repo locally. Go requires
files to be placed in certain directories based on your `GOPATH`
environment variable. If you are using the default `GOPATH` in go1.8
It will be `$HOME/go`. There are two options for getting the elwin
code. One option is to use Go's built in `get` command. The second
option is to manually download and create the necessary directory
structure.

### 2.2.1 Using `go get`

In your terminal run the following command.

```bash
go get github.com/Nordstrom/choices
```

> This requires `go` and `git` to be installed

You should find the elwin files at
`$GOPATH/src/github.com/Nordstrom/choices` if you have `GOPATH` set or
`$HOME/go/src/github.com/Nordtsrom/choices` if you are using the
default `GOPATH` in go1.8.

### 2.2.2 Manual `git clone`

You should check if your `GOPATH` is set.

```bash
echo $GOPATH
# If nothing is displayed run the following commands.
mkdir $HOME/go
export GOPATH=$HOME/go
```

> If you set the `GOPATH` it will only be set for this terminal. If
> you close the terminal or start a new session it will need to be set
> again.

Next you need to create the directory structure that go expects to
hold the files. Then you can clone the repo.

```bash
mkdir -p $GOPATH/src/github.com/Nordstrom/
cd $GOPATH/src/github.com/Nordstrom/
git clone https://github.com/Nordstrom/choices
```

## 2.3 Compiling source components

### 2.3.1 Compiling Elwin binary

Assuming you have the downloaded the code and set your `GOPATH`, now
you can compile the elwin executable. To compile an executable that
can run on your local machine you could run the following.

```bash
cd $GOPATH/src/github.com/Nordstrom/choices/cmd/elwin
go build
```

### 2.3.2 Compiling Storage binary

The most up to date storage binary is the bolt-store implementation.
You compile this in a similar way to elwin.

```bash
cd $GOPATH/src/github.com/Nordstrom/choices/cmd/bolt-store
go build
```

# 3. Running Elwin

> TODO: use `spf13/viper` for configuration

# 3.1 Running Elwin Locally

To run elwin locally you will need to supply some configuration values so the ports don't collide.

```bash
cd $GOPATH/src/github.com/Nordstrom/choices/cmd/bolt-store
./bolt-store
```

In a separate terminal, run the following.

```bash
cd $GOPATH/src/github.com/Nordstrom/choices/cmd/elwin
JSON_ADDRESS=:8082 GRPC_ADDRESS=:8083 MONGO_ADDRESS=:8080 ./elwin
```
110 changes: 110 additions & 0 deletions docs/elwin-usage.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,110 @@
# Elwin Usage

* [Elwin endpoints](#elwin-endpoints)
* [Elwin request](#elwin-request)
* [Elwin response](#elwin-response)

## Elwin endpoints

Elwin uses two endpoints for experiments: _dev_ and _prod_. The _prod_
endpoint will serve tests that are considered live to customers. The
_dev_ endpoint can be used to preview experiments before enabling it
for your customers.

`http://elwin.ttoapps.aws.cloud.nordstorm.net` is our current _prod_
endpoint.

> We are in the process of migrating to a supported Kubernetes
> cluster. When that migration occurs we will update our _prod_
> endpoint to `http://prod.elwin.aws.cloud.nordstrom.net`.

`http://dev.elwin.aws.cloud.nordstrom.net` is our current _dev_
endpoint.

> There are other dev endpoints but they should pointed to the same
> service as the above. They are
> `http://elwin.k8s-a.ttoapps.aws.cloud.nordstrom.net` and
> `http://elwin-test.ttoapps.aws.cloud.nordstrom.net`.

## Elwin request

Elwin's request structure is very simple. In it's current state,
clients will make a GET request to one of the endpoints. The request
requires two params. One is specifies the team making the request. The
second specifies the user's id.

The team name can currently be specified in query params `team`,
`label`, `teamid`, or `group-id`. If any of those params are not
blank, they will be used to filter the experiments.

The user id is specified in the param `userid`. In most cases this
should be the `ExperimentID` from the `experiments` cookie on web
requests.

You can also supply other query params to match labels on your
experiments. For example if you are running a test that should only be
shown to desktop users, you could set the label `platform` with the
value `desktop` on your experiment. When you query for desktop
experiments you would then include the `platform=desktop` query param.
Another example is if you wanted to run a test internal only before
deploying it to customers you could add a label
`traffic-source=internal` and query for it the same way. You can query
for multiple values for the same label key by repeating the query in
the request. For example, `?env=prod&env=dev`. The query params create
an *and* selection on the labels of your experiments. The results
returned will be the union of all experiments whose labels match.

The full request for experiments that are for the ato team in the dev
and prod environment for customers browsing on desktop platform for
the userid `andrew` would look like the following.

```
http://dev.elwin.aws.cloud.nordstrom.net/?team=ato&env=prod&env=dev&platform=desktop&userid=andrew
```

## Elwin response

All the experiments a user qualifies for will be returned in the
experiment response. The response is JSON.

```javascript
{
"experiments": {
"personalized-header-experiment": {
"namespace": "aaaaaa",
"params": {
"personalized": "default"
}
},
"button-experiment": {
"namespace": "bbbbbb",
"params": {
"button-color": "blue",
"button-size": "large"
}
}
}
}
```

> In this example there are two experiments:
> `personalized-header-experiment` and `button-experiment`. In
> `personalized-header-experiment` the user was hashed into the
> `default` experience. In `button-experiment` they were hashed into
> `blue` + `large` MVT experience.

The top level of the object will contain only a single key,
`experiments`. The `experiments` key contains a map of experiment
names to experiment values, represented as an object.

Experiment names will be keys in the `experiments` object. The values
for the experiment names will be objects with two keys. `namespace`,
the first key, contains a string value. It is not essential to the
experiment but is required for the data collection. `params`, the
second key, contains an object of param names and param values.

The `params` object has keys that correspond to param names. The
values of these keys will be the experience the user has been hashed
into. If you are running an A/B/N test there will only be one key. If
you are running a multivariate test then there be keys for each arm.
The values for params will always be returned as strings.
74 changes: 74 additions & 0 deletions docs/something-broke.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,74 @@
# Something Broke

Steps for checking what's broken.

## A test I created is not showing up when I query elwin.

1. Check if your labels match the query params

```bash
curl prod.elwin.aws.cloud.nordstrom.net/?userid=andrew&team=blah&env=dev
```

Your experiment should have labels that match all the query params
except `userid`. In this example, your experiment would need labels
`team` and `env` with the values `blah` and `dev`.

2. Check the test was actually created.

```bash
# dev environment
curl -X POST -d '{"environment": 0}' json-gateway.ato.platform.prod.aws.cloud.nordstrom.net/api/v1/all
# prod environment
curl -X POST -d '{"environment": 1}' json-gateway.ato.platform.prod.aws.cloud.nordstrom.net/api/v1/all
```

If you do not see the experiment in either output then it needs to be
recreated. Use the [form][neo-form] to create an experiment.

If your test is in the wrong environment, you should use
[houston][houston] to launch or delete tests from dev and prod.

3. Check if elwin is failing to update

If you verfied the test is in storage and the query is correct then
elwin may not be able to contact the storage server. You can first
check the logs.

```bash
$ kubectl get po -l run=elwin
NAME READY STATUS RESTARTS AGE
elwin-3750452436-4vshr 1/1 Running 0 10d
elwin-3750452436-6m78g 1/1 Running 0 11d
elwin-3750452436-7rnmx 1/1 Running 0 10d
elwin-3750452436-ffnq0 1/1 Running 0 11d
elwin-3750452436-gh607 1/1 Running 0 10d
elwin-3750452436-jlkcd 1/1 Running 0 11d
elwin-3750452436-p298j 1/1 Running 0 11d
elwin-3750452436-rqh9l 1/1 Running 0 11d
elwin-3750452436-sp9c0 1/1 Running 0 11d
elwin-3750452436-znccr 1/1 Running 0 11d
elwin-dev-3169672050-0sq6s 1/1 Running 0 10d
elwin-dev-3169672050-3r0w2 1/1 Running 0 14d
elwin-dev-3169672050-f3kcd 1/1 Running 0 11d
elwin-dev-3169672050-gxlqj 1/1 Running 0 10d
elwin-dev-3169672050-lzf50 1/1 Running 0 14d
```

Select a pod and check it's logs.

```bash
kubectl logs --tail=50 elwin-3750452436-4vshr
```

Check for lines like the following.

```bash
2017/03/13 17:14:17 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup elwin-storage: no such host"; Reconnecting to {elwin-storage:80 <nil>}
2017/03/13 17:14:19 grpc: addrConn.resetTransport failed to create client transport: connection error: desc = "transport: dial tcp: lookup elwin-storage: no such host"; Reconnecting to {elwin-storage:80 <nil>}
```

This means grpc has lost it's connection to storage. You can try restarting the pod, but it should restart automatically after `UPDATE_FAIL_TIMEOUT` passes (default is 15m).

[neo-form]: http://127.0.0.1 "neo form"
[houston]: http://houston.ato.platform.prod.aws.cloud.nordstrom.net "Launch Control"