-
Notifications
You must be signed in to change notification settings - Fork 71
/
Copy pathINSTALL.backend
196 lines (135 loc) · 6.89 KB
/
INSTALL.backend
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
** STEP 1: INSTALL MOST OF THE STUFF **
To run the backend, you will need:
- a Redis 2.8+ server
- a CouchDB server
- a Ruby 1.9 installation (use of rvm is suggested)
- ZeroMQ 4.0.5 (earlier API-compatible versions may work, but they have not been tested)
- Bundler
- ExecJS supported runtime (for the dashboard)
(see https://github.com/sstephenson/execjs)
- Python 3.6+ and websockets 7.0 (for the dashboard WebSocket)
(Little known fact: ArchiveBot is made to be as hard as possible to set
up. If you have trouble with these instructions, drop by in IRC for
help, file an issue, or submit improvements through a pull request. You
can also take a look at the .travis.yml integration test config file.)
Quick install, for Debian and Debian-esque systems like Ubuntu:
sudo apt-get update
sudo apt-get install bundler couchdb git tmux python3
(if you might build ZeroMQ from source, add the next line:)
sudo apt-get install libtool pkg-config build-essential autoconf automake libzmq-dev
git clone https://github.com/ArchiveTeam/ArchiveBot.git
cd ArchiveBot
git submodule update --init
bundle install
pip install websockets==7.0 # Or apt install python3-websockets, or whichever method you prefer, but it must be version 7.0.
** STEP 2: INSTALL REDIS **
Next, install Redis. You can build it from source or you can attempt to use a package.
If you want to try a package, do:
sudo apt-get install redis-server
If you want to build from source, here's how you can do that, using version 2.8.17
on Debian/Ubuntu as an example:
sudo apt-get install build-essential tcl8.5
wget http://download.redis.io/releases/redis-2.8.17.tar.gz
tar xzf redis-2.8.17.tar.gz
cd redis-2.8.17
make
make test
sudo make install
If you also want to set up Redis as a daemonized (always-running) service on your
Debian/Ubuntu machine on port 6379, follow up with this:
cd utils
sudo ./install_server.sh
(and then hit enter a bunch of times to accept the default values)
** STEP 3: CONFIGURE COUCHDB **
Next we need to configure CouchDB. But first, check to make sure it installed
correctly (which should have been done back in step 1) and that it is currently
running on your machine, by typing this:
curl http://127.0.0.1:5984/
If CouchDB is indeed running, you should get back a message that looks something
like this:
{"couchdb":"Welcome","uuid":"610e43c2778c3be750ad5fff8cadd108","version":"1.5.0",
"vendor":{"version":"14.04","name":"Ubuntu"}}
Now we need to load up CouchDB with the "archivebot" and "archivebot_logs" databases.
You can do this from the command line:
curl -X PUT http://127.0.0.1:5984/archivebot
curl -X PUT http://127.0.0.1:5984/archivebot_logs
If that works, you should get this back as a response each time:
{"ok":true}
Now, go to the db/design_docs folder in ArchiveBot:
cd db/design_docs
(You might have installed it somewhere like /home/archivebot/ArchiveBot/db/design_docs .)
The four design documents in there need to be uploaded to the new archivebot database
you just created. You can use CURL or you can use the Futon web interface at
http://localhost:5984/_utils/index.html where you can copy and paste the content
of the JSON files into new documents manually. If you want to use CURL instead,
do this:
grep -v _rev archive_urls.json > /tmp/archive_urls.json
grep -v _rev ignore_patterns.json > /tmp/ignore_patterns.json
grep -v _rev jobs.json > /tmp/jobs.json
grep -v _rev user_agents.json > /tmp/user_agents.json
curl -X PUT http://127.0.0.1:5984/archivebot/_design/archive_urls -d @/tmp/archive_urls.json
curl -X PUT http://127.0.0.1:5984/archivebot/_design/ignore_patterns -d @/tmp/ignore_patterns.json
curl -X PUT http://127.0.0.1:5984/archivebot/_design/jobs -d @/tmp/jobs.json
curl -X PUT http://127.0.0.1:5984/archivebot/_design/user_agents -d @/tmp/user_agents.json
** STEP 4: SET UP THE IRC SERVER **
Finally, you're going to need to install an IRC server (until such time as the
ArchiveBot code is changed to allow for alternate ways of sending it instructions,
such as Twitter). On Debian/Ubuntu, do this:
sudo apt-get install ircd-hybrid
sudo /etc/init.d/ircd-hybrid restart
If you need to add the config file, it is here:
sudo pico /etc/ircd-hybrid/ircd.conf
If you don't have a command line IRC client, and you want one for ease of use,
you can optionally install IRSSI:
sudo apt-get install irssi
Once that's all in place, run the following:
redis-server
(unless it's already running -- and make sure that it does not have a password)
cd /home/archivebot/ArchiveBot/bot
bundle exec ruby bot.rb \
-s 'irc://127.0.0.1:6667' \
-r 'redis://127.0.0.1:6379/0' \
-c '#archivebot' -n 'MyArchiveBot'
This means that the 'MyArchiveBot' bot should join the #archivebot IRC channel, which is
running on the IRC server that you just set up.
Congrats, you now have a bouncing baby bot!
** STEP 5: SET UP THE WEB DASHBOARD **
You can run the dashboard webapp on the same machine, or a different machine, or skip it
altogether. It's up to you. If you want to run it, then from the root of ArchiveBot's
repository (which is usually /home/archivebot/ArchiveBot/), run:
cd /home/archivebot/ArchiveBot/
export REDIS_URL=redis://127.0.0.1:6379/0
export UPDATES_CHANNEL=updates
export FIREHOSE_SOCKET_URL=tcp://127.0.0.1:12345
plumbing/updates-listener | plumbing/log-firehose
In another terminal, run
bundle exec ruby dashboard/app.rb -u http://127.0.0.1:8080
(replace 127.0.0.1 with your web dashboard host's IP address, if needed)
For the WebSocket, in another terminal, run:
export FIREHOSE_SOCKET_URL=tcp://127.0.0.1:12345
plumbing/firehose-client | python3 dashboard/websocket.py
websocket.py will print debugging info if there is an environment variable WSDEBUG=1.
** STEP 6: LOGS AND MAINTENANCE STUFF **
The last part of ArchiveBot is a set of maintenance tasks. They are currently
split between the cogs and plumbing directories; eventually, they will all
move to plumbing.
In cogs:
1. Configure twitter_conf.json if you want to post Twitter Tweets.
2. Run the cogs with bundle exec ruby cogs/start.rb.
In plumbing:
1. bundle install (yes, again -- the plumbing currently has its own Gemfile)
2. export REDIS_URL=redis://127.0.0.1:6379/0
3. export UPDATES_CHANNEL=updates
4. In separate terminals, tmux panes, or screen sessions, run
a. ./analyzer
b. ./trimmer > /dev/null
c. COUCHDB_URL=http://127.0.0.1:5984/db-name ./recorder
The trimmer prints all the data it trims to standard output in the form
IDENT JSON
IDENT JSON
...
For the EFNet ArchiveBot, we redirect it to /dev/null because we currently
don't do anything with that data.
To upgrade, run `git pull` and restart all programs.
bot.rb, dashboard/app.rb, and cogs/start.rb accept a --help option. Run
them with --help to see accepted options.