The documentation for all bots can be found in the Bots documentation. In this lesson, you will need the documentation several times, so better keep it open!
All the previously used collector bots use rate limiting: They are active once for fetching data and then idle for the configured time (for example one day). However it is also possible to use "scheduled" bots. They differ in their behavior:
- They are "disabled" and not started by default (on
intelmqctl start
) - After starting them explicitly with
intelmqctl start my-scheduled-bot
they run once and then stop.
In the runtime configuration there are two switches for all bots in order to achieve this behavior:
enabled
: A boolean, behaves similar to systemd's enabled and disable states. This setting could also be called "autostart". The commandsintelmqctl enable my-bot
andintelmqctl disable my-bot
, respectively, change this setting.run_mode
: eithercontinuous
orscheduled
, the first is the default. The scheduled bots stop directly after running.
Bots with this settings can be started by e.g. cron or systemd timers in regular intervals:
# start my-scheduled-bot every day at 6:20
20 6 * * * /usr/local/bin/intelmqctl start my-scheduled-bot
To make a bot "scheduled", apply these two settings:
enabled
:false
run_mode
:scheduled
These settings are not "parameters" as the other normal parameters you applied until now. In theruntime.conf
these settings are on the same level asmodule
and others.
Configure the previously added file collector for shadowserver data as scheduled bot.
Start it manually to check if the bot correctly stops after the run.
Configure cron to run this bot every 5 minutes. Make sure the crontab contains the following line (the provided VM has):
PATH=/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games
Observe in the log file that it was running. In case of errors, cron will also send you an email.
Click to see the answer.
intelmqctl start shadowserver-file-collector
(depending on which ID you gave your bot)- Run
crontab -e
and add at the end of the file:
*/5 * * * * /usr/bin/intelmqctl start shadowserver-file-collector
- Check the logs:
tail -f /var/log/intelmq/shadowserver-file-collector.log
The installed PostgreSQL has an user intelmq
with password intelmq
, you can connect via IPv4 and IPv6 locally on port 5432 without SSL.
Further, connecting via socket (and psql
on the command line), every connection is trusted.
The database intelmq
contains a table events
with the same schema as the internal data format of intelmq ("IDF", previously "Data Harmonization Ontology" "DHO").
This means that every field available in IntelMQ events is represented as column in the database.
Add an output bot for postgresql in parallel to the file-output.
Re-run the botnet or any collector and observe that data was sent to postgresql. Instructions how the data can be fetched are below.
Click to see the answer.
Configuration parameters for the bot:
autocommit
:true
(default)database
:intelmq
engine
:postgresql
host
:localhost
jsondict_as_string
:true
(default)password
:intelmq
port
:5432
sslmode
:allow
(no TLS available)table
:events
user
:intelmq
psql intelmq intelmq
As the table has a lot of columns, here is an example SQL query which selects only a few columns:
SELECT "time.source", "feed.name", "classification.taxonomy", "classification.type", "classification.identifier", "source.asn", "source.network", "source.ip", "source.fqdn", "source.reverse_dns", "source.geolocation.cc" FROM events;
To select some statistics:
SELECT extract(day from "time.source") AS day, "feed.name", "classification.type" FROM events GROUP BY day, "feed.name", "classification.type";
Go to the installed fody interface at /stats
. You see a query interface which allows you to easily add a lot of "WHERE" clauses without writing actual SQL. In the first two rows you can select on the time.source
column, the default is the last month. If the value is left empty, it is ignored.
The "Resoulution" field and "View Stats" buttons can be ignored for only fetching data.
The webinput interface available at /webinput/
allows interactive insertion of any CSV file into your IntelMQ instance.
It consists of two views/steps: The upload and the preview.
In the upload view you provide the data, either as file upload or as copy-pasted file. The "Parser Configuration" allows setting how the CSV should be parsed.
The preview shows the CSV data as table and to the left some settings. The table header shows drop-down menus offering auto-completion to assign IntelMQ fields to the columns. The second row is a simple check-mark if the column should be used or not.
On the left you can do some settings:
- the default timezone, applied to time-columns if the fields do not already have timezone information
- dry run: If true, all classification fields are set to "test". Un-tick the box when submitting data that should be processed.
- classification type: As this is a fixed list, you can use the value from a fixed list. The resulting taxonomy is shown to you.
- classification identifier: a free text (this field is added by a configuration option).
Hint: "constant fields" can be added in the configuration of the webinput.
The button "Refresh Table" causes the backend to parse the data according to the selected fields. Any not valid fields are shown in red and a summary is shown in the top left corner. If you detect any wrong mappings, you can adjust your settings. All rows containing any non-valid data are considered as "failed" and cannot be processed by IntelMQ.
"Submit" will insert the data into the queue defined in the configuration. The Alert box shows if it works and how many rows have been inserted.
time,address,malware,additional info
# you need to skip this line`
2020-01-22T23:12:24+02,10.0.0.1,zeus,very bad!!!
2020-01-23T04:34:46+02,10.0.0.2,smokeloader,no further information available
2020-01-24T15:52:05,10.0.0.3,spybot,"not, my, department!"
2020-01-25T82:12:24+02,10.0.0.4,android.nitmo,huh?
Observe that the valid data is in the PostgreSQL database.
Click to see the answer.
The necessary settings on the upload are:
- delimiter:
,
- quotechar:
"
(default) - escapechar:
\
(default) - has header: yes
- skip initial space: no (default)
- skip initial N lines: 1
- Show N lines maximum in preview: anything above 4 (default)
In the preview, the columns assignments are:
time.source
source.ip
malware.name
- for example
event_description.text
Settings:
- timezone:
+02:00
, for the line without time zone information. - classification type:
infected-system
- classification identifier: for example
malware
The fourth line is invalid (bad timestamp).
On submission, the box should say "Successfully processed 3 lines.".
To have a look what bots actually do and for testing purposes it is often useful to start bots in foreground with detailed logging.
This is what intelmqctl run
is for. Details can be found in the documentation of intelmqctl and with intelmqctl run -h
. -h
or --help
are also available for the various subcommands.
Find out how you can check what country the IP address 131.130.254.77
is in, according to the previously configured MaxMind Geolocation lookup bot.
But do not actually insert this data to the processing pipeline of IntelMQ.
Hint: The above IP address is represented in IntelMQ as {"source.ip": "131.130.254.77"}
Click to see the answer.
intelmq@malaga:~$ intelmqctl run MaxMind-GeoIP-Expert process -m '{"source.ip": "131.130.254.77"}' -d -s
Starting MaxMind-GeoIP-Expert...
MaxMind-GeoIP-Expert: GeoIPExpertBot initialized with id MaxMind-GeoIP-Expert and intelmq 2.1.1 and python 3.7.3 (default, Apr 3 2019, 05:39:12) as process 22983.
MaxMind-GeoIP-Expert: Bot is starting.
MaxMind-GeoIP-Expert: Bot initialization completed.
MaxMind-GeoIP-Expert: * Message from cli will be used when processing.
MaxMind-GeoIP-Expert: * Dryrun only, no message will be really sent through.
MaxMind-GeoIP-Expert: Processing...
[
{
"source.geolocation.cc": "AT",
"source.geolocation.city": "Vienna",
"source.geolocation.latitude": 48.2006,
"source.geolocation.longitude": 16.3672,
"source.ip": "131.130.254.77"
}
]
MaxMind-GeoIP-Expert: DRYRUN: Message would be sent now to '_default'!
MaxMind-GeoIP-Expert: DRYRUN: Message would be acknowledged now!
(your output might vary slightly).
The country is Austria, actually this is the IP address of cert.at
.
/var/lib/intelmq/bots/sql/ti-teams.sqlite
contains a table ti
with two columns:
cc
with two-letter country-codes and email
with a comma-separated list of email addresses
The data is from TI and contains all national CERTs listed there.
Configure a bot so that all data (with country information) gets the national's CERT addresses as source.abuse_contact
.
Click to see the answer.
The bot is the "Generic DB Lookup" Expert.
database
:/var/lib/intelmq/bots/sql/ti-teams.sqlite
engine
:sqlite
host
: not relevantmatch_fields
:{"source.geolocation.cc": "cc"}
overwrite
:true
password
: not relevantport
: not relevantreplace_fields
:{"email": "source.abuse_contact"}
sslmode
: not relevanttable
:ti
user
: not relevant
An SMTP server is running on localhost Port 25 without authentication. A webmail client is running at http://localhost:8080/roundcube/ login is possible for example as user
/user
or intelmq
/intelmq
. user
is a catchall for any non-existing mail addresses, including all domains.
Hint: The default configuration for the SMTP Bot has STARTTLS set to true, which is not supported by the local mailserver configuration.
Configure the SMTP Output so that it sends events to abuse contact as fetched by the previously configured bot.
Click to see the answer.
fieldnames
: As you like, for exampletime.source,feed.name,classification.taxonomy,classification.type,classification.identifier,source.asn,source.network,source.ip,source.fqdn,source.reverse_dns,source.geolocation.cc"
mail_from
: As you like, for exampleintelmq@localhost
mail_to
:{ev[source.abuse_contact]}
smtp_host
:localhost
smtp_password
:null
smtp_port
: 25smtp_username
:null
ssl
:false
starttls
:false
subject
: As you liketext
: As you like
RabbitMQ can be used as Messaging Queue instead of Redis. How this switch can be made can be found in the Configuration and Management Chapter of the User guide
First start the RabbitMQ server:
sudo systemctl start rabbitmq-server.service
The management interface is available at port 15672, you can login with the credentials admin
/admin
. You will see the queues, their sizes and statistics after IntelMQ has been started.
Stop the IntelMQ botnet: intelmqctl stop
In /etc/intelmq/runtime.yaml
create a "global" section for systemwide configuration and set these parameters:
"source_pipeline_broker"
to"amqp"
"destination_pipeline_broker"
to"amqp"
"source_pipeline_port"
to5672
or remove it (the default for amqp kicks in then)"destination_pipeline_port"
to5672
or remove it (the default for amqp kicks in then)
Now you can start the IntelMQ botnet again: intelmqctl start
In the RabbitMQ webinterface watch the statistics of the queues.
There is currently no standard solution to group data and send it out to the corresponding recipient as the workflows of the IntelMQ users differ so much. However you can have a look or use two approaches described here.
You can use output bots like SMTP or the Request Tracker bot to send data to recipients. But this data is not grouped, which means there is one event per e-mail/ticket.
IntelMQ Mailgen is a solution used by BSI/CERTBUND to send grouped notifications to network owners not directly interfacing with a ticketing system. The data is retrieved from a PostgreSQL database and sent using SMTP, but the subjects contains a unique ticket identifier. Responses to the e-mails then land in the OTRS system and can be dealt with there. This approach also reduces the load in the ticketing system.
CERT.at directly interfaces with RTIR after collecting the data in a PostgreSQL database as well. The tool suppresses ("squelches") events during a specified time period to avoid too much noise at the recipients end. The tools are not well documented, but can be found in the CERT.at fork if IntelMQ. There is ongoing work to generalize the code and make it easier available.
See also https://intelmq.readthedocs.io/en/maintenance/user/ecosystem.html
Fody is an interface for intelmq-mailgen's contact database, it's OTRS and the EventDB. The certbund-contact expert fetches the information from this contact database and provides scripts to import RIPE data into the contact database.
The Malware Name Mapping is a project which evolved from IntelMQ and is maintained under the certtools-umbrella organization. It's sole purpose is providing a mapping of various (feed-specific and accurat) malware names to well-known and more generic malware family names.
IntelMQ includes tools in it's contrib sub-tree to download and convert the mapping for use in IntelMQ.
The link above describes who the integration into IntelMQ works and how you can use the Modify-Bot to apply the mapping to your data. In the VM, the download script can be found at /usr/local/bin/download_mapping.py
. Call the script with --help
to get an overview of the parameters and a short documentation.
The "stats portal" is a framework to generate statistics from a PostgreSQL EventDB using Grafana. More information can be found at github.com/certtools/stats-portal.
Now continue with lesson four.