Examples on how to use the command line tools in Avro Tools to read and write Avro files.
See my original article Reading and Writing Avro Files From the Command Line from April 2013 for more information about using Avro Tools.
Table of Contents
- Getting Avro Tools
- JSON to binary Avro
- Binary Avro to JSON
- Retrieve Avro schema from binary Avro
- Related tools
You can get a copy of the latest stable Avro Tools jar file from the
Avro Releases page. The actual file is in the java
subdirectory
of a given Avro release version.
Here is a direct link to avro-tools-1.7.7.jar (12 MB) on the US Apache mirror site.
- twitter.avro -- data records in uncompressed binary Avro format
- twitter.snappy.avro -- data records in Snappy-compressed binary Avro format
- twitter.avsc -- Avro schema of the example data
- twitter.json -- data records in plain-text JSON format
- twitter.pretty.json -- data records in pretty-printed JSON format
Without compression:
$ java -jar ~/avro-tools-1.7.7.jar fromjson --schema-file twitter.avsc twitter.json > twitter.avro
With Snappy compression:
$ java -jar ~/avro-tools-1.7.7.jar fromjson --codec snappy --schema-file twitter.avsc twitter.json
The same command will work on both uncompressed and compressed data.
$ java -jar ~/avro-tools-1.7.7.jar tojson twitter.avro > twitter.json
$ java -jar ~/avro-tools-1.7.7.jar tojson twitter.snappy.avro > twitter.json
Output:
{"username":"miguno","tweet":"Rock: Nerf paper, scissors is fine.","timestamp": 1366150681 }
{"username":"BlizzardCS","tweet":"Works as intended. Terran is IMBA.","timestamp": 1366154481 }
You can also pretty-print the JSON output with the -pretty
parameter:
$ java -jar ~/avro-tools-1.7.7.jar tojson -pretty twitter.avro > twitter.pretty.json
$ java -jar ~/avro-tools-1.7.7.jar tojson -pretty twitter.snappy.avro > twitter.pretty.json
Output:
{
"username" : "miguno",
"tweet" : "Rock: Nerf paper, scissors is fine.",
"timestamp" : 1366150681
}
{
"username" : "BlizzardCS",
"tweet" : "Works as intended. Terran is IMBA.",
"timestamp" : 1366154481
}
The same command will work on both uncompressed and compressed data.
$ java -jar ~/avro-tools-1.7.7.jar getschema twitter.avro > twitter.avsc
$ java -jar ~/avro-tools-1.7.7.jar getschema twitter.snappy.avro > twitter.avsc
You can also take a look at the CLI tools avrocat, avromod, and avropipe that are part of the Avro suite. You must build these tools yourself by following their respective INSTALL instructions.