Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AVRO-3479: [rust] Avro Schema Derive Proc Macro #1631

Merged
merged 34 commits into from
Apr 16, 2022
Merged
Show file tree
Hide file tree
Changes from 11 commits
Commits
Show all changes
34 commits
Select commit Hold shift + click to select a range
d55efdf
port crate
jklamer Mar 4, 2022
99bd6d7
namespace port
jklamer Mar 4, 2022
2d4e9e8
dev depends
jklamer Mar 4, 2022
5f7db8f
resolved against main
jklamer Mar 14, 2022
019ff71
Cons list tests
jklamer Mar 6, 2022
490195c
rebased onto master resolution
jklamer Mar 14, 2022
15edb59
namespace attribute in derive
jklamer Mar 30, 2022
dabd3f2
std pointers
jklamer Mar 30, 2022
1cf8d10
References, testing, and refactoring
jklamer Apr 2, 2022
b98512b
[AVRO-3479] Clean up for PR
jklamer Apr 2, 2022
a81b358
AVRO-3479: Add missing ASL2 headers
martin-g Apr 8, 2022
05b1286
AVRO-3479: Minor improvements
martin-g Apr 8, 2022
5a43cd2
Schema assertions and PR comments
jklamer Apr 9, 2022
52b1c42
test failure fixing
jklamer Apr 9, 2022
39ac767
add readme
jklamer Apr 10, 2022
113f31f
README + implementation guide + bug fix with enclosing namespaces
jklamer Apr 10, 2022
b82e00d
AVRO-3479: Minor improvements
martin-g Apr 11, 2022
892b249
AVRO-3479: Fix typos
martin-g Apr 11, 2022
ff75150
AVRO-3479: Use darling crate to parse derive attributes
martin-g Apr 11, 2022
47ee2d1
darling for NamedTypes and fields
jklamer Apr 12, 2022
9f7d9a6
AVRO-3479 pr review naming
jklamer Apr 12, 2022
ed4d649
AVRO-3479 doc comment doc and small tests
jklamer Apr 15, 2022
8da964a
AVRO-3479 featurize
jklamer Apr 15, 2022
78db295
AVRO-3479 cargo engineering
jklamer Apr 15, 2022
cc9fe3c
Fix a docu warning:
martin-g Apr 15, 2022
c553bbd
AVRO-3479: Rename avro_derive to apache-avro-derive
martin-g Apr 15, 2022
4be7583
AVRO-3479: Use fqn for Mutex
martin-g Apr 15, 2022
a10e19a
AVRO-3479: Update darling to 0.14.0
martin-g Apr 15, 2022
632aabf
AVRO-3479: Fix the version of apache-avro-derive
martin-g Apr 15, 2022
bfb43da
AVRO-3479: Minor cleanups
martin-g Apr 15, 2022
984f0fa
AVRO-3479: Inline a pub function that is used only in avro_derive
martin-g Apr 15, 2022
65d0a04
AVRO-3479: Derive Schema::Long for u32
martin-g Apr 16, 2022
7574a05
Merge branch 'master' into jklamer/avro-derive
martin-g Apr 16, 2022
c59e961
AVRO-3479: Bump dependencies to their latest versions
martin-g Apr 16, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions lang/rust/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -18,4 +18,5 @@
[workspace]
members = [
"avro",
"avro_derive"
]
132 changes: 132 additions & 0 deletions lang/rust/avro/src/schema.rs
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@ use std::{
fmt,
hash::Hash,
str::FromStr,
sync::Mutex,
};
use strum_macros::{EnumDiscriminants, EnumString};

Expand Down Expand Up @@ -1498,6 +1499,137 @@ fn field_ordering_position(field: &str) -> Option<usize> {
.map(|pos| pos + 1)
}

pub fn record_schema_for_fields(
name: Name,
aliases: Aliases,
doc: Documentation,
fields: Vec<RecordField>,
) -> Schema {
let lookup: HashMap<String, usize> = fields
.iter()
.map(|field| (field.name.to_owned(), field.position))
.collect();
Schema::Record {
name,
aliases,
doc,
fields,
lookup,
}
}

pub trait AvroSchema {
fn get_schema() -> Schema;
}

/// TODO Help me name this. The idea here that any previously parsed or constructed schema with a name is registered in resolved schemas and passed recursively to avoid infinite recursion
pub trait AvroSchemaWithResolved {
fn get_schema_with_resolved(resolved_schemas: &mut Names) -> Schema;
}

impl<T> AvroSchema for T
where
T: AvroSchemaWithResolved,
{
fn get_schema() -> Schema {
T::get_schema_with_resolved(&mut HashMap::default())
}
}

macro_rules! impl_schema(
($type:ty, $variant_constructor:expr) => (
impl AvroSchemaWithResolved for $type {
fn get_schema_with_resolved(_: &mut HashMap<Name, Schema>) -> Schema {
jklamer marked this conversation as resolved.
Show resolved Hide resolved
$variant_constructor
}
}
);
);

impl_schema!(i8, Schema::Int);
impl_schema!(i16, Schema::Int);
impl_schema!(i32, Schema::Int);
impl_schema!(i64, Schema::Long);
impl_schema!(u8, Schema::Int);
impl_schema!(u16, Schema::Int);
impl_schema!(f32, Schema::Float);
impl_schema!(f64, Schema::Double);
impl_schema!(char, Schema::String);
impl_schema!(String, Schema::String);
impl_schema!(uuid::Uuid, Schema::Uuid);
impl_schema!(core::time::Duration, Schema::Duration);

impl<T> AvroSchemaWithResolved for Vec<T>
where
T: AvroSchemaWithResolved,
{
fn get_schema_with_resolved(resolved_schemas: &mut HashMap<Name, Schema>) -> Schema {
Schema::Array(Box::new(T::get_schema_with_resolved(resolved_schemas)))
}
}

impl<T> AvroSchemaWithResolved for Option<T>
where
T: AvroSchemaWithResolved,
{
fn get_schema_with_resolved(resolved_schemas: &mut HashMap<Name, Schema>) -> Schema {
let inner_schema = T::get_schema_with_resolved(resolved_schemas);
Schema::Union(UnionSchema {
schemas: vec![Schema::Null, inner_schema.clone()],
variant_index: vec![Schema::Null, inner_schema]
.iter()
.enumerate()
.map(|(idx, s)| (SchemaKind::from(s), idx))
.collect(),
})
}
}

impl<T> AvroSchemaWithResolved for Map<String, T>
where
T: AvroSchemaWithResolved,
{
fn get_schema_with_resolved(resolved_schemas: &mut HashMap<Name, Schema>) -> Schema {
Schema::Map(Box::new(T::get_schema_with_resolved(resolved_schemas)))
}
}

impl<T> AvroSchemaWithResolved for HashMap<String, T>
where
T: AvroSchemaWithResolved,
{
fn get_schema_with_resolved(resolved_schemas: &mut HashMap<Name, Schema>) -> Schema {
Schema::Map(Box::new(T::get_schema_with_resolved(resolved_schemas)))
}
}

impl<T> AvroSchemaWithResolved for Box<T>
where
T: AvroSchemaWithResolved,
{
fn get_schema_with_resolved(resolved_schemas: &mut HashMap<Name, Schema>) -> Schema {
T::get_schema_with_resolved(resolved_schemas)
}
}

impl<T> AvroSchemaWithResolved for Mutex<T>
where
T: AvroSchemaWithResolved,
{
fn get_schema_with_resolved(resolved_schemas: &mut HashMap<Name, Schema>) -> Schema {
T::get_schema_with_resolved(resolved_schemas)
}
}

impl<T> AvroSchemaWithResolved for Cow<'_, T>
where
T: AvroSchemaWithResolved + Clone,
{
fn get_schema_with_resolved(resolved_schemas: &mut HashMap<Name, Schema>) -> Schema {
T::get_schema_with_resolved(resolved_schemas)
}
}

#[cfg(test)]
mod tests {
use super::*;
Expand Down
33 changes: 33 additions & 0 deletions lang/rust/avro_derive/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

[package]
name = "avro_derive"
version = "0.1.0"
edition = "2021"
jklamer marked this conversation as resolved.
Show resolved Hide resolved

[lib]
proc-macro = true

[dependencies]
syn = {version= "1.0.60", features=["full", "fold"]}
quote = "1.0.8"
proc-macro2 = "1.0"

[dev-dependencies]
serde = { version = "1.0.130", features = ["derive"] }
apache-avro = { path = "../avro" }
31 changes: 31 additions & 0 deletions lang/rust/avro_derive/examples/bad_examples.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.

// use apache_avro::schema::{AvroSchema, AvroSchemaWithResolved};
// use apache_avro::{from_value, Reader, Schema, Writer};
// use avro_derive::*;
// use serde::de::DeserializeOwned;
// use serde::ser::Serialize;
// use std::collections::HashMap;

// #[macro_use]
// extern crate serde;

/// This module should not compile. This is an examples page for many common errors when using the avro_derive functionality. The errors should be handled gracefully, and explained in detail here.
///
mod examples {}
fn main() {}
95 changes: 95 additions & 0 deletions lang/rust/avro_derive/proposal_supporting_docs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
<!---
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

The trait defined within schema.rs
```
pub trait AvroSchema {
fn get_schema() -> Schema;
}
```

##### Reasoning/Desires
Associated funtion as the implementation. Not associated const to make schema creation function easier (can use non const functions). The best would be to have this associated function return &'static Schema but I have yet to figure out how to do that without some global state which is undesirable.

##### Desired user workflow
Anything that can be serialized the "The serde way" should be able to be serialized/deserialized without further configuration.

Current Flow
```
use apache_avro::Schema;

let raw_schema = r#"
{
"type": "record",
"name": "test",
"fields": [
{"name": "a", "type": "long", "default": 42},
{"name": "b", "type": "string"}
]
}

use apache_avro::Writer;

#[derive(Debug, Serialize)]
struct Test {
a: i64,
b: String,
}

// if the schema is not valid, this function will return an error
let schema = Schema::parse_str(raw_schema).unwrap();

let mut writer = Writer::new(&schema, Vec::new());
let test = Test {
a: 27,
b: "foo".to_owned(),
};
writer.append_ser(test).unwrap();
let encoded = writer.into_inner();
```

New Flow
```
use apache_avro::Writer;

#[derive(Debug, Serialize, AvroSchema)]
struct Test {
a: i64,
b: String,
}
// derived schema, always valid or code fails to compile with a descriptive message
let schema = Test::get_schema();

let mut writer = Writer::new(&schema, Vec::new());
let test = Test {
a: 27,
b: "foo".to_owned(),
};
writer.append_ser(test).unwrap();
let encoded = writer.into_inner();
```


##### crate import
To use this functionality it comes as an optional feature (modeled off serde)

cargo.toml
```
apache-avro = { version = "X.Y.Z", features = ["derive"] }
```
Loading