Skip to content

Commit

Permalink
Add IPPREFIX type (#11122)
Browse files Browse the repository at this point in the history
Summary:
This PR only adds the IPPrefix type classes. CAST logic is not implemented. The next PR for IPPrefix type will enhance the fuzzers for IPPrefix type. After that we will add the CAST logic so that it can be tested with fuzzers from the start itself.

The full logic for IPPrefix is available in PRs :
Original PR: #10538
Original Split PR: #10816

Pull Request resolved: #11122

Reviewed By: Yuhta

Differential Revision: D64917429

Pulled By: pedroerp

fbshipit-source-id: 2aef68d20de20673d9cc9239c98178ffc803860d
  • Loading branch information
mohsaka authored and facebook-github-bot committed Oct 28, 2024
1 parent 624a21c commit c8ac4e3
Show file tree
Hide file tree
Showing 12 changed files with 260 additions and 8 deletions.
22 changes: 20 additions & 2 deletions velox/docs/develop/types.rst
Original file line number Diff line number Diff line change
Expand Up @@ -172,6 +172,7 @@ JSON VARCHAR
TIMESTAMP WITH TIME ZONE BIGINT
UUID HUGEINT
IPADDRESS HUGEINT
IPPREFIX ROW(HUGEINT,TINYINT)
======================== =====================

TIMESTAMP WITH TIME ZONE represents a time point in milliseconds precision
Expand All @@ -182,14 +183,31 @@ Supported range of milliseconds is [0xFFF8000000000000L, 0x7FFFFFFFFFFFF]
store timezone ID. Supported range of timezone ID is [1, 1680].
The definition of timezone IDs can be found in ``TimeZoneDatabase.cpp``.

IPADDRESS represents an IPV6 or IPV4 formatted IPV6 address. Its physical
type is HUGEINT. The format that the address is stored in is defined as part of `(RFC 4291#section-2.5.5.2) <https://datatracker.ietf.org/doc/html/rfc4291.html#section-2.5.5.2>`_
IPADDRESS represents an IPv6 or IPv4 formatted IPv6 address. Its physical
type is HUGEINT. The format that the address is stored in is defined as part of `RFC 4291#section-2.5.5.2 <https://datatracker.ietf.org/doc/html/rfc4291.html#section-2.5.5.2>`_.
As Velox is run on Little Endian systems and the standard is network byte(Big Endian)
order, we reverse the bytes to allow for masking and other bit operations
used in IPADDRESS/IPPREFIX related functions. This type can be used to
create IPPREFIX networks as well as to check IPADDRESS validity within
IPPREFIX networks.

IPPREFIX represents an IPv6 or IPv4 formatted IPv6 address along with a one byte
prefix length. Its physical type is ROW(HUGEINT, TINYINT). The IPADDRESS is stored in
the HUGEINT and is in the form defined in `RFC 4291#section-2.5.5.2 <https://datatracker.ietf.org/doc/html/rfc4291.html#section-2.5.5.2>`_.
The prefix length is stored in the TINYINT.
The IP address stored is the canonical(smallest) IP address in the
subnet range. This type can be used in IP subnet functions.

Example:

In this example the first 32 bits(*FFFF:FFFF*) represents the network prefix.
As a result the IPPREFIX object stores *FFFF:FFFF::* and the length 32 for both of these IPPREFIX objects.

::

IPPREFIX 'FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF:FFFF/32' -- IPPREFIX 'FFFF:FFFF:0000:0000:0000:0000:0000:0000/32'
IPPREFIX 'FFFF:FFFF:4455:6677:8899:AABB:CCDD:EEFF/32' -- IPPREFIX 'FFFF:FFFF:0000:0000:0000:0000:0000:0000/32'

Spark Types
~~~~~~~~~~~~
The `data types <https://spark.apache.org/docs/latest/sql-ref-datatypes.html>`_ in Spark have some semantic differences compared to those in
Expand Down
2 changes: 2 additions & 0 deletions velox/expression/tests/CustomTypeTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ TEST_F(CustomTypeTest, getCustomTypeNames) {
"TIMESTAMP WITH TIME ZONE",
"UUID",
"IPADDRESS",
"IPPREFIX",
}),
names);

Expand All @@ -231,6 +232,7 @@ TEST_F(CustomTypeTest, getCustomTypeNames) {
"TIMESTAMP WITH TIME ZONE",
"UUID",
"IPADDRESS",
"IPPREFIX",
"FANCY_INT",
}),
names);
Expand Down
2 changes: 2 additions & 0 deletions velox/functions/prestosql/IPAddressFunctions.h
Original file line number Diff line number Diff line change
Expand Up @@ -16,11 +16,13 @@
#pragma once

#include "velox/functions/prestosql/types/IPAddressType.h"
#include "velox/functions/prestosql/types/IPPrefixType.h"

namespace facebook::velox::functions {

void registerIPAddressFunctions(const std::string& prefix) {
registerIPAddressType();
registerIPPrefixType();
}

} // namespace facebook::velox::functions
3 changes: 3 additions & 0 deletions velox/functions/prestosql/TypeOf.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@
#include "velox/expression/VectorFunction.h"
#include "velox/functions/prestosql/types/HyperLogLogType.h"
#include "velox/functions/prestosql/types/IPAddressType.h"
#include "velox/functions/prestosql/types/IPPrefixType.h"
#include "velox/functions/prestosql/types/JsonType.h"
#include "velox/functions/prestosql/types/TimestampWithTimeZoneType.h"
#include "velox/functions/prestosql/types/UuidType.h"
Expand Down Expand Up @@ -78,6 +79,8 @@ std::string typeName(const TypePtr& type) {
case TypeKind::VARBINARY:
if (isHyperLogLogType(type)) {
return "HyperLogLog";
} else if (isIPPrefixType(type)) {
return "ipprefix";
}
return "varbinary";
case TypeKind::TIMESTAMP:
Expand Down
8 changes: 4 additions & 4 deletions velox/functions/prestosql/tests/IPAddressCastTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -24,28 +24,28 @@ namespace {
class IPAddressCastTest : public functions::test::FunctionBaseTest {
protected:
std::optional<std::string> castToVarchar(
const std::optional<std::string> input) {
const std::optional<std::string>& input) {
auto result = evaluateOnce<std::string>(
"cast(cast(c0 as ipaddress) as varchar)", input);
return result;
}

std::optional<int128_t> castFromVarbinary(
const std::optional<std::string> input) {
const std::optional<std::string>& input) {
auto result =
evaluateOnce<int128_t>("cast(from_hex(c0) as ipaddress)", input);
return result;
}

std::optional<std::string> allCasts(const std::optional<std::string> input) {
std::optional<std::string> allCasts(const std::optional<std::string>& input) {
auto result = evaluateOnce<std::string>(
"cast(cast(cast(cast(c0 as ipaddress) as varbinary) as ipaddress) as varchar)",
input);
return result;
}
};

int128_t stringToInt128(std::string value) {
int128_t stringToInt128(const std::string& value) {
int128_t res = 0;
for (char c : value) {
res = res * 10 + c - '0';
Expand Down
3 changes: 2 additions & 1 deletion velox/functions/prestosql/types/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,8 @@ velox_add_library(
JsonType.cpp
TimestampWithTimeZoneType.cpp
UuidType.cpp
IPAddressType.cpp)
IPAddressType.cpp
IPPrefixType.cpp)

velox_link_libraries(
velox_presto_types
Expand Down
77 changes: 77 additions & 0 deletions velox/functions/prestosql/types/IPPrefixType.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include <folly/small_vector.h>

#include "velox/expression/CastExpr.h"
#include "velox/functions/prestosql/types/IPPrefixType.h"

namespace facebook::velox {

namespace {

class IPPrefixCastOperator : public exec::CastOperator {
public:
bool isSupportedFromType(const TypePtr& other) const override {
return false;
}

bool isSupportedToType(const TypePtr& other) const override {
return false;
}

void castTo(
const BaseVector& input,
exec::EvalCtx& context,
const SelectivityVector& rows,
const TypePtr& resultType,
VectorPtr& result) const override {
context.ensureWritable(rows, resultType, result);
VELOX_NYI(
"Cast from {} to IPPrefix not yet supported", input.type()->toString());
}

void castFrom(
const BaseVector& input,
exec::EvalCtx& context,
const SelectivityVector& rows,
const TypePtr& resultType,
VectorPtr& result) const override {
context.ensureWritable(rows, resultType, result);
VELOX_NYI(
"Cast from IPPrefix to {} not yet supported", resultType->toString());
}
};

class IPPrefixTypeFactories : public CustomTypeFactories {
public:
TypePtr getType() const override {
return IPPrefixType::get();
}

exec::CastOperatorPtr getCastOperator() const override {
return std::make_shared<IPPrefixCastOperator>();
}
};

} // namespace

void registerIPPrefixType() {
registerCustomType(
"ipprefix", std::make_unique<const IPPrefixTypeFactories>());
}

} // namespace facebook::velox
78 changes: 78 additions & 0 deletions velox/functions/prestosql/types/IPPrefixType.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#pragma once

#include "velox/type/SimpleFunctionApi.h"
#include "velox/type/Type.h"

namespace facebook::velox {

class IPPrefixType : public RowType {
IPPrefixType() : RowType({"ip", "prefix"}, {HUGEINT(), TINYINT()}) {}

public:
static const std::shared_ptr<const IPPrefixType>& get() {
static const std::shared_ptr<const IPPrefixType> instance{
new IPPrefixType()};

return instance;
}

bool equivalent(const Type& other) const override {
// Pointer comparison works since this type is a singleton.
return this == &other;
}

const char* name() const override {
return "IPPREFIX";
}

std::string toString() const override {
return name();
}

folly::dynamic serialize() const override {
folly::dynamic obj = folly::dynamic::object;
obj["name"] = "Type";
obj["type"] = name();
return obj;
}

const std::vector<TypeParameter>& parameters() const override {
static const std::vector<TypeParameter> kEmpty = {};
return kEmpty;
}
};

FOLLY_ALWAYS_INLINE bool isIPPrefixType(const TypePtr& type) {
// Pointer comparison works since this type is a singleton.
return IPPrefixType::get() == type;
}

FOLLY_ALWAYS_INLINE std::shared_ptr<const IPPrefixType> IPPREFIX() {
return IPPrefixType::get();
}

struct IPPrefixT {
using type = Row<int128_t, int8_t>;
static constexpr const char* typeName = "ipprefix";
};

using IPPrefix = CustomType<IPPrefixT>;

void registerIPPrefixType();

} // namespace facebook::velox
3 changes: 2 additions & 1 deletion velox/functions/prestosql/types/tests/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,8 @@ add_executable(
TimestampWithTimeZoneTypeTest.cpp
TypeTestBase.cpp
UuidTypeTest.cpp
IPAddressTypeTest.cpp)
IPAddressTypeTest.cpp
IPPrefixTypeTest.cpp)

add_test(velox_presto_types_test velox_presto_types_test)

Expand Down
41 changes: 41 additions & 0 deletions velox/functions/prestosql/types/tests/IPPrefixTypeTest.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
/*
* Copyright (c) Facebook, Inc. and its affiliates.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/
#include "velox/functions/prestosql/types/IPPrefixType.h"
#include "velox/functions/prestosql/types/tests/TypeTestBase.h"

namespace facebook::velox::test {

class IPPrefixTypeTest : public testing::Test, public TypeTestBase {
public:
IPPrefixTypeTest() {
registerIPPrefixType();
}
};

TEST_F(IPPrefixTypeTest, basic) {
ASSERT_STREQ(IPPREFIX()->name(), "IPPREFIX");
ASSERT_STREQ(IPPREFIX()->kindName(), "ROW");
ASSERT_EQ(IPPREFIX()->name(), "IPPREFIX");
ASSERT_TRUE(IPPREFIX()->parameters().empty());

ASSERT_TRUE(hasType("IPPREFIX"));
ASSERT_EQ(*getType("IPPREFIX", {}), *IPPREFIX());
}

TEST_F(IPPrefixTypeTest, serde) {
testTypeSerde(IPPREFIX());
}
} // namespace facebook::velox::test
17 changes: 17 additions & 0 deletions velox/functions/tests/FunctionRegistryTest.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@
#include "velox/functions/Registerer.h"
#include "velox/functions/prestosql/registration/RegistrationFunctions.h"
#include "velox/functions/prestosql/tests/utils/FunctionBaseTest.h"
#include "velox/functions/prestosql/types/IPPrefixType.h"
#include "velox/functions/tests/RegistryTestUtil.h"
#include "velox/type/Type.h"

Expand Down Expand Up @@ -501,4 +502,20 @@ TEST_F(FunctionRegistryOverwriteTest, overwrite) {
ASSERT_EQ(signatures.size(), 1);
}

TEST_F(FunctionRegistryTest, ipPrefixRegistration) {
registerIPPrefixType();
registerFunction<IPPrefixFunc, IPPrefix, IPPrefix>({"ipprefix_func"});

auto& simpleFunctions = exec::simpleFunctions();
auto signatures = simpleFunctions.getFunctionSignatures("ipprefix_func");
ASSERT_EQ(signatures.size(), 1);

auto result = resolveFunctionWithMetadata("ipprefix_func", {IPPREFIX()});
EXPECT_TRUE(result.has_value());
EXPECT_EQ(*result->first, *IPPREFIX());
EXPECT_TRUE(result->second.defaultNullBehavior);
EXPECT_TRUE(result->second.deterministic);
EXPECT_FALSE(result->second.supportsFlattening);
}

} // namespace facebook::velox
12 changes: 12 additions & 0 deletions velox/functions/tests/RegistryTestUtil.h
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@
#include "velox/expression/FunctionSignature.h"
#include "velox/expression/VectorFunction.h"
#include "velox/functions/Macros.h"
#include "velox/functions/prestosql/types/IPPrefixType.h"

namespace facebook::velox {

Expand Down Expand Up @@ -97,6 +98,17 @@ struct VariadicFunc {
}
};

template <typename T>
struct IPPrefixFunc {
VELOX_DEFINE_FUNCTION_TYPES(T);

FOLLY_ALWAYS_INLINE bool call(
out_type<IPPrefix>& /* result */,
const arg_type<IPPrefix>& /* arg1 */) {
return true;
}
};

class VectorFuncOne : public velox::exec::VectorFunction {
public:
void apply(
Expand Down

0 comments on commit c8ac4e3

Please sign in to comment.