-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Is there any interest in supporting IPNetwork as well? #40
Comments
Seems like a reasonable feature. I don’t have any plans to work on it.
… On Jun 18, 2019, at 12:09, Dinesh Dutt ***@***.***> wrote:
Hi,
I'm interested in IPNetwork as well. Is there any work being done on this already? If not, is there any interest in supporting it?
Thanks,
Dinesh
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Cool. Any recommendations on providing the pull request provided I get to doing it? Dinesh |
It seems best to create a new type called IPNetworkArray rather than fiddle with IPArray. Would you agree? |
Yes, a separate ExtensionDtype and ExtensionArray makes the most sense. Will the storage be essentially the same? An ndarray with a record dtype for the hi and lo bits? I think you'll want to inherit from NumPyBackedExtensionArrayMixin: https://github.com/ContinuumIO/cyberpandas/blob/master/cyberpandas/base.py. I wrote that as a prototype for pandas, so it's a bit convoluted, but may be useful. I suspect you find some duplicate functionality with IPArray. I wouldn't worry about deduplicating things too much at first. |
I was planning to store the thing as a CIDR string, /prefixlen. Would that work? I haven't written an extensions to Pandas dtypes before. |
Hmm I'm not sure. For performance on operations, IPArray stores elements as
a 128-bit integer, split across two columns of a NumPy array.
This gives us NumPy's fast performance for operations like `isna()`,
`is_ipv4()`, masking, etc. I'm not sure if you could do those kind of
operations on a string, (but again, I don't know much about networking so I
may be missing something).
…On Tue, Jun 18, 2019 at 1:15 PM Dinesh Dutt ***@***.***> wrote:
I was planning to store the thing as a CIDR string, /prefixlen. Would that
work? I haven't written an extensions to Pandas dtypes before.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#40?email_source=notifications&email_token=AAKAOIQIOMUPPOOYGPMRLODP3EQ5LA5CNFSM4HZCEBW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODX7QE4A#issuecomment-503251568>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIRQ747C2OIUWHIXKU3P3EQ5LANCNFSM4HZCEBWQ>
.
|
It would be convenient for a number of operations if the representation of the network could be fixed width. Packed variable length strings are likely to make things complicated. |
So I have a prelim implementation working, but not sure of its efficiency. I stored the network as an IPv4Network or IPv6Network object itself instead of trying to do a string or any other fancy implementation. For things like is_ipv4, the IPArray approach is much faster and for things like is_multicast, is_link_local, the approach I've taken seems faster from a simple %timeit on a few things perspective. I'd like some eyes on what I've done to see if there are changes to make the thing better. What do you suggest? I still need a lot of cleaning up if all looks good. Dinesh |
I've got df.a.astype('ipnetwork'), df.query('a.ipnet.is_link_local') etc. working Dinesh |
On Wed, Jun 19, 2019 at 2:39 PM Dinesh Dutt ***@***.***> wrote:
So I have a prelim implementation working, but not sure of its efficiency.
I stored the network as an IPv4Network or IPv6Network object itself instead
of trying to do a string or any other fancy implementation. For things like
is_ipv4, the IPArray approach is much faster and for things like
is_multicast, is_link_local, the approach I've taken seems faster from a
simple %timeit on a few things perspective.
On IPArray many methods, including is_multicast and is_link_local, convert
from the fast ndarray of integers to the slow ndarray of IPv4Address
objects. So I
would expect those to be roughly the same (a bit slower, since
IPArray.is_link_local needs to do the conversion first). Some of these
could likely be
implemented without converting to objects first.
That said, as a prototype this is probably fine for now. You may just want
to make the `.data` attribute private in case the internal representation
changes in the future.
I'd like some eyes on what I've done to see if there are changes to make
the thing better. What do you suggest? I still need a lot of cleaning up if
all looks good.
A PR is probably best, though I likely won't have time to review until
after the next distributed and pandas releases.
… Dinesh
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#40?email_source=notifications&email_token=AAKAOISPIP6IXGFB2ER3QY3P3KDPNA5CNFSM4HZCEBW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYDCAZI#issuecomment-503717989>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIUVYWTPCMHHYDFKC6TP3KDPNANCNFSM4HZCEBWQ>
.
|
Thanks Tom. Since you won't have time for a bit, let me clean it up and ensure the data is private for now. I'll look at what I can do (maybe add a third byte which is the prefixlen to your ip address model) to make this more efficient. Is Cython a possibility here? I'll send a PR after I'm comfortable its cleaned up. Does that make sense? Dinesh |
Sounds good.
Not sure about Cython, unless it's really necessary, since that complicates
the build / packaging so much.
…On Wed, Jun 19, 2019 at 3:00 PM Dinesh Dutt ***@***.***> wrote:
Thanks Tom. Since you won't have time for a bit, let me clean it up and
ensure the data is private for now. I'll look at what I can do (maybe add a
third byte which is the prefixlen to your ip address model) to make this
more efficient. Is Cython a possibility here? I'll send a PR after I'm
comfortable its cleaned up.
Does that make sense?
Dinesh
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#40?email_source=notifications&email_token=AAKAOIWIVKX4IQUQL7NFYDTP3KF6DA5CNFSM4HZCEBW2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGODYDDWJI#issuecomment-503724837>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAKAOIVY2G22YMA5YSVGHL3P3KF6DANCNFSM4HZCEBWQ>
.
|
I've created a pull request. I've tried to follow the model you've followed for IPArray in terms of where the different functions live. I've also added support for max/min besides support for supernet_of and subnet_of. |
Hello, is there any progress on this issue? |
@ddutt , @TomAugspurger did this go anywhere? I have a use-case for this and was curious if it's still "a thing" |
Not really, I'm don't have time to work on cyberpandas these days. |
I don't have time to work on cyberpandas either. I found other ways to work around it. But if you have some code, I'm happy to test it. |
Hi,
I'm interested in IPNetwork as well. Is there any work being done on this already? If not, is there any interest in supporting it?
Thanks,
Dinesh
The text was updated successfully, but these errors were encountered: