Social schema #310

jkbhagatio · 2024-01-28T19:06:03Z

No description provided.

…-12-08

glopesdev · 2024-02-12T15:35:19Z

aeon/dj_pipeline/acquisition.py

@@ -5,7 +5,7 @@
 import pandas as pd

 from aeon.io import api as io_api
-from aeon.schema import dataset as aeon_schema
+from aeon.io import schemas as aeon_schema


I don't think we should lump together experiment-specific schemas with the much more general io module.

Currently somebody can use io as a PyPI package for their own experiments without having to know anything at all about the details of the foraging experiments, which I think is something we should keep. We can discuss whether aeon.schema is the best name for the module, but I do feel these should live in a separate module.

glopesdev · 2024-02-12T15:38:34Z

aeon/io/api.py

@@ -115,7 +115,8 @@ def load(root, reader, start=None, end=None, time=None, tolerance=None, epoch=No
                # to fill missing values
                previous = reader.read(files[i - 1])
                data = pd.concat([previous, frame])
-                data = data.reindex(values, method="pad", tolerance=tolerance)
+                data = data.reindex(values, tolerance=tolerance)
+                data.dropna(inplace=True)


Why are we dropping NaN values from the output data? This feels dangerous, but regardless of the reason seems out of scope for this PR which is about refactoring schemas. If we want to discuss this we should do so in a separate PR.

glopesdev · 2024-02-12T15:50:23Z

aeon/io/device.py

-def compositeStream(pattern, *args):
-    """Merges multiple data streams into a single composite stream."""
-    composite = {}
+def register(pattern, *args):


I don't think register is a good term to use here. In computer-science this term refers either to a CPU register or something like the Windows registry, which is more like a global level dictionary of settings.

If the issue is the term composite, then my proposed alternative would be to simply use the plural streams, since conceptually a Device is simply meant to contain a collection of data streams.

I think we discussed this in a previous DA meeting, but there is a confusion here about the intended target audience. At a basic data analysis level I think the goal is to simply provide access to the collection of streams in a device. That collection is a dictionary simply because there is a unique key associated with each stream, but what we have at its heart is still simply a collection of streams.

I do agree with you that the "binder function" (provisional name) has a different role from a Reader, and likely merits having its own name separate name when we are explaining how the API works.

However, I was realizing just now that we actually don't seem to use the word "stream" for anything else in the API so it still feels to me like a strong contender to capture this concept.

glopesdev · 2024-02-12T15:53:06Z

aeon/io/device.py

-    """Merges multiple data streams into a single composite stream."""
-    composite = {}
+def register(pattern, *args):
+    """Merges multiple Readers into a single registry."""


We are not merging Readers but actually binder_fn objects, at least given the code below.

glopesdev · 2024-02-12T15:55:44Z

aeon/io/device.py



 class Device:
-    """Groups multiple data streams into a logical device.
+    """Groups multiple Readers into a logical device.


Same as above, we are not merging Readers but binder_fn objects.

glopesdev · 2024-02-12T15:56:35Z

aeon/io/device.py

+    If a device contains a single stream reader with the same pattern as the device `name`, it will be 
+    considered a singleton, and the stream reader will be paired directly with the device without nesting.


What exactly changed here? If we are reviewing core terminology we should make sure documentation stays consistent.

looks like auto-formatting

glopesdev · 2024-02-12T16:00:00Z

aeon/io/device.py


    Attributes:
        name (str): Name of the device.
-        args (Any): Data streams collected from the device.
+        args (any): A binder function or class that returns a dictionary of Readers.


Any here meant the Any type so it should either be left with its original casing, or replaced with something else from the typing module.

A possible alternative to "binder function" could be "stream selector" or "stream accessor", since basically the only functionality this function brings in addition to the reader is the pattern to search and find stream data files.

Another option is simply to call this concept a "stream", since we don't use that name anywhere else in the API. I think this is actually closer to what I originally had in mind when I defined these "binder functions":

a Reader is a reusable module for reading specific data file formats

a stream is a combination of "data" + "reader", i.e. where is the data (the "pattern") and how to read it (the Reader).

We can discuss better on a future meeting.

glopesdev · 2024-02-12T16:00:58Z

aeon/io/reader.py

@@ -212,7 +212,7 @@ def read(self, file):
        specified unique identifier.
        """
        data = super().read(file)
-        data = data[data.event & self.value > 0]
+        data = data[(data.event & self.value) == self.value]


I don't remember anymore what this change was about, but I feel we should leave any functional changes outside of this PR, which will already be confusing enough with all the major breaking refactoring and reorganisation.

glopesdev · 2024-02-12T16:02:07Z

aeon/schema/core.py

@@ -37,7 +37,7 @@ def subject_state(pattern):
    return {"SubjectState": _reader.Subject(f"{pattern}_SubjectState_*")}


-def messageLog(pattern):
+def message_log(pattern):


Can we focus this PR only on the major naming changes and terminology? I know we disagree on best practices for casing in python, but if we could leave this out of this PR so we can focus on the core discussions on naming, that would help a lot.

glopesdev · 2024-02-12T16:03:53Z

aeon/schema/social.py

@@ -32,7 +32,7 @@ def read(
    ) -> pd.DataFrame:
        """Reads data from the Harp-binarized tracking file."""
        # Get config file from `file`, then bodyparts from config file.
-        model_dir = Path(file.stem.replace("_", "/")).parent
+        model_dir = Path(*Path(file.stem.replace("_", "/")).parent.parts[1:])


As above, I would remove all functional changes from this PR.

jkbhagatio · 2024-04-04T14:12:37Z

Closing as the schema has been incorporated in #347

jkbhagatio added 16 commits December 1, 2023 21:34

Ensure data filtered by bitmask

72d0458

Fix for #290

f20c248

Changed 'dataset' name to 'schemas' and refactored

f8349c0

Added notebook showing finding harp event bitmasks

38d46c1

Refactored according to #293

967f322

Fixed #294

fe3a799

WIP schema-from-scratch tutorial

3164415

Started 'device' section

f9441d5

Updated device terminology

26394c9

Finished 'device' and 'schema' explanation

fc779f3

reintroduce aeon.schema

422395f

Refactored to use 'register'

f11f97f

Social schema WIP

5a095c7

Refactored for consistent snake_case

14d4635

Removed device name from path

5171108

Finished and tested social schema for data between 2023-12-01 -- 2023…

8683fee

…-12-08

glopesdev requested changes Feb 12, 2024

View reviewed changes

jkbhagatio closed this Apr 4, 2024

jkbhagatio deleted the social_schema branch November 14, 2024 11:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Social schema #310

Social schema #310

jkbhagatio commented Jan 28, 2024

glopesdev Feb 12, 2024 •

edited

Loading

glopesdev Feb 12, 2024 •

edited

Loading

glopesdev Feb 12, 2024 •

edited

Loading

glopesdev Feb 12, 2024

glopesdev Feb 12, 2024

glopesdev Feb 12, 2024

lochhh Feb 14, 2024

glopesdev Feb 12, 2024 •

edited

Loading

glopesdev Feb 12, 2024

glopesdev Feb 12, 2024

glopesdev Feb 12, 2024

jkbhagatio commented Apr 4, 2024

		If a device contains a single stream reader with the same pattern as the device `name`, it will be
		considered a singleton, and the stream reader will be paired directly with the device without nesting.

Social schema #310

Social schema #310

Conversation

jkbhagatio commented Jan 28, 2024

glopesdev Feb 12, 2024 • edited Loading

Choose a reason for hiding this comment

glopesdev Feb 12, 2024 • edited Loading

Choose a reason for hiding this comment

glopesdev Feb 12, 2024 • edited Loading

Choose a reason for hiding this comment

glopesdev Feb 12, 2024

Choose a reason for hiding this comment

glopesdev Feb 12, 2024

Choose a reason for hiding this comment

glopesdev Feb 12, 2024

Choose a reason for hiding this comment

lochhh Feb 14, 2024

Choose a reason for hiding this comment

glopesdev Feb 12, 2024 • edited Loading

Choose a reason for hiding this comment

glopesdev Feb 12, 2024

Choose a reason for hiding this comment

glopesdev Feb 12, 2024

Choose a reason for hiding this comment

glopesdev Feb 12, 2024

Choose a reason for hiding this comment

jkbhagatio commented Apr 4, 2024

glopesdev Feb 12, 2024 •

edited

Loading

glopesdev Feb 12, 2024 •

edited

Loading

glopesdev Feb 12, 2024 •

edited

Loading

glopesdev Feb 12, 2024 •

edited

Loading