DM-24283: Add updateSchema tool #32

bsmartradio · 2023-06-30T22:06:41Z

Tool that takes a path to the apdb.yaml and a version string and generates new schemas. This tool excludes a number of fields we do not currently want in the schemas. sample_data, diaNondetectionLimit.avsc, and alert.avsc still need to be manually updated as they are not read from the apdb.

timj · 2023-06-30T22:08:28Z

@bsmartradio would you be able to also look at #31 for me? (DM-39756). As far as I can tell none of the tests work so I can't tell if my fixes break anything.

bsmartradio · 2023-07-03T18:36:08Z

I realized I should probably use fastavro instead of avro because it isn't in the stack. I'm swapping it over now.

ebellm

I'd suggest changing updateSchema.py to make it more useful in the future--not just for docstring synchronization but for generating Avro schemas that match an apdb.yaml source of truth.

python/lsst/alert/packet/updateSchema.py

ebellm · 2023-08-28T22:40:41Z

python/lsst/alert/packet/updateSchema.py

+__all__ = ['update_schema']
+
+
+def update_docs(schema, apdb):


Even thought this is a small script, let's include docstrings.

python/lsst/alert/packet/updateSchema.py

ebellm

Hi @bsmartradio, a number of comments throughout for consistency.

README.rst

ebellm · 2023-12-19T19:44:56Z

python/lsst/alert/packet/updateSchema.py

+
+    """
+
+    registry = SchemaRegistry.from_filesystem()


I think we can skip instantiating a schema registry here and just have the function take the user-supplied version number directly.

I've added a new input, schema_path, which requires the user to include a path to where the schemas will be added.

ebellm · 2023-12-19T19:45:02Z

python/lsst/alert/packet/updateSchema.py

+    return schema
+
+
+def update_schema(apdb_filepath, update_version=None):


Since we're not updating existing schemas but generating new ones based on an apdb I'd choose a different name for this function and update the docstrings accordingly throughout the function.

I'm swapping to 'generate_schema' since we are generating a schema from the apdb.

ebellm · 2023-12-19T19:56:01Z

python/lsst/alert/packet/updateSchema.py

+            version_name = version.split(".")[0] + "_" + version.split(".")[1]
+
+        # The first 4 columns in the apdb are the ones we use for alerts
+        for x in range(0, 4):


It will be less fragile if you iterate through all of the entries in apdb['tables'] and compare the name to a list of the tables you want.

ebellm · 2023-12-19T19:56:17Z

python/lsst/alert/packet/updateSchema.py

+        else:
+            version_name = version.split(".")[0] + "_" + version.split(".")[1]
+
+        # The first 4 columns in the apdb are the ones we use for alerts


not columns, but tables

Your right, I believe avro treats them as columns of the parent schema, but its really nested tables. Changed to tables.

ebellm · 2023-12-19T20:12:25Z

python/lsst/alert/packet/updateSchema.py

+    return schema
+
+
+def populate_fields(apdb):


Since this takes a single table schema I wouldn't call this variable apdb

Swapped to apdb_table

ebellm · 2023-12-19T20:13:50Z

python/lsst/alert/packet/updateSchema.py

+    Parameters
+    ----------
+    apdb: `dict`
+        The name of the schema as a string. E.G. diaSource.


this docstring description isn't right--you're passing in a dictionary from the sdm_schemas yaml, not a string name

Updated the docstring. Thanks for catching this. A lot of the docstrings were leftover from when I was just updating things and not generating new schemas.

ebellm · 2023-12-19T20:25:31Z

python/lsst/alert/packet/updateSchema.py

+    field_dictionary_array = []
+    for column in apdb['columns']:
+        # We are still finalizing the time series feature names.
+        if (column['name'] != 'validityStart') and (


I think this would be nicer looking if you made a list of strings to exclude and then used a loop to check rather than retyping column['name'] a bunch of times.

Changed to iterating over a list of excluded fields.

ebellm · 2023-12-19T20:26:52Z

python/lsst/alert/packet/updateSchema.py

+
+if __name__ == '__main__':
+
+    parser = argparse.ArgumentParser(description='Update the schema docstrings so that they'


again, I'd update the docstring here: rather than updating docstrings we're generating a new schema set from apdb.yaml.

Changed the docstring to reflect that we are generating new schemas.

ebellm · 2023-12-22T17:55:26Z

README.rst


-      * Rename all of the ``lsst.vX*`` files to the new version.
-      * Update the ``"namespace": "lsst.v5_0",`` line at the top of each ``*.avsc`` file to the new version.
+      * run ``python updateSchema.py /path/to/LSST/code/sdm_schemas/yml/apdb.yaml Path/To/Yaml/sdm_schemas/yml/apdb.yaml "6.0"`` All Generated files do not need to be altered.


Suggested change

* run ``python updateSchema.py /path/to/LSST/code/sdm_schemas/yml/apdb.yaml Path/To/Yaml/sdm_schemas/yml/apdb.yaml "6.0"`` All Generated files do not need to be altered.

* run ``python updateSchema.py /path/to/LSST/code/sdm_schemas/yml/apdb.yaml Path/To/Yaml/sdm_schemas/yml/apdb.yaml /path/to/alert_packet/schema/ "6.0"`` All Generated files do not need to be altered.

ebellm

Just a couple of minor items of docs cleanup but I think it's ready otherwise!

README.rst

python/lsst/alert/packet/updateSchema.py

README.rst

Update

bsmartradio requested a review from ebellm June 30, 2023 22:06

bsmartradio force-pushed the tickets/DM-24283 branch 3 times, most recently from bee6cb1 to a9349b9 Compare July 8, 2023 03:16

timj changed the title ~~Add updateSchema tool~~ DM-24283: Add updateSchema tool Jul 8, 2023

bsmartradio force-pushed the tickets/DM-24283 branch 2 times, most recently from 820ee8f to 90804cb Compare July 20, 2023 18:34

bsmartradio requested a review from erinleighh August 1, 2023 23:44

ebellm requested changes Aug 28, 2023

View reviewed changes

bsmartradio force-pushed the tickets/DM-24283 branch from 90804cb to b602c68 Compare September 20, 2023 18:42

bsmartradio force-pushed the tickets/DM-24283 branch from 30cb226 to ec8b6c6 Compare November 22, 2023 22:33

bsmartradio added 2 commits December 14, 2023 09:58

Update avro schemas with docstrings and add update tool

71d9304

Update README.rst with new updateSchema.py instructions

332a458

bsmartradio force-pushed the tickets/DM-24283 branch from e0f8d41 to c23e696 Compare December 14, 2023 12:59

ebellm requested changes Dec 19, 2023

View reviewed changes

ebellm reviewed Dec 22, 2023

View reviewed changes

ebellm approved these changes Dec 22, 2023

View reviewed changes

bsmartradio force-pushed the tickets/DM-24283 branch from 87dbc80 to dfa5860 Compare December 22, 2023 19:20

Update schema with nullable trailNdata

946a730

Update

bsmartradio force-pushed the tickets/DM-24283 branch from 28e3842 to 946a730 Compare December 22, 2023 19:32

bsmartradio merged commit 2b9c5b9 into main Jan 3, 2024
5 checks passed

bsmartradio deleted the tickets/DM-24283 branch January 3, 2024 02:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DM-24283: Add updateSchema tool #32

DM-24283: Add updateSchema tool #32

bsmartradio commented Jun 30, 2023 •

edited

Loading

timj commented Jun 30, 2023

bsmartradio commented Jul 3, 2023

ebellm left a comment

ebellm Aug 28, 2023

ebellm left a comment

ebellm Dec 19, 2023

bsmartradio Dec 20, 2023

ebellm Dec 19, 2023

bsmartradio Dec 19, 2023 •

edited

Loading

ebellm Dec 19, 2023

ebellm Dec 19, 2023

bsmartradio Dec 20, 2023

ebellm Dec 19, 2023

bsmartradio Dec 20, 2023

ebellm Dec 19, 2023

bsmartradio Dec 20, 2023

ebellm Dec 19, 2023

bsmartradio Dec 20, 2023

ebellm Dec 19, 2023

bsmartradio Dec 20, 2023

ebellm Dec 22, 2023

ebellm left a comment

		return schema


		def update_schema(apdb_filepath, update_version=None):


		if __name__ == '__main__':

		parser = argparse.ArgumentParser(description='Update the schema docstrings so that they'

	* run ``python updateSchema.py /path/to/LSST/code/sdm_schemas/yml/apdb.yaml Path/To/Yaml/sdm_schemas/yml/apdb.yaml "6.0"`` All Generated files do not need to be altered.
	* run ``python updateSchema.py /path/to/LSST/code/sdm_schemas/yml/apdb.yaml Path/To/Yaml/sdm_schemas/yml/apdb.yaml /path/to/alert_packet/schema/ "6.0"`` All Generated files do not need to be altered.

DM-24283: Add updateSchema tool #32

DM-24283: Add updateSchema tool #32

Conversation

bsmartradio commented Jun 30, 2023 • edited Loading

timj commented Jun 30, 2023

bsmartradio commented Jul 3, 2023

ebellm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebellm left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bsmartradio Dec 19, 2023 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ebellm left a comment

Choose a reason for hiding this comment

bsmartradio commented Jun 30, 2023 •

edited

Loading

bsmartradio Dec 19, 2023 •

edited

Loading