Skip to content

Data Model: Goal

webbhm edited this page Jun 2, 2018 · 1 revision

"The situation is impossible, but not hopeless"

I started thinking through several use cases, and find myself coming to the conclusion that it is impossible to create a common normalized data model for capturing sensor data (environment observations). Different systems fragment the data in different ways. Three examples are the existing MVP, the latest MVP prototype, and Fairchild Garden’s current unit. With the MVP there is a single ‘environment’ to which all environmental factors relate in a 1:1 relationship. The environment has one light, one exhaust fan, one reservoir (nutrients) and one air temperature. This is the use case I have been modeling. With the MVP prototype the reservoir is split into two sections; so while there is still a 1:1 relationship of the environment to the lights, fan and temperature, there is a need for two fill pumps and two measurements of nutrient, pH and other reservoir readings (1:2 relationships). With Fairchild, since they use ‘pillows’ (sacks of medium and nutrients), there is a 1:1 relationship for the lights, fans and temperature; but nutrients are a 1:* relationship to the environment - there will be a separate reading for each plant/pillow. One option is to denormalize the database and record everything at the individual plant level. This has the risk (which normalization is designed to avoid) of inconsistent data and the difficulty of reporting common factors (“What is the temperature of the environment?”, “Turn on the Reservoir pump.”) While we can have common naming conventions and common database patterns across all projects, I am coming to the conclusion that each box design may require its own custom database design. These would have the same ‘parts’ (tables), but the relationships would be associated at different levels of granularity (environment, portion of the environment, group of plants, individual plant). While I think common data is impossible at the collection level, I do not think that sharing of operational data has that much value. The value of sharing data is with analytics, and for that I think that ‘project end’ summary data is sufficient. Summaries would be done at the ‘trial’ level, which in all cases, I believe, would be at the individual plant level. Rather than sending every temperature reading, we standardize on a daily min/max/average; and this data is ‘distributed’ to each trial/plant data package. While ideally it would be nice to share operational and administrative ‘dashboards’ all using common data, I think the best we can hope for are some common components and patterns that allow for some relational flexibility for each hardware design. The sharing of analytics and recipes is our real goal, and while a lot of hard work, I think it is possible. I am going to continue to model the input data (initially for the existing MVP), though may have to create several ‘alternative’ models for the different designs. My real goal, and what I want to end up with are some views that express a common analytic summary. I hope this makes sense, as this idea is key to moving forward.