Syme, G., Hatton MacDonald, D., Fulton, B. and Piantadosi, J. (eds) MODSIM2017, 22nd International Congress on Modelling and Simulation., Dec 3, 2017
Models of physical systems are the foundation of many scientific and decision support systems. Th... more Models of physical systems are the foundation of many scientific and decision support systems. These models rely heavily on observational data, typically collected from sensors. Increasingly this data comes from a wide range of sources. For example, agricultural models often require data from climate observations, soil conditions, on-farm equipment, seasonal forecasts, among others. Integration of these data with models is very time-consuming and often is repetitious across different models. Furthermore, automation of model runs is difficult due to the complexity of managing data dependencies. We have developed a distributed system, Senaps, to support automation of sensor data retrieval and coupling with model execution in a scalable way. It has been developed over many years across scientific disciplines, including water management, agriculture, aquaculture, and related Information, Communication and Technologies areas. It has been used, and is in use, by a diverse range of projects, resulting in a flexible system that is not tied to a specific domain. Senaps includes a publish-subscribe subsystem that handles ingestion of disparate time-series data. It supports stream processing, such as quality assurance and data checking, and automates data ingestion with monitoring and recovery. The storage and access subsystem is a scalable time-series backend with an Application Programming Interface (API) to allow third party developers to build on. It has a range of features including dynamic temporal aggregations; fine-grained access control to support data privacy and sharing (users can elect to share data between organisations); metadata for sensor data management; and controlled vocabularies. The focus of this paper is the model integration subsystem, which provides the model integration and automation features. This system builds on developments in cloud and container-based computing to isolate a user submitted model's runtime environment and provide access to the data backend. APIs are provided to handle environment images (e.g. Linux with R), model definition, workflows (instances of a model), and running of model jobs. We have successfully used this system to automate model runs and provide continuous results from a number of parameterised models. We have hosted a number of models on the platform, including a timber drying model and two agricultural prediction models. Being tied to a robust sensor-data backend ensures models are run on the most recent data and removes the need for model developers to continuously manage model execution. Results from the model are automatically available and can be easily shared between users and organisations. In this paper, we detail the technical challenges in implementation, provide example results from a running model, and describe our next research steps.
Uploads
Papers by Mac Coombe