Question: Does the Manage/ Share/Reuse of Workflows, Models and Data have a common pattern?

Just a question - but a quick look at charrette post-its suggests a lot of common functional aspect of the manage, share, re-use for each of the three categories: Data (we have experience with), Workflows (less experience) and Models (?least experience?).

Worth a look. Or has this been done? Is GEO any different?

Views: 37

Comment

You need to be a member of EarthCube to add comments!

Join EarthCube

Comment by Naicong Li on February 15, 2013 at 1:32pm

Our experience in developing the Spatial Decision Support Ontology and Knowledge Portal agrees with what Omar said.  Just as you would need metadata in order to properly register and access datasets, you need metadata for models and workflows.  We defined ontology classes for each of these three types of things for registering their metadata, and uses semantic query for searching for specific instances of these things in the registry.  In this sense we use the same way to manage these things.  Each of these classes of things (with their subclasses) have different set of properties.  Some of these properties are the same across the three types of things (for example, the knowledge domain property - models and datasets often are specific for some knowledge domains, e.g. biodiversity conservation), while many of them are class specific.  Some of these properties specify inter-relations among these classes, for example, a particular workflow step may be instantiated by a certain type of models (on the other hand a model typically has an internal workflow); a model input would require some datasets of certain characteristics. 

When you think about the usage of data, which includes analysis and ultimately supports decision making, you need to consider management, sharing and re-use of all these three types of things, so I am also glad that you brought this up.  The ontology that we are using to define these things need more work (more details), and I hope this proposed RCN would provide a place for people interested in this topic to work together.

Comment by Omar El-Gayar on November 2, 2011 at 12:16pm
Thanks for bringing this up. I would argue 'Yes', they do have common pattern. To facilitate managing, sharing, and reusing models one often needs to represent  them at higher level of abstraction, manage meta-data (or meta-models), develop and leverage semantics, etc. With regard to models, there are additional issues, e.g., diversity of models and solution environments, and often the tight coupling of models and supporting data (hampering reuse with other data sets). There is also synergy among all three, e.g., workflow composition can benefit from model composition. We have done work in model representation, including the semantic representation of models as well as the integration of models in workflow and the dynamic composition of workflows.
Comment by Chaitan Baru on November 2, 2011 at 11:51am
It would be good to get a catalog of "big" data in GEO--what is "big"; what are the data types (simulation, remote sensing); what types of processing is needed; can the processing be done in data-parallel fashion, e.g using modern approaches like Hadoop? Is there a role/need for large relational databases and/or things like SciDB.
Comment by Tanu Malik on November 2, 2011 at 11:35am
We just finished discussing there is less experience with "Big  Data" that is large, distributed, and diverse.

© 2013   Created by Dennis Carey.

Badges  |  Report an Issue  |  Terms of Service

Any opinions, findings, conclusions or recommendations presented in this material are only those of the presenter grantee/researcher, author, or agency employee; and do not necessarily reflect the views of the National Science Foundation.