Data & Knowledge Engineering
School of ITEE

Water Management


Project Aim

Adaptive water management requires the use of many different collections of data in order to obtain a high-level complete view of the environment. Scientists and planners require access to data that describes climate (rainfall, temperature, evaporation), water flow, water quality, land use, vegetation coverage, and the distribution of species. These all come from different sources.

To facilitate effective planning and decision-making these need to be used in conjunction with data and models from other sources describing population growth, urban development, water consumption and water management rules.

Data can be derived from models or obtained from a variety of sources including sensors, manually captured monitoring data, and databases. Currently data from different sources vary widely in completeness, quality, scale, scope, and reliability. Thus it is often difficult to be able to compare data from different sources, even if they are referring to the same thing.

One of the greatest challenges facing the adaptive whole-of-water-cycle approach to water resource management is quick and seamless integration of the many independently developed data sources and models. The current solution involves domain experts manually mapping and tuning data from different sources in order to put them together. This is extremely tedious and time-consuming. As more data is stored with time, and more complex models are needed, it becomes increasingly infeasible to do everything manually.

The aim of this project is to improve the speed, rigour and adaptability of the water management decisions being made within South East Queensland.

Key Focus Areas

  • Improve the quality of data as well as reducing gaps/missing entries in the data
  • Improve data accessibility, retrieval and integration to draw on and use information more easily
  • Make it easier to determine what data is relevant for constructing certain models
  • Improve data interpreability so that complex data can be presented as meaningful information to users

The Fundamentals - A Research Approach

This project employs a focussed applied research approach that is looking to solve critical issues associated with using Healthy Waterways’ (HW) data. It is significant because the data is used to support management strategies and planning decisions by Healthy Waterways’ partners as well as related interstate and national water management authorities such as the National Water Commission.

A HW workshop held in April 2007 determined that scientists in Queensland’s Environmental Protection Agency (EPA) want simple, scalable and efficient systems capable of retrieving and merging data from a range of distributed databases and models in order to rapidly answer questions such as:

“How will the mandatory adoption of rainwater tanks in the Logan Region effect domestic water requirements in 5 years time, taking into account the effects of climate change and population growth in the region?”

To answer such questions the right information must first be acquired. This leads us to ubiquitous and fundamental research problems with data regarding quality and integration. The Data Knowledge and Engineering group at The University of Queensland, and the group at Victoria University focus on developing techniques to tackle these problems and provide a foundation to support further improvement of HW modelling systems.

Data quality and completeness is being improved through a combination of approaches: developing new techniques for identifying and correcting data inaccuracies, inconsistencies and gaps in datasets, and optimising the effectiveness of data cleaning and capture programs for HW-related data. Statistical analyses of existing integrated data sets identify areas where data is poor, sparse or unreliable. To that end identifying gaps in existing data sets can directly impact on HW data practices by guiding the placement of new sensors, their calibration and sampling frequency, for instance.

Data integration, accessibility, and retrieval are addressed by focussing on making distributed data sources accessible and easy to find, then making their data interoperable. The problem is there isn't one standard way to record data. A first step in approaching integration is identifying important data sources: those that feed into and out of each of the models being used by HW. Then, advanced mapping techniques have been employed to work out the semantics of how data is represented. We develop a common framework so data from different sources can be understood, then integrated or used with other data.

The goal is to enable a more holistic, adaptive approach to water management. For instance, by linking biophysical, environmental, economic and social models, we can better understand the trade-offs between natural resource management decisions and the impact on the economy and people's livelihoods.

Related Projects

The eResearch group, lead by Professor Jane Hunter, has been working on a related project regarding information management. The Health-e-Waterways project is a collaboration between The University of Queensland, the SEQ Healthy Waterways Partnership and Microsoft Research. Its aim is to develop a new waterways information management system, which will provide a significant advance on current information technology and accessibility.

Full details of the Health-e-Waterways project can be found on the project website at: http://www.health-e-waterways.org/

People

Prof. Xiaofang Zhou is an expert in spatial databases, information systems and information infrastructures. He is a Professor of Computer Science at the University of Queensland, and heads the Data and Knowledge Engineering Research Division there. Before joining UQ he worked as a researcher at the Commonwealth Scientific and Industrial Research Organisation leading its Spatial Information Systems group.

Prof. Jane Hunter is internationally recognized for the development of metadata tools, ontologies and inferencing rules for scientific data integration and the recording and visualization of provenance information. She leads the eResearch Group at the University of Queensland. Her area of expertise is the application of semantic web technologies to the integration, organization and preservation of research data and collections.

Dr Eva Abal is an expert in hydraulic modelling, water quality and ecosystem health interactions. As Scientific Coordinator for the Healthy Waterways Partnership, she is ideally positioned to undertake responsibility for liaising between the CIs, data users, and the stakeholders who will be adopting the outputs of this project. Her background knowledge in waterways management brings a real-world expertise to the work being conducted for the project.

Prof. Yanchun Zhang is an expert in web services, database and data integration. He contributes his expertise to the development of the metadata models, data integration and web services. He is the Director of Centre for Applied Informatics at Victoria University, and is active in areas of databases and information systems, Web and Internet technologies, Web data management and mining, Web services, and e-Research.

Dr Shazia Sadiq is an expert in workflows and messaging technologies and data management in sensor networks. She is currently working in the School of Information Technology and Electrical Engineering at The University of Queensland. She is part of the Data and Knowledge Engineering (DKE) research group and is involved in teaching and research in databases and information systems.

Dr Henning Koehler is a Post Doctoral Research Fellow who worked on the project with the University of Queensland Data and Knowledge Engineering group from Sep 2010 to Jan 2011. He received his Masters in Mathematics from Munich Technical University, in 2003, and his PhD in Information Systems from Massey University, New Zealand in 2008.

Abdulmonem Alabri is a senior software engineer currently doing his doctorate at the University of Queensland under the supervision of Professor Jane Hunter and Dr Eva Abal.

Dr Sham Prasher is a Post Doctorial Research Fellow who joined the project in Apr 2011, working with the Data and Knowledge Engineering group at the University of Queensland (UQ). After receiving his PhD from UQ in 2005 he has   worked in the industry creating business intelligence solutions from large data sets. His research interests include query processing and geo-spatial information systems.

 

Guangyan Huang is currently doing a doctorate under the supervision of Professor Yanchun Zhang at Victoria University, and has contributed to a number of works and publications in areas of interest covered by the project.

Newsletter

Project Newsletter - May 2011

Publications

Work undertaken for the project to date has yielded a number of publications presented at various international conferences, symposiums, and workshops.

Papers

  1. L. Wang, H. Koehler, K. Deng, X. Zhou and S. Sadiq, "Flexible Provenance Tracing", in International Journal of Systems and Service-Oriented Engineering (accepted in Feb 2011)
  2. Z. Ye, X. Zhou and A. Bouguettaya, "Genetic Algorithm Based QoS-Aware Service Compositions in Cloud Computing", DASFAA 2011, April 2011, Hong Kong, China
  3. G. Huang, Y. Zhang, J. He and J. Cao, Fault Tolerance in Data Gathering Wireless Sensor Networks, The Computer Journal, 2011
  4. G. Huang, Y. Zhang, J. He and Z. Ding, Efficiently Retrieving Longest Common Route Patterns of Moving Objects by Summarizing Turning Regions, The 15th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD2011)
  5. A. Alabri, J. Hunter, "Enhancing the Quality and Trust of Citizen Science Data", IEEE eScience 2010, Brisbane, Australia, December 8 - 10, 2010
  6. J. Hunter, A. Alabri, C. Brooking, P. Becker, E. Abal, "Ontology-based Correlation of Resource Management Actions with Water Quality Data in South East Queensland", 9th International Conference on Hydroinformatics, HIC 2010, Tianjin, China, Sept 7-10, 2010
  7. Zhou and H. Koehler, "Building the World from Views", Proceedings of 11th International Conference on Web-Age Information Management (WAIM 2010), Chengdu, China (Keynote Speech)
  8. H. Koehler, X. Zhou, S.  Sadiq, Y. Shu and K. Taylor, "Sampling Dirty Data for Matching Attributes", to appear in Proceedings of 2010 ACM SIGMOD International Conference on Management of Data (SIGMOD 2010), Indianapolis, USA, June 2000
  9. K. Deng, L. Wang, X. Zhou, S. Sadiq and P.-C. Fung, "Active Duplicate Detection", to appear in Proceedings of 15th International Conference on Database Systems for Advanced Applications (DASFAA 2010), Tsukuba, Japan, April 2010 (Best Paper Award Runner-Up)
  10. J. Zhu, P.-C. Fung and X. Zhou, "Anddy: A System for Author Name Disambiguation in Digital Library", Proceedings of 15th International Conference on Database Systems for Advanced Applications (DASFAA 2010), Tsukuba, Japan, April 2010 [Demo paper]
  11. W. Lu, C. Rong, X. Du, G. Fung and X. Zhou, "Efficient Common Items Extraction from Multiple Sorted Lists", APWeb, Busan, Korea, 2010
  12. Henning Koehler, "Estimating Set Intersection using Small Samples", 33rd Australasian Computer Science Conference (ACSC), 2010
  13. A. Alabri, J. Hunter, C. Van Ingen and E. Abal, "The Health-e-Waterways Project - Data Integration for Whole-of-Water Cycle Management", 1st International Workshop on "Intelligent Systems for Environmental (Knowledge) Engineering and EcoInformatics" (i-SEEK), Fukuoka, Japan, 2009
  14. A. Alabri, J. Hunter, C. Van Ingen and E. Abal, "Data Integration Services for Smarter Collaborative Whole-of-Water-Cycle Management", 8th International Conference on Hydroinformatics, Conception, Chile, 2009
  15. J. Zhu, X. Zhou and P.-C. Fung, "A Term-based Driven Clustering Approach for Name Disambiguation", in Proceedings of the Joint International Conferences on Advances in Data and Web Management (APWeb/WAIM 2009), pages 320-331, Springer Lecture Notes in Computer Science (LNCS 5446), Suzhou, China, April 2009
  16. N. Khodabandehloo, S. Sadiq, K. Deng and X. Zhou, "Data Quality Aware Query Processing in Collaborative Information Systems", in Proceedings of the Joint International Conferences on Advances in Data and Web Management (APWeb/WAIM 2009), pages 39-50, Springer Lecture Notes in Computer Science (LNCS 5446), Suzhou, China, April 2009

Other

  • P. Becker, A. Alabri, J.Hunter, "Dynamic Generation of Online, Interactive Environmental Report Cards", eResearch Australasia, Manly, Sydney, 2009
  • P. Becker, “The Health-e-Waterways Project - Advanced Information Management in generating the Healthy Waterways Report Card for South East Qld”, Spatial and Scientific Information Management for the Reef Workshop, Brisbane Oct 13-14, 2009
  • J. Hunter, "Health-e-waterways: supporting decision making through access to knowledge, information and decision support tools", 12th International River Symposium, Brisbane, 21-24 Sept, 2009
  • J. Hunter, “The Health-e-Waterways Project – An Exemplary Model for Environmental Monitoring and Resource Management”, Microsoft Faculty Research Summit, Redmond, USA, July, 2009

Proceedings

  1. K. Tanaka, X. Zhou, M. Zhang and A. Jatowt, editors, Proceedings of the 4th ACM Workshop on Information Credibility on the Web (WICOW 2010). Proceedings ACM 2010
  2. X. Zhou, H. Yokota, K. Deng and Q. Liu, editors, "Database Systems for Advanced Applications", Proceedings of DASFAA 2009, LNCS 5463, Springer, 2009
  3. X. Zhou and X. Xie: Proceedings of the 2009 International Workshop on Location Based Social Networks, 2009, Seattle, Washington, USA, Proceedings ACM 2009
  4. K. Tanaka, X. Zhou and A. Jatowt, editors, Proceedings of 3rd ACM Workshop on Information Credibility on the Web (WICOW 2008). Proceedings ACM 2009
  5. C.-Y. Chan, S. Chawla, S. Sadiq, X. Zhou and V. Pudi, editors, Data Quality and High-Dimensional Data Analysis (Proceedings of DASFAA 2008 Workshops). World Scientific, 2009