Methods for data integration

The text on this page is taken from an equivalent page of the IEHIAS-project.

The purpose of link data from multiple monitoring sources is to see if there is any added value can be identified. In INTARESE, the aim of link environment and health data is to better understand the cause and severity of how different environmental hazards and exposures impact human health.

There is no best method to do this yet. All methods that can be used to link data from multiple sources are research goal oriented.

Even in a world of rapid information access, synthesizing vital scientific knowledge and evidence about EH problems, and solutions, into understandable concepts remains a formidable challenge. However, a range of scientific tools exist to facilitate such synthesis including:

Quantifying the environmental health impacts
Environmental health mapping
Environmental health indicators

Scope

Purpose

The purpose of link data from multiple monitoring sources is to see if there is any added value can be identified. In INTARESE, the aim of link environment and health data is to better understand the cause and severity of how different environmental hazards and exposures impact human health.

There is no best method to do this yet. All methods that can be used to link data from multiple sources are research goal oriented.

Even in a world of rapid information access, synthesizing vital scientific knowledge and evidence about EH problems, and solutions, into understandable concepts remains a formidable challenge. However, a range of scientific tools exist to facilitate such synthesis including:

Quantifying the environmental health impacts
Environmental health mapping
Environmental health indicators

Boundaries

As we all know, there is no perfect methods exist to link data from multiple sources. Each of methods has its own boundaries. Here is one example for the GIS which has been started to used broadly in linking health and environmental data from different sources. The limitations of by using GIS are (Malkawi, [1]):

A map is primarily a means of display; it cannot predict the patterns of distribution or relationships between resources. The map does not infer a causal relationship, it merely points out that there are some spatial coincidences that are worth exploring, to see if a causal relationship exists. Likewise, to show how changes in one resource may impact distribution of another resource, the relationship must be known and put into the model creating the map.
Another limitation of spatially-referenced environmental information is that access is often limited. For example, data may be available for only a portion of the required area or for the whole area but taken from two or more different sampling exercises which may have used different sampling methodologies, scales, or accuracy levels.
Connected to this are costs involved in generating maps, and in printing, disseminating, and updating them. This requires specialized hardware and software, trained personnel, and often expensive and time-consuming means of acquiring, checking, interpreting, and inputting information.
Furthermore, the technology is rapidly advancing, and thus new applications and training courses are required on an almost annual basis.
Finally, not all people can readily relate to information in a two-dimensional spatial format, especially if the map is of an unfamiliar area or is presented in an unusual projection. Furthermore, different cultures place different importance or meaning on symbols and colors. For example, western cultures may use the color red to symbolize danger or an area where conditions are bad, but in China this color would symbolize luck or a favorable area.

Method description

Input

The most commonly mapped environmental information of relevance to the health sector includes:

pollution sources and affected areas (including sewage, solid waste, hazardous waste, industrial pollution, smoke and other emissions, and radiation);
land cover and use (including vegetation type, vegetation change and condition, agriculture, forestry, and soil type and condition);
water availability and quality;
energy sources and use (including fossil fuel use, electrical connectivity, biomass use, and renewable energy sources); and
biological resources (including protected areas and recreational sites, endangered species, and medicinal resources).

Output

In general, by linking enviornmental health data, we can identify the health burden of environmental hazard/ecosystem degradation, and furthermore, the cost of damage to health and quality of life due to environmental degratation, for example, environmental burden of difference diseases, death from difference diseases, death from difference pollutants, etc.

Rationale

Here, we choose environmental health mapping as one example. The reasons are:

One inherent characteristic of both environmental and health data is that they have a location component. This characteristic makes Geographic Information Systems (GIS) an ideal and sometimes indispensable tool for analyzing environmental health data.

According to Malkawi, mapping techniques by using GIS can be used in two main ways to show the links between environment and health:

Simple overlays (comparisons) of environmental and socioeconomic (health) data can be used to identify patterns, which can then be investigated later for correlations.
Once the causal relationship is known, however, spatial models can also be developed to predict changes in health based on environmental changes.
- EH is multi disciplinary GIS handles multi layers

Methods for data integration

Methods for link exposure and dose data

In general, there are two types of models can be used to link exposure and dose data. First, Physiologically Based Pharmacokinetic (PBPK) models are powerful computational tools that can be used to link exposure to the internal concentrations of parent compounds and/or active metabolites at the target site(s) of toxicity (http://cfpub.epa.gov).

Second, Biologically Based Pharmacokinetic (BBPK) models are being increasingly used in the risk assessment of environmental chemicals. These models are based on biological, mathematical, statistical and engineering principles. Their potential uses in risk assessment include extrapolation between individuals, species, doses and routes of exposures (http://cfpub.epa.gov).

In addition, other tools on hazard identification and exposure assessment can also be used to link exposure and dose data.

Methods for link dose and health effect data

There are many tools to link dose and health effect, e.g. tools on spatial statistics, tools on time-activity patterns, tools on EPHT (Environmental Public Health Tracking, http://www.cdc.gov), tools on dose-response assessment (DistGEN, GEN.T, http://www.foodrisk.org/resource_types/tools/dose_response.cfm) and risk characterization, etc.

DCAL (Dose and Risk Calculation software) is a comprehensive software system for the calculation of tissue dose and subsequent health risk from intakes of certain pollutant or exposure to specific pollutant present in environmental media (http://www.wise-uranium.org/rdr.html).

Methods for link exposure, dose and health effect data

Geographical information systems (GIS)

Geographical information systems (GIS) are “automated systems for the capture, storage, retrieval, analysis, and display of spatially referenced data” (Clarke et al., 1996; Higgs and Gould, 2001). GIS can relate otherwise disparate issues on the basis of common geography, revealing hidden patterns, relationships, and trends that are not readily apparent in spreadsheets or statistical packages, often creating new information from existing data resources. This feature implies, in E & H fields, GIS is a useful instrument to link the indicators from environmental monitoring, biomonitoring and health monitoring by a visual presentation. These might be represented as several different layers where each layer holds data about a particular kind of feature. Each feature is linked to a position on the graphical image on a map and a record in an attributed table. Apart from, for example, simply plotting environmental monitoring data or morbidity/mortality information on a map, GIS also offers important opportunities for inter- or extrapolation of data, for a geographical representation of monitoring or modeling data, and for the visualization of overlaps between different layers of information (Smolders et al., 2008).

GIS mapping techniques can be used in two main ways to show the links between environment and health: (i) simple overlays (comparisons) of environmental monitoring, biomonitoring and socioeconomic (health) data can be used to identify patterns, which can then be investigated later for correlations; and (ii) once the causal relationship between environment and health is known, however, spatial models can also be developed to predict changes in health based on environmental changes.

GIS application will be the cornerstone of an integrated monitoring system. Its spatial application techniques will be the best options to provide effective linkage and integration among exposure-dose-response (Smolders et al., 2008). The use of GIS techniques in integrated data from different monitoring programs will be determinately considered and enhanced further in the next step case studies.

Multiple Lines and Levels of Evidence (MLLE)

Multiple Lines and Levels of Evidence (MLLE) were originally developed for epidemiological studies when it was difficult to assign causality. It was first proposed by Hill (1965) in the medical field and has since been used in human and ecological risk assessments (Culp et al., 2000; Fairbrother, 2003). It is now being adapted by NRM (Natural Resource Management) (Adams, 2003; Young et al., 2006). At present, MLLE method is broadly used in research to explore cause and affect relationships (Norris et al., 2005).

Bayesian Belief Networks (BBN)

A Bayesian Belief Networks (BBN) is a probabilistic graphical model that represents a set of random variables and their conditional independencies via a directed acyclic graph (DAG). For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases (Bayesian network in Wikipedia).

BBNs perform just such a function, providing a rational method for the integration of the best possible data from a variety of sources (Wooldridge and Done, 2003). A BBN can also incorporate prior knowledge in order to more accurately model a complex system, which may be difficult when using other techniques (Pollino, 2005).

Advanced statistical models

There are many statistical models used in different monitoring programs. Because of the need to integrate monitoring indicators in integrated monitoring programs, the use of multivariate statistical models (e.g. connecting information from different sources through models) need to be considered and developed.

Techniques for assessing uncertainty

There are a large numbers of sources of uncertainties in integrated monitoring in E & H fields, e.g. inaccuracies in observations or insufficient numbers of observations, missing components or errors in the data, random sampling error and biases (non-representativeness) in a sample, etc. All types of uncertainty require to be handled by adequate analytical techniques. A more systematic and structured approach for uncertainty analysis should be recommended.

Techniques for quality assurance and quality control

Quality assurance (QA), quality control (QC) and standard operating procedures (SOP) are separate components of an integrated monitoring program that work together to provide data of known quality. Together they minimize and quantify the errors that are introduced in sampling and allow tracking of errors that might occur. One of the most important aspects of quality assurance in a monitoring program is the development of a quality assurance plan, which should identify in a clear way the quality of the data needed and describe in detail the planned actions to provide confidence so that the program will meet its stated objectives (Shampine, 1993). These should be done with all stakeholders and for each objective. Quality control data, which allow for the quality and suitability of the environmental and health data to be evaluated and verified should be collected and utilized as an integral part of the QA effort associated with a monitoring programs (Shampine, 1993). QA/QC should address the data quality, the data type, quality should be consistent and comparable, and the data should be available and accessible.

References

Adams, SM. 2003. Establishing causality between environmental stressors and effects on aquatic ecosystems. Human and ecological risk assessments. 9(1): 17–35.
Clarke, K. C., McLafferty, S. L., Tempalski, B. J. 1996. On epidemiology and geographic information systems: A review and discussion of future directions. Emerg. Infect. Dis. 3:85–92.
Culp, J.M., Lowell, R.B., Cash, K.J. 2000. Integrating mecosm experiments with field and laboratory studies to generate weight-of-evidence risk assessments for large rivers. Environmental toxicology and chemistry. 19(4): 1167–1173.
Fairbrother, A. 2003. Lines of evidence in wildlife risk assessments. Human and ecological risk assessments. 9(6):1475–1491.
Higgs, G., Gould, M. 2001. Is there a role for GIS in the ‘new NHS’? Health Place. 7:247–259.
Hill, AB. 1965. The environment and disease: Association or causation. Proceedings of the Royal Society of Medicine, vol. 58, pp. 295–300.
Norris, R., Liston, P., Mugodo, J., Nichols, S., Quinn, G., Cottingham, P., Metzeling, L., Perriss, S., Robinson, D., Tiller, D., Wilson, G. 2005. Multiple Lines and Levels of Evidence for detecting ecological responses to management intervention. In I.D. Rutherfurd, I. Wiszniewski, M.J. Askey-Doran and R. Glazik (Eds), Proceedings of the 4th Australian Stream Management Conference: linking rivers to landscapes, (pp. 456-463). Department of Primary Industries, Water and Environment, Hobart, Tasmania.
Pollino, C.A., Woodberry, O., Nicholson, A.E., Korb, K.B. 2005. Parameterising Bayesian networks: a case study in ecological risk assessment. Proceedings of the 2005 International Conference on Simulation and Modeling, Bangkok, Thailand, January 2005.
Shampine, W. J., 1993. Quality assurance and quality control in monitoring programs: in Improving natural resource management through monitoring, workshop, Stafford, S., ed., Environmental Monitoring and Assessment. 26 (2-3):143-151.
Smolders, R., Gasteleyn, L., Joas, R., and Schoeters, G. 2008. Human biomonitoring and the inspire directive: spatial data as links for environment and health research. Journal of Toxicology and Environmental Health, Part B. 11 (8): 646-659.
Wooldridge, S., Done. T. 2003. The use of Bayesian belief networks to aid in the understanding and management of large-scale coral bleaching. MODSIM 2003 International 398 Conference on Modeling and Simulation, Townsville, July 2003 399.
Young, B., Nichols, S., Norris, R. 2006. Application of multiple lines and levels of evidence (MLLE) for addressing ecological questions of causality. Australian Society for Limnology 45th Annual Conference, 25–29 September, Albury, NSW.

Topic	Pages
IEHIAS is a website developed by two large EU-funded projects Intarese and Heimtsa. The content from the original website was moved to Opasnet.
Toolkit
Data	Boundaries · Population: age+sex 100m LAU2 Totals Age and gender · ExpoPlatform · Agriculture emissions · Climate · Soil: Degredation · Atlases: Geochemical Urban · SoDa · PVGIS · CORINE 2000 · Biomarkers: AP As BPA BFRs Cd Dioxins DBPs Fluorinated surfactants Pb Organochlorine insecticides OPs Parabens Phthalates PAHs PCBs · Health: Effects Statistics · CARE · IRTAD · Functions: Impact Exposure-response · Monetary values · Morbidity · Mortality: Database
Examples and case studies	Defining question: Agriculture Waste Water · Defining stakeholders: Agriculture Waste Water · Engaging stakeholders: Water · Scenarios: Agriculture Crop CAP Crop allocation Energy crop · Scenario examples: Transport Waste SRES-population UVR and Cancer
Models and methods	Ind. select · Mindmap · Diagr. tools · Scen. constr. · Focal sum · Land use · Visual. toolbox · SIENA: Simulator Data Description · Mass balance · Matrix · Princ. comp. · ADMS · CAR · CHIMERE · EcoSenseWeb · H2O Quality · EMF loss · Geomorf · UVR models · INDEX · RISK IAQ · CalTOX · PANGEA · dynamiCROP · IndusChemFate · Transport · PBPK Cd · PBTK dioxin · Exp. Response · Impact calc. · Aguila · Protocol elic. · Info value · DST metadata · E & H: Monitoring Frameworks · Integrated monitoring: Concepts Framework Methods Needs
Listings	Health impacts of agricultural land use change · Health impacts of regulative policies on use of DBP in consumer products
Guidance System
The concept
Issue framing	Formulating scenarios · Scenarios: Prescriptive Descriptive Predictive Probabilistic · Scoping · Building a conceptual model · Causal chain · Other frameworks · Selecting indicators
Design	Learning · Accuracy · Complex exposures · Matching exposure and health · Info needs · Vulnerable groups · Values · Variation · Location · Resolution · Zone design · Timeframes · Justice · Screening · Estimation · Elicitation · Delphi · Extrapolation · Transferring results · Temporal extrapolation · Spatial extrapolation · Triangulation · Rapid modelling · Intake fraction · iF reading · Piloting · Example · Piloting data · Protocol development
Execution	Causal chain · Contaminant sources · Disaggregation · Contaminant release · Transport and fate · Source attribution · Multimedia models · Exposure · Exposure modelling · Intake fraction · Exposure-to-intake · Internal dose · Exposure-response · Impact analysis · Monetisation · Monetary values · Uncertainty
Appraisal

Methods for data integration

Contents

Scope

Purpose

Boundaries

Method description

Input

Output

Rationale

Methods for data integration

Methods for link exposure and dose data

Methods for link dose and health effect data

Methods for link exposure, dose and health effect data

Geographical information systems (GIS)

Multiple Lines and Levels of Evidence (MLLE)

Bayesian Belief Networks (BBN)

Advanced statistical models

Techniques for assessing uncertainty

Techniques for quality assurance and quality control

References

See also

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Page Tools

Tools