# NISER GeneralActivities 081107 ID

### From Niser

## Revision as of 05:19, 17 August 2007

## Contents |

## NISER General Activities - Mercury Contamination in Fish: A multi-disciplinary Activity

## Summary

Visualization, understanding and interpreting real data may be challenging because of noise in the data, data complexity, multiple variables, hidden relations between variables and large variation. This NISER activity demonstrates how to use free Internet-based IT-tools and resources to solve problems that arise in the areas of biological, chemical, medical and social research.

## Goals

This NISER activity has the following specific goals:

- to demonstrate the typical research investigation pipeline - from problem formulation, to data collection, visualization, analysis and interpretation;
- to illustrate the variety of portable freely available Internet-based Java tools, computational resources and learning materials for solving practical problems;
- to provide a hands-on example of interdisciplinary training, cross-over of research techniques, data, models and expertise to enhance contemporary science education;
- to promote interactions between different science education areas and stimulate the development of new and synergistic learning materials and course curricula across disciplines.

## Motivation/Problem

Mercury contamination of edible freshwater fish poses a direct threat to human health. Largemouth bass is a fresh water fish that was studied in 53 different Florida lakes to examine the factors that influence the level of mercury contamination. Water samples were collected from the surface of the middle of each lake in August 1990 and then again in March 1991. The **pH level**, the **amount of chlorophyll**, **calcium**, and **alkalinity** were measured in each sample. Also, samples of fish were taken from each lake with sample sizes ranging from 4 to 44 fish. The age of each fish and mercury concentration in the muscle tissue was measured. Since fish absorb mercury over time, older fish will tend to have higher concentrations. Here is a detailed discussion of why mercury accumilates in muscle tissues and why is it toxic in larger amounts.

## Data

Largemouth bass were studied in 53 different Florida lakes to examine the factors that influence the level of mercury contamination. Water samples were collected from the surface of the middle of each lake in August 1990 and then again in March 1991. The pH level, the amount of chlorophyll, calcium, and alkalinity were measured in each sample. The average of the August and March values were used in the analysis. Next, a sample of fish was taken from each lake with sample sizes ranging from 4 to 44 fish. The age of each fish and mercury concentration in the muscle tissue was measured.

## Challenge

To make a fair comparison of the fish in different lakes using a regression estimate of the expected mercury concentration in a three-year-old fish as the standardized value for each lake. Determine the age of the individual fish in some lakes and correlate this with the average mercury concentration of the sampled fish.

Florida has set a standard of 1/2 part per million as the unsafe level of mercury concentration in edible foods. 45.3% of the lakes exceed this level. The smallest level of mercury concentration that the measuring instrument can detect is 40 parts per billion. Any level below that was set to 40 parts per billion. This, of course, "flattens out" the slope of the relationship at the low end as well as affecting the standardized values. These observations are usually on young fish.

Logarithmic transformations on some of the variables may provide insights into the relationships among the other variables in the study. For instance, **alkalinity level** may be associated with mercury concentration, and may help account for the higher levels of mercury.

## Methods & Approaches

We now discuss varieties of scientific methods, models, tools and strategies for data modeling, understanding, inference and visualization of the data in this specific driving biological problem, as well as, discuss checking and affirming underlying model or technique assumptions.

### Physics

- What is an atom and what is the atomic structure of mercury?
- TBD

- TBD

### Biology

- Why is mercury (and other heavy metals) accumulating in muscle tissue?
- What causes the toxicity of larger than normal amounts of mercury in the body?
- The Chlorophyll molecule.
- 3D structure of Chlorophyll using PDB/JMol viewer.

- TBD

### Ethotoxicology

Ethotoxicology is the study of the behavioral consequences of exposure to toxic chemicals. Toxic chemicals, like *heavy metals* and estrogen mimicking compounds, may influence behavior. For instance, there are documented changes in how some animals respond to novel situations, vocalize, and expose themselves to the risk of predation as a function of chemical exposure. Thus, by quantifying time allocation using an event recorder (like JWatcher), and using various statistical analyses to analyze these data, it is often possible to have a behavioral indicator of chemical exposure. And, by quantifying such behaviors in a variety of species, and then using phylogenetic software (like Mesquite), it is possible to study the evolution of such behavioral effects. We are in the process of developing new applets and learning materials demonstrating the effects of toxic exposure to such compounds on animal behavior.

### Chemistry

- Periodic Table
- Alkalinity
- pH
- See the position of
**Mercury**(*Hg*) in the interactive Periodic table and discuss its chemical properties.

- TBD

### Engineering

- TBD

- TBD

### Mathematics

- Fit
**models**to some of the variables in this dataset.- We can easily compute the histogram of the average (per lake) mercury measurements and fit in Normal, Exponential or other distribution models to the histogram of the average mercury measurement using SOCR Charts and SOCR Modeler (see examples here). The left image below shows the fit of a
*N**o**r**m**a**l*(μ = 10.54,σ^{2}= 45.64) distribution model and the right image shows the (offset by 0.8)*E**x**p**o**n**e**n**t**i**a**l*(λ = 10.54) model fit to the mercury histogram. The parameters in both cases are automatically calculated from the data using maximum likelihood estimation. - We can also demonstrate the effect of using a wavelet decomposition of the column data (
*avg_mercury*). The image below demonstrates the**wavelet representation**of this column vector data using Daubechies' 2 wavelet basis, where only the largest 10% of the wavelet coefficients were used to reconstruct/synthesize the (compressed) wavelet representation of the data. More information about wavelets and wavelets-based activities may be found here. You can use the SOCR Modeler to fit is any of a number of wavelet models to data and observe the effect of wavelet shrinkage on the corresponding wavelet model fit.

- We can easily compute the histogram of the average (per lake) mercury measurements and fit in Normal, Exponential or other distribution models to the histogram of the average mercury measurement using SOCR Charts and SOCR Modeler (see examples here). The left image below shows the fit of a

### Statistics

- Are there associations in these data between
*Alkalinity*,*pH*,*Calcium*and*Chlorophyll*? Using SOCR Analyses construct regression plots illustrating the relations between*Calcium, Chlorophyll*and*Avg_Mercury*, as shown on the image below.

- Generate some exploratory data analyses (EDA plots). For instance, the figure below demonstrates how you can use SOCR Charts to obtain the scatter plot of
*Alkalinity*vs.*pH*. What are some inferences about the bivariate relations between these variables?

- What is the distribution of the average amount of
*mercury*in the entire sample? We can use SOCR Charts to construct the histogram of the*avg_mercury*column data. To learn how to use SOCR Charts see these Charts Activities. Notice the effect the bin-size parameter has on the shape of the data histogram (you can smoothly vary this parameter). Although, there are many choices for determining the bin-size parameter, in general, the optimal choice of the histogram bin-size parameter may be determined by , where the interquartile range*I**Q**R*=*Q*_{3}−*Q*_{1}, and*Q*_{3}and*Q*_{1}are the**third**and**first**quartiles for the sample data, respectively.

- Is there evidence of statistical differences between the two groups (according to the
*age_data*variable) in either*Alkalinity*or*pH*?- There does not seem to be a statistically significant difference in the
*pH*levels between the two groups separated by the age_data varaible. A non-parametric Wilcoxon Rank Sum test did not produce significantly small p-value (see the image below showing the result of the calculations using SOCR Analyses. - Similarly, there does not seem to be a statistically significant difference in the
*Alkalinity*levels between the two groups separated by the*age_data*variable. The same non-parametric Wilcoxon Rank Sum test did not produce significantly small p-value (see the image below showing the result of the calculations using SOCR Analyses.

- There does not seem to be a statistically significant difference in the

## Computational Resources

- Internet-based SOCR Tools (including offline resources, e.g., tables)

## Hands-on activities

- TBD (Step-by-step practice problems).

## Notes

- Follow the directions in the SOCR Wiki Editing Guide to expand, revise or improve these materials.

- NISER Home page: http://www.NSER.org

Translate this page: