SOCR Data Analysis Documentation

From Socr

(Difference between revisions)
Jump to: navigation, search
Line 1: Line 1:
-
SOCR Data Analysis Functionality for Developers
+
==Framework and Implementation ==
 +
 
 +
=== Framework of the Analysis Component===
 +
There are three "sets" of classes: data, Model, and Result. There are at:
 +
 
 +
# Data: edu.ucla.stat.SOCR.analysis.data.Data class.
 +
# Model Classes: under the directory of edu.ucla.stat.SOCR.analyses.model.
 +
# Result Classes: under the directory of edu.ucla.stat.SOCR.analyses.result.
 +
 
 +
Data class has a public, non-static method called "getAnalysis." This is the main and the only method needed to implement the data analysis. It identifies the requested analysis model by taking a parameter, anaylisisType, from the caller. The legal analysis types are short primitaves defined in edu.ucla.stat.SOCR.analysis.model.AnalysisType.java. You must use those defined in this class in order for it to work. Then a model class is called by the Data class to do all the mathematical operations. All the computed results (e.g. parameter estimates and their standard error), will be put in Result class. The form of storage is by using Result's HashMap member variable (variable name is: texture). Just fetch this HashMap using "getTexture" method to get the results.
 +
 
 +
=== Example ===
 +
Here is an example snippet. Suppose we would like to run a simple linear regression on two variables: height and weight, and the data are of the same length.
 +
 
 +
<pre>
 +
 
 +
double[] heightDoubleArray = new double[] {60, 55, 51, 54, 63};
 +
double[] weightDoubleArray = new double[] {190, 160, 110, 120, 130};
 +
 +
// you need to instantiate a data instance first.
 +
Data testData = new Data();
 +
// submit the independent variable by appendX.
 +
appendX("HEIGHT", heightDoubleArray, DataType.QUANTITATIVE);
 +
// submit the independent variable by appendX.
 +
appendY("WEIGHT", weightDoubleArray, DataType.QUANTITATIVE);
 +
 +
// then use the following line to get the result.
 +
try {
 +
Result result = data.getAnalysis(AnalysisType.SIMPLE_LINEAR_REGRESSION);
 +
// Result.getTexture() returns a HashMap that holds some result data.
 +
if (result != null) {
 +
// is null if something in data goes wrong, e.g. exceptio. throwing.
 +
HashMap texture = result.getTexture();
 +
double alpha = 0, beta = 0;
 +
try {
 +
        alpha =
 +
                        ((Double)texture.get(SimpleLinearRegressionResult.ALPHA)).doubleValue();
 +
        System.out.println("alpha = " + alpha);
 +
 
 +
} catch (NullPointerException e) {
 +
System.out.println("alpha could not be computed.");
 +
}
 +
try {
 +
beta =
 +
                        ((Double)texture .get(SimpleLinearRegressionResult.BETA)).doubleValue();
 +
System.out.println("alpha = " + alpha);
 +
} catch (NullPointerException e) {
 +
System.out.println("beta could not be computed.");
 +
}
 +
}
 +
} catch (Exception e) {
 +
System.out.println("Something is wrong. No result generated.");
 +
}
 +
</pre>
 +
 
 +
=== A Few Points to Note ===
 +
 
 +
Here are a few things you should know when you code analysis calling:
 +
 
 +
* the Exception of the outer try above consists of: DataIsEmptyException (SOCR defined), WrongAnalysisException (SOCR defined), InstantiationException, IllegalAccessException, ClassNotFoundException. What if anaysis type is not specified corretly? A WrongAnalysisException would be generated. Or, what if the array that holds data is actually null or with length zero, a DataIsEmptyException will be generated.
 +
 
 +
* For the complete set of avaialble results, please see more examples below or check class API under edu.ucla.stat.SOCR.analsyses.result.
 +
 
 +
* A few words about appendX and appendY methods: All the QUANTITATIVE varialbles must be submited in a form of int, long, float or double. All the FACTOR must be submitted in form of String, and if an array of data is String, it will be treated as FACTOR. Therefore, if you intend to submit an array of data as QUANTITATIVE but you submit them as String, the code cannot compute for results.
 +
 +
* Data Type: Note that we have two big categories of data: QUANTITATIVE and FACTOR. For example, QUANTITATIVE can be variables like height, weight, SAT score, etc. And FACTOR can be catogorical variables such as sex Male/Female, race White/Black/Asian/Hispanic, etc. "QUANTITATIVE" and "FACTOR" are constants declared in DataType.java class.
 +
 
 +
=== Implemented Analysis Models===
 +
As of August 1, 2006, we have implemented:
 +
 
 +
Under Linear Models:
 +
* One Way ANOVA
 +
* Two Way ANOVA
 +
* Simple Linear Regression
 +
* Multiple Linear Regression
 +
 +
Under Parametric Testing:
 +
* One Sample T-Test
 +
* Two Independent Sample T-Test
 +
* Two Paired Sample T-Test
 +
 +
Under Non-Parametric Testing:
 +
* Two Independent Sample Wilconxon Test
 +
* Two Paired Sample Signed Rank Test
 +
 
 +
 
 +
=== More Examples===
 +
 
 +
The linear model ones are based on the logic described in section 1.
 +
 
 +
* Example: [[Simple Linear Regression]]
 +
 +
* Example: [[Multiple Linear Regression]]
 +
 
 +
* Example: [[One Way ANOVA]]
 +
 +
* Example: [[Two Way ANOVA]]
 +
 
 +
 
 +
 
 +
The parametric and non-parametric tests are even easier to use with ad hoc static methods.
 +
 
 +
* Example:[[ One Sample T-Test]]
 +
 +
* Example: [[Two Independent Sample T-Test]]
 +
 
 +
* Example: [[Two Paired Sample T-Test]]
 +
 
 +
* Example: [[Two Independent Sample Wilconxon Test]]
 +
 
 +
* Example: [[Two Paired Sample Signed Rank Test]]

Revision as of 23:33, 27 July 2006

Contents

Framework and Implementation

Framework of the Analysis Component

There are three "sets" of classes: data, Model, and Result. There are at:

  1. Data: edu.ucla.stat.SOCR.analysis.data.Data class.
  2. Model Classes: under the directory of edu.ucla.stat.SOCR.analyses.model.
  3. Result Classes: under the directory of edu.ucla.stat.SOCR.analyses.result.

Data class has a public, non-static method called "getAnalysis." This is the main and the only method needed to implement the data analysis. It identifies the requested analysis model by taking a parameter, anaylisisType, from the caller. The legal analysis types are short primitaves defined in edu.ucla.stat.SOCR.analysis.model.AnalysisType.java. You must use those defined in this class in order for it to work. Then a model class is called by the Data class to do all the mathematical operations. All the computed results (e.g. parameter estimates and their standard error), will be put in Result class. The form of storage is by using Result's HashMap member variable (variable name is: texture). Just fetch this HashMap using "getTexture" method to get the results.

Example

Here is an example snippet. Suppose we would like to run a simple linear regression on two variables: height and weight, and the data are of the same length.


double[] heightDoubleArray = new double[] {60, 55, 51, 54, 63};
double[] weightDoubleArray = new double[] {190, 160, 110, 120, 130};
	
// you need to instantiate a data instance first.
Data testData = new Data(); 	
// submit the independent variable by appendX.
appendX("HEIGHT", heightDoubleArray, DataType.QUANTITATIVE);
// submit the independent variable by appendX.
appendY("WEIGHT", weightDoubleArray, DataType.QUANTITATIVE);	
	
// then use the following line to get the result.
try {
	Result result = data.getAnalysis(AnalysisType.SIMPLE_LINEAR_REGRESSION);
	// Result.getTexture() returns a HashMap that holds some result data.
	if (result != null) { 
		// is null if something in data goes wrong, e.g. exceptio. throwing.
		HashMap texture = result.getTexture();
		double alpha = 0, beta = 0;
		try {
		        alpha = 
                        ((Double)texture.get(SimpleLinearRegressionResult.ALPHA)).doubleValue();
		        System.out.println("alpha = " + alpha);

		} catch (NullPointerException e) {
			System.out.println("alpha could not be computed.");
		}
		try {
			beta = 
                        ((Double)texture .get(SimpleLinearRegressionResult.BETA)).doubleValue();
			System.out.println("alpha = " + alpha);
		} catch (NullPointerException e) {
			System.out.println("beta could not be computed.");
		}
	}
} catch (Exception e) {
	System.out.println("Something is wrong. No result generated.");
}

A Few Points to Note

Here are a few things you should know when you code analysis calling:

  • the Exception of the outer try above consists of: DataIsEmptyException (SOCR defined), WrongAnalysisException (SOCR defined), InstantiationException, IllegalAccessException, ClassNotFoundException. What if anaysis type is not specified corretly? A WrongAnalysisException would be generated. Or, what if the array that holds data is actually null or with length zero, a DataIsEmptyException will be generated.
  • For the complete set of avaialble results, please see more examples below or check class API under edu.ucla.stat.SOCR.analsyses.result.
  • A few words about appendX and appendY methods: All the QUANTITATIVE varialbles must be submited in a form of int, long, float or double. All the FACTOR must be submitted in form of String, and if an array of data is String, it will be treated as FACTOR. Therefore, if you intend to submit an array of data as QUANTITATIVE but you submit them as String, the code cannot compute for results.
  • Data Type: Note that we have two big categories of data: QUANTITATIVE and FACTOR. For example, QUANTITATIVE can be variables like height, weight, SAT score, etc. And FACTOR can be catogorical variables such as sex Male/Female, race White/Black/Asian/Hispanic, etc. "QUANTITATIVE" and "FACTOR" are constants declared in DataType.java class.

Implemented Analysis Models

As of August 1, 2006, we have implemented:

Under Linear Models:

  • One Way ANOVA
  • Two Way ANOVA
  • Simple Linear Regression
  • Multiple Linear Regression

Under Parametric Testing:

  • One Sample T-Test
  • Two Independent Sample T-Test
  • Two Paired Sample T-Test

Under Non-Parametric Testing:

  • Two Independent Sample Wilconxon Test
  • Two Paired Sample Signed Rank Test


More Examples

The linear model ones are based on the logic described in section 1.


The parametric and non-parametric tests are even easier to use with ad hoc static methods.

Personal tools