Multivariate Regression
started 8/26/2000; 5:58:31 PM  last post 8/28/2000; 9:26:12 PM


Chris Rathman  Multivariate Regression
8/26/2000; 5:58:31 PM (reads: 391, responses: 5)


A while back, I got into an argument about least squares regression techniques. In the process, I implemented a sample technique in Python, Smalltalk, and Java. The artifacts of that discussion can be found at Multivariate Regression on my home page that I'm trying to put.
For a little background, I originally had to implement a regression package for a company in Norway that was doing analysis on data being collected on the offshore platforms in the North Sea. At the time, it was quite a rush job and I made a quick and dirty implementation in Pascal and 68K ASM. This was some 10 years ago, and I don't recall exactly where I "borrowed" the solution, though I believe that the general Gaussian matrix solution was lifted from a FORTRAN numerical recipes book and the statistics stuff was lifted from my college stats books.
As I said, the Python, ST, and Java implementations are of more recent vintage and were part of some code archeology I was doing. The original Pascal code was done prior to the advent of the internet, so the resources I had at my disposal were rather limited. However, when I started doing the recent stuff, I could not locate any relevant links that provided analysis or code for multivariate regression.
To make a long story short, I was wondering if anyone knew where the best place to look for statistical and regression analysis from a software implementation perspective (preferably Python)? I'm pretty sure this territory has been covered numerous times (I recall using it in SAS in my Economics classes), but most of the links I found go off in intense statistical analysis way beyond my needs (much less level of comprehension).
Better yet, are there some generic Python stat libraries that I can tap into?
Thanks.


andrew cooke  Re: Multivariate Regression
8/27/2000; 1:21:16 AM (reads: 409, responses: 0)


Not what you asked for (best source for general info I ever found was Numerical Recipes, which it sounds like you already have; Python has matrix support in a package called NumPy, don't know of a stats package), but you can do the equivalent of lambda expressions in Java using inner/anonymous classes.
See
here (especially the new Command part of the example code just before
here) (I don't think I've ever used this, despite programming for a living in Java  implementing interfaces is usually sufficient for most problems).


Ehud Lamm  Re: Multivariate Regression
8/27/2000; 5:16:48 AM (reads: 437, responses: 3)


Usually netlib and statlib are good places to check for such things, though you probably know this already. The Numerical Recepies code is available online. I found the statistics section of Yahoo useful at times.
But what is the language angle here?
Is the fact the you are considerring a language that is not mainstream (or is it?) enough to make this a language issue? I tend to think not, unless  of course  the language has some interesting apporach useful for this kind of problem. Now an APL solution might be nice. Or maybe this is just an APL issue then?
I can see two interesting issues: languages with tools that are useful in general, and in particulaer relevant to your problem (i.e, regression); or languages specifically tied to statistics  and how they differ from more general purpose languages. Examples of this type of language are R or S.
I once wrote a short paper about languages for statistical inference and query languages. It compared SAS and a home grown language we had where I worked. Some of the conclusions seemed pretty interesting to me. Alas, the paper is in Hebrew, and isn't online.


Ehud Lamm  Re: Multivariate Regression
8/27/2000; 5:28:13 AM (reads: 461, responses: 0)


I liked this quote from the "History of S" paper:
Did you notice a certain uncertainty about what to call the thing? We started out with System, then added Language, then switched to Environment; with the next version, we would switch back to Language and drop System. We were sure, however, that we wanted to avoid the P word: S was not to be considered a statistical package in the usual sense of the term.


Chris Rathman  Re: Multivariate Regression
8/28/2000; 6:54:22 PM (reads: 491, responses: 1)


But what is the language angle here?
The problem of making a generic multivariate regression library was kind of fresh on my mind. The thing I was kind of looking for, at least from a question of programming languages, was how to provide a set of generic routines to allow the library user to provide not only the raw data matrix (array) but also to define the form of the equation.
I settled on the use of lambda functions to provide the equation for each term that corresponds to a coefficient. Kind of a minor frustration, but I sure wish Python and Squeak would get around to implementing block closures.
Will have to try using inner classes for my Java code as suggested. Anyhow, thanks for the links and suggestions. :)


andrew cooke  Re: Multivariate Regression
8/28/2000; 9:26:12 PM (reads: 524, responses: 0)


The problem of making a generic multivariate regression library was kind of fresh on my mind. The thing I was kind of looking for, at least from a question of programming languages, was how to provide a set of generic routines to allow the library user to provide not only the raw data matrix (array) but also to define the form of the equation.
My first programming job was supporting something very similar written in C. There were (still are, I guess  but thankfully I no longer have to look after it) arrays and arrays of pointers to functions and tables of variables... Nothing like it to make you appreciate closures (and Purify).



