Lambda the Ultimate

inactiveTopic Pre-processing (or composition of functions)
started 3/26/2002; 11:23:11 PM - last post 3/27/2002; 1:17:58 PM
Keith Devens - Pre-processing (or composition of functions)  blueArrow
3/26/2002; 11:23:11 PM (reads: 373, responses: 4)
How do we know if it is easier to perform an operation on some data by splitting it into two simpler operations that do the same thing as the original operation you wanted to perform? So when f(x) = (g o h)(x), how do we know when we can benefit by using (g o h)(x) instead of f(x)?

As a perfect example, it's less expensive computationally to sort and then search an unsorted list than it is to search the original unsorted list.

The reason I ask is, I'm looking again at the Pyxie XML processing library. I'm wondering if it's simpler to convert XML to PYX and then process it than it is to process the raw XML in the first place. It's probably simpler *to program* a parser in these two steps, but is it also simpler in a theoretical sense? Will it take less time computationally?

The reason I'm asking the question is a mundane one, but I'm mostly interested in the theoretical aspects of this. How can we study this phenomenon? How do we look for it?

Here's the "original":

Ehud Lamm - Re: Pre-processing (or composition of functions)  blueArrow
3/27/2002; 7:42:19 AM (reads: 381, responses: 0)
Why do you think there should be a general answer to this question?

Chris Rathman - Re: Pre-processing (or composition of functions)  blueArrow
3/27/2002; 8:19:52 AM (reads: 380, responses: 2)
In many instances, the breakdown is faster simply because you wind up calling the same function multiple times. If you do a pre-sort (i.e. break it down into two functions), then the sort only has to be done once. Otherwise each step requires a sort by itself, thus significantly adding overhead.

Keith Devens - Re: Pre-processing (or composition of functions)  blueArrow
3/27/2002; 12:09:40 PM (reads: 413, responses: 1)
Ehud: maybe there isn't any kind of general answer, but science usually progresses when people find commonality where they didn't know there was commonality before. Just looking for some :) I figured maybe there was theory about this I didn't know about.

Chris: while that's true, that's not exactly what I'm talking about. To bring up the same sorting and searching example: The reason sorting and then searching a list is faster isn't just because the sorting is only done once instead of many times. If you sorted once and did the same kind of search you'd have done on an unsorted list (a sequential search), the sorting would actually waste time :)

The reason that sorting and searching is faster is because it enables a completely different type of search algorithm (binary over sequential). I know everyone knows this, so I'm not trying to be pedantic :) Just trying to illustrate the type of improvement I'd be looking for.

Ehud Lamm - Re: Pre-processing (or composition of functions)  blueArrow
3/27/2002; 1:17:58 PM (reads: 446, responses: 0)
Well, commomn wisdom has it that it possible to reduce the cost of accessing a data structure, by spending more effort when building the structure. Classic example: maintaining a balanced binary tree. This claim as stated is, of course, not formally provable. It also doesn't tell you whether and how this technique can be applied in any particular case.

You may find related ideas by looking up references to memoization and amortized bounds (you may find interesting material by looking under online algorithms).

I am not sure any of these really answer your question, but if I understand you correctly, these ideas stem from the same source of inspiration.

(To connect this to programming languages: Okasaki's PFDS deals extensively with these topics).