Suppose I came to you and said, "I want a mathematical approach for working with with data in a continuous multidimensional space". Hopefully you'd point me at linear algebra.
Suppose I came to you and said, "I want a mathematical approach for working with data that's connected via arbitrary relationships". Hopefully you'd point me at predicate logic, maybe graph theory, or maybe even category theory (I only vaguely understand category theory, so I may be wrong there).
So now I come to you and say, "I want a mathematical approach for working with data that's connected by arbitrary continuous relationships". What theory would you point me at?
By continuous, I mean that it's possible to smoothly interpolate between any two relations. This necessarily requires that relations themselves aren't strictly binary. It should also be possible to build either hierarchical or meta relationships, ala relational calculus or higher order predicate logic. Ideally, relationships would themselves be objects (i.e. the theory should be impredicative).
Above all, the theory should be quantitative rather than symbolic (I believe that mostly rules out category theory, or any theory based on logic). It doesn't necessarily have to be computationally tractable, though.
Is there a standard mathematical framework for this kind of thing? I think I'd recognize it if I saw it through the right lens, but I have the feeling that my qualitative descriptions of what the theory should look like are bordering on meaningless.
I've been obsessed with this idea for the past few weeks. It would be nice if someone had already done the hard work of exploring it.
Posted on November 20, 2006 01:59 PM
More projects articles
Probability theory?
Posted by: dave glasser at November 20, 2006 03:40 PMHow can I represent hierarchy with probability distributions?
Posted by: Kim at November 20, 2006 07:41 PMCategory theory is actually one level above linear algebra, logic, graph theory and so on. You can think of it as abstracting away the differences between them, so you can talk about mathematical concepts independently of any concrete theory that implements them. (In that sense, I think of category theory as the mathematical equivalent of design patterns.)
Whatever this theory is, you'll be able to study it in the language of category theory.
Posted by: Pseudonym at November 20, 2006 09:21 PMCould you give a concrete example of smoothly interpolating between two relations? Sounds interesting...
Posted by: Chris at November 21, 2006 01:01 AMChris, that's the problem, I'm not sure what I'm talking about here.
I'm somewhat reluctant to describe my real motivation, because I'm afraid it will bias the answer, but here's a clue. I want to start with a defined universe of data points. Then I want to read in a sequence of relationships between those data points. I want to "summarize" those relationships in order to produce a final relationship whose representation is sublinear with respect to the size of the input (constant would be best, but logarithmic would probably be okay). An analogy to what I'm thinking would be if data points were multidimensional coordinates, and relationships were matrices. Then I could sum up relationships just by multiplying matrices -- the size of the final matrix would then be independent of the number of original matrices. The only problem is that matrices are too simple -- they can't represent structure at all. I also need the original relationships to be somewhat like probability distributions, in that they are capable of representing continuous shades of grey. Dave was on the right track by suggesting probability theory, but I don't know how to make probability distributions represent hierarchy/structure/meta.
So here's one idea that's somewhat more concrete. You're given a set of data points, each of which is defined as a probability distribution over all the other data points (and maybe even over itself as well). Now you can augment the domain by introducing even more probability distributions (new data points), which range over the original data points plus all the newly introduced distributions. Modify the original data points so that they too can range over these newly introduced points. Take that idea to the limit, so that you have an infinite number of probability distributions, all of which range over each other. Now you have some way of talking about continuous relations (they're just probability mass functions), and there's some structure (since each point is itself a pmf).
Here's a different idea. Maybe what I want is to represent data points as coordinates in a non-cartesian space. Maybe I'm given a set of quantitative relationships between points, and what I want to do is to discover the structure of the space that those relationships describe. This might be a variation on topology. Or it might be some version of principal component analysis for warped space.
What I want is a mathematical theory that does something vaguely like this in a principled way. What I especially want is to find out that there are a few decades of work behind the theory already, so that I can reuse its insights.
This makes me think of neural nets. You can represent relations as a layer of nodes with weighted links from the data points. The value of each data point is multiplied by the weight of a link; the links are summed for each node, and a "sigmoid function" is applied, mapping (-inf->inf) to [-1,+1], representing a classification, but in a smooth continuous way. A second layer of nodes can then take inputs from the first layer, or you can mix the first and second layers if you don't mind things getting complicated.
At this point your representation is at least N^2; definitely not sublinear. But maybe depending on your application, you can consider your "summary" representation to be made up of only links to higher-order nodes (i.e. only inputs from other nodes).
A neural network layer can be construed as a matrix multiplication; if your data points are scalars, and the collection of them is a vector, then each node influenced by them is a dot product with a set of link weights; and a group of those nodes is a matrix multiplication resulting in a new vector (representing those nodes' levels of activation). Two layers have more computational/classificational power than a simple matrix multiplication because of that sigmoid function applied in between. Without that, two layers could be reduced to one with no loss in expressive power.
Neural nets feel more like engineering than math to me, but your discussion of matrices kind of brought it to mind, so I thought I'd mention it in case it sparks some useful insight for you...
Posted by: Chris at November 21, 2006 09:46 AMDifferential geometry.
Posted by: Tobin Fricke at November 21, 2006 06:43 PMYou may want to check out network of network models, though I think you'd become quickly dissatisfied. Neural nets are just a special view of linear algebra and probability.
It sounds like you're looking for something that combines analysis (eg, where probability theory comes from) and algebra (the study of simple relationships).
One of my roomates from last year studies something called representation theory. Algebraic/combinatoric topology would be easier to handle, but sounds further removed.
Posted by: leo at November 22, 2006 11:23 AMWhen I described this to Micha Elsner at Brown, he pointed me at Markov Logic Networks. I think that may be the closest I've seen to what I'm looking for, but it still assumes that the set of predicates and their truth values are known a priori. I'm looking for something that would let me derive a maximum likelihood estimate of a relational model, where the relations are hidden parameters.
Posted by: Kim at November 28, 2006 09:59 PMHow about fuzzy logic and fuzzy systems? Basically, you need to hack set theory into something continuous. Another thing you should look at is temporal logic, multiple futures.
Posted by: John Carlson at April 4, 2008 05:45 PMDid you figure out a solution? I'd be interested in knowing more about it.
Posted by: S at April 27, 2008 07:14 PM