The company I work for, StreamBase, has released our product, also called StreamBase, for free download. If you're interested in seeing what I've been working on for the past 2.5 years, now's your chance to find out first hand. You're required to register to get to the download, but don't let that stop you -- you can use a bogus email address since there's no verification.
I'll be writing more about StreamBase over the coming months, especially once we've released the first public implementation of our textual query language, StreamSQL. The streamsql.org site is still pretty empty, but you can see a "preview" of StreamSQL if you download the product.
If you do try it out, please let me know what you think.
Posted on July 22, 2006 03:47 PM
More work articles
So what exactly is it good for? I've kind of managed to figure out that it's something that can query data that varies over time. But what Cool Thing can I make if I download this tool?
Posted by: crzwdjk at July 24, 2006 02:17 AMThe product is targetted at processing of continuous data feeds. It has three real strengths: it's relatively easy to write custom application logic, it's very low latency (sub-millisecond), and it handles very high data volumes. Either one of these might be a sufficient reason to use the product.
Here's some example application domains:
Maybe you want to monitor IP packets so that you can detect denial of service attacks, and trigger some automatic response. Because the product has high-level operations over continuous data feeds, it's comparatively easy to build your own detecting logic. The ability to handle large data volumes is also important here.
Maybe you have access to a stock feed, and you want to watch for changes in stock prices, and automatically trigger the buying or selling of stocks based on various criteria. For this domain, low latency is very important.
Maybe you have access to the clickstream for a large website, and you want to watch for pages that are starting to spike in popularity. You could use this information for anything from ad selection to optimizing a load balancing strategy.
These are some of the examples that we usually spout. But as you've probably noticed, they're all pretty much uninteresting unless you're a large company.
I keep thinking that I should get around to writing an RSS adapter, and point it at various blogs or blog aggregator sites. Then I could write my own logic for expressing what posts I'm interested in. The output could also be expressed as an RSS feed, so nobody would have to know that it's written in StreamBase.
This seems like one of those things that is useful for processing huge volumes of data. Which is something that only large companies generally have, at least now. But perhaps it's because only large companies have had the resources to deal with such large volumes of data. Now with things like the web, anyone can get billions and billions of pages, and with this streambase business maybe even make some sense of it all.
Posted by: crzwdjk at July 26, 2006 10:01 PMcrzwdjk, I think it's applicable to more than just very large data streams. For example, you could classify databases as only being useful for "storing huge volumes of data", but these days even small-time projects use relational databases just to avoid having to write a custom data persistence and querying layer. No matter how much data you need to store, it's nice to have features like count(*), group by, joins, etc right out of the box.
I think that 10 years from now, it's not unreasonable to expect that developers will have a standard tool for dealing with events and with continually-changing data, as opposed to just storing unchanging records the way an RDBMS does. I can imagine such a tool eventually turning into a common platform for building applications like instant messaging, rss feeds, syslogs, online games, email, etc.
The advantage of having a common platform is that over time this kind of product will evolve to where it provides all the usual enterprise software goodness, like a high-level query language, seamless scalability, fault tolerance, administrative tools, etc. There will be simple open source implementations, as well as really expensive enterprise implementations -- just like any other category of systems software.
Posted by: Kim at July 26, 2006 10:38 PM