Thursday, September 25, 2008

a scary-deep idea...

I stumbled across this very strange idea the other day while talking to a friend about science and religion... it's related to the post I made earlier about quantum observers and straightening out what it actually means to be an "observer" creating reality.

Assuming that "observers" can create reality (and this is what the SEED article suggests has been shown experimentally over and over again) -- I raised one of my biggest conceptual problems with the idea: given multiple observers, how do we all "agree" to create the same reality?  If we don't create the same (or at least similar) reality, then how can we communicate about it at all?  More to the point, how come the observers seem unable to shape reality by their will?

She took up this last point by referring to a modern theologist's view that we also can't walk on water because every cell in our body would have to believe it was true in order to succeed...

At first I thought how silly, but then maybe there's something to that... what if it wasn't just every cell in your body, but every cell, every rock, everything?  I thought specifically of something I read of Feynman's talking about another aspect of quantum physics: that there is really no separation, no distinction at that level between particles, rocks, plants, people... stars... it's all part of the same oneness.

Then I thought of something really strange... wouldn't it be funny if all the experiments and the maths weren't wrong, just our narrow interpretations of the meaning?  What if everything in quantum physics was pointing to the idea that there was and had only ever been One Observer infinitely observing Himself unfold?

...I paused and thought of the imagery of Buddhism, lotus petals and Mandlebrots infinitely shimmering in space and time...

Science can only take you so far, then you must dream...

Wednesday, September 24, 2008

history, math and misery...

Oh, so let me tell you about my history class.. I had a weird revelation today.  As we were going through all these philosophers and statisticians views on the industrial revolution, I began to think, "wow, all these horrible living conditions, etc. were the result of really poor models of business."  The curious thing is that many of them were (and still are) mathematical models!

For example, with tons of surplus labor, the early industrial businessman seemed able to approximate labor as (people x hours) = value.  As each person cost money in wages, they desired to keep that side low, while simply scaling the hours to 12 or 14 per day (children too).  As time went on, they slowly realized that worker efficiency diminished past a point the more hours worked... so "counter-intuitive" to the earlier model, they had to factor in shorter work days with more breaks.

Another camp believed that you should only pay workers subsistence wages -- because if given more, they would spend it on "recreation" or "procreation" (ha!).  Population was viewed as a fundamental problem because food was determined to rise arithmetically, while unconstrained population grew geometrically (the simple models of the time) -- hence the ideal was to keep the population low by keeping the poor working so hard they had no time for anything else!

Unbelievable!  

But something about all this suddenly struck me as a set of problems that has been undergoing successive refinement over the generations.  For example, Nash's equilibrium is a drastic improvement on Adam Smith that more closely approximates investor behavior... and in so doing, determined the behavior of a whole new generation of economists.

Business always looks so random at the low level, but high-up there is a definite pattern of improvement after horrible costs and destruction -- I hate to think it, but historically speaking, perhaps pain is the best teacher after all?  It's startling to think of this like some big genetic algorithm that is slowly crawling through the optimal ways to run business and increase production.  Still, it's also very exciting, because it means the better the mathematical models, perhaps the less suffering we need before we learn...

Through better math, perhaps we can save the world.

Thursday, September 18, 2008

daydreams on a strange loop...

I came up with these fanciful meanderings after reading Godel, Escher, Bach by Hofstadter a few years ago... I was thinking about it again today so I dug it out of some old linear algebra class notes.  It's fairly crazy in the details, but the "dance" of it is fun... don't take it too seriously, unless you take it as a warning of the dangers of sleep deprevation mixed with studies in linear algebra...  

-lk



The fundemental problem that Godel's Incompleteness Theorem describes is that any untrivial formal system is either incomplete or inconsistent.  However, a super-system may be built around the first system that does completely and consistently describe it!  The trick is, you have to go "outside" the system you want to describe in order to fully describe it (Hofstatder calls this a "strange loop").

The reason this isn't a terribly useful solution is that the super-system itself can be shown to be incomplete or inconsistent.  In fact, no matter how many systems you layer on top of each other to fix things, there is always one outermost layer that suffers from this problem.

So, combining two other things I read about (Godel numbering and Hilbert spaces) :
What if it were possible to somehow compose an infinite number of formal systems such that they converge on a finite system?  Could the outer "error" (required to describe inner systems completely) diminish wrt the total in a manner analogous to a limit?  Could the limit then approach a system that is both complete and consistent in spite of Godel's incompleteness theorem?
(ED: this suddenly reminded me of a recent article I read about Negative Databases for some reason...)

Godel numbering allows any theory or formal system to be represented in mathematical form. So, it follows that there must exist a set of Godel numbers that correspond to math in Hilbert spaces.  

Hilbert spaces are very interesting, because even though points in such spaces have an infinite number of coordinates, angles and distances measured between any such points are finite!  Weird huh?  (common applications are quantum physics, signal processing, fourier transforms, etc.)

(ED: heh, imagining extra dimensions reminds me of that great short story "Dreams in the Witch House" by H.P. Lovecraft... creepy!)

Since points in these spaces are vectors with an infinite number of coordinates, it seems reasonable that each point vector in this space could be interpreted as a Godel numbering representing a potential formal system.

Since there are an infinite number of points, there must be an infinite number of formal systems being represented... yet, it's possible to define a line of finite measure from the origin of this space to any coordinate point given an angle and a distance (using polar coordinates).  

A line represents an infinite set of points... does this imply that a finite Godel numbering can somehow represent an infinite set of formal systems?  

(ED: woah woah woah Tex, this is quite a leap...)

This raises even more questions:
What's the the representation of the differential term used in the limit?  
Does the differential actually converge towards a finite value?

(ED: it just kind of ends here... I probably wised up by this point.  Sorry, no breakthroughs!)


Saturday, September 13, 2008

No REST for Web Apps...

[update 9/20/08: actually, the O'Reily book "RESTful Web Services" does a great job of distinguishing 3 categories of services: REST-ful resource-oriented, RPC and REST-RPC hybrids.  The last category is very carefully considered.  The other thing that I really like about this book is the author's "whale-fish" simile when talking about superficial details like technologies used vs. architectural underpinnings.

My initial frame of mind when I started this post was based on several sites that I've seen promoting REST-ful apps that in my mind are clearly making "whale-fish" mistakes he talks about.  They simply haven't grasped the underlying architecture of REST enough to distinguish the difference.
 
At the time I was worried that this was a general trend.  However the O'Reily book shows that the leaders of this movement know the limitations and are applying them only to the places that make sense.

I highly recommend getting this book.]




Yesterday I stumbled into what became a vigorous debate through a simple assertion with my friend who is a Ruby programmer:

It is impossible to write a web application without using a stateful connection of some sort.

It's an interesting debate. I think there's probably a difference between what he is calling "applications" and what I'm calling applications, but let's tackle it by looking at a quick history of the web and why REST was so successful for web 1.0.



Pre-REST

Before REST, users had to connect to remote systems through a terminal session or a special client. This was ok, except remote systems could only support a (relatively) small number of connections before they reached capacity. Even worse, the connections were idle most of the time since users are generally slower than computers.

So an interesting idea was to make the systems "stateless" (i.e. sessionless), by allowing servers to handle only one request per connection instead of multiple requests. This solved the problem of idle connections and allowed a single server to handle thousands of requests from separate users. For static[1] web documents this was a brilliant optimization and thus the World Wide Web was born.


Cookies

Of course, almost as quickly as the Web became popular, people at Netscape wanted to build dynamic web documents -- documents that could change their content depending on who was viewing them. Their first web application was an online "shopping cart" that could support:
  1. browsing a catalog
  2. adding an item to a virtual shopping cart
  3. checking out and paying for the item
But they had one small problem: REST.

With REST the user only gets to make one request and each request is completely independent of any other requests the user has made. It's like going to a market where they have magical shopping carts: every time you put something in the cart or try to pay for it, it instantly disappears! Clearly this was unacceptable, so RFC 2109 was born: a method of creating a stateful session using cookies.

However, with cookie-based sessions came performance and scalability issues.  Web applications and infrastructure unavoidably became slower and more complicated than the static web had been.


Secure Sessions

Almost as soon as the first shopping cart web-applications were deployed people also started becoming concerned about the security of their transactions. The next logical step was proposed in RFC 2818. This time the security concerns required a new protocol (SSL/TLS) that was layered underneath HTTP (i.e. HTTP/SSL, or HTTPS for short).

However, this didn't save web developers from the complexity of managing session state, it only exacerbated it. Now we had a stateful application (using cookies) on top of a stateless protocol (HTTP) on top of a completely separate stateful session protocol (SSL).  This added so much complexity to web applications that only the most demanding and wealthy clients (i.e. banking and commerce) can afford to develop and maintain such applications.

Most web architects avoid SSL intentionally, citing poor performance and scalability and complexity of management and deployment as key problems.  


The Search for Simplicity

In the 1990's (the golden age of the early web) we didn't care about such problems, because the technology was new and exciting, the code was small, the problems were mere annoyances... and basically everything worked. However, by the end of the 90's, systems had become complex enough that developers started to rebel.


Web-fundementalism?

One of the directions we rebelled in is a kind of web-fundementalism (a return to basics): all this "state-management" was a bad idea in the first place, we should return to REST-ful principles that worked so well the first time.  But what principles are we talking about?  URL-spaces?  Do people really understand which aspects of REST made the web 1.0 succesful?  

As I asked these questions, I realized that most of the alternative[2] proposals were simply managing session state with different technologies-- instead of using cookies they'd use GET params, GET url-spaces or POST fields; instead of webserver memory, they'd use databases.  They hadn't really changed anything.

Session management by any other name is still session management and is fundementally incompatible with the claims and assumptions of REST, chief among them the idea that such applications are still scalable and can support caching.

Let's explore this idea a little more since there is such resistance to it. Say I have a web application that allows me to rent videos... you might expect such an application to have a REST-ful design with the following types of urls:

http://videostore/signup
http://videostore/customers/larry/rented
http://videostore/customers/larry/cart/pay
http://videostore/customers/larry/cancel_membership
http://videostore/customers/larry/bill/09-17-2008
http://videostore/customers/larry/overdue/Big_Trouble_in_Little_China
http://videostore/videos/Harry_Potter/add_to_cart

On the surface, this looks beautiful... it looks REST-ful. But let's dig into some of those assumptions:

1) Below the pretty urls, is state-management. Suppose the urls are all being served by dynamically from a database. That means the web site has to read my information from the database and display it to me (but no one else) when I access a url below "/customers/larry".  One article I read said the REST-ful answer to this is basic web authentication.  But that isn't REST-ful since it's implemented with cookies!  surprise! your session's back!

2) None of this is even remotely cacheable. When customers cancel or sign-up, the url space changes structure. But how can I cache a url-space that is constantly changing? The simple answer is I can't/shouldn't. The "horribly-complicated-billions-of-wasted-dollars" answer is: sure, you can invest in configurable (or worse yet, "adaptive") caching at multiple levels and spend the rest of your earthly days debugging it. I can hear the sound of a thousand web developers slitting their wrists even thinking about this... gee thanks.

3) One thing I read is that REST-ful web services are easier than XML services to deploy.  It sure looks that way from the service author's point of view initially, but then some critical questions came to mind: how do you know the difference between the verbs and the nouns?  I don't want the user to accidentally cancel their membership by just browsing the site.

Some people say that we should use the other two verbs in HTTP instead (PUT and DELETE). But how do you know that the client has PUT the correct format when you don't have any structural form that can be validated? Does the client just have to guess until they get it right?

I suppose the way out of that problem is to use some fancy AJAX and maybe some JSON object definitions, but now we're headed back into custom serialization territory. Haven't we been here before?  Isn't this why people stopped using SOAP in favor of simple XML and why people stopped using XML in favor of simple JSON?  Sooner or later features get added and simple isn't so simple.  

Einstein had an opinion about simplicity, he said "make things as simple as they need to be, but no simpler".  Do we really need to learn this lesson the hard way by repeatedly wasting billions of dollars in new technology cycles that have essentially been resolving similar problems?


Anyway, because #1 and #2 basically break all the claims that REST makes about scalability and performance and #3 points out that it's not easier, I think that the perceived gains from simply applying a REST-like architecture to web applications is mostly falacious.  Certain kinds of web services, maybe, but never applications.

Fortunately, a handful of other people were already realizing this back in 2004. 



State-management

The other direction we can go--we should go, is to finally accept "stateful" web applications as a given. Google Chrome, Adobe Flex and Microsoft Silverlight are all moving in this direction. REST has it's place, but so do stateful applications. It's time to recognize the architectural pros and cons of each and use the right tool for the right job.

In some sense, this is all obvious -- it's just hard to see it because the layers of technology have always clouded the argument considerably.



[1] It's interesting to note that Fielding, the author of REST originally thought in the context of information and media access-- he did not think of contrasting REST with RPC - a stateful technique. (from http://en.wikipedia.org/wiki/Representational_State_Transfer#Claimed_benefits)

[2] There is
one notable approach that covers a certain subset of web services in a purely REST way. However, this is not an "application" in a traditional sense, it's limited to "lookup"-style services.

There is a very simple litmus test to see if your service is in this subset: Could you also implement the service less efficiently using ONLY static HTML pages? If the answer is yes, then it qualifes.


Thursday, September 11, 2008

lrn2patch: an open letter to Apple and updaters everwhere......

Look guys, I know you're probably busy creating the "next big thing" over there in Cupertino, but can you have one or two guys research the art of deploying software via patch files instead of downloading an entirely new 80MB image every time you deploy?

In fact, let's make this a global rant against the "Check Updates..." installers out there: 
  1. please realize that you are using OUR computers when you design and architect these systems. (i.e. the rude neighbor: "oh I'm sorry, were you using that?")
  2. please realize that other pieces of software may be trying to update at or near the same time. (i.e. the flood: "Me me me..." "Me too.")
  3. please realize that liberal use of "restart" semantics to complete installations may be magnified by the above. (i.e. the tantrum: "stop everything and do what I want... NOW!")
  4. [added 9/25] Please. Please make your updaters run at IDLE priority... there's no reason that a system update should interrupt the user or lock up the system while the update is occuring.
I realize that Windows forces #3 more than most devs can avoid. (mac less so, and linux is almost impervious to restarts from installs).  But aside from those exceptions, you should be treating our computers with respect.  You don't own them or the networks that connect them.  You barely own the licenses of the software on them.

Please take some time to respect your users.  Thanks.

Wednesday, September 3, 2008

google's chrome is hot

This is my first blog posting from google's chrome.  There's been a lot of chatter about whether this is a good idea, so here is my two cents:

Q: Firefox is already a great browser AND it's open source.  Why didn't Google just invest changes into this browser?

A: Chrome isn't simply another Mozilla-fork, it has new low-level system features and architecture.  Firefox is a great browser, and it's pushed Web 2.0 far enough to expose some serious problems in the underlying design of Web 2.0 and the browsers that support it.  Changing Firefox just isn't practical when there are this many changes... it's better to start over.

Q: "Separate Processes" per tab?  Why is that necessary?

A: The browser and web based applications are hostile territory.  You can't trust that applications developers are going to write friendly, clean, compatible code, so don't trust them.  This sounds counter-intuitive until you realize that all the rock-solid architectures do this: IP works because it fundementally assumes that communication failure is NORMAL.  Likewise, modern operating systems believe in process isolation.  It's a "good thing"(tm) that Google is now doing the same for the browser.  

Q: Do we really need another browser?

A: Yes.  Application developers demand more of the web.  We should have a fully integrated debugging stack.  We should have protection against rogue or errant processes.  We should give the user visibility into what happens on their machine.  The current browser architectures address none of this.  The closest extension out there is FireBug (and FireCookie)... but this is only a start.  There are limits to what you can do within a poor javascript environment -- Google knows this because they are also web developers -- our pain has been their pain as well.  So yes, it's absolutely necessary that someone with the resources step up and help solve this mess.

By the way, people have said the same thing about Opera for a long time, but the truth is that Opera pushed both IE and Firefox towards better CSS 2.0 compliance when it beat them badly on the ACID2 tests.   Likewise, there are serious problems with current browser architectures that have not been addressed that Google is now addressing.  

More competition is never a bad thing because it motivates exactly these kinds of changes.

Q: So great, now I have to design for another browser as well?

A: Unfortunately that particular problem will never go away anyway anytime soon-- we already have at least 3 major browsers (IE, Firefox and Safari).  The truth is that a certain segment will always push the latest browsers and gradually the rest will either move towards compatibility or lose market share as those sites become popular.

Q: So why use it before it becomes popular?

A: Stability.  With blogs, online banking, web mail, intranet portals -- it becomes very important that the browser not crash in the middle of edits.   If browsers were stable this wouldn't be nearly as compelling.  We rely on a nightmare of javascript to run our sites now -- shouldn't that be based on a stable VM and multiprocess model in-line with modern operating systems, not 1980's technology?