Sunday, September 27, 2009

Independent Study: Ripley: Automatically Securing Web 2.0 Applications Through Replicated Execution

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format is similar to that of a review of a paper submitted to a computer science conference. There are already-published papers, so I'll be writing with the benefit of hindsight, especially when the paper was published at least several years ago.)

Submission: Ripley: Automatically Securing Web 2.0 Applications Through Replicated Execution

Please give a brief, 2-3 sentence summary of the main ideas of this paper:

AJAX or Web 2.0 applications provider richer end-user experience by moving computations to the browser, but the results of those computations are vulnerable to manipulation by known tools. Since client-side computations often cause server-side events, such as modifying or removing data in a database, it's important for the server to be able to validate the integrity of the client's computations. The authors present RIPLEY, a system for replaying client-side computations in the server to reestablish the lower bound on computational integrity that was lost by moving code to the client.
What is the strength of this paper (1-3 sentences):
The architecture of RIPLEY is sound for preventing the problem the authors identify, and the empirical results of deploying the system are sufficiently promising for internet application providers to consider investigating this approach to the problem.
What is the weakness of this paper (1-3 sentences):
The authors don't provide a compelling case that this problem needs to be solved. In their paper on Spectator, they refer to specific JavaScript worms that have caused damage to Web site operators. Here they don't refer to specific large-scale exploits that their approach solves. This makes the problem they're solving seem less pressing.
It's an interesting solution but the problem, while real, doesn't seem to have been exploited in any significant way. The fact that the server-side code can anticipate client actions, pre-compute their results, and send them to the client before the client takes the given action, thus increasing client-side responsiveness, is a lovely way to improve the user experience, one that should pique the interest of Web site operators making heavy use of client-side JavaScript.
RIPLEY is novel. The authors note that an alternative approach (explored here and here) is to require the client to send to the server a computed proof of the correctness of its current state (e.g. the client could intermittently send a stack trace to the server). Unlike RIPLEY, these approaches only provide a degree of assurance, not a guarantee, of the correctness of the client-side computations.
The architecture, implementation details, and runtime performance
Worth Solving
The problem is worth solving, but not as pressing as the problem of JavaScript worms that the authors solve in their work on Spectator.
95% +- 2
Detailed Comments

See citations 27, 29 for vulnerabilities to code modification in browser.

See citation 20 for, more generally, AJAX vulnerabilities and exploits.

Volta, a distributing compiler -- in some respects similar to Google Web Toolkit, which generates JavaScript code from Java. The idea of GWT is to allow the user to write dynamically-typed code in a statically-typed language -- the GWT compiler can reason about the statically-typed code and emits code that's known to be safe. While Volta appears to be designed to work in a more general way than GWT -- generating code for multiple languages (more than just JavaScript) via the Microsoft Common Language Runtime intermediate representation, for example -- the basic idea is similar. One feature of Volta is that because it can reason about the statically-typed program, it can divide the computations in the statically-typed code to be divided between the server and client.

A developer who wishes to use RIPLEY applies class-level annotations to identify which code runs on the client, which on the server. The code is then converted to .NET bytecode, at which point Volta reads the bytecode (which contains the annotations), and generates the appropriate client- or server-side code, introducing RPCs between client and server when necessary. RIPLEY is implemented as a tweak to this phase of Volta's execution. The authors added a feature to Volta to generates additional server-side code for validating the client-side computations.

RIPLEY is intended to enforce the integrity of the original application in the face of the vulnerability of the client-side code. It accomplishes this by reproducing each client-side computation on the server; if the results of a client-side computation cannot be reproduced on the server, this is reason to believe that something faulty or malicious occurred in the client.

In the RIPLEY architecture, a browser emulator runs on the server, validating the client-side computations by replaying them. Since the browser emulator runs in a .NET virtual machine, which uses JIT, sometimes the server-side validations can anticipate client-side events, and the server can push pre-computed results to the client.

Sunday, September 20, 2009

Independent Study: Spectator: Detection and Containment of JavaScript Worms

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format is similar to that of a review of a paper submitted to a computer science conference. There are already-published papers, so I'll be writing with the benefit of hindsight, especially when the paper was published at least several years ago.)

Submission: Spectator: Detection and Containment of JavaScript Worms [PDF]

Please give a brief, 2-3 sentence summary of the main ideas of this paper:

Detecting JavaScript worms can be accomplished by adding a tag to content uploaded to a Web server, associating the tag with the IP address of the client that originated the upload, and using tags to identify propagation chains. A long chain of propagation is a signature of JavaScript worms, so identifying a long chain (where length is user-defined) should be sufficient for detecting JavaScript worms. One a worm has been identified, containment is a matter of disallowing further uploads along the chain of known-infected clients until an administrator tells the Spectator proxy that the chain is safe.
What is the strength of this paper (1-3 sentences):
The strength of this paper is that it proposes a solution which can be implemented simply as a proxy server in the domain of a Web site operator, and which doesn't require any modifications or plug-ins in the web browser. With a reasonable amount of time and capital, a large Web site operator can implement this solution today.
What is the weakness of this paper (1-3 sentences):
Not to be so generous, but I don't find any problems with it.
This is on the whole an excellent paper. I would like to see it in SIGFOO this year.
I'm not sure how novel this is, as XSS attacks and JavaScript worms aren't my specialty, but naively, it seems novel. In the spirit of many useful papers in computer science, it combines a number of sound techniques to solve an important problem.
I'm convinced not only that the detection and containment algorithms work correctly and efficiently, but also that the author of a JavaScript worm would have a hard time subverting the system. The authors' approach is essentially invisible to the browser, with the exception of some JavaScript injected by the Spectator proxy (Figure 5) (which appears to be designed so as to prevent subversion by malicious code).

While the empirical data in section 5.2 (Overhead and Scalability) is encouraging, a Web site operator would obviously want to subject any implementation of this system to rigorous testing before deploying it.
Worth Solving
Very much so!
Detailed Comments
I'm curious whether it's possible for each page downloaded from a site to contain JavaScript code to validate that only Web site operator-provided JavaScript is executing in that page. Something like an unmodifiable onload event handler similar to the unload event handler in Figure 5.

Some papers to follow up on:

S. Meschkat. JSON RPC: Cross site scripting and client side Web services. In 23rd Chaos Communication Congress, 12 2006.

T. Pietraszek and C. V. Berghe. Defending against injection attacks through context-sensitive string evaluation [PDF]. In Proceedings of the Recent Advances in Intrusion Detection, Sept. 2005.

Y. Xie and A. Aiken. Static detection of security vulnerabilities in scripting languages [PDF]. In Proceedings of the Usenix Security Symposium, pages 271–286, Aug. 2006.

Y.-W. Huang, F. Yu, C. Hang, C.-H. Tsai, D.-T. Lee, and S.-Y. Kuo. Securing Web application code by static analysis and runtime protection. In Proceedings of the Conference on World Wide Web, pages 40–52, May 2004.

Wednesday, September 16, 2009

Fibonacci Strings and Performance

Fibonacci strings are related to Fibonacci numbers in that the length of the nth Fibonacci string is the nth Fibonacci number. Here's pseudocode for a function that generates them:

fibstr: int -> string
if n is 0
return the empty string
else if n is 1
return the string "b"
else if n is 2
return the string "a"
return fibstr(n-1) concatenated with fibstr(n-2)

The strings generated by this function for n > 2 have the interesting property that if you delete the last two letters the resulting string is a palindrome. Anyway, for kicks I decided to use Fibonacci strings to test the string performance of popular programming languages. The test was simple: run fibstr(31) 31 times. I was lazy in writing the tests, but the results were consistent across many cups of Bhakti chai and many executions of the test yesterday morning , so as far as I'm concerned they're valid representations of the relative performance of the languages with respect to string concatenation and, to some extent, memory management.

Java (StringBuilder w/JIT2.807s2.652s0.084s
Java (String w/JIT)2.819s2.692s0.076s
JavaScript (Rhino 1.6.r5-34.240s4.100s0.080s
Python (2.4.4)4.708s3.992s0.012s
Perl (5.8.8)7.528s7.484s0.008s
Java (StringBuilder w/o JIT)8.171s8.053s0.040s
Java (String w/o JIT)11.707s11.553s0.068s
Ruby (1.8.6)21.348s18.565s2.488s

Off-the-cuff ... Ruby's performance speaks for itself. I expected Perl to do better, since it's been around quite a while. I can see why Sun responded to early complaints about Java's performance with JIT. It's clearly effective, but I'd like to know more about how the JRE decides to compile to native code. With JIT, StringBuilder doesn't buy you anything if it's compiled to native code; without it, StringBuilder can make a noticeable (on paper) difference over plain String, but I wonder whether the difference shows up much in real workloads. If a string-manipulating function is a hot spot, it'll probably be compiled to native code. I suppose one would have to examine the memory impact of String v. StringBuilder as well.

I wouldn't be surprised if most of the time a language spends in system mode comes from calls to brk(2).

Sunday, September 13, 2009

Independent Study: A Universal Modular ACTOR Formalism for Artificial Intelligence

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format of this and subsequent papers will be similar to that of a review of a paper submitted to a conference. These are already-published papers, so I'll be writing with the obvious benefit of hindsight.)

Submission: A Universal Modular ACTOR Formalism for Artificial Intelligence [PDF]

Reviewer: Nicholas Dronen

Please give a brief, 2-3 sentence summary of the main ideas of this paper:
The authors present their notion of an actor, a formalism which unifies control flow and data flow. Actors allow programs to achieve a high degree of parallelism without the explicit use of semaphores. Actors accomplish this by being free of side effects and communicating with other actors only by passing messages.
What is the strength of this paper (1-3 sentences):
The paper's greatest impact is the scalability (i.e. degree of parallelism) and reliability (discussed below) of actors.
What is the weakness of this paper (1-3 sentences):
While the ambition of the paper is admirable, it nonetheless casts too wide a net, covering topics from hardware design to formal representations of knowledge and belief. Historically, the value of actors as a way to reason about the properties of programs isn't clear, considering that other approaches, such as the pi calculus, have seen wider adoption.
In retrospect, a great deal of this paper could be removed without harm. Specifically, the idea of actors as an appropriate abstraction for hardware (e.g. removing interrupts) and the discussion of using actors to represent knowledge could be completely removed. The remaining material could then be expanded, ideally covering in more depth the concurrency-related aspects of actors.
Some aspects of actors -- such as message passing and perhaps pattern matching -- are not original. What is novel in this paper is the combination of these techniques into a scheme that provides a high degree of concurrency.
Worth Solving
I am very confident in my evaluation of this paper. I've been thinking about actors for a year or so.
Detailed Comments
To understand actors, consider a thread. Its stack is finite, but it's undecidable how much stack space a function running in a thread requires, so system designers allocate a constant amount of stack space for each thread -- enough for well-behaved programs, but still too much to allow even tens of thousands of threads on a reasonably-equipped machine. It's conventional for 2 MB to be allocated to a thread on a 32-bit machine running Linux. If a process is limited to 2 GB of memory, and assuming for the sake of discussion that no other implementation issues (such as the placement of the heap) interfere, a process can have at most ~1000 threads. Actors solve this problem, which is admittedly not an enormous problem, but the solution to this problem solves another huge problem -- the high degree of concurrency that programs will require in order to exploit the large number of cores of future processors.

An actor is a user space thread, the stack of which doesn't grow -- essentially a tail-recursive function. So the system designer only has to allocate an extremely small amount of stack space -- basically enough for an activation record and local variables -- which means that a single process can contain many thousands of actors. For example, a benchmark has run with 20 million Erlang processes (the Erlang term for an actor).

An actor contains only local state, so it has no side effects. Since there's no shared state, the programmer doesn't have to concern herself with error-prone synchronization primitives, such as semaphores. Actors communicate with each other by sending messages. The incoming messages for an actor are placed in an actor's mailbox (this term doesn't appear in the paper but is in common use today). When an actor receives a message, it uses pattern matching (in the style of Haskell and ML) to determine the type of the message.

One would be wrong to infer that synchronization primitives are not necessary for a runtime system that supports actor-based programs in a single operating system process on an SMP or multicore machine. Imagine a program running on a multicore machine with 10,000 actors, and 9,999 are sending messages to the remaining one (who we'll call A). The actor A's mailbox will be filling up with messages, and it's possible that two actors executing concurrently can send a message to A at the same time, which implies that the scheduler for A has to be capable of handling two requests at once without corrupting A's mailbox.

The paper's authors found inspiration for actors in packet-switched networks, which in part explains why a program written in actor-based language seems so much like a distributed system. Where the program has actors sending messages to each other, the distributed system has nodes exchanging packets. In the domain of software architecture, there is an interesting analogue between REpresentational State Transfer (REST) and the actor model, which is mentioned during the question and answer period of a recent presentation by REST advocate Steve Vinoski.

Additional benefits of actors: since they contain only local state, they can be terminated if an error occurs. This effectively isolates errors. A small amount of data may be lost, but the program continues to run. Since actors consume so little memory, each actor can be monitored by another actor. The monitoring actor can be notified when the "child" actor dies, and it can restart the child. This increases the reliability of programs. Another consequence of the lack of global state in actor-based programs is the ease with which code can be reloaded into a running program without consequences. This increases availability, which is especially important in systems that are supposed to be fault tolerant.

Monday, September 07, 2009

Independent Study: Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part I

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format of this and subsequent papers will be similar to that of a review of a paper submitted to a conference. These are already-published papers, so I'll be writing with the obvious benefit of hindsight.)

Submission: Recursive Functions of Symbolic Expressions and Their Computation by Machine, Part 1

Reviewer: Nicholas Dronen

Please give a brief, 2-3 sentence summary of the main ideas of this paper:
The author presents the basic symbols, syntax, and semantics of a Turing-complete programming language, LISP, providing compelling examples of the language's expressiveness. He then discusses how it is implemented on the IBM 704 computer.
What is the strength of this paper (1-3 sentences):
It describes a new programming language based in part on Church's lambda calculus which can be applied in both practical and theoretical domains.
What is the weakness of this paper (1-3 sentences):
Being such a seminal paper in the PL literature, it's hard to find fault with it. It is interesting that the author doesn't himself discuss deficiencies in the language or its implementation.
This paper is excellent. The author explains his ideas clearly and comprehensively.

The contemporary reader does have to adjust his vocabulary at times, because McCarthy's terminology is outdated — for example, where a contemporary paper would use the term memory cells, McCarthy says registers.
To my knowledge, there was only one other functional programming language at the time of publication, IPL, but LISP was a big improvement. LISP's garbage collection is extremely useful to programmers, freeing them from having to manage memory explicitly.
Worth Solving
This paper is seminal in part because it solves several problems at once.
How confident are you in your evaluation of this paper?
Detailed Comments

This paper is exemplary in the literature of programming languages. It sets the standard for the formal definition of a programming language: present the basic symbols and syntax, define the semantics, discuss implementation issues.

I'm fascinated by the fact that the LISP function apply "plays the theoretical role of a universal Turing machine and the practical role of an interpreter." Since LISP is based at least in part on the lambda calculus, and the untyped lambda calculus is Turing-complete, it's fair to assume that the universality of apply follows as a consequence, but I'm not capable of providing a proof, nor, if I were capable of such, would I have time to do so. However, it's clear that the built-in substitution function subst has the same role as substitution in the lambda calculus. Another connection between LISP and formal representations of programming languages is that the function eval seems to be a handmaiden of the operational semantics of the language. The signature of the function is eval[e; a], where e is an expression to be evaluated, and a is list consisting of pairs, the items of which are, first, an atomic symbol, second, the expression for which the symbol stands.

And, practically speaking, apply is a function implemented in LISP that interprets LISP — highlighting the bootstrap problem that early computer science students grapple with, if only briefly: how do you compile a compiler? But further, one can view it as a virtual machine for LISP. The language's use of Polish notation reminds me of the stack-oriented nature of the Java virtual machine.

One of the questions I had from last week's paper is whether LISP was, as the paper described, actually capable of performing symbolic differentiation of functions. This paper provides a concrete answer. LISP is indeed capable of evaluating the derivative of a symbolic expression with respect to a variable. It's surprisingly simple, amounting approximately to replacing occurrences of the variable with ZERO or ONE, depending on whether they are being added or multiplied, respectively. So, for, example, the derivative of (PLUS 9 y) with respect to y is (PLUS 9 0) and the derivative of (TIMES 5 z) with respect to z is (TIMES 5 1).

Some of the built-in functions are named poorly: car is a mnemonic for "contents of the address part of register" and cdr stands for "contents of the decrement part of register." Today a language designer would give those functions names in harmony with their semantics, not their implementation. I believe this deficiency in early LISP was corrected at some point (with car replaced by first and cdr by rest), but I don't know what Scheme or Common LISP use.

Garbage collection does not occur intermittently. Rather, it occurs when the program needs memory but free memory has been exhausted. It does a simple reachability analysis. In the IBM 704, characters are 6 bits, and a machine word is 36 bits. A string of characters like "DIFFERENTIATE" is represented by a linked list of structures. The first field of each structure is a pointer to a 6-character memory cell. The second is a pointer to the cell of the next structure. The string "DIFFERENTIATE" decomposes to cells containing "DIFFER", "ENTIAT", and "E."

Friday, September 04, 2009

Internet of Things

Some quick thoughts on machine-generated and -consumed data (as opposed to documents) being the next phase of the development of the web. First thing that comes to mind is, this will require our infrastructure to be more intelligent, such as using feedback control systems to regulate the machine-to-machine flow of data. See, for example, "Black-Box Performance Control for High-Volume Non-Interactive Systems". Also, a human user navigating pages on the web (or using a search engine to go directly to the desired content) is a request-response style of interaction. HTTP does this quite well. However, data being generated by machines in the real world and consumed by machines suggests a publish-subscribe or at least a hybrid request-response, publish-subscribe style of interaction, a task for which protocols like XMPP are suited.