Saturday, May 22, 2010

You would think that an editorial that's subtitled "If the Fed won't tighten up, we'll pay the price" would present a case for why the Fed should abandon the current policy of quantitative easing in favor of something more contractionary. That's what I expected when I turned to the last page of this week's print edition of Barron's, but that's not what found. Instead, Tom Donlan fills most of the page with a story about bonds issued three decades ago, the point of which appears to be that things don't always turn out as you expect. But saying, in short, "shit happens" isn't a compelling argument in favor of any particular policy, much less the one that the editorial piece ostensibly advocates. Does journalism get any lazier?

The article isn't completely free of argument. There is an inkling of one towards the end, where he writes "Wall Street doesn't have a money market; its interest rates reflect Fed policy, as do Main Street consumer prices." Now this isn't really an argument, but it's the closest Donlan gets to the kind of writing one expects in a piece advocating a Fed policy change. The dependent clause -- "as do Main Street consumer prices" -- is an assertion. And it's not false; Fed rates and the state of the economy as a whole are correlated. When the economy overheats, the Fed raises interest rates, and there's a recession of some magnitude. The Fed eventually responds by lowering rates, which lets economic growth resume. That's the standard picture.

From there, though, it just gets childish:

It's easy to imagine that inflation is irrelevant to a U.S. economy that's posting the lowest increases in consumer prices in 44 years. But "imagination is funny; it makes a cloudy day sunny," as Johnny Burke wrote for Frank Sinatra in 1940.

Sunny days of low inflation won't last.

And it would be easier for us to imagine that what he says actually matters if Donlan had bothered to make an argument. Low inflation won't last. So what? When inflation becomes palpable to Fed policy makers, they'll raise interest rates. But if there's no sign of inflation now, what reason is there for Fed policy to become contractionary now? Hello?

UPDATE: The New York Fed chairman disagrees with Donlan too.




Thursday, May 06, 2010

I saw Bong Joon-Ho's "Mother" last night at the Lyric Cinema Cafe in Fort Collins. The style, theme, and humor are similar to his "The Host" and "Memories of Murder".

They're all awesome. "Memories of Murder" should have been on at least one major critic's best-of-decade list, but perhaps it was elided because Bong plays with genre too much to be considered alongside the directors of consistently sublime movies like "Syndromes and a Century" or "Yi Yi". In any case, if you like yourself some Asian cinema, watch "The Host", "Memories of Murder", and "Mother" (in order of accessibility).

Saturday, May 01, 2010

iPhone, Safari, and tel URI's

I think it's very cool that tapping a tel URI in Safari on an iPhone allows you to actually call the number. (A "tel URI" is basically a phone number link, similar to an "http:" or" mailto:" link, except it starts with "tel:".) Because not all phone numbers start with "tel:" on the Web, Safari tries to detect them so you can just tap and call, but it needs to be improved. For example, document citations often contain text that Safari wrongly believes is a phone number, such as the text after the comma in this fragment of a review: "No. 4, 541-578 (2009)". My phone thinks this is the number (541) 578-2009.

Futher brokenness is evident when I tap and hold the link: a pop-up menu appears with the option to "Call 541-578 (2009" (with no parenthesis at the end).

Wednesday, January 27, 2010

Rewatching Mad Men (cont.)

I believe this scene is key to understanding why Campbell shows up at Peggy's apartment that night. Here's the pattern. First, Draper thwarts Campbell's desire to engage in a sort of corporate-ladder-climbing intercourse. Clearly, given the way the scene is shot, and given the bile of Campbell's "f*ck you" at the end, sex is a weapon for Campbell. Later, when the other Sterling Cooper lads take him out for his stag party, Campbell brazenly puts the moves on one of the gals who joins the lads at the strip club. Here again he confuses force for flattery, and again he's rejected and humiliated, only this time in front of his peers.

Campbell mixes desire and power to the point that they're indistinguishable. Being humiliated by Draper in the afternoon, and in front of his peers at the strip club, forces his hand. He goes to Peggy's apartment not -- or not strictly -- because he wants some wanton nookie before he gets hitched, but to get something that Draper has (or so he thinks). Sleeping with Draper's secretary is Campbell's way of getting revenge. Again, sex to him is only a weapon. This is, in part, why Peggy's speech at the end of the second season -- the one that left me and probably you breathless -- is so powerful: what she tells Campbell shocks him; he's surprised; but it's not about the baby. It's about what she says and how she says it and the fact that she is a woman. And it's about Campbell's idea of who he is.

Sunday, January 17, 2010

Shots from "Syndromes and a Century"

I watched "Syndromes and a Century" again last night after seeing it about a year ago. Something about it is so fresh and vibrant, even after multiple viewings.

Here are two powerful shots -- the first is from the rural vignette, the second from the urban one -- posted just because I think they're beautiful.



Tuesday, January 12, 2010

Rewatching Mad Men

It may be cheap sport, playing spot the symbols of sex and power in an episode of "Mad Men," but rewatching the first episode, it's too obvious to pass up. It's about at the 24-minute mark, right after the distasterous meeting with Rachel Menken.



Campbell: I'm not going to pretend I don't want your job, but you were right, I'm not great with people, and you are, I mean, not counting that meeting we were just in, so I'm kinda counting on you to help me out.... There's plenty of room at the top.


Draper: Look, I'm sorry I was so hard on you before. It's just this damn tobacco thing.


Campbell: You'll think of something. [Emboldened.] A man like you I'd follow into combat blindfolded, and I wouldn't be the first. Am I right buddy? [Presents hand to shake.]


Draper: Let's take it a little slower, I don't want to wake up pregnant. [Walks away.]


Campbell: [Under his breath.] F*** you.

Sometimes a cigar is just a cigar and sometimes the outstretched hand of your weasel of an underling is just a phallic symbol appearing on your TV screen.

Oh, and would it be too much to note that the previous scene features Peggy Olsen being humiliated by the gynecologist when she goes to him to get oral contraceptives? I think not.

Independent Study Reading List, Spring 2010

Friday, December 25, 2009

A Memory of Vic Chesnutt

After I graduated from high school, I got on a bus and went to Athens for two weeks, and had the best possible experience an 18-year-old music junkie could have had at the time. It's unbelievable but true. I met Michael Stipe (how exactly is a story in itself) and ended up spending almost a half day with him. We hung out in the R.E.M. office in downtown Athens while listening to Pylon rehearse in the basement. We went to his house and had a drink. We went to a cafe and later, before he dropped me off at the city park where I was camping, a bar. The next morning I got kicked out of the park and ended up camping uninvited on the roof of the R.E.M. office. The woman who ran the fan club noticed me through a window sleeping in the sun. I woke up with a bag next to me; it had a tube of sunscreen and some fruit. Turns out her boyfriend, Armistead Wellford, was the bass player in Love Tractor. She arranged for me to stay in a big old house where a lot of the band members lived. All of that was more than I imagined would happen when I got on the Greyhound bus in Fargo, North Dakota.

But in retrospect the most significant thing I saw there was when Stipe and I were at the cafe. He introduced me to some people at a table -- two women who I don't remember, and a man in a wheelchair, Vic Chesnutt. A cassette player played recording of one of Vic's live performances. While we sat at the table and talked, Stipe told Chesnutt that he'd like to help him record a record. It was the summer of 1988. Two years later New West Records released Chesnutt's Little, which Stipe produced.

Tuesday, December 01, 2009

Wiki terminology, like wikipedia, not optimized for search.

In MediaWiki terms, including a page in another page is called transclusion, an accurate word, but not one that automatically comes to mind. I searched for "macros" and "include" for a while and, after a few minutes, found a page that mentions transclusion.

More papers to read

From a comment on the Natural Language Processing blog, classic papers in NLP:

  • [Bahl et al., 1983] L.R. Bahl, F. Jelinek and R.L. Mercer. "A Maximum Likelihood Approach to Continuous Speech Recognition." IEEE Journal of Pattern Analysis and Machine Intelligence.
  • [Charniak, 1983] Eugene Charniak. Passing Markers: A Theory of Contextual Influence in Language Comprehension, Cognitive Science, 7, pp. 171-190.
  • [Charniak, 1973] Jack and Janet in Search of a Theory of Knowledge. In Proceedings of the International Joint Conference on Artificial Intelligence (1973)
  • [Charniak, 1977] Eugene Charniak. Ms. Malaprop, A Language Comprehension Program. In Proceedings of the International Joint Conference on Artificial Intelligence (1977).
  • [Cohen et al. 1982] Philip R. Cohen, C. Raymond Perrault, and James F. Allen. Beyond Question Answering. Strategies for Natural Language Processing, pp. 245- 274.
  • [Grosz, Joshi, and Weinstein, 1995]. Centering: A Framework for Modeling the Local Coherence of Discourse. Computational Linguistics, 21 (2), pp. 203-226.
  • [Grosz and Sidner, 1986]. Attention, Intention, and the Structure of Discourse. Computational Linguistics, 12 (3), pp. 175-204, 1986.
  • [Hobbs et al., 1993]. Interpretation as Abduction. Artificial Intelligence, vol 63. pp. 69-142.
  • [Hobbs, 1979] Jerry Hobbs. Coherence and Coreference, Cognitive Science 3(1), pp. 67-90.
  • [Hovy, 1988] Hovy, E.H. 1988. Planning Coherent Multisentential Text. Proceedings of 26th ACL Conference. Buffalo, NY.
  • [Karttunen, 1969] Lauri Karttunen. 1969. Pronouns and variables. In CLS 5: Proceedings of the Fifth Regional Meeting, pages 108-116, Chicago, Illinois. Chicago Linguistic Society.
  • [Kay, 1986] Martin Kay. Parsing in functional unification grammar.
  • [Lakoff & Johnson, 1980] George Lakoff and Mark Johnson. Metaphors We Live By, Chapters 1-4. (short - a total of 21 pages).
  • [Lehnert, 1981] Wendy G. Lehnert. Plot units and narrative summarization. Cognitive Science, Volume 5, Issue 4, October-December 1981, Pages 293-331
  • [Lehnert, 1977] Wendy Lehnert. Human and Computational Question Answering. Cognitive Science, Vol. 1, No. 1, pp. 47-73.
  • [Mann and Thompson, 1988]. Rhetorical Structure Theory: Toward a functional theory of text organization. Text 8 (3), pp. 243-281, 1988.
  • [Martin et al., 1986] P. Martin, D. Appelt and F. Pereira. Transportability and generality in a natural-language interface system.
  • [McKeown 1986] Kathleen McKeown. Discourse strategies for generating natural-language text.
  • [Rosch and Mervis, 1975] Eleanor Rosch and Carolyn B. Mervis. Family Resemblances: Studies in the Internal Structure of Categories, Cognitive Psychology, 7, 573-605.
  • [Schank, 1986] Roger Schank. Language and memory.
  • [Schubert and Pelletier, 1986] L Schubert and F J Pelletier. From English to logic: context-free computation of "conventional" logical translations.
  • [Wilks, 1975] Yorick Wilks. An Intelligent Analyzer and Understander of English, CACM 1975.
  • [Woods, 1986] W.A. Woods. Semantics and quantification in natural language question answering.

Sunday, November 29, 2009

Independent Study: Concolic testing for web applications

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format is similar to that of a review of a paper submitted to a computer science conference. There are already-published papers, so I'll be writing with the benefit of hindsight, especially when the paper was published at least several years ago.)

This week, two papers about web applications from the International Symposium on Software Testing and Analysis '08:

Dynamic Test Input Generation for Web Applications

Here the authors use the concolic testing method pioneered in the seminal paper on Directed Automated Randomized Testing (PDF) to generate tests automatically for web applications written in PHP.

Use a taint-based (clarify) PHP runtime environment as the test oracle (which determines whether a failure occurs). The purpose of automatically generating tests is to automatically identify bugs in the program and, narrowly, the type of bugs the authors are trying to identify are SQL injection vulnerabilities. To that end, they iteratively construct (what they call) an approximate backward slice of the PHP program by (loosely speaking):
  1. identify statements where such vulnerabilities may cause undesired behavior (viz. database library calls where the injected SQL can ultimately do harm),
  2. add the functions where such statements occur to a set of functions to be analyzed,
  3. execute the program by loading it in a browser
  4. resolve control dependencies by recording stack trace at the beginning
  5. analyze data dependencies
  6. repeating (with some variations) until all data dependencies are resolved
The purpose of the preceding steps is to exclude from the analysis aspects of the program that won't help identify SQL vulnerabilities in the code.

Section 3 of the paper discusses the authors' algorithm for generating constraints for PHP.

Section 4 evaluates the system. Constraint generation is accomplished by a plugin to phc, a PHP compiler front-end. The plugin "wraps each statement in a function call"; the function call logs a trace of the program's execution to a file. They deal with eval by passing the string to be eval-ed through the plugin, so each statement in the eval-ed string is wrapped in a function call which logs a trace to the same file. Constraints are resolved by reading and symbolically executing the trace file. The result is "a list of Boolean control expressions where each subexpression is annotated with a concrete value from the execution."

An interesting aspect of the authors' evaluation is the overhead of the tracing process. When they evaluated an entire PHP program, the trace file for loading a single web page was almost 3 GB and the page load timed out. So the iterative process of limiting the scope of their analysis mentioned above was necessary for obvious practical reasons.
Previous work on leveraging symbolic and runtime values for input test generation falls back on concrete values when the symbolic value being addressed lies outside the theory of the resolution algorithm’s decision procedure. Our constraint resolution algorithm generates constraints only based on one variable instance per value. Therefore it may underapproximate the symbolic values of variables when program predicates depend on multiple variables, and it may miss paths that other resolution algorithms would find. In principle our constraint resolution algorithm could be enhanced to include multivariate constraints in some cases, but we leave that to future work.
An object-oriented web test model for testing web applications, Kung, et al, may be interesting reading:
This paper describes an Object-Oriented test model that captures both structural and behavioral test artifacts of Web applications. The model represents the entities of Web applications as objects and describes their structures, relationships, and dynamic behaviors. Based on the test model, test methods are presented to derive test cases automatically for ensuring the structures and behaviors of Web applications
Finding Bugs in Dynamic Web Applications

The previous paper focused on web application security. This paper focuses on web application reliability. Where the previous paper's goal was to identify vulnerabilities to SQL-injection attacks, this paper's goal is to identify bugs that cause web applications to crash or generate invalid HTML. (Web application crashes that can be triggered by user input become denial-of-service attack vulnerabilities once they become known to bad actors.) Similarly, where the test oracle of the previous paper is a PHP runtime environment that supports checking strings for taintedness (failure being defined as the use of an untained string in an SQL statement), the test oracle of this paper is an HTML validator (failure being defined as the web application generating invalid HTML).

Testing whether a web application generates valid HTML is hard for dynamic web pages. Systems exist for validating dynamically-generated web pages, but the require the tester to create tests manually. Here the authors present a system, Apollo, for automatically generating tests for dynamic pages.

Something I don't understand. Here's a passage from the paper:
The HTML pages generated by a PHP applications may contain buttons that—when pressed by the user—result in the loading and execution of additional PHP source files. We simulate such user input by transforming the source code. Specifically, for each page h that contains N buttons, we add an additional input parameter p to the PHP program, whose values may range from 1 through N. Then, at the place where page p is generated, a switch statement is inserted that includes the appropriate PHP source file, depending on the value supplied for p. The steps of the user input simulator are fully mechanic, and the required modifications are minimal, but for the evaluation we performed the program transformation by hand (due to time constraints).
Normally, submit buttons result in an HTML form being POST'ed to the web application. From the context, it's not clear why the system wouldn't simply POST the form. An additional passage
The stand-alone component of the User Input Simulator performs a transformation of the program that models interactive user input by way of additional parameters.
Still a little confused. :-)

Ah, I get it:

<?php echo "<h2>WebChess ".$Version." Login"</h2>;?>
<form method="post" action="mainmenu.php">
<p>
Nick: <input name="txtNick" type="text" size="15"/><br/>
Password: <input name="pwdPassword" type="password" size="15"/>
</p>
<p>
<input name="login" value="login" type="submit"/>
<input name="newAccount" value="New Account"
type="button" onClick="window.open('newuser.php', '_self')"/>
</p>
</form>
Nothing else very interesting here. They evaluate their system, present the results, related work, etc.

Other reading:

Improving test case generation for web applications using automated interface discovery, Halfond, et al, focuses on JavaScript.

Thursday, November 26, 2009

Notes for Information Retrieval Quiz #3

Some things to review before the Information Retrieval quiz:

Lecture 18, Collaborative Filtering and Recommender Systems


Pearson correlation

Lecture 19, Information Extraction

Named entity recognition: find and classify (i.e. determine the category of) all the named entities in a text. Two approaches to named entity recognition:
Rule-based (regular expressions)

  • Lists of names
  • Patterns to match things that look like names
  • Patterns to match the environments that classes of names tend to occur in
ML-based
  • Get annotated training data
  • Extract features
  • Train systems to replicate the annotation
Relation analysis consists of two tasks:
  1. determine if two entities are related
  2. if they are, classify the relation

Features in relation analysis (for each of the above tasks) are:

  1. Features of the named entities involved (their types [concatenation of the types, headwords of the entities)
  2. Features derived from the words between and around the named entities (+- 1, 2, 3; bag of words between)
  3. Features derived from the syntactic environment that governs the two entities (constituent path through the tree from one entity to the other; base syntactic chunk sequence from one to the other; dependency path)
Template filling
  1. Rules and cascades of rules
  2. Supervised ML as sequence labeling
    1. One sequence classifier per slot
    2. One big sequence classifier
Lecture 20, Sentiment Analysis

Classification in sentiment analysis
  • Coarse classification of sentiment
    • Document-level classification according to some simple (usually binary) scheme
      • Political bias
      • Likes/hates
    • Fine-grained classification of sentiment-bearing mentions in a text
      • Positive/negative classifichttp://www.blogger.com/post-edit.g?blogID=35269538&postID=7962329312778949245ations of opinions about entities mentioned in a text
      • Perhaps with intensity
Choosing a vocabulary
  • Essentially feature selection
  • Previous examples used all words
  • Can we do better by focusing on subset of words?
  • How to find words, phrases, patterns that express sentiment or polarity?
  • Adjectives
    • positive: honest important mature large patient
    • negative: harmful hypocritical inefficient insecure
  • Verbs
    • positive: praise, love
    • negative: blame, criticize
  • Nouns
    • positive: pleasure, enjoyment
    • negative: pain, criticism
Lecture 21, Sentiment Analysis (cont.)

Identifying polarity words
  • Assume that generating exhaustive lists of polarity words is too hard
  • Assume contexts are coherent with respect to polarity
  • Fair and legitimate, corrupt and brutal
  • But not: fair and brutal, corrupt and legitimate
  • Example:
    • Extract all adjectives with > 20 frequency from WSJ corpus
    • Label them for polarity
    • Extract all conjoined adjectives
    • A supervised learning algorithm builds a graph of adjectives linked by the same or different semantic orientation
    • A clustering algorithm partitions the adjectives into two subsets
Challenges
  • Mixed sentiment: The steering is accurate but feels somewhat anesthetized.
  • Sentiment inverters: ... never seen any RWD cars can handle well on snow even
    just few inches.
  • Anaphora and meronymy:
    • It's a great car for just about anything. The mkVI is pretty
      much a mkv but ironing out all the small problems.
    • Hey is the back seat comfortable? In my MkV it feels like
      you're sitting on a vat of acid.

Sunday, November 22, 2009

Independent Study: Concepts and Experiments in Computational Reflection

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format is similar to that of a review of a paper submitted to a computer science conference. There are already-published papers, so I'll be writing with the benefit of hindsight, especially when the paper was published at least several years ago.) (I've written most of these posts in the form of a review of a conference paper, but I'm going to use a free-form style this time.)

Pattie Maes's wonderful paper on reflection is a perfect follow-on to my reading about traits. My motivation for reading the traits paper was that Perl's Moose (Perl 6, too!) has traits (it calls them roles). I've been using Moose at lot at work and I wanted to catch up with the research behind traits to prepare for a short talk I gave about it to my colleagues. After the talk, a colleague who was at Bell Labs in the 80's mentioned a paper about reflection by Pattie Maes at OOPSLA in '87 which led to the MetaObject Protocol in the Common Lisp Object System (CLOS), which in part inspired Moose, so Maes's paper closes the loop, ices the cake, etc.

Many programmers know of reflection from the java.lang.reflect package in the Java API. Suffice it to say for the moment that Maes' reflection is expansive and Java's reflection is by comparison quite limited. However, reflection in Java does serve as a good starting point for understanding Maes. What the Java API does provide is programmatic access to information about objects at runtime by exposing an interface to what Maes calls the "self-representation of the system ... which makes it possible for the system to answer questions about itself and support actions on itself." To wit, java.lang.reflect allows the Java programmer to find out which methods or constructors, etc., exist for a given object, and to invoke them.

However, Maes goes a step further (or, more appropriately, Java didn't go as far as Maes imagined, for better or worse) by asserting that "a computational system can actually bring modifications to itself by virtue of its own computation." The modifications Maes envisions are pervasive. In her experimental object-oriented language, 3-KRS, reflection on an object occurs by way of a meta-object associated with the object. (The motivation for performing reflection on an object through a separate entity, its meta-object, may not have been obvious before Maes' paper, but in retrospect it's a clear case of separation of concerns.) Since everything is an object in a pure object-oriented language, meta-objects are everywhere:

[T]he self-representation of an object-oriented system is uniform. Every entity in a 3-KRS system is an object: instances, classes, slots, methods, meta-objects, messages, etc. Consequently every aspect of a 3-KRS system can be reflected upon. All these objects have meta-objects which represent the self-representation corresponding to that object.
For the uninitiated (including me), a slot is, loosely, an instance variable. And I suspect that the difference between a method and message is that a method defines a method (pardon the circularity) and a message "calls" a method. (The notion of a message likely goes back to Smalltalk.)

Further, meta-objects are manipulable at runtime, so — to borrow an example from Maes — a language that supports reflection in all its glory allows the programmer to modify meta-objects to provide support for multiple inheritance. Intuitively, a complete reflective system is an API for the semantics of a programming language. (See also: "A metaobject protocol (MOP) is an interpreter of the semantics of a program that is open and extensible.")

Two aspects of Maes's reflection which are not true for popular statically-typed languages, then, are:
  • meta-objects are pervasive
  • meta-objects are mutable
Obviously for, say, Java or C#, this is true by design. Enterprises of all kinds are often hard-pressed to find programmers who can understand the source code of their larger applications. Some managers are left in the lurch when a key employee departs. Self-modifying code, one might say, is job security. So clarity, explicitness, consistency, readability — these language properties are desirable for production software. A complete reflective system violates them. The flip side of this is expressiveness. Domain-specific languages (DSLs) come to mind. Being able to define little languages to solve a particular problem, and to compose larger applications from modules written in little languages, is an attractive idea. I'd like to say more about DSLs and how they relate to reflective systems, but I don't know much more than that they've been seeping into programming culture for quite some time.

Incidentally, Aspect-Oriented Programming (AOP) exists to address the fact that cross-cutting concerns — any application requirements that cause code to be scattered throughout a code base (e.g. logging, security) — violate modularity. However, it also bridges the gap between the limited form of reflection that exists in Java and some of the more interesting uses of reflection Maes mentions — such as being able to trace the execution of a program (with e.g. print statements) without modifying the program itself. This is no coincidence. Gregor Kiczales, the author of a book about the MetaObject Protocol, which was inspired by Maes' paper, is a coauthor of the earliest paper on AOP. In a sense, implementations of AOP attempt to augment a language with a meta-object protocol without changing the language itself.


Thursday, November 12, 2009

A Little History of Electronic SEC Filings

In 1986, the Securities and Exchange Commission started to accept SEC filings — 10-K forms and the like — electronically. Between '86 and '92, only a handful of companies filed their 10-K electronically. The companies?

1986
  • Medical Monitors, Inc.
1987
  • Medical Monitors, Inc.
1988
  • Medical Monitors, Inc.
  • Fast Eddie Racing Stables, Inc.
1989
  • Medical Monitors, Inc.
  • Fast Eddie Racing Stables, Inc,.
  • Jilco Industries, Inc.
  • Whitney American Corp.
  • Filmagic Entertainment Corp.
  • First Boston Mortgage Sec. Corp. Con Mor Pas Thr Cer CR 1989-2
  • First Boston Mortgage Sec. Corp. Con Mor Pas Thr Cer CR 1989-3
  • First Boston Mortgage Sec. Corp. Con Mor Pas Thr Cer CR 1989-5
1990
  • Medical Monitors, Inc.
  • Fast Eddie Racing Stables, Inc,.
  • Jilco Industries, Inc.
  • Filmagic Entertainment Corp.
  • Xanthic Enterprises, Inc.
  • First Boston Mortgage Sec. Corp. Con Mor Pas Thr Cer CR 1988-1
  • First Boston Mortgage Sec. Corp. Con Mor Pas Thr Cer CR 1988-2
1991
  • Medical Monitors, Inc.
  • Fast Eddie Racing Stables, Inc,.
  • Jilco Industries, Inc.
  • Filmagic Entertainment Corp.
  • Xanthic Enterprises, Inc.
  • Admiral Financial Corp.
  • Quad Metals Corp.
  • First Boston Mortgage Sec. Corp. Con Mor Pas Thr Cer CR 1988-4
1992
  • Medical Monitors, Inc.
  • Fast Eddie Racing Stables, Inc,.
  • Jilco Industries, Inc.
  • Filmagic Entertainment Corp.
  • Xanthic Enterprises, Inc.
  • Admiral Financial Corp.
  • Quad Metals Corp.
  • American Housing Partners
  • First Boston Mortgage Sec. Corp. Con Mor Pas Thr Cer CR 1992-3
While I get the obvious kick out of Fast Eddie Racing Stables, Inc. — that it's publicly traded and was an early-comer to electronic SEC filing — my hunch is that these early companies are there not necessarily because of who runs them (although it's certainly possible that Medical Monitors, Inc. has a techno-saavy CEO) but more likely because 86-92 was a period of controlled introduction. In the following years, the number of electronically-filed 10-Ks were:
Year # of filings
1993 1305
1994 1249
1995 3460
1996 6482
I suspect that the SEC opened the flood gates only partially in 93-94, a bit more in 95, and completely in 96. On the other hand, this was the era during which the Internet was really taking off, so it's hard to distinguish between what the SEC mandated and what it allowed.

If you're wondering what those First Boston Mortgage things are, they're probably asset-backed securities, which are by far the largest class of securities regulated by the SEC. Commercial state banks are up there, too. But it's interesting that the largest group of securities aren't even operating companies — they're just a publicly-traded piece of paper that represents a share of ownership in some asset.

Sunday, November 08, 2009

Modeling the World Wide Web

Just so I don't forget, some papers by Filippo Menczer, who appears to be doing work related to an idea I've been mulling over for a while.

Informally, consider the World Wide Web as a graph with web pages as nodes and hyperlinks as edges, label each node with some value derived from the contents of the web page (e.g. the length of the page, the set of terms in the page, or the term vector for the page), then define the value of an edge as the difference between the nodes it connects. This basically yields a geometric representation of the web graph by mapping it into some metric space. (If the idea is still fuzzy, imagine a hyperlink as a function from the document that contains it to the document to which it points.)

Now that I've found some already-published work on this model, I'll have to spend next semester learning what people have already done so I can do something new.

Friday, November 06, 2009

Independent Study: Traits: Composable Units of Behaviour

(This is one of a series of posts about papers I'm reading for an independent study with Prof. Evan Chang at the University of Colorado, Boulder. The format is similar to that of a review of a paper submitted to a computer science conference. There are already-published papers, so I'll be writing with the benefit of hindsight, especially when the paper was published at least several years ago.)

Submission: Traits: Composable Units of Behaviour [PDF]

Please give a brief, 2-3 sentence summary of the main ideas of this paper:

If you start with the reasonable assumption that code reuse improves programmer productivity, an important question is how to increase code reuse. Historically, inheritance — both single and multiple — has been a mechanism for code reuse, in as much as it has allowed classes to be composed at least in part from other classes. Mixins solve some of the problems of single and multiple inheritance, but they don't work well when a class uses two or more mixins containing identically-named methods. Traits solve the same problems as mixins, but don't suffer from their limitations.
What is the strength of this paper (1-3 sentences):
Traits solve the problems of previous attempts to facilitate code reuse. Further, they suggest a style of designing software — as collections of traits (which ideally define only a single method) instead of collections of classes — that may be useful in its own right.
What is the weakness of this paper (1-3 sentences):
None noted.
Evaluation:
Wonderful!
Novelty:
The authors note that traits are inspired by mixins, and in one (arguably incorrect) sense, traits are merely an incremental improvement on mixins; however, the deficiencies the authors identify in mixins are real (and especially important for large, complex applications) and require solving, so the fact that traits aren't light years ahead of mixins is irrelevant, as the improvements traits provide are necessary.
Convincing:
Yes. The refactoring example helps the authors make a stong case for traits.
Worth solving:
See my response about novelty above.
Confidence:
I'm confident in the material in this paper.
Detailed comments:

The notion that code reuse improves programmer productivity is non-controversial. A library of well-tested and widely-used classes in a language's ecosystem provides not only implementations of common functionality (thereby relieving programmers of having to implement and test that functionality themselves) but also (because classes and methods are named) a common language for communication among programmers — both of which reduce "friction" during software development.

A problem for API designers is that code reuse in most common object-oriented languages is done at the level of the class definition — a client class T of the some API class U reuses U either by extending U or by otherwise referring to U (e.g. statically, as the type of a local variable, as the type of a member variable). This can lead to an undesirable decoupling of generic functionality, making it hard for programmers to discover reusable code. For example, in Java, to sort an object of the type of a class which implements java.util.List, one needs to know to use the sort method of the Collections class. Acquiring this knowledge isn't necessarily onerous for a programmer, but it likely leads novice programmers to implement their own sorting methods. A language with mixins or traits does not have this problem. In such a language, a class C that can be sorted uses the mixin or trait that provides a generic sorting routine (as long as C provides whatever methods the mixin or trait requires [e.g. something akin to the compareTo of Java's Comparable]), and discovering that C can be sorted is a matter of invoking an IDE's method autocomplete feature or, absent IDE support, a cursory examination of the definition of C. This is similar to the problem of jungloid navigation, where a programmer knows what kind of object she wants, but doesn't know how to get it, except in this case a programmer knows what she wants to do with an object (i.e. sort a list), but doesn't know how to do it (i.e. call Collections.sort).

A little history on multiple inheritance, interfaces, and mixins. So far as I've been able to find, interfaces first appeared in Modula-2 (1978), where they're called definition modules. From the Modula-3 report (I wasn't able to find one for Modula-2): "An interface is a group of declarations. Declarations in interfaces are the same as in blocks, except that any variable initializations must be constant and procedure declarations must specify only the signature, not the body." Modula-3 (1980s) also had multiple inheritance. Mixins first appeared in an OO extension to LISP (called Flavors [circa 1980]). Flavors fed into the Common Lisp Object System, where the concept of a Meta-Object Protocol was first implemented. Perl's Moose is built atop Class::MOP, which was inspired by the CLOS MOP.







Sunday, November 01, 2009

It's Expensive Being Rich

When you open a web page in a browser, the browser loads the page and all other resources to which the page refers. Some of those resources are files containing JavaScript code, and those files keep getting larger. Some of them can be large enough to noticeably delay the complete loading of the web page, which is a real problem for web site operators. Their dilemma is that users demand features, and JavaScript is a way to provide lots of features, but users also demand that pages load quickly, and adding more and more JavaScript increases the time it takes a page to load. Without changing the underlying technology, it's akin to a zero-sum game, or a game of Whack-a-Mole, or whatever notion you prefer for identifying a situation like this. The two requirements — provide a rich end-user experience, provide it quickly — are to some degree at odds with one another.

Recently James Hamilton pointed out a cool research project which transformed JavaScript source files into a form that allows the source code to be loaded only when needed. Another approach to dealing with this problem — an approach that is complementary to the approach mentioned by Hamilton — is to modify the HTML script element to indicate whether the JavaScript source file needs to be loaded immediately or can be loaded "lazily." This approach exists in the script element section of the HTML 5 draft specification.

Meditations on Meat

Exhibit #1: "My childhood was typical, summers in Rangoon, luge lessons. In the spring we'd make meat helmets."

Exhibit #2: Turducken

Exhibit #3: Report: Meat Now America's No. 2 Condiment

Discuss.