The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

OpenResty::Spec::Overview - Overview of the OpenResty service platform

INTRODUCTION

OpenResty is a general-purpose RESTful web service platform for web applications. It provides the following important functionalities for a common nontrivial web app:

  • (scalable) relational data storage,

  • truely RESTy interface and JSON/YAML data transfer format,

  • SQL-based reusable views,

  • a REST-oriented role system for access control,

  • view-based RSS feeds,

  • user-defined actions in the RestyScript language,

  • captchas,

  • and cross-site AJAX support.

What OpenResty is

  • A REST wrapper for relational databases

  • A web runtime for 100% JavaScript web sites and other RIAs.

  • A "meta web site" supporting other sites via web services.

  • A handy personal or company database which can be accessed from anywhere on the web.

  • A (sort of) competitor for the Facebook Data Store API.

What OpenResty is NOT

  • A server-side web application framework.

  • A relacement for highly scalable semi-structured data storage solutions like Amazon SimpleDB or CouchDB.

REST API

This section just gives a conceptual overview for the REST API probably with some samples. For detailed spec for the various REST request syntax, see OpenResty::CheatSheet and OpenResty::Spec::REST.

Accounts

An openresty server typically distributes its data in terms of accounts, especially when the backend is a database cluster. An account is an atomic namespace for other OpenResty first-class objects like models and views. (In the current Pg and PgFarm backends, accounts are actually implemented by Pg schemas.) These objects are shared in the same account and different accounts can have different models, views, actions, and etc. with the same names.

Operations like creating and removing accounts are not part of the OpenResty web service API. Basically the sysadmin uses the following command to create an account on his server terminal:

  $ bin/openresty adduser marry

and a similar command to remove one:

  $ bin/openresty deluser marry

Roles

Multiple users can share the same set of objects in an account by logging in as different roles. And fine-grained access control can be achieved by specifying different sets of ACL rules for each role.

Every OpenResty account has two builtin roles throughout its lifetime: Admin and Public.

The Admin role always owns the most privileges and its properties and ACL rule set are always read only. Public role is always anonymous but its ACL rule set can be modified by a role with enough privileges.

An OpenResty role with access to the Role API (such as Admin) can create new roles, remove existing roles (except the two builtin roles explained above, of course), and modify the properties and ACL rules of other roles or even itself. For instance, to allow the Public role to perform the request GET /=/model/Post/id/<some number> under the same account, the Admin role could insert a corresponding access rule to the Public role's ACL rule set, like this:

  POST /=/role/Public/~/~ HTTP/1.0
  Content-Type: text/json
  Content-Length: 45

  {"method":"GET", "url":"/=/model/Post/id/~"}

The JSON structure in the POST content specifies an ACL rule. The tild (~) character in the url value serves as a wildcard which matches "anything". So both GET /=/model/Post/id/1 and GET /=/model/Post/id/231 are allowed to be performed by the Public role.

Interestingly it's also possible to grant the Public role privileges to augment its own ACL rule set in a similar way:

  POST /=/role/Public/~/~ HTTP/1.0
  Content-Type: text/json
  Content-Length: 46

  {"method":"POST", "url":"/=/role/Public/~/~"}

Login

Every user accessing an OpenResty server must specify both its account name and its role name unless he or she has already logged in and got a session ID. For example, a typical HTTP request may look like this:

  GET /=/model/Post/id/3?_user=agentzh.Public HTTP/1.0

In the above example, the _user parameter has the value agentzh.Public where agentzh is the account name and Public the role name. In addition, the Public role is an anonymous role, or a _password or a _captcha parameter would be required here as well. This authentication method is called "per-request login".

Alternatively, the user can login with his user name and MD5'd password first so as to obtain a session ID which can be used for subsequence requests. For example:

  GET /=/login/agentzh.Admin/5f4dcc3b5aa765d61d8327deb882cf99 HTTP/1.0

will yield an HTTP response from the OpenResty server like this:

  HTTP/1.0 200 OK
  Connection: close
  Content-Type: text/json; charset=UTF-8
  Content-Length: 133
  Date: Mon, 21 Apr 2008 11:51:49 GMT

  {
      "success": 1,
      "session": "535F265E-0F99-11DD-B185-1A3EB9E8D9B0",
      "account": "agentzh","role":"Admin"
  }

And subsequent requests can be made by using the resulting session ID:

  GET /=/model/Post/id/3?_session=535F265E-0F99-11DD-B185-1A3EB9E8D9B0

For convenience, the sample HTTP requests given throughout this document will not specify the _user nor the _session parameter explicitly.

It's worth mentioning that the simple MD5 treatment of passwords in the current implementation is merely a hack and will be changed in the near future. It's highly recommended to use SSL for the password login method for any serious uses.

Models

An OpenResty model is just an abstract concept of tables found in common relational databases. An instance of an OpenResty model could be a blog post:

  {
    "description":"Blog post",
    "columns": [
      { "name":"title", "label":"Post title", "type":"text" },
      { "name":"content", "label":"Post content", "type":"text" },
      { "name":"author", "label":"Post author", "type":"text" },
      { "name":"created", "default":["now()"],
          "type":"timestamp (0) with time zone",
          "label":"Post creation time" },
      { "name":"comments", "label":"Number of comments",
          "type":"integer", "default":0 }
    ],
  }

This is approximately the Post model used in my personal blog site http://blog.agentzh.org. The rough SQL equivalence could be as follows:

  create table "Post" (
      title text,
      content text,
      author text,
      created timestamp (0) with time zone default now(),
      comments integer default 0
  )

Although the data storage backend may be truly implmented this way, the column types and names that can be used here are well defined and reasonably limited.

After creating a model, one can insert data via an HTTP POST request:

  POST /=/model/Post/~/~ HTTP/1.0
  Content-Type: text/json
  Content-Length: 111

  {
    "title":"My first post",
    "content":"Blah blah blah...",
    "author":"Agent Zhang"
  }

Multiple rows can be inserted at a time as well, but there's a limit.

The model API not only offers interfaces to perform CRUD operations on models, columns, and rows, but also gives some simple but still powerful query functionalities. Here's an example:

  GET /=/model/Post/author/agentzh?_order_by=created:desc&_count=10 HTTP/1.0

which is roughly equivalent to the following standard SQL query:

  select *
  from "Post"
  where "author" = 'agentzh'
  order by created desc
  count 10

Views

To address the problem of extending the limited data query interface provided by the model API, OpenResty integrates a view system in which the user can define reusable SQL-like queries by means of the RestyScript language. Here is an example:

  POST /=/view/~ HTTP/1.0
  Content-Type: text/json
  Content-Length: 312

  {
    "name": "CommentsToAuthor",
    "description": "Recent comments for the blog",
    "definition": "
         select Comment.sender as guest,
                Comment.body as comment_body
         from Comment, Post
         where Comment.id = Post.id and Post.author = $author"
  }

In this sample, the string literal for the definition slot has been splitted into multiple lines for readability. The RestyScript language for views is just a (non-strict) subset of the standard SQL language, thus giving powerful strucutred query capability to the user, which is often a missing feature in those highly-distributed and semi-structured data storage solutions like CouchDB and SimpleDB.

Unlike SQL, however, the view definition can take one or more parameters (or named place-holders) which are required to feed values while invoking the view (unless they have a default value):

  GET /=/view/CommentsToAuthor/author/agentzh

Or equivalently

  GET /=/view/CommentsToAuthor/~/~?author=agentzh

The HTTP response from the OpenResty server might be

  HTTP/1.0 200 OK
  Connection: close
  Content-Type: text/json; charset=UTF-8
  Content-Length: 187
  Date: Mon, 21 Apr 2008 12:42:15 GMT

  [
    {"guest":"laser","comment_body":"super cool!"},
    {"guest":"carriezh","comment_body":"hello?hello?"},
    {"guest":"agentzh":"comment_body":"Thanks for commenting!"}
  ]

Feeds

OpenResty offers the feed objects which can be used to map OpenResty views to RSS 2.0 feeds. For instance, the OpenResty feed object for my blog posts looks like this:

    {
        "description": "Feed for blog posts",
        "author": "agentzh",
        "copyright": "Copyright 2008 by Yahoo! China EEEE Works",
        "language": "zh-cn",
        "title": "Posts for Human & Machine",
        "link": "http://blog.agentzh.org",
        "logo": "http://blog.agentzh.org/me.jpg",
        "view": "PostFeed"
    }

and the PostFeed view used to generate this feed has the following definition:

    {
      "description":"View for post feed",
      "definition":
      "
 select author, title, 'http://blog.agentzh.org/#post-' || id as link,
        content, created as published,
       'http://blog.agentzh.org/#post-' || id || ':comments' as comments
 from Post
 order by created desc
 limit $count | 20
      "
    }

Here the PostFeed view takes one optional parameter $count (with the default value 20), which controls the number of resulting rows returned.

Not every view can be used to drive feed generation. The resulting rows of the view must have the columns that make sense to the feed, like author, title, link, content, published, and comments.

After creating the Post feed in my agentzh account, one can subscribe to the feed by the following GET request:

  GET /=/feed/Post/~/~ HTTP/1.0

Check http://api.eeeeworks.org/=/feed/Post/~/~ to see what the actual response looks like.

One nice thing about the feed object is that it can forward arguments to the view that drives it:

  GET /=/feed/Post/count/100 HTTP/1.0

This will produce the RSS 2.0 feed for the last 100 post entries rather than the default 20, giving more options to my blog readers.

Actions

An openresty action is a bunch of RestyScript commands with a name attached to it. Such a command can be either a SQL-like statement or an HTTP-like command.

An example for SQL-like commands could be

    update Post
    set comments = comments + 1
    where id = $post_id

In this update command, Post is the name of an OpenResty model (assuming it's already there), comments is one of its columns, and $post_id is a parameter for the whole action (similar to parameters for views).

An instance of HTTP-like commands could be

    POST '/=/model/Comment/~/~'
    { "sender": $sender, "body": $body, "post": $post_id }

Here the http://blah.blah.blah part is omitted in the POST URL, so it's default to the current OpenResty server being requested. If a full URL is specified here, one can do some really cool things by invoking the resources of some other OpenResty server.

Similarly, for the SQL-like command such as:

    delete from Comment
    where post = $post_id and sender = $spammer

[TODO: a optional run on clause might be specified to run "SQL" on remote OpenResty servers if permitted.]

Furthermore, we can put multiple RestyScript commands together using the ; separator to define a full action object:

    {
        "name": "PostComment",
        "description": "Action for posting a comment",
        "parameters":[
            {"name":"post_id","label":"Post ID","type":"literal"},
            {"name":"sender","label":"Sender","type":"literal","default":"agentzh"},
            {"name":"body","label":"Body","type":"literal"}
        ],
        "definition":
        "
            update Post
            set comments = comments + 1
            where id = $post_id;
            POST '/=/model/Comment/~/~'
            { "sender": $sender, "body": $body, "post": $post_id }
        "
    }

We still split the definition string into multiple lines for readability. The PostComment action defined here takes 3 parameters, i.e. $post_id, $sender, and $body.

One can invoke the PostComment action like this:

    GET /=/action/PostComment/~/~?post_id=3&sender=marry&body=Haha
    HTTP/1.0

The server response would be an array of results for each command. If any of the commands fails, the whole action would fail. Even preious successfully executed commands would get rolled back. That is, actions always run in a transaction.

With actions the user can encapsulate multiple OpenResty REST requests as well as SQL-like update and delete statements as a whole and reuse as many times as he wishes. Such atomicity would be very useful in the context of captcha authentication. (See the "Captchas" section for more information.)

More interestingly it would be possilbe to call other actions or views in an action, or even call the action itself (i.e. recursive calling). Special constraints would be imposed on the length of the action call chain though. There would also be some limit regarding the total number of commands grouped in an action.

Captchas

Captchas are just another way to do "per-request login" in addition to the anonymous and password login methods.

Captcha support must be associated with some user-defined role whose "login" attribute is set to the value "captcha", like this:

    POST /=/role/CommentPoster HTTP/1.0
    Content-Type: text/json
    Content-Length: 64

    {"description":"Role for posting comments","login":"captcha"}

Therefore, it's not hard to see that it's not possible to do captchas with roles like Public or Admin.

Then we should grant priviledges to the operations that need to performed by solving a capthca challenge for the CommentPoster role:

    POST /=/role/CommentPoster/~/~ HTTP/1.0
    Content-Type: text/json
    Content-Length: 48

    {"method":"POST","url":"/=/model/Comment/~/~"}

Then the clients (like the JavaScript code in a web page) could do the following:

  1. Use GET /=/captcha/id to obtain a captcha ID from the OpenResty server.

  2. Use the captcha ID, say, B44572D0-1038-11DD-B185-1A3EB9E8D9B0, to fetch the catpcha image:

        GET /=/captcha/id/B44572D0-1038-11DD-B185-1A3EB9E8D9B0 HTTP/1.0
  3. With the captcha ID along with the captcha solution given by the end user the following operation can be tried:

        POST /=/model/Comment/~/~?_user=agentzh.CommentPoster\
            &_captcha=B44572D0-1038-11DD-B185-1A3EB9E8D9B0:pretty%20cat \
            HTTP/1.0
        Content-Type: text/json
        Content-Length: 52
    
        {"sender":"agentzh","body":"Good post!","post":25}

If the user solution pretty cat (i.e. "pretty%20cat") provided in the captcha URL parameter is incorrect, the server would reject the whole POST operation.

Cross-site AJAX

The OpenResty server opens a special door to the JavaScript code in a web page coming from other domains, so as to allow REST requests get directly initiated from the end user's web browser.

For GET requests, it's the common practice to do cross-domain AJAX requests via dynamically created <script> tags. To help the page owner do this trick with an OpenResty server, the special _callback URL parameter is supported to make the server returning JSON data wrapped by some_callback_func( and );. For example, the request

  GET /=/view/RecentPosts/~/~?_user=agentzh.Public&_callback=foo HTTP/1.0

yields something like this

  HTTP/1.0 200 OK
  Connection: close
  Content-Type: application/x-javascript; charset=UTF-8
  Content-Length: 74
  Date: Mon, 21 Apr 2008 12:42:15 GMT

  foo(
    [
        {"title":"My first post","id":1},
        {"title":"My second one","id":2}
    ]
  );

Note the extra stuff foo(...) around the JSON data.

POST, PUT, and DELETE requests all have their GET variations:

    GET /=/post/...
    GET /=/put/...
    GET /=/delete/...

where ... is the normal stuff in POST /=/..., PUT /=/..., and DELETE /=/..., respectively. Some people might be nervous about GET requests doing data modification but I can't think of a better way.

Cookies for authentication should always be excluded due to the risk of XSS attacks.

To overcome the length limit of URLs, cross-site POST interface is also supported. Basically the user could use an HTML form in his web page like below:

  <form action="/=/model/Comment/~/~?_last_response=69bc45ec71ca7dc83cc"
        method="POST"
        onclick="onPostComment()"
        target="myHiddenFrame">
    <input type="hidden" name="data" value="{some JSON goes here...}">
  </form>
  <iframe id="myHiddenFrame" style="display: none"></iframe>

and the browser may initiate a POST request when submitting this form:

  POST /=/model/Comment/~/~?_last_response=69bc45ec71ca7dc83cc HTTP/1.0
  Content-Type: application/x-www-form-urlencoded
  Content-Length: 23

  data={some JSON goes here...}

which is functionally equivalent to

  POST /=/model/Comment/~/~
  Content-Type: text/json
  Content-Length: 23

  {some JSON goes here...}

with the exception that the OpenResty serser would save the response of the current POST request and allow the user to retrieve it later (using the same _last_response key):

  GET /=/last/response/69bc45ec71ca7dc83cc

Note that the _last_response key 69bc45ec71ca7dc83cc used here should be randomly selected and globally unique enough.

The "last response" stuff is essential for the web page client to obtain the response of its POST request because there's no (known) way for the JavaScript code to directly "look" into the target frame (i.e. the myHiddenFrame iframe in the above sample) belonging to the OpenResty server's domain.

As you might have already noticed, two HTTP round-trips are required to do a true POST, which is a bit expensive. We'll use the cross-site cookies (as well as p3p headers for IE) to deliver the response of POSTs to the JavaScript initiator.

CLIENTS

In theory, any programming languages or tools with basic HTTP 1.0/1.1 support would have access to 100% of the OpenResty services.

But to make things even easier, there are currently two ad-hoc OpenResty client library for JavaScript and Perl:

openresty.js

See http://svn.openfoundry.org/openapi/trunk/clients/js/.

WWW::OpenResty

See the WWW::OpenResty module on CPAN. Most of the time one would just use its subclass WWW::OpenResty::Simple which is much more handy IMHO ;)

SAMPLE APPLICATIONS

OpenResty's admin site

http://openresty.org/admin/

agentzh's blog and EEEE Works' blog

http://blog.agentzh.org

http://eeeeworks.org

Yisou BBS

http://www.yisou.com/opi/post.html

An IRC bot, springbot

http://svn.openfoundry.org/openapi/trunk/demo/Springbot/

Most of the sample apps' source code can be found at http://svn.openfoundry.org/openapi/trunk/demo/.

AUTHOR

Agent Zhang <agentzh@yahoo.cn>

COPYRIGHT AND LICENSE

  Copyright (c)  2008  Yahoo! China EEEE Works, Alibaba Inc.
  Permission is granted to copy, distribute and/or modify this document
  under the terms of the GNU Free Documentation License, Version 1.2
  or any later version published by the Free Software Foundation;
  with no Invariant Sections, no Front-Cover Texts, and no Back-Cover
  Texts. A copy of the license can be found at

    http://www.gnu.org/licenses/fdl.html

SEE ALSO

OpenResty::Spec::REST_cn, OpenResty, OpenResty::CheatSheet.