Hire Us

CouchDB and CouchApp

To write a web application today there are many languages, frameworks and libraries. One of the main languages in this set is JavaScript, because this technique works in most browsers without any additional third-party libraries and is quite flexible.

At the moment, it is possible to write entire applications to be impressive in its simplicity and efficiency by only using JavaScript. In this article I want to tell about the two interesting technologies – CouchDB and CouchApp. This two things can help write web applications by using only JavaScript (Of course you also need to know HTML/CSS).

Hello CouchDB

So what is CouchDB?
CouchDB – is an open source document-oriented database written mostly in the Erlang programming language. Yep, its NoSQL. Main features:

  • Document Storage – All documents stores in JSON format, so schema of records is flexible, like in MongoDB
  • Document Revisions – All documents have list of revisions (old versions of this document), this can help do rollback of documents
  • REST API – it stored all items as a resources, each have unique URI and you can do CRUD (Create, Read, Update, Delete) operations on all resources.
  • Distributed Architecture with Replication – CouchDB was designed with bi-direction replication (or synchronization) and off-line operation in mind. That means multiple replicas can have their own copies of the same data, modify it, and then sync those changes at a later time. The biggest gotcha typically associated with this level of flexibility is conflicts.
  • Eventual Consistency – CouchDB guarantees eventual consistency to be able to provide both availability and partition tolerance.

I also write the legend about CouchDB, which you need to know when reading this articles:

  • Design documents – are a special type of CouchDB document that contains application code.
  • Views – the combination of a map and a reduce function (MapReduce!). For information, map functions are called once with each document as the argument. The function can choose to skip the document altogether or emit one or more view rows as key/value pairs. Map functions may not depend on any information outside of the document. This independence is what allows CouchDB views to be generated incrementally and in parallel. Views used for filtering and finding the documents in CouchDB (No SQL, only Views).
  • Show functions, as they are called, build HTML or another output format by document and have a constrained API designed to ensure cacheability and side effect–free operation.
  • List functions, just as show functions convert documents to arbitrary output formats, allow you to render the output of view queries in any format. The powerful iterator API allows for flexibility to filter and aggregate rows on the fly, as well as output raw transformations for an easy way to make Atom feeds, HTML, CSV files, or even just modified JSON.

First of all you need install CouchDB. Its not covered in this article, because this information simple find in Internet. After installation, check if all work fine by curl request (localhost change by ip of your server, where you install CouchDB):

$ curl http://localhost:5984

{"couchdb":"Welcome","version":"1.1.1"}

You should see looks like the same string, when all work fine.

After building and running CouchDB at http://localhost:5984/_utils/ will be available to web-based interface for managing a database – the Futon.

More screenshots here http://couchdb.apache.org/screenshots.html.

By this web interface, you can manage databases and configuration of CouchDB. As you can see, you can manage all data in your database in two ways in CouchDB: by HTTP requests (REST) or from Futon.

Now lets talk about documents in CouchDB. In CouchDB exists two main kinds of document. First used for storing information, but design type of document used for storing applications code, which can contain views, shows and list functions, attachments, etc. So, you can build and save design document in CouchDB, which will contain your application and will be working from CouchDB (Futon will serve your application as HTTP server) and using CouchDB as data storage. For creating such type of documents we using CouchApp.

CouchApp – feel power of JavaScript in CouchDB

CouchApp – is a JavaScript and HTML5 applications served directly from CouchDB. Your application on JS/HTML/CSS build and push into CouchDB by this tool. Because it runs inside a database, the application API is highly structured. Exists CouchApp, which written on Python (http://couchapp.org/) and written on Node.js (https://github.com/mikeal/node.couchapp.js). I prefer use Node.js version, because the structure of such application more easy for beginning than Python version.

Lets write a simple application, which just say for us information about how many documents in the database. I am using JQuery and JQuery.Couch (https://github.com/daleharvey/jquery.couch.js) when working with javascript. First of all, lets generate application skeleton (Node.js and CouchApp should be installed):

$ couchapp boiler simple_on_couch
$ cd simple_on_couch/
$ ls
app.js          attachments

The main file here is “app.js”, which contain information about build design document (https://gist.github.com/2002829).

“ddoc” is a hash, which pushed in CouchDB as design document. It contain views, show and list functions, rewrites (for routes in application), validation function for CouchDB documents and attachments (js/html/css files of application). As you can see “ddoc” have key “_id” with value “_design/app”. This is the unique key of document in CouchDB save. If document don’t set this key, CouchDB auto generates this key for the document. The keys of design documents must begin from string “_design/”, by such way CouchDB understand that this is design document.

Now I added in body of “attachments/index.html” text

<h1>Hello CouchDB!</h1> 

and push this document on CouchDB server. But before you need create database on CouchDb, which will server this design document. You can do this from Futon or by curl command:

$ curl -X PUT http://localhost:5984/simple_on_couch
{"ok":true}

Now push this document in this database:

$ couchapp push app.js http://localhost:5984/simple_on_couch
Preparing.
Serializing.
PUT http://localhost:5984/simple_on_couch/_design/app
Finished push. 1-b5e7352550b9a49891c4a4b5419d87e6

Now we can look on results by url http://localhost:5984/simple_on_couch/_design/app/index.html

Now lets add view, which will count all documents in this database. Go to “app.js” and add after “ddoc.views = {};” line:

ddoc.views.count_of_documents = {
  map: function(doc) {
      emit(doc._id, 1);
  },
  reduce: function(keys, values, rereduce){
    if (rereduce) {
      return sum(values);
    } else {
      return values.length;
    }
  }
};

Map function contain all documents and create key/value B-tree and reduce function count sum of values of this B-tree. We can optimize reduce function by this:

ddoc.views.count_of_documents = {
  map: function(doc) {
      emit(doc._id, 1);
  },
  reduce: '_count'
};

Latest versions of CouchDB have additional functions for reduce, written on Erlang, which do reduce function quicker, then JavaScript functions. In this code we using function “_count”.

Now we need get data from this view and show it in HTML. Add this code to “attachments/index.html”:

<p>Count of documents: <span id="countOfDocuments"></span></p>

and this code to javascript file in attachments folder:

$(function(){
  var db = $.couch.db('simple_on_couch');
  db.view("app/count_of_documents", {
    success: function(data){
      if (data.rows.length > 0) $('#countOfDocuments').text(data.rows[0].value);
    }
  });
});

By line “var db = $.couch.db(‘simple_on_couch’);” we set database for jquery.couch, and by “db.view” send a request to CouchDB view. On the success, we set count of documents into html.

On this screen we don’t have documents in the database, except design document. Lets create some documents from Futon:

And reload our application:

As you can see it works. The next time we will manage documents from our application.

The source code of this application you can find by this link (https://github.com/railsware/simple_on_couch/tree/v1).

CouchDB Tips

As described above, we have built a simple couchapp application. Now I will cover CouchDB separately. There are a number of tips for using CouchDB described below.

Filtering Views by Parts of a Complex Key

In CouchDB, the sorting of view results is based upon the key. In some cases, you need only filter by the first part of complex key. For example, the last part of keys used for ordering (in my practical work with CouchDB such possibility is often required). You need to select ordered data by keys [user_id, group_id, timestamp] and you have only user_id and group_id.

Thanks Ryan Kirkman for his article, in which he show how to solve such a problem. You have to use “startkey” and “endkey” if you want to filter by part of a complex key. If you want to filter using just “key” all parts of the complex key must be specified or you will get a null result, as “key” is looking for an exact match.

Note that when filtering by part of the complex key, you can only filter by in-order combinations. For example, if you had [field1, field2, field3] as a key, you could only filter by [field1], [field1, field2] or [field1, field2, field3]. You could not, for example, filter by [field1, field3], as CouchDB would interpret the key you specified for field3 as the value to filter field2 by.

The syntax required to use startkey=…&endkey=… when you want to filter on only part of a complex key is as follows:

Say we had a key like [user_id, group_id, timestamp], and we wanted to filter on only user_id and group_id where user_id = 123 and group_id = 456. Our url would look like:

http://localhost:5984/database/_design/app/_view/messages?startkey=[123, 456]&endkey=[123,456,{}]

Notice the “{}” in the “endkey”. This is so that we get all values returned by the view between “null” and “{}”, which for just about every case should be everything.

Rebuilding of views

Before CouchDB version 1.1.0 there was a small problem existing. View was automatically rebuilt on every request. If you have a huge number of documents, then such operation takes a long time. To solve this problem, a “stale=ok” parameter was proposed. It returns last built view results without rebuilding (of course, on first request it will still build view results). In this case, you need to reset cached view results by crontab or find another way for this. Starting from version 1.1.0 a new parameter called “stale=update_after” exists. It provides the same effect as «stale=ok», but the view rebuilds automatically after response.

Use the native reduce functions written on Erlang

Do not reinvent the wheel. You can find such code in the documentation as an example:

function (key, values, rereduce) {
   return sum (values);
}

Try to avoid them and use the native reduce functions written on Erlang: “_count” and “_sum”, which also operate faster than Javascript analogs.

Use more databases

In many books for beginners (including CouchDB: The Definitive Guide) examples looks very nice, but it isn’t combined with real life. As soon as the number of your document grows the development of temporary views becomes almost impossible, because the server now needs to go through all your documents for compliance with the map-function. The logic of CouchDB is following: when you update a document in the database – it affects all documents. Therefore, completely all documents update their ETag when updating just a single document. This is one disadvantage in using many documents from various fields. At the same time, update of a document does not affect the ETag of other documents, because the ETag of documents is their latest revision. Solving these problems help division of documents (by types or another logical structures) by databases.

Cache data using the ETag

Receiving data from CouchDB with headers “If-Modified-Since/ETag” is really a fast data retrieval. Do not forget that when you use headers “If-None-Match”, with response status 304 response body is always empty, because the server assumes that you are storing data.

Each update of the document leads to the creation of its newer revision. Also, this leads to the rebuild of views, in which this document is used (on addition and removal of documents also rebuild views) in their next call. All old revisions are saved, and not always you need to have all document revisions. Database size is growing, so don’t forget to perform the operation for the compaction of all documents. This saves a lot of free disk space.

Do not use CouchDB for frequently inserted or updated data

All NoSQL has the best usage. The fetch of data is an ideal usage for CouchDB. For frequently inserted or updated data CouchDB isn’t a good solution. Why? First reason I described in “Rebuilding views”. Second reason, is that on each update of documents CouchDB creates a new revision, so there will be a very fast growth of database size on the server. CouchDB is an ideal solution to build CRM, CMS systems.

Full-text search

CouchDB is suitable for many tasks, but not for all. Full-text search falls under this exception. Since we can not pass a parameter directly in a view, we cannot find anything like in the database. So you will not be able to organize a search on the site using CouchDB.
To solve this problem – use a separate search engine. There are aready solutions like Lucene (connected by couchdb-lucene) and Elasticsearch (with a plugin for CouchDB) exists.

Geo search

Another problem in CouchDB is a geo search, for example, find all objects within N miles. In SQL-like database, this task is implemented using a small function that allows to determine the distance between two points by latitude and longitude. In CouchDB, we have only one scale sorting keys, so finding all the points that fall into the square – almost impossible. Almost – means having exist solution isn’t perfect. Geo search can be implemented by Geohash. The implementation is that any position on a map can be represented as a numeric-literal hash. At the same time, rather than the specified coordinates, the greater the length of the hash. Thus, you can transfer geohash as the key and to vary its length in the parameters startkey/endkey to refine the search radius (of course this is not the radius).

Deploy CouchApp application to the production environment

We have successfully built a simple couchapp application and discussed several tips for using CouchDB on production. We can now deploy our CouchApp application to the production environment.

CouchApp have built-in “push” command, which helps you to build and push design document along with your application to CouchDB. In Python version of CouchApp such command looks like:

$ couchapp push http://someserver:port/mydb

where “http://someserver:port/mydb” is a direction to your database in CouchDB. For Node.js version such command looks like this (almost the same):

$ couchapp push app.js http://someserver:port/mydb

Then you can check your application by such link:

http://someserver:port/mydb/_design/myapp/index.html

where “myapp” is your app design document name, “index.html” is your HTML file.

Now lets add a domain for this application and display it without this long URI.

To add a virtual host, add a CNAME pointer to the DNS for your domain name. Add an entry to your Configuration File (default linux location “/etc/couchdb/local.ini”) in the [vhosts] section:

[vhosts]
simple_app.couchdb = /mydb/_design/myapp/_rewrite

“simple_app.couchdb” is a virtual host of application.

Next, add vhosts entry to couchdb by visiting configuration page in Futon app and adding a new section:

section = vhosts
option = simple_app.couchdb 
value = /mydb/_design/myapp/_rewrite

Finally, configure Nginx to proxy the requests to the running CouchDB (or use another proxy server):

server {
  listen 80;
  server_name simple_app.couchdb

  location / {
    proxy_pass http://0.0.0.0:5984;
    proxy_redirect off;
    proxy_set_header Host $host;
    proxy_set_header X-Real-IP $remote_addr;
    proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
  }

}

Now your application should be available by this url:

http://simple_app.couchdb

That’s all folks!

More information