At its most basic, CherryPy is designed to allow the production of simple websites without having to think about any of the details of HTTP. Notice we're saying HTTP (the transport), not HTML (the markup language)! In particular, developers should not have to concern themselves with:
Responding to unpublished requests
Logging and notifying users appropriately when unhandled exceptions occur
The difference between query strings and POSTed params
The decoding and unpacking of request headers and bodies, including file uploads
Response status or headers
For the most part, simple "page handlers" (functions attached to
cherrypy.root
), should never have to refer to cherrypy at all! They receive
params via function arguments, and return content directly. Advanced functionality is
most often enabled via the built-in filters, which encapsulate the particulars of HTTP,
and can be completely controlled via the config file.
Simple apps are produced simply, but when a developer needs to step out of the
mundane and provide real value, they should be able to leverage the complete power and
flexibility of the HTTP specification. In general, the HTTP request and response messages
are completely represented in the cherrypy.request
and
.response
objects. At the lowest level, a developer should be able to
generate any valid HTTP response message by modifying
cherrypy.response.status
, .headers
, and/or
.body
.
The design of HTTP itself is guided by REST, a set of principles which constrain its expressivity and therefore its implementation. HTTP is a transfer protocol which enables the exchange of representations of resources. In a RESTful design, clients never expect to access a resource directly; instead, they request a representation of that resource. For example, if a resource has both an XML and an HTML representation, then an HTTP/1.1 server might be expected to inspect the Accept request header in order to decide which representation to serve in response.
It's important to clarify some terminology, here. In REST terms, a "resource" is "any concept that might be the target of an author’s hypertext reference...a conceptual mapping to a set of entities, not the entity that corresponds to the mapping at any particular point in time". A resource is not the request, nor the response, in an HTTP conversation. "The resource is not the storage object. The resource is not a mechanism that the server uses to handle the storage object. The resource is a conceptual mapping — the server receives the identifier (which identifies the mapping) and applies it to its current mapping implementation (usually a combination of collection-specific deep tree traversal and/or hash tables) to find the currently responsible handler implementation and the handler implementation then selects the appropriate action+response based on the request content."
CherryPy, therefore, does not provide REST resources, nor model them, nor serve them. Instead, it provides mappings between identifiers (URI's) and handlers (functions). It allows application developers to model resources, perhaps, but it only serves representations of resources.
By default, these identifier-to-handler mappings (which we will call "handler dispatch" from now on) follow a simple pattern: since the path portion of a URI is hierarchical, CherryPy arranges handlers in a similar heirarchy, starting at cherrypy.root, and branching on each attribute; every leaf node in this tree must be "exposed" (but the branches need not be, see section 2.2). Note in particular that, although the query portion of a Request-URI is part of the resource identifier, CherryPy does not use it to map identifiers to handlers. Application developers may use the query string to further identify the requested resource, of course, but CherryPy, not having any domain-specific knowledge about the format or semantic of a query string, doesn't try to guess.
Filters, then, are CherryPy's way to wrap or circumvent the default handler dispatch. EncodingFilter, for example, wraps the response from a handler, encoding the response body as it is produced. StaticFilter, on the other hand, intercepts some requests (based on the path portion of the Request-URI) and implements its own identifier-to-handler mapping. Developers who wish to provide their own handler dispatch mechanisms are encouraged to do so via a filter.
The cherrypy.request object contains request-related objects. Pretty lame description, but that's all it does; it's a big data dump. At the beginning of each HTTP request, the existing request object is destroyed, and a new one is created, (one request object for each thread). Therefore, CherryPy (and you yourself) can stick data into cherrypy.request and not worry about it conflicting with other requests.
This attribute is a string containing the IP address of the client. It will be an empty string if it is not available.
This attribute is an int containing the TCP port number of the client. It will be -1 if it is not available.
This attribute is a string containing the remote hostname of the client.
This attribute is a dictionary containing the received HTTP headers, with automatically titled keys (e.g., "Content-Type"). As it's a dictionary, no duplicates are allowed.
This attribute is a list of (header, value) tuples containing the received HTTP headers. In general, you probably want to use headers instead; this is only here in case you need to inspect duplicates in the request headers.
This attribute is a string containing the first line of the raw HTTP request; for example, "GET /path/page HTTP/1.1".
This attribute is a SimpleCookie instance from the standard library's Cookie module which contains the incoming cookie values from the client.
This attribute is the input stream to the client, if applicable. See cherrypy.request.processRequestBody for more information.
This attribute is the request entity body, if applicable. See cherrypy.request.processRequestBody for more information.
This attribute specifies whether or not the request's body (request.rfile, which is POST or PUT data) will be handled by CherryPy. If True (the default for POST and PUT requests), then request.rfile will be consumed by CherryPy (and unreadable after that). If the request Content-Type is "application/x-www-form-urlencoded", then the rfile will be parsed and placed into request.params; otherwise, it will be available in request.body. If cherrypy.request.processRequestBody is False, then the rfile is not consumed, but will be readable by the exposed method.
This attribute is a string containing the HTTP request method, such as GET or POST.
This attribute is a string containing the HTTP protocol of the request in the form of HTTP/x.x
This attribute is a Version object which represents the HTTP protocol. It's the
same os request.protocol, but allows easy comparisons like if
cherrypy.request.version >= "1.1": do_http_1_1_thing
.
This attribute is a dictionary containing the WSGI environment for the request. In non-WSGI settings (i.e., custom HTTP servers), it is absent.
This attribute is a string containing the query string of the request (the part of the URL following '?').
This attribute is a string containing the path of the resource the client requested.
This attribute is a dictionary containing the query string and POST arguments of this request.
This attribute is a string containing the root URL of the server. By default, it is equal to request.scheme://request.headers['Host'].
This attribute is a string containing the URL the client requested. By default, it
is equal to request.base + request.path
, plus the querystring, if
provided.
This attribute is a string containing the path of the exposed method that will be called to handle this request. This is usually the same as cherrypy.request.path, but can be changed in a filter to change which method is actually called.
This attribute is a string containing the original value of cherrypy.request.path, in case it is modified by a filter during the request.
This attribute is a string containing the original value of cherrypy.request.params, in case it is modified by a filter during the request.
The cherrypy.response object contains response-related objects. Pretty lame description, but that's all it does; it's a big data dump. At the beginning of each HTTP request, the existing response object is destroyed, and a new one is created, (one response object for each thread). Therefore, CherryPy (and you yourself) can stick data into cherrypy.response and not worry about it conflicting with other requests.
This attribute is a dictionary with automatically titled keys (e.g., "Content-Length"). It holds all outgoing HTTP headers to the client.
This attribute is a list of (header, value) tuples. It's not available until the response has been finalized; it's really only there in the extremely rare cases when you need duplicate response header_list. In general, you should use request.headers instead.
This attribute is a SimpleCookie instance from the standard library's Cookie module. It contains the outgoing cookie values.
This attribute is originally just the return value of the exposed method, but by the end of the request it must be an iterable (usually a list or generator of strings) which will be the content of the HTTP response.
This attribute is a string containing the HTTP response code in the form "### Reason-Phrase", i.e. "200 OK". You may also set it to an int, in which case the response finalization process will supply a Reason-Phrase for you.
This attribute is a Version object, representing the HTTP protocol version of the response. This is not necessarily the value that will be written in the response! Instead, it should be used to determine which features are available for the response. For example, an HTTP server may send an HTTP/1.1 response even though the client is known to only understand HTTP/1.0—the response.version will be set to Version("1.0") to inform you of this, so that you (and CherryPy) can restrict the response to HTTP/1.0 features only.
Start the CherryPy Server. Simple websites may call this without any arguments, to run the default server. If init_only is False (the default), this function will block until KeyboardInterrupt or SystemExit is raised, so that the process will persist. When using one of the built-in HTTP servers, you should leave this set to False. You should only set it to True if you're running CherryPy as an extension to another HTTP server (for example, when using Apache and mod_python with CherryPy), in which case the foreign HTTP server should do its own process-management.
Use the server_class argument to specify that you wish to use an HTTP server other than the default, built-in WSGIServer. If missing, config.get("server.class") will be checked for an alternate value; otherwise, the default is used. Possible alternate values (you may pass the class names as a string if you wish):
None
: this will not load any HTTP server. Note that this is
not the default; the default (if server_class is not given) is to load the
WSGIServer.
Any other class (or dotted-name string): load a custom HTTP server.
You must call this function from Python's main thread, and set init_only to False, if you want CherryPy to shut down when KeyboardInterrupt or SystemExit are raised (including Ctrl-C). The only time you might want to do otherwise is if you run CherryPy as a Windows service, or as an extension to, say, mod_python, and even then, you might want to anyway.
If the "init_only" argument to server.start is True, this will be False, and vice-versa.
Whatever HTTP server class is set in server.start will be stuck in here.
Whatever HTTP server class is set in server.start will be instantiated and stuck in here.
One of three values, indicating the state of the server:
STOPPED = 0: The server hasn't been started, and will not accept requests.
STARTING = None: The server is in the process of starting, or an error occured while trying to start the server.
STARTED = 1: The server has started (including an HTTP server if requested), and is ready to receive requests.
True if the server is ready to receive requests, false otherwise. Read-only.
Since server.start usually blocks, other threads need to be started before calling server.start; however, they often must wait for server.start to complete it's setup of the HTTP server. Use this function from other threads to make them wait for the HTTP server to be ready to receive requests.
Since server.start usually blocks, use this to easily run another function in a new thread. It starts the new thread and then runs server.start. The new thread automatically waits for the server to finish its startup procedure.
Stop the CherryPy Server. Well, "suspend" might be a better term—this doesn't terminate the process.
Usually None, set this to KeyboardInterrupt() or SystemExit() to shut down the entire process. That is, the new exception will be raised in the main thread.
A list of functions that will be called when the server starts.
A list of functions that will be called when the server stops.
A list of functions that will be called when each request thread is started. Note that such threads do not need to be started or controlled by CherryPy; for example, when using CherryPy with mod_python, Apache will start and stop the request threads. Nevertheless, CherryPy will run the on_start_thread_list functions upon the first request using each distinct thread.
A list of functions that will be called when each request thread is stopped.
This function returns the configuration value for the given key. The function checks if the setting is defined for the current request path; it walks up the request path until the key is found, or it returns the default value. If returnSection is True, the function returns the configuration path where the key is defined instead.
The getAll function returns a list containing a (path, value) tuple for all occurences of the key within the request path. This function allows applications to inherit configuration data defined for parent paths.
Function to update the configuration map. The "updateMap" argument is a dictionary of the form {'sectionPath' : { } }. The "file" argument is the path to the configuration file.
The Tree class is used to keep track of where applications are mounted. To "mount" an
application means to have its root respond to a URL other than "/". By using
cherrypy.tree
, you can easily mount applications and remember where you mounted
them!
Function to mount a tree of objects at the given baseurl, using the given
configuration dict or filename. If baseurl is None or missing, it is assumed to be "/"
unless the config specifies [global] mount_point = "/path/to/approot"
. If
conf is not None, then each of its sections (which should be a relative URL, like
"/skins/deepblue/main") will be prefixed with the baseurl, so that config lookups are
also "mounted" at the base URL.
Note that, by using tree.mount, your approot may not be found at cherrypy.root; there may be several "dummy" objects placed in-between cherrypy.root and your application's root instance.
A method which finds the appropriate baseurl for a given path. If path is None or
missing, cherrypy.request.object_path is used. If multiple applications "contain" the
given path, the longer baseurl is returned. That is, if App1 is mounted at "/" and App2
is mounted at "/path/to/app", then
cherrypy.tree.mount_point("/path/to/app/main")
will return
"/path/to/app".
Once you have obtained the baseurl using mount_point, you can obtain a reference to
the application root object by looking up
cherrypy.tree.mount_points[baseurl]
.
This exception can be used to automatically send a response using a http status code, with an appropriate error page.
This exception will redirect processing to another path within the site (without informing the client). Provide the new path as an argument when raising the exception. You may also provide a second "params" argument which will replace the current request params (usually a dict, but you may also supply a GET-param-style string). This exception is only handled from within page handlers and before_main filter methods.
Utility class that exposes a getitem-aware object. It does not provide index() or default() methods, and it does not expose the individual item objects - just the list or dict that contains them. User-specific index() and default() methods can be implemented by inheriting from this class.
Utility class that restores positional parameters functionality that was found in 2.0.0-beta.
Returns a list of AcceptValue objects from the specified Accept-* header (or None if the header is not present). The list is sorted so that the most-preferred values are first in the list.
Each AcceptValue object has a value
attribute, a string which is
the value itself. For example, if headername
is "Accept-Encoding", the
value
attribute might be "gzip". It also has a (read-only)
qvalue
attribute, a float between 0 and 1 which specifies the client's
preference for the value; higher numbers are preferred. Finally, each AcceptValue
also has a params
attribute, a dict; for most headers, this dict will
only possess the original "q" value as a string.
If headername
is "Accept" (the default), then the params attribute
may contain extra parameters which further differentiate the value. In addition,
params["q"]
may itself be an AcceptValue object, with its own
params
dict. Don't ask us why; ask the authors of the HTTP spec.
Returns a list of (start, stop) indices from a Range request header. Returns None if no such header is provided in the request. Each (start, stop) tuple will be composed of two ints, which are suitable for use in a slicing operation. That is, the header "Range: bytes=3-6", if applied against a Python string, is requesting resource[3:7]. This function will return the list [(3, 7)].
A subclass of Python's builtin dict
class; CherryPy's default
request.headers
and response.headers
objects are instances
of this class. The keys are automatically titled (str(key).title()
) in
order to provide case-insensitive comparisons and avoid duplicates.
Returns (method, path, querystring, protocol
) from an HTTP
requestLine. The default Request processor calls this function.
Returns a dict of {'key': 'value'}
pairs from an HTTP "key=value"
query string. Also handles server-side image map query strings. The default Request
processor calls this function.
Returns a dict of {'key': ''value'}
pairs from a
cgi.FieldStorage
object. The default Request processor calls this
function.
Set status, headers, and body in order to serve the file at the given path. The Content-Type header will be set to the contentType arg, if provided. If not provided, the Content-Type will be guessed by the extension of the file. If disposition is not None, the Content-Disposition header will be set to "<disposition>; filename=<name>". If name is None, it will be set to the basename of path. If disposition is None, no Content-Disposition header will be written.
This module both provides code-coverage tools, and may also be run as a script. To use this module, or the coverage tools in the test suite, you need to download 'coverage.py', either Gareth Rees' original implementation or Ned Batchelder's enhanced version.
Set cherrypy.codecoverage to True to turn on coverage tracing. Then, use the covercp.serve() function to browse the results in a web browser. If you run this module as a script (i.e., from the command line), it will call serve() for you.
You can profile any of your page handlers (exposed methods) as follows:
Example 3.14. Profiling example
from cherrypy.lib import profile
class Root:
p = profile.Profiler("/path/to/profile/dir")
def index(self):
self.p.run(self._index)
index.exposed = True
def _index(self):
return "Hello, world!"
cherrypy.root = Root()
Set the config entry: "profiling.on = True" if you'd rather turn on profiling for all requests. Then, use the serve() function to browse the results in a web browser. If you run this module as a script (i.e., from the command line), it will call serve() for you.
Developers: this module should be used whenever you make significant changes to CherryPy, to get a quick sanity-check on the performance of the request process. Basic requests should complete in about 5 milliseconds on a reasonably-fast machine running Python 2.4 (Python 2.3 will be much slower due to threadlocal being implemented in Python, not C). You can profile the test suite by supplying the --profile option to test.py.
This module provides a brute-force method of reloading application files on the fly. When the config entry "autoreload.on" is True (or when "server.environment" is "development"), CherryPy uses the autoreload module to restart the current process whenever one of the files in use is changed. The mechanism by which it does so is pretty complicated:
_cp_on_error is a function for handling unanticipated exceptions, whether raised by CherryPy itself, or in user applications. The default simply responds as if HTTPError(500) had been raised.
_cp_on_http_error handles HTTPError responses, setting cherrypy.response.status, headers, and body.
User defined filters are enabled using the class attribute _cp_filters. Any filter instances placed in _cp_filters will be applied to all methods of the class.
CherryPy provides a set of hooks which are called at specific places during the request process. A filter should inherit from the BaseFilter class and implement the hooks it requires to add extra code during the process. CherryPy will go through all the filters which are on (buil-in and user defined) for that requested path and call all hooks that are implemented by each filter.
This hook is being called righ at the beginning of the request process. The only work CherryPy has done when this hook is called is to parse the first line of the HTTP request. This is needed so that filters have access to the object path translated from the path specified in the HTTP request.
This hook is always called.
This hook is being called right after CherryPy has parse the HTTP request headers but before it tries to parse the request body. If a filter which implements that hook sets cherrypy.request.processRequestBody to False, CherryPy will not parse the request body at all. This can be handy when you know your user agent returns the data in a form that the default CherryPy request body parsing function cannot understand.
For example, assuming your user agent returns you a request body which is an XML string unquoted, you may want a filter to parse that XML string and generates an XML DOM instance. Then the filter could add that instance to the cherrypy.request.params which in turns would be passed to your page handler like if it had actually been sent like that through the HTTP request. Therefore your filter has turned the XML string into an XML DOM instance transparently and makes your life easier. In that case you do not want CherryPy to parse the request body. It could also be used to scan the request body before it is being processed any further and decide to reject it if needed.
This hook is not called if an error occurs during the process before hand.
This hook is called right before your page handler (exposed callable) is being called by CherryPy. It can be handy if considering HTTP request headers or body you may want not to call the page handler at all, then you would have to set cherrypy.request.executeMain to False.
This hook is not called if an error occurs during the process before hand.
This hook is called right after the page handler has been processed (depending on the before_main hook behavior) and before CherryPy formats the final respone object. It helps you for example to check for what could have been returned by your page handler and change some headers of needed.
This hook is not called if an error occurs during the process before hand.
This hook is called at the end of the process so that you can finely tweak your HTTP response if needed (eg adding headers to the cherrypy.response.header_list). Note that cherrypy.response.headers will not be processed any longer at that stage.
This hook is always called.
This hook is called when an error has occured during the request processing. It allows you to called code before the _cp_on_error handler is being called as well as the response finalizing stage.
Filters provide a powerful mechanism for extending CherryPy. The aim is to provide code called at the HTTP request level itself. More specifically it means that you can write code that will be called:
The baseurlfilter changes the base url of a request. It is useful for running CherryPy behind Apache with mod_rewrite.
The baseurlfilter has the following configuration options
base_url_filter.base_url
base_url_filter.use_x_forwarded_host
The cachefilter stores responses in memory. If an identical request is subsequently made, then the cached response is output without calling the page handler.
The decoding filter can be configured to automatically decode incoming requests.
The decodingfilter has the following configuration options:
decoding_filter.encoding
The encodingfilter can be configured to automatically encode outgoing responses.
The encodingfilter has the following configuration options:
encoding_filter.encoding: Force all text responses to be encoded with this encoding.
encoding_filter.default_encoding: Default all text responses to this encoding (if the user-agent does not request otherwise).
The gzipfilter will automatically gzip outgoing requests, if it is supported by the client.
The gzipfilter does not have any configuration options.
The logdebuinfofilter adds debug information to each page. The filter is automatically turned on when "server.environment" is set to "development".
The logdebuginfofilter has the following configuration options:
log_debug_info_filter.mime_types, ['text/html']
log_debug_info_filter.log_as_comment, False
log_debug_info_filter.log_build_time, True
log_debug_info_filter.log_page_size, True
The sessionauthenticatefilter provides simple form-based authentication and access control.
The static filter allows CherryPy to serve static files.
The staticfilter has the following configuration options:
static_filter.file
static_filter.dir
static_filter.root
The tidyfilter cleans up returned html by running the response through Tidy.
Note that we use the standalone Tidy tool rather than the python mxTidy module. This is because this module doesn't seem to be stable and it crashes on some HTML pages (which means that the server would also crash.)
The tidyfilter has the following configuration options:
tidy_filter.tmp_dir
tidy_filter.strict_xml, False
tidy_filter.tidy_path
The virtualhostfilter changes the ObjectPath based on the Host. Use this filter when running multiple sites within one CP server.
The virtualhostfilter has the following configuration options:
virtual_host_filter.prefix, '/'
The wsgiappfilter allows the application developer or deployer to mount WSGI-compatible applications and middleware to locations on the CherryPy object tree.
Applications can be added to the tree by using the cherrypy.lib.cptools.WSGIApp convenience class to directly mount applications to the CherryPy tree. You can also add an instance of the filter to a class's _cp_filters list.
The cherrypy.lib.cptools.WSGIApp and WSGIAppFilter class contsructors takes the following parameters:
wsgi_app (required) - the WSGI application callable.
env_update - a optional dictionary of parameters used to update the WSGI environment.
The xmlrpcfilter converts XMLRPC to the CherryPy2 object system and vice-versa.
PLEASE NOTE: before_request_body: Unmarshalls the posted data to a methodname and parameters. - These are stored in cherrypy.request.rpcMethod and .rpcParams - The method is also stored in cherrypy.request.path, so CP2 will find the right method to call for you, based on the root's position. before_finalize: Marshalls cherrypy.response.body to xmlrpc. - Until resolved: cherrypy.response.body must be a python source string; this string is 'eval'ed to return the results. This will be resolved in the future. - Content-Type and Content-Length are set according to the new (marshalled) data
The xmlrpcfilter does not have any configuration options.
CherryPy 2.1 supports arbitrary WSGI servers, and includes its own WSGI server (the default). This means that you should be able to deploy your CherryPy application using Apache or IIS (among others) without any changes to your application--only the deployment scripts will change.