A High Level Look at How Drupal Works
Can it be done? How long will it take? How much will it cost?
Those are the three questions every marketing manager asks when it comes to website issues and enhancements. If you’re a marketing manager who wants a productive discussion with a Drupal developer, it’s good to have a picture of how all the pieces work together. Every technology has its own way of viewing the world and Drupal is no exception. Learn about Drupal’s architecture, concepts, and terminology before you start talking. (Related article: Optimize Drupal with Regular Pruning)
The easiest way to understand how Drupal works is to go through an overview of its major architectural components, then walk through how Drupal renders a page when a user clicks on a URL.
When Drupal gets compared to WordPress and other content management systems (CMS), the consensus is that it provides the most flexibility for customization. This means it’s well-suited for enterprises that anticipate growth, both in terms of traffic volume and functionality. Drupal’s architecture is more of an interlinked set of components than a structured hierarchy. There’s general agreement that Drupal consists of core APIs, modules, and themes, but five different developers would probably create five different architectural diagrams. Some would say it’s more important to understand that Drupal is content-centric and event-driven than to try and draw boxes around its components.
[Diagrams (L to R) from: PHP Everyday; Lullbot, Inc.; Doodlepress]
In the enterprise space, one of Drupal’s architectural strengths is its support for multi-site and virtual site environments – where different web properties appear to be separate sites but actually share common Drupal components. Organizations with different business units and product lines might deploy as a multi-site installation, so that each business unit can tailor its own theme, content, and functionality. Organizations such as car franchises might deploy as a virtual site installation so that each dealership can localize information such as inventory, events, and promotions while adhering to franchise-managed branding and content.
[Diagram: An Introduction to Drupal Architecture by John VanDyk]
At the heart of Drupal is a set of API services (above in blue) that modules use to interact with content and with other modules. Drupal’s modules sit on top of these API services. Drupal’s core APIs consist of:
Some of the APIs worth highlighting:
- Caching API. This improves response time by storing the output of a page so that Drupal doesn't need to render it again each time there is a page request.
- Database Abstraction API. This shields the developer from having to work directly with database systems. As long as someone has written a driver that allows Drupal to interface with a particular database, this API allows you to make database queries, update and delete tables.
- Menu API. Menu API does the heavy work of routing. Whenever a web page is required, Menu API determines what’s needed to build that page. It checks that the client has permission to access the path, invokes the first module needed to render the page, and if all goes well, Menu API eventually delivers a rendered page to the client. A module might contain hooks (“events”) that bring other modules into play but in the end, it’s the Menu API that delivers the result.
- Modules API. Handles loading of Drupal modules, including creating events so that other modules required for building the page know to spring into action.
- Session Handling API. This authenticates and then keeps track of users who are logged in through unique session IDs. This lets processes work with session information.
- Themeing API (Rendering). Just what it sounds like – controls the presentation of content from Drupal. This handles requests for themed output, taking raw data and applying the correct layout for the page, and sends back to the Menu API.
A module consists of code and files that extend Drupal's functionality. Modules make use of Drupal's APIs. Modules are event-driven: they ‘listen’ for hooks, or events, generated by APIs and other modules which will trigger them into action. This notion of events and listening for them is fundamental to Drupal. It’s how modules communicate.
Some modules are essential to Drupal operations. These are part of the standard Drupal release which is maintained by the Drupal development team, and known collectively as Core Modules. Other modules are contributed modules which developers can add and enable for their installation. Architecturally there’s no difference between contributed modules and core modules; they all make use of standard Drupal APIs.
When it comes to writing modules, there are well-documented best practices. However, thanks to a large Drupal community of module contributors, it’s very likely that a developer can assemble most of the functionality you need from ‘contrib’ modules rather than writing them from scratch. Whether a module is free or premium, a responsible developer first determines the risk factor by doing some research on a module: reading reviews, checking its support status, whether it’s widely used, and testing functionality and configurability.
Other Drupal Concepts
Some other terminology you’ll come across in Drupal:
- Nodes. A set of information that define a content type: a page, an article, a blog entry, or a forum topic. Each of these types could be comprised of multiple elements such as: title, text, comments, creation date, author, tags, etc. All of these are stored as a node.
- Hooks. Locations in modules where code can be executed. A hook is an event listener. Events trigger actions.
- Render arrays. This is how Drupal prefers to receive data and information about how the data should be rendered. Avoid direct HTML markup and put all that information in a render array.
- Alter hooks. How modules can alter content (data) before a page is fully rendered. It’s a way to customize modules without touching their code. Starting in Drupal 7, even themes can use alter hooks.
How Drupal Renders a Page
(Assume an Apache webserver)
- A user makes a webserver request by clicking on a URL.
- Drupal reads the (Apache) configuration file .htaccess (which developers can edit to override web server settings); .htaccess tells Drupal to execute index.php, a script file.
- Index.php bootstraps the process by loading up the APIs in the include folder, initializing the database; then it initializes session handling, loads the libraries, and prepares to handle the request.
- The Menu API, one of the APIs loaded up, now checks for the module responsible for getting content from the database for that URL. Modules are event-driven and listen to the Menu API for requests.
At this point, Drupal gets to skip steps 5 – 8 if:
- page caching is turned on
- the page has already been rendered once during the caching interval
- the user is public (‘anonymous’)
Caching stores pages in memory in their rendered form so that Drupal can just serve up the page without having to assemble it again. This means that ‘static’ pages that don’t interact with users are good candidates for being cached. On a website with lots of static pages, this can represent a significant performance boost.
- When the module gets the request, it goes to the database and loads up content from the appropriate node, but it also fires up hooks to allow other modules to interact. These other modules might add business logic, extra functionality, make changes to the content, bring in more content, and/or hand off to other modules by firing up other hooks. This handing-off and delegation is what makes Drupal event-driven. Modules build the content and respond to events.
- Assuming there are no errors, when all the content is available the module(s) hand back control to the Menu API, which determines which theme should be used and hands the raw, unformatted data to the Theme layer.
- The Theme styles the content and may also call more hooks, or change the content. Then it hands a fully-formed HTML page back to Menu API, which hands the page to the user’s browser for rendering.
- The browser renders the page for the user, the user clicks on another link, and the entire process starts all over again.
What all this means is that Drupal modules are not standalone pieces of code. They’re part of a request lifecycle infrastructure where multiple events take place and interact. Modules ‘pass the baton’ so to speak. So when developers debug code, they must figure out what else Drupal and other modules might be doing that impacts a specific module.
Similarly, when it comes to optimizing performance, it’s worth remembering that there’s no way a developer can bypass this overhead of passing the baton from module to module. That’s why advice about Drupal always cautions against using too many modules. There’s more to be gained from creating efficient data fields right from the start and by making precise data queries (i.e. requesting specific data fields instead of all fields “*”); by doing so, when Drupal gets to Step 5 (above) it has less work to do if it’s dealing with smaller tables and indexes.
Talking to Drupal Developers
Can it be done? How long will it take? How much will it cost?
Now that you have a picture of how Drupal works, you’ll understand what Drupal developers need to research before they give you an answer. Some contributing factors include:
- Is the website current: if your site has been well-maintained and has all the latest security updates, it makes life easier. Especially if there’s custom code involved.
- Can it be done by modifying the theme: if the only change is at the presentation layer, it’s an easier task than changes that involve tweaking the database or integrating a module.
- Can it be done with Views or Panels: these are modules familiar to every Drupal developer. Chances are they’re already installed and have already been used to create queries and display data. Again, this adds new features to the website without changes at a deeper level.
- Is there a module for that: sometimes it sounds as though a new module will do the job. Even so, the developer needs to do some research. Will it work with your version of Drupal? Will it work with your existing modules and themes? Will it work with you custom/3rd party code?
- Does it mean changes to the database: when this happens, it also means reviewing all associated queries, tables, and data views.
- Does it mean integration changes: In many enterprises, the Drupal site is integrated with a 3rd party app. It could share or synchronize data, share authentication, or run external scripts. Making changes to this scenario means understanding the sharing mechanism (usually an API) and sometimes the 3rd party app.
As a final word, remember that when it comes to performance, Drupal runs as part of a total webserver infrastructure. This includes the web server, database server, and network connections. When performance lags, a good Drupal site admin will check for performance bottlenecks on those components before asking for assistance with Drupal specifics from a developer.