This Website: The Generator
All of the code I'm going to discuss here is available on GitHub. Links to specific bits of code will point at what is currently the HEAD
commit - things will naturally move and change as time passes.
This post is a high level overview of the project's implementation. More detailed discussions of things like Razor customizations or the blurb extractor will follow.
Why though?
So, why did I write my own site generator instead of just using an off-the-shelf solution like Hugo? There are a few reasons:
The first is control. If you read about this site's history, you'll see that I've been burned a few times by taking dependencies on things that get abandoned, broken, monetized, so on. I don't ever want to have to start from scratch.
The second is familiarity. I already know Razor and I like it, so I wanted to use it to define my theme and layout logic.
Also, I have simple needs. I don't need some big thing with loads and loads of bells and whistles that I have to wade through when trying to find answers to problems, I want just a few thousand lines of code that I can confidently edit on my own.
And finally, it was kinda fun.
Features
Here's the feature set I settled on:
- Markdown content.
- Yaml front matter. (Actually not my favorite thing in the world, but everybody else is doing it so it's just what's easy.)
- Plain old C# to define site structure.
- Razor templating logic for structure and theming. (
_layout.cshtml
let's gooo!) - Razor (and even Markdown) partials.
- Automatic page generation which isn't tied to arcane dependency management.
- It needs to run fast.
Implementation
So, how is it built?
Well, let's trace the execution stages:
Finding all the content
Enumerating the source directory
The first thing I do is scan the source directory. This just builds a list of input files, prepresented as ContentItem
objects.
Some of the input files aren't actual content and they're ultimately all going to be removed from the input item set.
Compiling the site
assembly
The first of these are the .cs
files which make up the site
assembly, which is automatically referenced in all of the Razor templates. They're pulled out of the input set and compiled first.
Compiling the Razor pages
The next thing to do is compile the Razor pages. The initial scan tracks these as dummy objects since the full RazorPage
class can't initialize itself properly without the site
assembly.
Handling IEnumeratedContent
The last part of this process involves enumerated content. These are Razor templates that produce multiple output pages based on some sort of Linq query, like the one in my tags.cshtml
file, which generates the per-tag post listing pages:
@enumerate IGrouping<string, (MarkdownPage Page, PostFrontMatter FrontMatter)> Tag
{
Items = Project.Content.GetPosts().ByTag();
OutputPath = t => Posts.TagPath(t.Key);
}
The @enumerate
directive drives some custom Razor logic that generates the necessary code to tell the generator what we're generating, how to enumerate it, and how to construct an output path for each of the generated items. The rest of the Razor template doesn't run until after items are enumerated, and then it runs once per item, with the Tag
property (as defined in the @enumerate
directive itself - this could be named differently in different templates) set to the item.
There's a bit of an awkward dance that has to be done here in order to make everything make sense from the perspective of inter-page dependencies.
- First, all of the enumerated templates are pulled from the iput set and set aside.
- Then, everything which remains is initialized, which loads in its basic metadata (such as front matter). This is important because that metadata is likely to be criteria for the enumeration queries.
- The initialized pages are added to the input set.
- The enumerated templates are initialized.
- The enumerated templates are executed, and new content items are generated for each of their outputs.
And then for some reason I thought that maybe enumerated pages might spawn other enumerated pages, so this is all in a weird loop and ...I don't really know what I was thinking here. This is a thing that's complicated in all the other static site generators I looked at, and that loop is mostly a placeholder for "future multiphase/dependency logic here".
Generating output
Once all the content has been identified, the tool just executes its custom logic and dumps the output in the .out
folder.
This happens in two stages: PrepareContent
and WriteContent
.
The first, PrepareContent
is where Markdown is turned into HTML, blurbs are extracted, Razor page templates are run, and so on. It's designed to do as much processing as might be required by other items. Naturally, this requires items to form an acyclic dependency graph (and the tool doesn't do anything special if they don't - it just deadlocks).
PrepareContent
is basically the scaffolding that the blurb-extraction logic (which drives the little extracts you can see in the post listing) rests on.
WriteContent
actually writes the prepared output to disk. I'm a bit inconsistent on where, exactly, I apply layouts and run minification and so on. That'll probably get cleaned up eventually.
Finally, leftover files from previous runs which no longer belong in .out
are deleted.
Layout and theming
In addition to the Markdown and Razor pages, the tool supports certain auxiliary Razor templates. The most important are the _layout.cshtml
files.
A page, in essence, consists of body text. But we need more than that to make a web page. The rest of the scaffolding comes from the layout files. These are processed in order, starting in the page's source directory and moving up to the project root. (So,for instance, the layout file in the posts/
directory adds the publication date and tags structure, as well as editor's notes, and then the root layout glues in all the CSS, scripts, so on.)
Layouts can also restrict themselves to only a subset of the content in their folder. For instance, /posts/_layout{%2A.md}.cshtml
has a weird thing going on in the {%2A.md}
. If you unencode it, it reads {*.md}
, which means it applies only to the markdown files and not also to the output the post listings.