A new website for 2022

Hello reader, happy new year 2022 (yes I'm more than one month and a half late)! It has been a while since I last published something.

2021 has been complicated to say the least and I did not take the time to stop, reflect, and write things down the way I wanted. I've seen my attention switching between way too many things throughout the year. And now looking back at the past 12 months or so, I feel frustrated for not having spme sort of record. That's something I want to improve in the future: more articles more often, as a way to hold myself accountable and document what I do.

I made a new website

And you're browsing it right now! The previous website was based on Grav, a cool little "flat-file CMS" written in modern PHP: you describe your various pages and type of content using yaml and markdown files, and have a simple web UI to configure your site, create and manage content. Flat-files can be saved in git, and pages and assets are aggressively cached, so you have the benefits of a static website while keeping the user experience of a dynamic CMS. It's a nice alternative to Wordpress for simple websites. But after ~2 years I felt that wasn't a good fit for me: I have very little interest in PHP so maintaining my custom theme was more annoying than fun, and given how simple and static my content is I felt that maintaining an Nginx config and run a PHP process was too much. Also plugins I used broke too often for my taste.

So I ditched all of it and rebuilt the website from scratch. Was that an excuse for a side-project? Well, yes! And this article covers my journey :)

Start the new year by learning a new language

Before discussing the website itself let's take a detour on a completely different topic. I spent the past 5 years writting mostly Go and TypeScript, and for some reasons I ended 2021 with a boost of motivation to learn a new programming language. Discovering and experimenting with new languages is something I often enjoy doing, and have done a lot in the past, and because of this I know it can also be frustrating and time consuming. I need to be in the correct mindset to benefit from the experience and the transition to 2022 felt like a good time for this.

As a side note, in 2021 I went back to C# after a bit less than a decade. The language changed so much during that time it felt like riding a widely different but familiar creature than the one I remembered.

My mental list of new languages to investigate was made of names I've seen mentioned on HackerNews over the past few years: V, Zig, and Rust. From the surface they all seem to be focused on type and memory safety, low memory footprint, and performance.

I started looking into V, and was really surprised by how much the documentation promised, it seemed way too good to be true by basically describing all the popular features a language could have without any of the cost. I spent a lot of time writing Go during the past 5 years, so the idea of a faster, leaner language inspired by Go felt like a cool thing to check out. Well, that went quickly. I played around a bit and checked out the stdlib implementation: the compiler is completely broken, documented features are either completely missing or broken. Your V program is guaranteed to leak memory with the current version of the compiler and that's unlikely to change in the future. I cannot say if it is a work-in-progress project that overpromised or just vaporware, but to be honest I do not really care, I lost all interest for this project. The GitHub repository has 26.9k stars, which is the best proof I've seen that star count is a terrible metric to judge the quality of a project.

Next I looked into Zig. The language looks interesting and seem to be a serious attempt: it tries to position itself as a safe C competitor but its main novel feature, comptime, which enables to run arbitrary (I think?) code at compile-time can be leveraged to implement high-level abstractions. That makes the language a potential competitor for C++ and Rust. For example generics are implemented using compile-time parameters:

// from the documentation: https://ziglang.org/documentation/master/#Compile-Time-Parameters

fn max(comptime T: type, a: T, b: T) T {
    return if (a > b) a else b;
}
fn gimmeTheBiggerFloat(a: f32, b: f32) f32 {
    return max(f32, a, b);
}
fn gimmeTheBiggerInteger(a: u64, b: u64) u64 {
    return max(u64, a, b);
}

The same principle is used for generic data structure:

// from the documentation: https://ziglang.org/documentation/master/#Generic-Data-Structures

fn List(comptime T: type) type {
    return struct {
        items: []T,
        len: usize,
    };
}

// The generic List data structure can be instantiated by passing in a type:
var buffer: [10]i32 = undefined;
var list = List(i32){
    .items = &buffer,
    .len = 0,
};

But the language goes even further, for example with compile-time reflection!

After spending a bit of time reading the documentation and trying (and failing a lot) to get simple programs to compile, I'm convinced that Zig is on the right path and will have an interesting future. The community seem to be serious and competent. But it still feels a bit early: the documentation is functional but difficult to approach if you don't know what you're doing and tooling is not ready yet. For example ZLS, the language server, was crashing a lot with very simple programs and I lost interest after trying to debug it for some time. Also dealing with strings does not seem to be supported by the stdlib, which makes small things more complicated than I expected. Nevertheless I got a really good impression of Zig from that short time writing toy programs and I plan to look at it again once it reaches a stable version (currently at version 0.9.0).

Switching to Rust as my main language

The title gives it out. I started by experimenting with windows-rs, an official "crate" (Rust jargon for "package" or "library") to use Windows APIs. Everything went so smoothly, I had so much fun with the language and Windows APIs that I spent the next 2 weeks experimenting with simple Win32 GUIs applications in Rust.

I still remember the first time I tried out Rust, in 2015. Things went really badly. At the time I could not understand what the borrow-checker whas doing and felt the compiler was constantly reminding me how stupid I was. Everything felt over-complicated and I abandoned with a sour taste of frustation. And I guess that makes sense. At the time my experience was almost exclusively with garbage-collected languages where the concept of ownership does exists as part of the runtime but is hidden from the developer: I create a string, pass it around, who cares who owns it, it will eventually be collected. But that was 7 years ago, and a lot of stuff happened since then. Since ~2019 I tried to learn C++ on the side (I mean, as much as you can "learn" this monster of a language) and consumed a lot of content from the C++ community.

By the way if you're interested by the C++ world I highly recommend CppCon and Jason Turner Youtube channels. Lot of really high quality content there.

Jason Turner is also a co-host of the awesome CppCast podcast.

And for an introduction to modern C++ I appreciated A Tour of C++ 2nd edition.

The one thing I got from this exposure to C++: memory management revolves around ownership. Developing this intuition of "who owns what" and how and when to pass data ownership around is fundamental to understand move semantics, a very important part of modern C++. In fact that has become so essential to the language that you now have an entire book on the topic! (and it is a great read if you have the time. For real!) Every time you pass something around you have those questions in your mind: Pass a copy? Pass a reference? Pass a pointer? Who owns the pointer now? Have destructor taken in account the ownership? It's always there in your mind. And that's exactly what Rust type system and borrow checker is for: instead of keeping that line of questioning in your mind, you encode the information and statically check its correctness. So in some ways exploring C++ has been my training for Rust, I just didn't know about it at the time. And it for sure wasn't the most efficient way to develop that intuition...

Ok, no more fighting with the borrow checker, that's great. What about the rest? Rust already has enough evangelism online so I won't go into details of why it's such a great language, at least not in this article. And if you want to know, read what Amos has to say. Instead I will just cover a few things I had to learn or adapt to. For example it took me some time to understand how errors are managed in Rust. A common answer I first got was "just use enums". Unfortunately that's glossing over a lot of stuff. People complain a lot about Go error management, I personally find it quite effective for the general use case:

func f() error {
  if res, err := fetchStuff(); err != nil {
      return fmt.Errorf("failed fetching stuff: %w", err)
  }

   if valid, err := res.validate(); err != nil {
      return fmt.Errorf("cannot validate stuff: %w", err)
  }
  if !valid {
      return fmt.Errorf("invalid stuff: %#v", res)
  }
}

Here %w is used to wrap err. The eror can then be unwrapped using errors.Unwrap. And %#v just converts the parameter res to a string representation with some extra metadata such as the type name.

Straightforward, simple to read and use when you just want to report an error to the caller. That becomes a bit of a pain once you want to check details of a specific error as everything is just an interface error but since a few years we have errors.As and errors.Is as a standard way to deal with this. The whole thing still feels a bit hacky but works fine.

In Rust "just use enums" feels like the extreme opposite. That's a great, explicit way to describe error conditions when you have clear cases that are expected to be handled by the caller. For example std::io::ErrorKind is a great use of enum, you can have a clear view of every possible error type and easily pattern match on it:

pub enum ErrorKind {
    NotFound,
    PermissionDenied,
    ConnectionRefused,
    ConnectionReset,
    // more values...
}

But what about your applicative code where you just want to handle a error cases and return a general error that will for example be logged or reported accordingly, similar to what is done in Go with the error interface? A straightforward solution is to use the Error trait this way:

pub type Error = Box<dyn std::error::Error>;
pub type Result<T> = std::result::Result<T, Error>;

fn f() -> Result<()> {
    let res = fetch_stuff().map_err(|e| format!("failed to fetch stuff: {}", e))?;
    let valid = res.validate().map_err(|e| format!("cannot validate stuff: {}", e))?;
    if !valid {
        return Err("invalid stuff".into());
    }
    Ok(())
}

fn main() {
    match f() {
        Ok(_) => println!("done!"),
        Err(e) => {
            eprintln!("woops: {}", e);
            std::process::exit(1);
        }
    }
}

dyn std::error:Error here is called a trait object which results in dynamic dispatch via a vtable. Box<T> is used to allocate on the heap. The result is fairly similar to what is done in Go with the error interface. Using enums for all errors requires a lot of boilerplate but avoid both the dynamic dispatch and the heap allocation.

So far using enums for errors from libraries and a trait object error for application code has been the good balance for my programs. I've also seen that multiple crates exist to make your life easier, like anyhow, but did not investigate them yet.

Other than error management, I also needed some time to get used to compiler messages. They felt really cryptic at the beginning, until I understood how Box, traits, and enum work. I just needed some exposure and time to adapt, once you get used to them Rust error messages become really, really helpful, with nice little ASCII art arrows and coloring representing ownership/borrowing issues. It's such a radical improvement compared to C++ compilers.

As a conclusion, I'm in love with Rust enums and traits. They are simple to learn and apply, and can be used to build nice abstractions. The explicit ownership is daunting at first but makes perfect sense and is such a fantastic idea. The stdlib support for common monads such as Options and Result results is just great and results in very readable but terse code, which is always appreciated. Macro are hardcore, it's the most complicated part of the language I've seen so far (I haven't tried async yet) but also ridiculously powerful (they also slow compilation time). Initially I thought I would just play around a bit but after a month I ended being completely convinced by both the language and the ecosystem, and I cannot see myself using Go or C++ again unless I'm paid for it (Want to pay me? I'm available for contracting work, reach me out at contact at elborai.me).

To be honest this experience is puzzling, I've seen so many people online complaining about Rust for becoming as complicated as C++, and I cannot see how someone can arrive to this conclusion. The language feels straightforward to me, and with no gotchas so far (I'm sure some exist, I heard about Pin and async). Even when using unsafe and transmute (Win32 requires lot of pointer manipulation) I cannot feel the complexity people seem to complain about. That will maybe change once I become more experienced.

So, a new website you said?

Ok, that was a long walk off the beaten track. But now you can see where my mind was: I am learning a new language, feel unsatisfied by my previous website setup, have thoughts and experiments I want to record over time. That sounds like a good justification to implement my own static site generator! So that's what I've done. Of course I'm familiar with existing tools: Hugo, Zola, and others, but I don't really like such general site generators, I always end up maintaining a bunch of inter-connected yaml files (which is something I personally dislike) and end up fighting with their choice of libraries and template system, which is exactly what I do not want to deal with. And I want to write Rust :)

My list of requirements for the first version were simple:

markdown for content
as static as possible (i.e only use Javascript when necessary)
extensible but simple templating system (i.e should not be a limitation for future features I want to add)
iterating over styles and layout should be fast and simple

And then I have the infinite list of things that would be cool to have, that may or may not exist in the future:

multi-part pieces (series, long tutorials)
hot-reload (for local dev)
pagination
syntax highlighting for code snippets
lightbox for images
"cool bear hot tips" and "cool bear says/I say", Amos-style
a publishing queue
image captions
local GUI to create content
incremental builds
dark/light theme
custom markdown syntax for banners and more
RSS/Atom/JSON feed
and way more small things...

Always start with the content

I like to think of an article as a directory. For simple articles that feels a bit overkill, one really just need a text file, but when we start to include images, videos, or other assets it's practical to have everything in its own directory: that makes it easy to keep track of an article content and shuffle things around. That already gives us a bit of a guidance: 1 directory + 1 content file per article. And the same can be done for other pages.

$ tree /F .\content\
C:\USERS\SAM\DEVELOPMENT\WEBSITE\CONTENT
│   index.md
│
├───articles
│   │   index.md
│   │
│   ├───initial-post
│   │       index.md
│   │
│   ├───habitcat-release
│   │       index.md
│   │       playstore_feature_graphic.png
│   │       playstore_screenshot_1_en.png
│   │       playstore_screenshot_2_en.png
│   │       playstore_screenshot_3_en.png
│   │
│   ├───arctic-code-vault-contributor
│   │       Annotation 2020-07-17 162632.jpg
│   │       index.md
│   │
│   └───2022-new-website
│           index.md
│
└───series
        index.md

That's simple enough and maps directly to the hierarchy I want for my URLs:

content/index.md is the homepage, listing recent articles, and should results in index.html
content/articles/index.md is the listing of all articles and should results in articles/index.html
content/articles/initial-post/index.md, an article, should result in articles/initial-post/index.html

Content files need a way to specify which template to use when rendering the HTML, what data to load, and other metadata. I'm using a Yaml front-matter. For example content/index.md has the following content:

---
title: 'sam.elborai.me'
description: 'Personal website: programming, projects.'
type: index
count_articles: 3
count_series: 3
---

## Hi, I'm Sam 👋

Welcome to my personal space.

I write about my main passion: programming!

Currently exploring [Rust](https://www.rust-lang.org/) 🦀 and digging into low-level Windows APIs.

I already mentioned I do not like to maintain too much Yaml, and that's true. Yaml quickly becomes a mess when you have to maintain multiple files but it is a great fit for limited key-value entries.

When reading the file I first look for the front matter delimiter --- and a potential summary marker === (used for articles), then split it into 3 parts: front_matter (yaml), summary (markdown), and content (markdown).

fn read_content_file(path: &Path) -> Result<ContentFile> {
    let content = fs::read_to_string(path)?;

    let mut front_matter = String::default();
    let mut front_matter_markers = 0;
    let mut front_matter_lines = 0;
    let mut summary_lines = 0;

    let lines = content.lines();

    for (i, line) in lines.clone().enumerate() {
        // look for front matter delimiters
        if front_matter_markers < 2 {
            front_matter_lines += 1;
            if line.trim() == "---" {
                front_matter_markers += 1;
                continue;
            }
            if front_matter_markers > 0 {
                front_matter.push_str(line);
                front_matter.push('\n');
            }
            continue;
        }

        // look for summary marker
        if line.trim() == "===" {
            summary_lines = (i + 1) - front_matter_lines;
            break;
        }
    }

    let content_lines = lines.skip(front_matter_lines);
    let content = content_lines.clone().collect::<Vec<&str>>().join("\n");
    let content = Markdown(content);

    let mut more_content = false;
    if summary_lines == 0 {
        summary_lines = 3;
        more_content = true;
    }
    let mut summary = content_lines.take(summary_lines).collect::<Vec<&str>>();
    if more_content {
        summary.push("...");
    }
    let summary = summary.join("\n");
    let summary = Markdown(summary);

    Ok(ContentFile {
        front_matter: FrontMatter(front_matter),
        content: Content(content),
        summary: Summary(summary),
        path,
    })
}

The front matter is then parsed a first time using serde_yaml to extract a struct I call BaseMetadata: common metadata shared by all content files. We will parse it a second time later to deal with page-specific metadata, such as count_articles in the case of the homepage.

/// Common metadata that can be specified in the content front-matter
#[derive(Deserialize, Debug)]
pub struct BaseMetadata {
    #[serde(rename = "type")]
    kind: String,
    // page title as displayed in browser tab
    title: Option<String>,
    // page description used for SEO
    description: Option<String>,
    /// the article or page date publicly displayed
    date: Option<String>,
    /// if in the future no output file is generated
    publication_date: Option<String>,
    /// if false no output file is generated
    published: Option<bool>,
}

Only one value is mandatory, the page type, renamed to kind due to a conflict with the Rust keyword type. The rest is optional as specified by the use of the enum Option<T>.

I defined 5 page types, though in practice I only use 4 of them at the moment. They are defined as a combination of enum and structs.

pub enum Page {
    Index(IndexPage),
    Articles(ArticlesPage),
    Article(ArticlePage),
    Series(SeriesPage),
    _Serie(SeriePage), // currently not used
}

pub struct CommonPageData {
    pub meta: BaseMetadata,
    pub content: Content,
    pub front_matter: FrontMatter,
    pub summary: Summary,
    pub source_path: PathBuf,
}

pub struct IndexPage {
    pub common: CommonPageData,
    meta: IndexMetadata,
}

pub struct ArticlePage {
    /// part of the url that identifies the article, such as 'my-cool-post-1' for the url
    /// 'http://domain.tld/articles/my-cool-post-1'.
    pub slug: String,
    pub common: CommonPageData,
    pub title: String,
    pub date: NaiveDateTime,
    pub reading_duration: Duration,
}

pub struct ArticlesPage {
    pub common: CommonPageData,
}

pub struct SeriePage {
    pub common: CommonPageData,
}

pub struct SeriesPage {
    pub common: CommonPageData,
}

By pattern matching over the page type we can build the relevant enum.

impl Page {
    pub fn make_from(common: CommonPageData) -> Result<Self> {
        match common.meta.kind.as_str() {
            "index" => {
                let meta = serde_yaml::from_str::<IndexMetadata>(&common.front_matter.0)?;
                let page = IndexPage { common, meta };
                Ok(Page::Index(page))
            }
            "article" => {
                let date = /* ... */;
                let date = /* ... */;
                let title = /* ... */;
                let markdown = /* ... */;
                let reading_duration = /* ... */;

                // the slug is the directory name, with everything before b '.' removed:
                // - 01.my-cool-post => my-cool-post
                // - 13022021.my-nice-post => my-nice-post
                // - my-interesting-post => my-interesting-post
                let parent_dir = /* ... */;
                let slug = /* ... */;

                let page = ArticlePage {
                    common,
                    title,
                    date,
                    reading_duration,
                    slug,
                };
                Ok(Page::Article(page))
            }

            "articles" => {
                let page = ArticlesPage { common };
                Ok(Page::Articles(page))
            }

            "series" => {
                let page = SeriesPage { common };
                Ok(Page::Series(page))
            }

            unknown => Err(format!("unknown page type: {}", unknown).into()),
        }
    }
    // ...
}

Putting everything together, the function read_markdown looks like this:

fn read_markdown(path: &Path) -> ReadMarkdownResult {
    // 1. read file, split front matter and markdown content
    let content_file = match read_content_file(path) {
        Ok(ok) => ok,
        Err(e) => { /* ...*/ }
    };

    // 2. parse front matter
    let meta = match serde_yaml::from_str::<BaseMetadata>(&content_file.front_matter.0) {
        Ok(ok) => ok,
        Err(e) => {/* ... */},
    };

    match meta.validate() {
        BaseValidateResult::Skip(reason) => {/* ... */}
        BaseValidateResult::Valid => {}
        BaseValidateResult::ParseErr(e) => {/* ... */}
    };

    // 3. build page struct based on parse info
    let common = CommonPageData {
        meta,
        content: content_file.content,
        front_matter: content_file.front_matter,
        summary: content_file.summary,
        source_path: path.to_owned(),
    };
    let page = match Page::make_from(common) {
        Ok(ok) => ok,
        Err(e) => {/* ... */}
    };

    if let Err(e) = page.validate() {
        /* ... */
    }

    ReadMarkdownResult::Ok(page)
}

For images, videos, or other assets we just need another pair of enum and structs. The asset type is already identified by the file extension and there is no need for metadata.

pub enum Asset {
    Image(ImageAsset),
    Video(VideoAsset),
}

pub struct ImageAsset {
    pub source_path: PathBuf,
    pub hash: String,
}

pub struct VideoAsset {
    pub source_path: PathBuf,
    pub hash: String,
}

Markdown parsing

After looking a bit around for a markdown library I went with pulldown-cmark due to its focus on being compliant with CommonMark and the clarity of their tests and examples. I love standard markdown but it is also too limited for my need, so I want to be able to extend the syntax with my own logic. And pulldown-cmark is perfect for this and surprisingly simple to deal with: the parser is defined as an iterator of Event (an enum) on which we can easily pattern match and do some processing.

I currently have 3 extensions.

ignore_summary_marker replaces the summary marker === with a line break due to its potential conflict with something CommonMark calls setext heading underline (I personally never use setext heading when writing markdown).

fn ignore_summary_marker(&self, event: &Event) -> Option<Event> {
    if let Event::Text(ref text) = event {
        if text.trim() == "===" {
            return Some(Event::SoftBreak);
        }
    }
    None
}

replace_bear_says extends markdown syntax with markers @bear: ... and @sam: ..., similar to the "cool bear says" Amos popularized. I do not use them yet but I want to in the future and it was a good proof of concept to understand how to work with Event.

fn replace_bear_says(&self, event: &Event) -> Option<Event> {
    if let Event::Text(ref text) = event {
        let text = text.trim();
        if text.starts_with("@bear: ") {
            let text = text.strip_prefix("@bear: ").unwrap_or_default();
            return Some(Event::Html(
                format!(r#"<div class="bear-say">{}</div>"#, text).into(),
            ));
        } else if text.starts_with("@sam: ") {
            let text = text.strip_prefix("@sam: ").unwrap_or_default();
            return Some(Event::Html(
                format!(r#"<div class="sam-say">{}</div>"#, text).into(),
            ));
        }
    }
    None
}

And the last one is also the most complicated one. hash_assets_url replaces asset links, such as ![an image](path/to/my/image.png) with a simplified path that also contains the asset hash: ![an image](/assets/<hash>_image.png). Browsers often aggressively cache images and I want to be sure the correct asset is displayed, adding a hash is a simple trick. This function is not really elegant or robust but works well enough for now.

fn hash_assets_url(&self, event: &Event, entry: &Entry) -> Option<Event> {
    match event {
        Event::Start(tag) => {
            match tag {
                Tag::Link(ref link_type, ref url, ref title) | Tag::Image(ref link_type, ref url, ref title) => {
                    match link_type {
                        LinkType::Inline | LinkType::Shortcut | LinkType::Autolink => {}
                        _ => return None,
                    };
                    if url.contains("://") { return None; }

                    let asset_entry = /* ... */;
                    let asset = match self.assets.get(asset_entry) {
                        Some(asset) => asset,
                        None => return None,
                    };

                    let (hash, filename) = match asset {
                        Asset::Image(asset) => (&asset.hash, asset.source_path.file_name().unwrap()),
                        Asset::Video(asset) => (&asset.hash, asset.source_path.file_name().unwrap()),
                    };

                    let filename = format!("{}_{}", hash, filename.to_string_lossy());

                    let replaced_url = /* ... */;
                    let title = title.to_owned();

                    match tag {
                        Tag::Image(..) => {
                            Some(Event::Html(
                                format!(r#"<img src="{}" title="{}">"#, replaced_url, title)
                                    .into(),
                            ))
                        }
                        Tag::Link(..) => {
                            Some(Event::Html(
                                format!(
                                    r#"<a href="{}" target="_blank">{}</a>"#,
                                    replaced_url, title
                                )
                                .into(),
                            ))
                        }
                        _ => None,
                    }
                }
                _ => None,
            }
        }
        _ => None,
    }
}

One cool detail is that each handler returns an Option<Event> with the value Some(Event) when the event has been handled, None otherwise. That way we can easily chain them using the method or_else. If no handler processed the event we can use unwrap_or to default to the event itself. The result is cute and readable!

pub fn render_to_html(&self, entry: &Entry, text: &Markdown) -> HTML {
    let options =
        Options::ENABLE_STRIKETHROUGH | Options::ENABLE_TABLES | Options::ENABLE_STRIKETHROUGH;

    let parser = pulldown_cmark::Parser::new_ext(&text.0, options).map(|event| {
        self.ignore_summary_marker(&event)
            .or_else(|| self.replace_bear_says(&event))
            .or_else(|| self.hash_assets_url(&event, entry))
            .unwrap_or(event)
    });
    let mut output = String::with_capacity(&text.0.len() * 2);
    pulldown_cmark::html::push_html(&mut output, parser);
    HTML(output)
}

Find and read content

My general approach is to first list and read all files from the content directory ./content, and for each entry:

if markdown: read + parse content, build a page struct, insert result into a hashmap pages: HashMap<Entry, Page>
if image or video: hash file content, build asset struct, insert result into a hashmap assets: HashMap<Entry, Asset>
otherwise: do nothing and just skip the file

To list files I'm using the crate walkdir, it provides a simple API to recursively traverse a given path. We only care about actual files, so let's skip symlinks and directories by using the method file_type(self) -> fs::FileType implemented by the struct walkdir::DirEntry. Then we can pattern match over the file extension.

pub fn read_content(&mut self) -> Result<()> {
  for entry in WalkDir::new(&self.content_dir) {
            let entry = entry?;
            let path = entry.path();

            if entry.file_type().is_dir() || entry.file_type().is_symlink()
            {
                continue;
            }

            let entry = Entry(path.strip_prefix(&self.content_dir)?.to_path_buf());
            match path.extension().and_then(OsStr::to_str) {
                Some("md") => {
                    debug!("content file\t{}", path.display());
                    let page = match read_markdown(path) {/*...*/};
                    if self.pages.contains_key(&entry) {
                        return Err(format!("duplicate content file {}", entry.display()).into());
                    }
                    self.pages.insert(entry, page);
                }
                Some("png") | Some("jpg") | Some("jpeg") | Some("gif") => {
                    debug!("image asset\t{}", path.display());
                    let buffer = fs::read(path)?;
                    let hash = format!("{:016x}", seahash::hash(&buffer));
                    let asset = Asset::Image(ImageAsset {
                        source_path: path.to_owned(),
                        hash,
                    });
                    if self.assets.contains_key(&entry) {
                        return Err(format!("duplicate asset {}", entry.display()).into());
                    }
                    self.assets.insert(entry, asset);
                }
                Some("mp4") => {
                    debug!("video asset\t{}", path.display());
                    let buffer = fs::read(path)?;
                    let hash = format!("{:016x}", seahash::hash(&buffer));
                    let asset = Asset::Video(VideoAsset {
                        source_path: path.to_owned(),
                        hash,
                    });
                    if self.assets.contains_key(&entry) {
                        return Err(format!("duplicate asset {}", entry.display()).into());
                    }
                    self.assets.insert(entry, asset);
                }
                None => {/* skip... */}
                Some(other) => {/* skip... */}
            };
  }
  //...
}

Then in a second step we iterate over all discovered pages and render both the whole content and the summary to HTML. This step is done only after all files and assets have been discovered due to the fact that markdown handlers may require assets or pages references. That's for example the case for the handler hash_assets_url we described earlier.

pub fn read_content(&mut self) -> Result<()> {
    //...
    let parser = MarkdownParser::new(&self.assets);
    self.pages.iter().try_for_each(|(entry, page)| {
        let (content, summary) = match page {
            Page::Index(page) => (&page.common.content, &page.common.summary),
            Page::Articles(page) => (&page.common.content, &page.common.summary),
            Page::Article(page) => (&page.common.content, &page.common.summary),
            Page::Series(page) => (&page.common.content, &page.common.summary),
            Page::_Serie(page) => (&page.common.content, &page.common.summary),
        };

        let content = parser.render_to_html(entry, &content.0);
        let summary = parser.render_to_html(entry, &summary.0);

        if self.rendered_contents.contains_key(entry) {
            return Err(format!("duplicate rendered content {}", entry.display()));
        }
        self.rendered_contents.insert(entry.to_owned(), content);

        if self.rendered_summaries.contains_key(entry) {
            return Err(format!("duplicate rendered summary {}", entry.display()));
        }
        self.rendered_summaries.insert(entry.to_owned(), summary);

        Ok(())
    })?;

    Ok(())
}

At this point we have:

a hashmap of all pages
a hashmap of all assets
a hashmap of all markdown content rendered as HTML
a hashmap of all markdown summaries rendered as HTML

Templating

I use the liquid templating engine for rendering the website's HTML content. Liquid is a safe, easy-to-use, and expressive template language commonly used in web applications. I like that it only provide a limited number of templating constructs, that makes it fairly simple to learn and maintain.

A simple template file such as templates/base.liquid look something like this:

<!DOCTYPE html>
<html lang="en">

<head>
    <meta charset="utf-8" />
    <title>{{ title }}</title>
    <!-- ... -->
    <link href="/{{ styles }}" rel="stylesheet" />
</head>

<body class="...">
    {% include "navbar" %}
    <div class="...">
        {{ content }}
    </div>
    {% include "footer" %}
</body>

</html>

For each page we want to render we need to provide context data to be used for templating.

Creating page contexts

The context is a data structure that holds variables and values that will be used in the template rendering process. Here's an example of a basic context structure:

struct BaseContext {
    title: String,
    description: String,
    content: String,
    styles: String,
}

impl BaseContext {
    fn to_liquid_object(&self) -> liquid::Object {
        liquid::object!({
            "title": self.title,
            "description": self.description,
            "content": self.content,
            "styles": self.styles,
        })
    }
}

This BaseContext structure is used for the base template, base.liquid, which is responsible for the overall layout of the website. The context contains common attributes, such as the title, styles, content, and other metadata. The to_liquid_object method is responsible for converting the context into a Liquid-compatible data structure (liquid::Object).

Similarly, other contexts can be designed for specific page templates and are populated with relevant data before being passed to the corresponding templates.

Accessing Context Data in Templates

The context data can be accessed in the Liquid templates using double curly braces {{ }}. For example, in the base.liquid template, we use {{ title }}, {{ description }}, and {{ content }} to access the title, description, and content from the context.

In the article.liquid template, we use {{ title }} and {{ markdown }} to access the title and the rendered markdown content from the context.

Rendering Pages with Context

Once the context is set up, it is passed to the Liquid template engine to render the final HTML output. Here's a simplified example of a render function:

fn render(template: &str, context: &liquid::Object) -> Result<String, liquid::Error> {
    let parser = liquid::ParserBuilder::with_liquid().build()?;
    let template = parser.parse(template)?;
    let output = template.render(context)?;
    Ok(output)
}

This render function combines the context with the template and generates the final HTML output that is written to the output directory.

For each page type, you would create the appropriate context, call the render function with the corresponding template, and then write the output to the desired location.

Generate pages

The next step in the process is to generate the HTML pages for the website. The write_pages method in the Website struct is responsible for this task. It iterates through the pages and renders them using their respective templates. The rendered content is then written to the output directory.

pub fn write_pages(&self) -> Result<()> {
    self.pages.iter().try_for_each(|(entry, page)| {
        let html = self.render_page(&entry)?;
        let slug = match page {
            Page::Article(page) => page.slug.clone(),
            Page::Serie(page) => page.slug.clone(),
        };
        let filename = format!("{}.html", slug);
        let out_path = match page {
            Page::Article(_) => self.out_dir.join("articles").join(filename),
            Page::Serie(_) => self.out_dir.join("series").join(filename),
        };
        if let Some(prefix) = out_path.parent() {
            std::fs::create_dir_all(prefix)?;
        }
        fs::write(out_path, html.as_bytes())?;
        Ok(())
    })
}

First, the method iterates over the self.pages HashMap, which contains the content and metadata for each page. For each page, it calls the render_page function, passing in the entry.

Next, the method determines the appropriate output path based on the page's type (Article or Serie) and slug, creating the necessary directory structure for the output files if needed.

Finally, the write_pages method writes the rendered HTML to the corresponding output file in the out_dir. The file's name is derived from the page's slug, and the extension is set to .html.

By the end of this process, the write_pages method has created HTML files for all the pages in the website, ready to be served or deployed.

Local server

After generating the pages and assets for the website, it is useful to have a local server for previewing the site during development. As a quick'n'dirty trick I rely on a simple Python HTTP server, Python is available everywhere and it is reliable enough for quick iterations.

The serve method starts a Python-based HTTP server, serving the website's output directory. Here's the relevant code snippet:

pub fn serve(&self) -> Result<()> {
    info!("🧿 start server");
    let output = Command::new("python")
        .args(["-m", "http.server"])
        .args(["--directory", &self.out_dir.to_string_lossy()])
        .output()?;
    if !output.status.success() {
        return Err(format!(
            "python http.server command failed. stderr: {:?}",
            String::from_utf8(output.stderr)?,
        )
        .into());
    }
    Ok(())
}

Conclusion

Pffiu, that was a lot. I could also mention that I am using TailwindCSS for styling, and a few other cool little features and details, but I already wrote way too much for this article!

Anyway, see you next time :)