Deserializing YAML - deny unknown fields

YAML serde deny_unknown_fields

Defining default values for fields in YAML or making fields optional are very useful features, but if there is a typo in the YAML file we might never notice it. This is certainly a source for a lot of frustration. Luckily there is a solution. We can tell serde to deny_unknown_fields. That way if there is a typo in the names of one of the fields, the parser will return an error.

This is basically what we need to do:

#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
struct Person {
    name: String,

    #[serde(default = "get_default_married")]
    married: bool,
}

fn get_default_married() -> bool {
    false
}

In this struct we expect two fields, name is required, but if there is no married field then we set it do false.

This works well when the YAML file has all the fields:

examples/yaml-deny-unknown-fields/all.yaml

name: Foo Bar
married: true

name: Foo Bar
married: true

or when the married field is missing:

examples/yaml-deny-unknown-fields/missing.yaml

name: Foo Bar

name: Foo Bar
married: false

However if there is a typo and we have maried instead of married:

examples/yaml-deny-unknown-fields/typo.yaml

name: Foo Bar
maried: true

Then without the deny_unknown_fields we get:

name: Foo Bar
married: false

Adding the deny_unknown_fields attribute would yield the following error:

Could not parse YAML file: unknown field `maried`, expected `name` or `married` at line 2 column 1

Full example

examples/yaml-deny-unknown-fields/src/main.rs

use serde::Deserialize;
use std::fs;

#[derive(Deserialize)]
#[serde(deny_unknown_fields)]
struct Person {
    name: String,

    #[serde(default = "get_default_married")]
    married: bool,
}

fn get_default_married() -> bool {
    false
}

fn main() {
    let filename = get_filename();
    let text = fs::read_to_string(filename).unwrap();

    let data: Person = serde_yaml::from_str(&text).unwrap_or_else(|err| {
        eprintln!("Could not parse YAML file: {err}");
        std::process::exit(1);
    });

    println!("name: {}", data.name);
    println!("married: {}", data.married);
}

fn get_filename() -> String {
    let args: Vec<String> = std::env::args().collect();
    if args.len() != 2 {
        eprintln!("Usage: {} FILENAME", args[0]);
        std::process::exit(1);
    }
    args[1].to_string()
}

Dependencies in Cargo.toml

examples/yaml-deny-unknown-fields/Cargo.toml

[package]
name = "yaml-deny-unknown-fields"
version = "0.1.0"
edition = "2021"

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
serde = { version = "1.0", features = ["derive"] }
serde_yaml = "0.9"


A potential problem

What if we get the files from some external source and the provider decides to add a new field? Our code will stop functioning. On one hand it is good that we immediately notice the extra field, on the other hand we would not want our service to stop working at 2am just because the data supplier decided to roll out their changes at that time.

I am not sure what should be the right solution. How do we balance the two needs: avoiding using default values when there was a typo and allowing the seamless addition of new fields.

Related Pages

YAML and Rust

Author

Gabor Szabo (szabgab)

Gabor Szabo, the author of the Rust Maven web site maintains several Open source projects in Rust and while he still feels he has tons of new things to learn about Rust he already offers training courses in Rust and still teaches Python, Perl, git, GitHub, GitLab, CI, and testing.

Gabor Szabo