Recently I’ve been using YAML files along with some Ruby scripts as a simple and convenient mechanism for importing content into our database. The YAML files will be maintained by content authors, and one of the questions that came up is that of validation.

Luckily, there’s a very nice solution in form of Kwalify, a small schema validator for YAML. Conceptionally, it is similar to the corresponding XML technologies such as XML Schema or RelaxNG. Of course, in line with YAML itself, Kwalify is much simpler and still perfectly adequate for most purposes. Kwalify schemas are expressed in YAML. Supported datatypes are maps and sequences, ranges, enumerations, as well as primitive types such as strings, ints, and bools. Fields can be marked as required or unique. In addition, regular expressions can be specified to perform pattern matching on values.

Kwalify can be run standalone by passing the name of the schema file and the file to validate in the command line. It can also be called programmatically. I chose to go with the second approach in order to first catch and gracefully deal with parse errors that result from non-well-formed YAML files, which otherwise result in an ugly stack trace on the command line. Using the wrapper script, an invalid YAML file results in a friendly message containing the line number for the parse error, and a YAML file that does not conform to the schema results in one or more friendly messages about the specific schema violations.

I still think that most situations that suggest the use of YAML won’t require any form of schema validation, but if you do need this feature, Kwalify fits the bill.