We already saw how to embed a simple string in our Rust application and also how to use preprocessing to embed a list of values.
Now we need to embed a simple CSV file that looks like this:
examples/embedded-simple-csv-file/data/languages.csv
rs,rust
sh,bash
toml,toml
lock,toml
It is actually part of the code-base running the Rust Maven web site that maps file extensions to format types.
We need to read in this file and store as a HashMap so we'll be able to easily get the format type from a file extension.
In the case of the list of values I wrote that storing the original text file in the memory would be a waste
of memory and thus opted to preprocessing it, but then I thought. The size of these data files is really small relatively to the
size of the compiled code. For example in our sample crate the result of cargo build --release
is a file of 4,681,888 while the data file is only 36 bytes.
Embedding the file
In this case we take a different approach and embed the file as it is. For this we use the include_str! macro.
examples/embedded-simple-csv-file/src/main.rs
use std::collections::HashMap;
fn main() {
let ext_to_languages = get_languages();
println!("{:?}", ext_to_languages);
println!("{:?}", ext_to_languages["rs"]);
assert_eq!(ext_to_languages["rs"], "rust");
}
fn get_languages() -> HashMap<String, String> {
let text = include_str!("../data/languages.csv");
let mut data = HashMap::new();
for line in text.split('\n') {
if line.is_empty() {
continue;
}
let parts = line.split(',');
let parts: Vec<&str> = parts.collect();
// let parts = parts.collect::<Vec<&str>>();
data.insert(parts[0].to_string(), parts[1].to_string());
}
data
}
We can now use cargo build --release
, we can move the resulting executable anywhere, it will already have the CSV file baked
into the code so we won't need to distribute it separately.
Compiled size change
Though I thought the change in the compiled size would be around the size of the file we embed, but I ran a little experiment and it was way more. I commented out the code that the "rs" extension and compiled the code. The resulting file size was 4,677,496. Then I emptied the CSV file and compiled the code again. This time I got a file of 4,670,952. So the difference is 6,544 bytes. Still only 0.2% of the total file size but way more than the 36 bytes I expected. I'll have to investigate this.
This is especially strange as there was no size difference in the embedding simple string case.
Improved version
After publishing this I got some suggestion, based on those I created an improved version with more functional programming elements which is probably way better than this solution. Check out the Embedding simple CSV file and processing in a functional way.