I’m currently dealing with a programming problem where I need access to several 64MB, file-backed data structures concurrently on an Embedded Linux system that only has 64MB of RAM. The data structures are fairly sparse (mostly zero data), and I typically only need to access small portions of them at any particular time. There is always the brute-force approach where you write code to manually load sections of the file as you need them. But with a little thought, the realisation hits “this is what operating systems do.” This article explores using memory mapping, and the sparse file support in file systems to solve this problem in a very efficient manner.
Sparse File Support
Most Unix file systems support sparse files. This means that sections of data that is zeros is not stored on the disk. Consider the following example where we create a 200MB file that is all zeros:
root@cm-x270:/media/mmcblk0p1# df Filesystem 1k-blocks Used Available Use% Mounted on /dev/mmcblk0p1 3917212 202448 3515776 5% /media/mmcblk0p1 root@cm-x270:/media/mmcblk0p1# dd if=/dev/zero of=sparse-file bs=1 \ count=1 seek=200M root@cm-x270:/media/mmcblk0p1# df Filesystem 1k-blocks Used Available Use% Mounted on /dev/mmcblk0p1 3917212 202456 3515768 5% /media/mmcblk0p1 root@cm-x270:/media/mmcblk0p1# ls -l -rw-r--r-- 1 root root 209715201 May 22 08:54 sparse-file root@cm-x270:/media/mmcblk0p1# time od sparse-file 0000000 000000 000000 000000 000000 000000 000000 000000 000000 * 1440000000 real 0m 17.23s user 0m 14.21s sys 0m 2.66s root@cm-x270:/media/mmcblk0p1# du sparse-file 68 sparse-file
It this case we created a 200MiB file on a SD card formatted as ext3. Even though the file is 200M in size, it only uses a few KiB of disk space. This same test also worked fine with a JFFS2 filesystem. With sparse file support, we get a cheap form of run length compression (at least for zeros) with no effort.
mmap()
The mmap() system call is used to map a file, or portions of a file into memory. The data in the file can then be accessed directly in memory. Because Linux is a demand paged system, portions of the file are paged in as needed, so the entire file does not need to be present in RAM at one time. There are several advantages to using mmap():
- mmap() avoids the extraneous copy that occurs when using read() or write() as the data does not need to be copied to a user space buffer.
- there is very little overhead
- you can directly access any part of the file without doing a lseek() and keeping track of where you are.
- the operating system takes care of paging in sections of the file you are using, and discarding sections that are not in use when memory is low. This includes flushing dirty portions to disk.
What this means, is mmap gives you an easy way to work on large, file backed data structures, and the OS takes care of loading the portions you need, and saving the modified portions back to disk. As most embedded systems are 32-bit, there is a limit to the file size you can use as virtual memory space is limited.
Test Application
Next I wrote a small application that is used to create, and modify files using mmap(). I wanted to experiment with creating files of various sizes, writing non-zero data at various intervals, and test the performance of this. The test application source is located here.
Usage: sparse_file_mmap_test [OPTION] -s, --filesize File size to allocate -o, --offset Will write a few bytes every offset bytes -m, --modify Modify existing file -d, --data Data to write to file at offset location (0-255)
The application creates a file of size filesize, and writes the value of data to the file every offset bytes. There is also an option to modify existing files, so we can test the performance of opening a large file, making a small change, and then closing it.
Test Results
The results are pretty amazing, and the performance is beyond my expectations. This type of problem is where you learn to appreciate the performance of an advanced operating system, and file system. There is a reason for the complexity! There are 3 things I wanted to test: 1) does the sparse file support work as expected? 2) can mmap be used to easily modify large files? 3) can mmap be used to work on data structures that are larger than physical RAM?
1. Sparse File Support
The basic tests above confirm that sparse file support works for an empty file, but what about a file that has some data every X bytes? Below are the test results for creating a 10MB file, and writing data at various offset intervals.
Offset (bytes) | File Size (KB) |
1024 | 9784 |
2048 | 9784 |
4096 | 9784 |
8192 | 4904 |
16384 | 2464 |
32768 | 1244 |
65536 | 632 |
1048576 | 60 |
2097152 | 40 |
4194304 | 32 |
As soon as the offset was greater than the MMU page size (4KB), then the sparse file effect started to kick in and worked as expected.
2. Can mmap() be used to efficiently modify large files?
The test here was to open a large file, make a small change in the middle, and then close it. In this case a 100MB file was created with a data value of 1 written every 1MB, and then re-opened the same file and wrote a data value of 2 every 2MB. The od utility was used to examine the file to verify the contents were correct.
root@cm-x270:/media/mmcblk0p1# time sparse_file_mmap_test_arm -s104857600 \ -o1048576 -d1 Sparse File mmap() test Filesize = 104857600, offset = 1048576, data = 1 size = 102400 KB size on disk (KB): 508 sparse-file real 0m 1.10s user 0m 0.00s sys 0m 0.43s root@cm-x270:/media/mmcblk0p1# time sparse_file_mmap_test_arm -s104857600 \ -o2097152 -d2 -m Sparse File mmap() test Filesize = 104857600, offset = 2097152, data = 2 size = 102400 KB size on disk (KB): 508 sparse-file real 0m 0.37s user 0m 0.01s sys 0m 0.05s root@cm-x270:/media/mmcblk0p1# time od -x sparse-file 0000000 0002 0000 0000 0000 0000 0000 0000 0000 0000020 0000 0000 0000 0000 0000 0000 0000 0000 * 4000000 0001 0000 0000 0000 0000 0000 0000 0000 4000020 0000 0000 0000 0000 0000 0000 0000 0000 .... * 614000000 0001 0000 0000 0000 0000 0000 0000 0000 614000020 0000 0000 0000 0000 0000 0000 0000 0000 * 620000000 real 0m 12.93s user 0m 7.33s sys 0m 1.70s
Creating and modifying large file-backed data structures is very fast and efficient, and takes on the order of 1 second for a 100MB file that contains data every 1MB. Dumping the data with od took considerably longer (13 seconds) as 100MB of data needed to be processed.
Possible Issues
There are several things to watch out for when using sparse files and mmap()
- With sparse files, there is the potential to run out of disk space as the files are using less space on disk than the file size. It is a good idea to monitor disk space when working with sparse files, so you don’t use up all of the disk space.
- mmap() requires virtual memory space for the size file it maps. With embedded systems, this is less of an issue, because the physical RAM size tends to be much less than the 4GB virtual address space. With a system that only has 64MB of RAM, mmap()’ing files in the 10’s of MB in size makes a lot of sense because it insures that you will not run the system out of memory, and yet there should be plenty of virtual address space to map in files of this size.
Conclusion
mmap() and sparse file support provide a very convenient solution for dealing with large, file-backed data structures. One example of this type of data structure is any type of large matrix such as maps used in mapping applications. Writing a “from scratch” solution to solve this problem would be a fairly large and difficult task. Processing large amounts of data efficiently is becoming more and more important in many embedded systems. This example provides another compelling reason why implementing modern, data-centric embedded systems using Linux makes a lot of sense.