I am trying to read from a file, perform an operation on the contents on the file, and then write to another file as fast as possible (for a competition). To do this, I mmap both the input and output file and read and write to the mmaped files. However, I noticed that mmaping an existing output file was significantly slower than mmapping a nonexisting file (x2 or x3 slower total runtime). Specifically, munmap was much slower for the output file. So now I check if the output file already exists and delete it if it does, which gives significantly faster run times. However, this takes about 10ms, which is quite significant given that my algorithm takes 100ms for a 100mb input file.
if(access(argv[2], F_OK) == 0) {
if(remove(argv[2]) != 0) {
fprintf(stderr, "%s\n", "Error removing output file");
exit(0);
}
}
The access call is fast, it is the remove call that is taking time.
Things that didn't work:
- Opening the output file with O_TRUNC before mmaping (slow mmap)
- Writing to a temporary empty file, then renaming to the output file (as slow as remove)
- Unlink (as slow as remove)
Is there a faster way to achieve the same effect as deleting?
copy_file_range()
call will beat memory-mapping anyway.