Splitting a text file into multiple separate files by section with csplit
Here’s a bash command to split a text file into multiple separate files by section delimited by a regex pattern matching titled sections in the source file:
csplit \
-z \
-f section_ \
-b "%02d.txt" \
document.txt \
'/\nSection [0-9]\+\n/' \
'{*}'
The key part is the /\nSection [0-9]\+\n/
regex pattern, which matches section
titles such as “Section 18” alone on a line.
Here’s an explanation of the the rest of the command:
-z
→ Suppress empty output files.-f section_
→ Prefix for output files.-b "%02d.txt"
→ Use a two-digit numbering format.'/\nSection [0-9]\+\n/'
→ Regex pattern to split on, described above.'{*}'
→ Split at every occurrence of the pattern until the end of the file.
This produces a set of files like section_00.txt
, section_01.txt
etc.