Splitting a text file into multiple separate files by section with csplit

Here’s a bash command to split a text file into multiple separate files by section delimited by a regex pattern matching titled sections in the source file:

csplit \
  -z \
  -f section_ \
  -b "%02d.txt" \
  document.txt \
  '/\nSection [0-9]\+\n/' \
  '{*}'

The key part is the /\nSection [0-9]\+\n/ regex pattern, which matches section titles such as “Section 18” alone on a line.

Here’s an explanation of the the rest of the command:

  • -z → Suppress empty output files.
  • -f section_ → Prefix for output files.
  • -b "%02d.txt" → Use a two-digit numbering format.
  • '/\nSection [0-9]\+\n/' → Regex pattern to split on, described above.
  • '{*}' → Split at every occurrence of the pattern until the end of the file.

This produces a set of files like section_00.txt, section_01.txt etc.


View post: Splitting a text file into multiple separate files by section with csplit