How do you split the txt file on chapters?



  • There's a txt document, not formatted.
    There are a number of paragraphs (some 130) on the document.
    Глава 1Глава 2Глава 3 etc.

    Is there a way to break such a document into a lot of small files so that every file contains one chapter?
    In fact, an example of implementation in the code was interesting. No language was preferred.

    That's what the Jave made.

    public class ReadLoadFile {
    public static void main(String[] args) {
        File fileForRead = new File("/home/user/Text.txt");
        StringBuilder stringBuilder = new StringBuilder();
        File fileForLoad = null;
        try {
            BufferedReader bufferedReader = new BufferedReader(new InputStreamReader(new FileInputStream(fileForRead),"cp1251"));
            String s;
            boolean firstPart = true;
            while ((s = bufferedReader.readLine()) != null){
                if(s.contains("Глава")){
                    if (firstPart){
                        fileForLoad = new File("/home/user/PartOfText/Введение.txt");
                        fileForLoad.createNewFile();
                        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileForLoad),"UTF-8"));
                        bufferedWriter.write(stringBuilder.toString());
                        bufferedWriter.flush();
                        stringBuilder = new StringBuilder();
                        stringBuilder.append(s);
                        stringBuilder.append("\n");
                        fileForLoad = new File("/home/user/PartOfText/"+s);
                        fileForLoad.createNewFile();
                        firstPart = false;
                    }else{
                        BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileForLoad),"UTF-8"));
                        bufferedWriter.write(stringBuilder.toString());
                        bufferedWriter.flush();
                        stringBuilder = new StringBuilder();
                        stringBuilder.append(s);
                        stringBuilder.append("\n");
                        fileForLoad = new File("/home/user/PartOfText/"+s);
                        fileForLoad.createNewFile();
                    }
                }else{
                    stringBuilder.append(s);
                    stringBuilder.append("\n");
                }
            }
            BufferedWriter bufferedWriter = new BufferedWriter(new OutputStreamWriter(new FileOutputStream(fileForLoad),"UTF-8"));
            bufferedWriter.write(stringBuilder.toString());
            bufferedWriter.flush();
        } catch (UnsupportedEncodingException e) {
            e.printStackTrace();
        } catch (FileNotFoundException e) {
            e.printStackTrace();
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
    



  • The algorithm is approximately:

    1. Seek in the original file a fragment between the headings (the last chapter of the heading will be only one);
    2. Cut it out or copy it in a new text file.

    Searching fragments can be seen in different ways: either bypassing the key searching text (Chapter 1, Chapter 2, Chapter 3 etc) or by using regular expressions.


Log in to reply
 


Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2