How do we consider UTF-8 a file to be spontaneous and to keep symbols?



  • There's a lot of different ways on the Internet.
    How do you consider the unicode code point?
    And what do you keep in java unicode code point?
    It's all because the file can be big enough.



  • It will be necessary to select classes to indicate the inlet flow coding.
    Unfortunately, it's a little long.

    BufferedReader in = new BufferedReader(new InputStreamReader(new FileInputStream(pathToFile), "UTF-8"));
    

    StringBuilder sb = new StringBuilder();
    char[] cbuf = new char[1024];
    int r;
    while ((r = in.read(cbuf, 0, 1024)) != -1) {
    sb.append(cbuf, 0, r);
    }
    String s = sb.toString();

    for (int i = 0; i < s.length();) {
    int cp = s.codePointAt(i); // Unicode code point
    ...
    i += Character.charCount(cp);
    }




Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2