Where can we find a copy of the "sakila" toy database with uncorrupted city names?



  • The "sakila" toy database which is available by download as https://downloads.mysql.com/docs/sakila-db.zip from https://dev.mysql.com/doc/index-other.html contains a city table whose city names have undergone corruption due to what I presume were encoding missteps that happened during the lifetime of the file.

    For example, the world database (which is the ancestor of the sakila database) available from that same download page has a city table with correctly spelled cities such as A Coruña (La Coruña) (in Spain), Acuña (in Mexico), and Balašiha (in Russia), but sakila's city table has A Corua (La Corua), Acua, and Balaiha respectively. Based on my cursory comparison of sakila's city table with world's city table, these are some of the extended characters that are absent from city names in the sakila table:

    á, â, ç, é, İ, í, ñ, ó, ö, š

    Where can a copy of the sakila database with correctly spelled cities be found, if not from mysql.com?



  • It sounds like the LOAD DATA command did not correctly specify the incoming encoding (CHARACTER SET).

    More

    Looking at the data file, I conclude that the accented characters are not there. That is, the problem is not with your actions, but with the source data.

    # hexdump -C sakila-data.sql | egrep -1 'A Coru|a c|una' | head;
    00003850  73 74 20 41 76 65 6e 75  65 27 2c 27 27 2c 27 4b  |st Avenue','','K|
    00003860  61 64 75 6e 61 27 2c 32  35 32 2c 27 34 33 33 33  |aduna',252,'4333|
    00003870  31 27 2c 27 37 34 37 37  39 31 35 39 34 30 36 39  |1','747791594069|
    --
    00003a10  35 20 32 32 3a 33 32 3a  33 30 27 29 2c 0a 28 32  |5 22:32:30'),.(2|
    00003a20  35 2c 27 32 36 32 20 41  20 43 6f 72 75 61 20 28  |5,'262 A Corua (|
    00003a30  4c 61 20 43 6f 72 75 61  29 20 50 61 72 6b 77 61  |La Corua) Parkwa|
    

    And that goes back to your original question of "where is a good copy?"

    I see that a bug report has been created:

    https://bugs.mysql.com/bug.php?id=106951
    



Suggested Topics

  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2
  • 2