G
The "prefix b" indicates that the object you have at hand is not a text string - and yes, a set of bytes -
In Python 3 the two things are fundamentally different, why you always need to know how the text is coded in the bytes so you can turn them into characters. Nowadays it is increasingly common that the text is in the "utf-8" encoding, but some legacy systems and Windows use the "latin-1" encoding - which allows all the characters of the Portuguese language to be in a single byte.The objects of the type "bytes" in Python have a "decode" method - just call it and the result will be the string in text (which is indicated in Python without the prefix 'b'). but beyond the "decode" method, the call str(xml, 'utf-8') would also make this transformation - the error message changes. Since it is not the error of Python saying that there is an invalid utf-8 sequence, the chance is that its XML is yes in utf-8 - only that the ODBC claims an invalid character: the utf-8 supports universal characters - other encodings, such as latin1, no - if there are characters in tongues with Greek, Russian, Hebrew, or even signs of latin-1 punctuation.The remedy would be to force an escaping encoding to pass the data to the driver - only, there is another problem: the function does not accept bytes (the text already encoded). Result: you will have to mutilation the text in Python, replacing all the characters outside the "latin1" with "?", turn it back into text and then make your call. Then, if there is no other error in XML it should work. I would recommend contacting whoever designed the bank you are feeding to accept a universal encoding.To understand more about these processes, stop now everything you're doing and read http://local.joelonsoftware.com/wiki/O_M%C3%ADnimo_Absoluto_Que_Todos_os_Programadores_de_Software_Precisam,_Absolutamente,_Positivamente_de_Saber_Sobre_Unicode_e_Conjuntos_de_Caracteres_(Sem_Desculpas!) To sort your problem and remove the problematic characters from the text:An error equivalent to this is what is now occurring within the ODBC code - if you send a text with Cyrillic characters, for example:In [119]: a = "texto inválido: Ут пауло интерессет темпорибус пер"
In [120]: a.encode("latin-1")
UnicodeEncodeError Traceback (most recent call last)
So - you should: decode your data using utf-8, encode back to latin-1, changing the unknown characters by "?", and decode back to text - there will have data that can be sent to your bank:In [122]: dados
Out[122]: b'texto inv\xc3\xa1lido: \xd0\xa3\xd1\x82'
In [123]: dados_str = dados.decode("utf-8").encode("latin1", errors="replace").decode("latin1")
In [124]: dados_str
Out[124]: 'texto inválido: ??'
(The "data" variable in this example is equivalent to what you have there at first: a bytes object representing utf-8 coded text, with invalid characters in latin-1). If you continue to have the same error não é possível alternar a codificação, expriemer filter out all non ASCII characters - use "ASCII" instead of "latin-1" in the above code.