Skip to main content

Questions tagged [encoding]

Encoding is a set of rules used to represent data in a form that can be stored and transmitted to another process or system. Character encoding (e.g. Windows-1252, ISO-8859-1, UTF-8, UTF-16) refers to the way character data is represented as a series of bytes. Binary encoding (e.g. Base64) refers to the way binary data is transformed into a series of characters.

Filter by
Sorted by
Tagged with
3 votes
2 answers
1k views

What is the collation used while comparing Unicode string literals in SQL Server 2019?

My understanding is that the collation for comparing Unicode string literals is determined by the database collation. My database is using SQL_Latin1_General_CP1_CI_AS collation. When I compare N'ß' ...
QFirstLast's user avatar
1 vote
1 answer
756 views

Upgrading MySQL utf8mb3 to utf8mb4 - how to replicate documentation behaviour?

I'm looking to upgrade MySQL fields from mb3 to mb4 in a large database and from reading the documentation I understand that the biggest problem is mb3 stores stuff in 3 bytes, but mb4 stores stuff in ...
KillerKode's user avatar
2 votes
1 answer
160 views

Insert gives error code 1366 even though charset is utf8mb4

I am trying to manually copy a table from one server to another. I used mysqldump to export the table. The import fails with the following error: ERROR 1366 (22007) at line 6: Incorrect string value: '...
Vongo's user avatar
  • 181
1 vote
0 answers
60 views

Issue with hash storage resulting from password_verify() output

Consider the following code: $random_token = random_bytes(32); $token_hash = password_hash( $random_token, PASSWORD_DEFAULT ); $token_base64 = sodium_bin2base64( $random_token, ...
DevelJoe's user avatar
  • 163
2 votes
1 answer
1k views

Postgres encoding issue - find values that have no equivalent in (client) encoding

I have Postgresql database that internally uses UTF-8 encoding. Some clients that connect to the database use LATIN 2 (ISO 8859-2) client encoding when they connect to database. I can't change the ...
CyberMuz's user avatar
  • 123
1 vote
1 answer
1k views

How to display non-English character correctly in db2cmd?

On Db2 v11.5.7 on Linux/x86_64 I have a UTF-8 database. Executing db2 get db cfg for test1 returns: Database code page = 1208 Database code set = UTF-8 On my Windows 10 computer in ...
folow's user avatar
  • 401
2 votes
1 answer
466 views

How can I detect failure in CONVERT( tbl AS utf8)?

MySQL provides the CONVERT(`tbl` AS utf8) expression to transcode text from one charset to another. Sometimes that conversion cannot succeed, because the destination charset does not include ...
Jim DeLaHunt's user avatar
2 votes
1 answer
672 views

How can I detect double-encoded MySQL columns and rows, and validate the repair?

My database provider, bless their hearts, migrated our MySQL databases to another server recently, and introduced double-encoding of UTF-8 data via Latin1 into our text data. Strings like 'emdash—here'...
Jim DeLaHunt's user avatar
0 votes
1 answer
40 views

Synapse column store datetime encoding

I have a table in an Azure Synapse database. This table has a clustered columnstore index. It has a datetime column. The minimum value in this column is 2001-03-27 00:00:00 and the maximum is 2022-12-...
Michael Green's user avatar
0 votes
0 answers
66 views

Create dadabase with spesific encoding from shell script

I want to create a database from shell script, I add this line to my script psql -c 'CREATE DATABASE template1 WITH OWNER = postgres ENCODING = 'LATIN1' TABLESPACE = pg_default LC_COLLATE = "...
user3093583's user avatar
0 votes
1 answer
965 views

Import postgresql database with Latin encoding

I'm going to upgrade from postgresql 9 to postgresql 11, the encoding used is Latin. I've exported the database using this command pg_dumpall | gzip > /tmp/db.sql.gz I extracted the exported ...
Naruto Uzumaki's user avatar
1 vote
1 answer
1k views

Unknown charset: utf8mb4

Context: I send queries to my MariaDB server(Ver 15.1 Distrib 10.3.34-MariaDB) through a Python script and mysql.connector module. (I don't know which one it's among those 3 in my list: mysql-...
lyeaf's user avatar
  • 197
1 vote
0 answers
149 views

Data in one or more tables corrupt after recovering tables with alter table discard/import tablespace

I accidentally dropped my schema and had to recover all the tables per the steps in this post; Recover schema and the data in it from accidentally deleted schema However, I notice that all of the ...
Citizen1138x's user avatar
0 votes
2 answers
128 views

Where can we find a copy of the "sakila" toy database with uncorrupted city names?

The "sakila" toy database which is available by download as https://downloads.mysql.com/docs/sakila-db.zip from https://dev.mysql.com/doc/index-other.html contains a city table whose city ...
Tim Stewart's user avatar
0 votes
2 answers
416 views

How mysql store data and their encoding

I have few databases in latin1 and i will migrate it to utf8. I have some characters like 'œ' that are utf8 characters. I want to know how mysql store the encoding ? Cause if it stores utf8 characters ...
Thomas Corbisier's user avatar
4 votes
2 answers
9k views

How to handle short UUIDs with Postgres?

I see that many web services (Stripe comes to mind) use a special encoding for their UUIDs. Instead of the usual encoding a44521d0-0fb8-4ade-8002-3385545c3318 they are going to be encoded using a ...
laurent's user avatar
  • 191
1 vote
1 answer
713 views

Why can't binary data be inserted/displayed as ones and zeroes?

If I have a column of type binary or varbinary, I imagine the data as a sequence of bits. For example, it makes sense to me that 01001 (as a base 2 number) could be a valid value in a binary(5) column....
Jacob Stamm's user avatar
9 votes
3 answers
4k views

Is it possible to use OPENROWSET to import fixed width UTF8 encoded files?

I have an example data file with following contents and saved with UTF8 encoding. oab~opqr öab~öpqr öab~öpqr The format of this file is fixed width with columns 1 to 3 each being allocated 1 ...
Martin Smith's user avatar
  • 86.5k
1 vote
0 answers
543 views

Convert any string to url valid percent encoding in BigQuery

I am trying to convert any string with any set of special characters into a valid url of the format below. In Bigquery Example: /artwork-v2/-̴̕ι-̶͔͛n̴e̷p̸u̴̒n̵uś̵̥o̵̙̾rt̷͗um̶̹͐-20380 encodes to: /...
Keegan Ead's user avatar
0 votes
1 answer
736 views

How can I know if there is data loss when converting mysql character set(s)?

I'm converting a large (70G) legacy mysql database that is mostly utf8 (but with a sprinkling of other encodings in fields) to one that is (as uniformly as possible) utf8mb4 (utf8mb4_unicode_ci). The ...
pixelearth's user avatar
2 votes
1 answer
2k views

What is the impact of converting latin1/latin1_swedish_ci to utf8mb4/utf8mb4_unicode_ci?

I was facing some issues with the character's encoding. Those are resolved by updating the CHARACTER and COLLATE for some columns in the table. So my concern is Is this conversion safe? Or can this ...
Mehar's user avatar
  • 121
2 votes
1 answer
526 views

Unable to enter characters in MariaDB

I have just switched from MySQL to MariaDB and am running into a very silly problem: I cannot enter any extended characters in the database (for example, ö ä or å). The system is utf8, and I tried ...
jorgon's user avatar
  • 21
5 votes
1 answer
3k views

Msg 6355 "Conversion of one or more characters from XML to target collation impossible" when querying sys.dm_exec_query_plan

I like to find missing indexes on the go, looking at the execution plans! It can potentially give me an indication where further to look at if I want to improve something that is currently running. ...
Marcello Miorelli's user avatar
2 votes
2 answers
2k views

How to create a new column in SELECT CASE when string contains Arabic words?

I have a problem with a select case when statement, I would like to add a new column in the select case when statement, I got the result but with ????? because I have set the column to Arabic words, I ...
abdou31's user avatar
  • 61
3 votes
1 answer
1k views

Does a huge key length value for a mulibyte column affect the index performance?

When I look at the EXPLAIN results, the key len value is always calculated based on the actual column length multiplied on the maximum number of bytes for the chosen encoding. Say, for a varchar(64) ...
Your Common Sense's user avatar
1 vote
1 answer
2k views

mysqldump dumps different data with and without --no-create-info

mysqldump dumps different representations of data when called with/without --no-create-info. Test case First, create the test table and populate it with interesting data. CREATE TABLE `test` ( `...
matt's user avatar
  • 203
0 votes
1 answer
2k views

How to fix double-encoded UTF8 characters in postgres

I have a dataset (shapefile) with the same problem as the post below: https://stackoverflow.com/questions/11436594/how-to-fix-double-encoded-utf8-characters-in-an-utf-8-table "A previous LOAD ...
hugonbg's user avatar
  • 101
1 vote
1 answer
3k views

PostgreSQL pg_dump -E encoding option not working

I have a UTF8 database qdb and I want to back it up to a plain file using the same UTF8 encoding. I am using pg_dump as I don't have pgAdmin working now. I however can't get pg_dump to output a UTF8-...
Rafs's user avatar
  • 125
1 vote
1 answer
2k views

How to escape special characters in MySQL

When I do select * .. | mysql ... > /tmp/file from a table with text, there are some problematic characters that prevent me from loading it to a different db using copy (postgres) or load into (...
Nir's user avatar
  • 533
1 vote
0 answers
2k views

PSQL console encoding

How can I have a working PSQL console using UTF8 encoding under Windows? I have a Windows server and client. The Postgres 12 database contains tables with content in multiple languages (ex: English, ...
JGH's user avatar
  • 111
4 votes
2 answers
1k views

Does MySQL 8 ASCII vs utf8mb4_0900_ai_ci size differ when only using ASCII characters?

If I only use only ASCII characters, will VARCHAR (255) with utf8mb4_0900_ai_ci be larger on disk than VARCHAR (255) using ASCII?
jsHate's user avatar
  • 177
2 votes
1 answer
13k views

Saving images as base64 encoded strings, why is it bad?

I've seen this on one of the production databases I've come across and these images apparently cover a large portion of their DB. After researching a lot I couldn't really find a lot of good answers ...
Chessbrain's user avatar
  • 1,223
1 vote
1 answer
6k views

Compress JSON String Stored in PostgreSQL, such as MessagePack?

JSON strings are currently being stored in a PostgreSQL 11 table in a text field. For example, a row can have the text field asks containing the string: {"0.000295":1544.2,"0.000324":1050,"0.000325":...
Nyxynyx's user avatar
  • 1,141
1 vote
0 answers
358 views

MariaDB REGEXP_REPLACE Invalid utf8 byte sequence

I'm working with a fairly old (15 years) old database. It started as a MySQL3, was upgraded several times, the frontends had problems with proper encoding, the whole database was converted to utf8mb4 (...
mindhaq's user avatar
  • 111
1 vote
2 answers
2k views

Oracle to T-SQL OPENQUERY special character conversion issues

I'm struggling to figure out where the character encoding issue on my Linked server may be coming from here. The ZPDT_PAT_ALPHA column should have a degrees symbol at the end, as shown by the DUMP. ...
blakel's user avatar
  • 31
2 votes
2 answers
560 views

charindex thrown off by extended characters

I've got a column with a filenames stored in a nvarchar(255). I'm trying to parse out the file extension using reverse( left( reverse(filename), charindex('.', reverse(filename) ) -1 ) ) This works ...
jfrobishow's user avatar
2 votes
0 answers
160 views

Create/Edit SSIS Derived Column Task From Text/Code for Large ETL

I have to insert a large UTF-8 encoded flat file to a 1252 encoded sql server. In order to do so, I am using SSIS to type cast each column. Is there a way, other than copy pasting through all 400 ...
Tyler Raber's user avatar
7 votes
1 answer
17k views

Query to find rows containing ASCII characters in a given range

I am using some scripts from another topic, but the accepted answer isn't working for all my data scenarios. I would have asked my question on the original How to check for Non-Ascii Characters post, ...
Fred's user avatar
  • 73
4 votes
2 answers
843 views

Different characters, same ASCII code?

I have this query that throws two results: SELECT id FROM table1 WHERE id like 'nm041033%' nm0410331 nm0410331 And this slightly different query that throws only one result: SELECT id FROM table1 ...
Leopoldo Sanczyk's user avatar
3 votes
1 answer
4k views

Is there a MySQL character set and encoding that will allow for both emojis and accents?

I've got a database of terms that get added to by one group of users, and queried against by another. I was running into problems when people would query for an emoji in the database and my React app ...
McB's user avatar
  • 133
2 votes
2 answers
2k views

Handling data encoding issues while loading data to SQL from Script (Notepad++)

I'm pretty sure this is not a SQL Server problem. I already asked a question HERE with an awesome explanation, BUT I still can't explain to the guys where I work that it has nothing to do with SQL ...
Racer SQL's user avatar
  • 7,484
20 votes
3 answers
32k views

PostgreSQL: difference between collations 'C' and 'C.UTF-8'

In PostgreSQL, what is the difference between collations C and C.UTF-8? Both show up in rows of pg_collation. Is it perhaps the case that C.UTF-8 is the same as C with encoding UTF-8 regardless or ...
rookie099's user avatar
  • 368
0 votes
1 answer
10k views

How can I search for a hex string in oracle?

There's a database record that is incorrect. The name field displays like this in a browser ( this is incorrect ): Thcˇodore And when I look at the record via SQL results, I see: Thc\xCB\x87odore I'...
cwd's user avatar
  • 105
6 votes
2 answers
2k views

Byte ordering for multibyte characters in SQL Server versus Oracle

I am currently in the process of migrating data from Oracle to SQL Server and I'm encountering an issue trying to validate the data post-migration. Environment Details: Oracle 12 - AL32UTF8 ...
HandyD's user avatar
  • 10.4k
1 vote
2 answers
10k views

Bulk insert not retaining special chars of UTF-8 Encoded txt file Sql Server 2008

I have a stored procedure that bulk imports a text file and inserts it into my database. CREATE TABLE DBO.TEMP_STORE ( ID nvarchar(max), [MONTH] nvarchar(max), [...
ninjasense's user avatar
4 votes
3 answers
2k views

Unicode storage of \u202b RLE and \u202c PDE in a Unicode-aware database?

I'm building a new product for toponyms and in it the Arabic shows kinda like this: ^IArabic^I<202b>ﺰﻤﺑﺎﺑﻮﻳ<202c>^I<202b>ﺞﻫﻭﺮﻳﺓ ﺰﻤﺑﺎﺑﻮﻳ<202c>$ Actually not quite. This is a ...
Evan Carroll's user avatar
  • 64.7k
3 votes
1 answer
4k views

How to insert a Unicode character verbose into a varchar DB?

I need to insert this character '●' into a VARCHAR column of a MSSQL database with collation set as SQL_Latin1_General_CP1_CI_AS (or at least mock what my Python + Windows MSSQL Driver might have done)...
Nishant's user avatar
  • 899
3 votes
2 answers
4k views

SQL Server 2019 UTF-8 Support Benefits

I'm already quite comfortable with using COMPRESS() and DECOMPRESS() in an internal forum software for our company (Currently in SQL Server 2017), but trying to make the database as efficient as ...
John Titor's user avatar
0 votes
1 answer
70 views

Users running ANSI scripts are having problems with special characers [closed]

Long story short. How can I fix users environments, to make them run our scripts using ANSI encode? The problem is, we send them scripts to run on their databases using ANSI encode. But some of ...
Racer SQL's user avatar
  • 7,484
17 votes
1 answer
12k views

Error starting SQL Server 2017 service. Error Code 3417

I have SQL Server 2017 installed on my computer. This is what SELECT @@VERSION returns: Microsoft SQL Server 2017 (RTM-GDR) (KB4293803) - 14.0.2002.14 (X64) Jul 21 2018 07:47:45 Copyright (C) ...
Beginner's user avatar
  • 273