[pol@fedora data]$ lsb_release -a
LSB Version: :core-4.1-amd64:core-4.1-noarch
Distributor ID: Fedora
Description: Fedora release 34 (Thirty Four)
Release: 34
Codename: ThirtyFour
I'm trying to convert a sample database file from MS SQL Server to PostgreSQL.
So, I'm having two small niggles that I can't resolve.
shipname NVARCHAR(40) NOT NULL,
That's
(always) two spaces
identifier (i.e. field name) - always [a-z] - lower case alphabetical
followed by an unknown number of spaces
followed by NVARCHAR(xy) NOT NULL or it may be followed by NVARCHAR(xy) NULL
and I want to transform this into
shipname TEXT NOT NULL CHECK (LENGTH(shipname) <= xy),
or
shipname TEXT NULL,
What I have so far:
sed 's/^ [a-z]+[ ]+NVARCHAR([0-9]+) NOT NULL/TEXT NOT NULL CHECK \(LENGTH\((\1) <= (\2)\)/g'
So,
^
is the beginning of the stringfollowed by two spaces
followed by my field name [a-z]+
followed by an arbitrary no. of spaces [ ]+
NVARCHAR([0-9]+)
and substitute in
TEXT
followed by NOT NULL then CHECK(LENGTH(xy) - back reference 1 - <= back reference 2...
I've tried various permutations and combinations of the above, but nothing appears to work for me.
[pol@fedora data]$ sed 's/^ [a-z]+[ ]+NVARCHAR([0-9]+) NOT NULL/TEXT NOT NULL CHECK \(LENGTH\((\1) <= (\2)\)/g'
sed: -e expression #1, char 87: invalid reference \2 on `s' command's RHS
Get invalid back reference...
Ideally, and I stress ideally, if the string following NVARCHAR(xy) is NULL
and not NOT NULL
, I don't want any length check - because it doesn't make sense to take the LENGTH of a NULL... this is conditional behaviour - not sure if it's possible in regexps....
p.s. thought this would be trivial.
Have data like this:
N'Strada Provinciale 1234', N'Reggio Emilia', NULL, N'10289', N'Italy');
I want to change the N'
into just plain apostrophe '
(the N'
is a SQL Server thing) but I don't want to change the NULL
into the empty string, or worse ULL
- so I tried:
[pol@fedora data]$ sed 's/N\'\'/g TSQLV5.sql
but get
sed: -e expression #1, char 7: unterminated `s' command
I know that I've used sed
a lot, but would be open to any awk
commands that could perform the tasks required.
xy
match the40
from the corresponding original?region
andpostalcode
are VARCHAR(n) NULL, yet you want them to have a length check in the desired output. Which is correct? 2. Where does thetest_field
come from? and the extra blank lines (before birthdate, address, phone)?