Detailed Notes
Focus for this episode: We start into a deeper look at synchronizing columns
This tutorial walks through the creation of a database synchronization tool in Python. It continues our Python and SQL Tutorials with a useful application that provides a deeper look at SQL structures and commands.
Repository For Code: git clone [email protected]:rbroadhead/dbsync.git
Transcript Text
[Music] well hello and welcome back we are continuing our season our series where we're looking at uh if i can get back here we're looking at python and at sql server and sql in general i guess and we are building a synchronization and we've started with tables we've added columns we've added data so in each case this is adding things that don't exist from a source to a destination as first tables are concerned um really there's not much there it's either it's there or it's not and since this is a essentially a one-way synchronization our goal is just to find tables that exist in the source and go ahead and add them into the destination we could go vice versa or i guess say anything that doesn't exist in the source we want to drop that table in the destination but i think we'll keep it now that we probably will swing back around to that later as a maybe a setting or something like that but for now i want to take that next step which is with the table what do we do if a because we've we've got columns uh first cases column exists in source doesn't exist in destination another situation is it exists in both but it's different and that's what we're going to start into uh today so before we had this syncrose where we're just going to synchronize the rows from one to the other and what we want to do this time is we're going to actually do the same thing essentially but we're going to do and actually that should have been adros but we're going to do it with the columns and so the way we need to come through here where is this at so with this one what we're going to do is we're going to start with our we've got our source our destination databases and then a list of tables and that tables list which is right here source tables is the list of tables from the source table uh source database so we're just going to go through that source results at all our tables uh missing columns sync columns here we go and we're going to need a couple cursors so we come in we're going to send those in and we're going to have a cursor for the source database and then we're going to have two different cursors for the destination database because we're going to be spinning through that a couple different ways and we have this missing columns which we really don't need i don't know why it's because i copy and paste error so we're going to do is for each table in the list we're going to walk through and the first thing we do is going to get and this is the source we're going to walk through and we're going to get a description we're going to get the details of the table in the source database and actually what we're going to do is we're going to get it in both so if you can see here get the list of columns in the first in the source list of columns in the destination and then we come through and we load those into source columns and destination columns and then what we're going to do is for each column in source columns so for source column in there we're going to then for each one of those we're going to walk through the destination columns and we're basically going to say that uh if destination column name uh let's see is that right yeah i'm sorry that's your destination column uh for each one of those if the destination column is equal to the source column which is the name then we're going to do something otherwise we don't do anything at all so basically first we're going to do is we're going to say hey if the name is the same so we're not going to allow for name changes because we don't really have a way to track that that's the same column so what we're going to do is go ahead and just assume that we're good uh other than if the chain if we have the same name so if the name is a chain same they're going to walk through we're going to assume that we don't need to update it uh and you'll sort of see as we go into it but otherwise we're going to do is we're going to walk through let's see so we're going through we're going to go through the length of source column uh remember these are arrays essentially so we're going to get the length of source column and then we're going to basically walk through each of them and actually i guess we could start it at 1. and right now we're going to print each of those we're going to print that should be source column and then destination column so we're going to print what is the details of each column and it may be things like you know we skip the name but it could be type we've got functions we've got whether it's primary key or key or not some things like that now first thing we need to do is we have some instances where in either of those we're going to have a um and let's actually just do it like this we're going to do a current destination equals something let's do that and current source equals because what we need to do is we need to figure out if the type of the row uh let's see source column because we get these a little bit differently from time to time sometimes they come back in strings sometimes sometimes they come back as a byte array uh so if the source then we're going to say that the current source equals whoops a d of the which just basically says let's take this and we've seen this before you can go look in the python stuff but basically we're going to make cursors equal to a decode of that otherwise source is just the string and we'll just make it a string so let's do string of source column index and this is just so we can do some string compares and things like that and we're going to do the same thing with the destination so this needs to be our destination column and destination column and current destination if not i'm going to do here and it's going to destination come so what we've got is we've figured out regardless what we have is we're going to is that right yeah then we've set these two up and what we'll see is we'll see here that these things are going to sometimes be different so if we come in here and we do uh current source i do it the same order so current source current destination now we've got uh some specialty type things so if index is one then we're going to have to do something here um let's see we're going to go ahead and we don't need to mess with that so if it's one then if the uh what do we call the current source curve source dot upper starts with that or cursed source that upper starts with that then we have to split it up a little bit and this is what we've seen before because of the whoops src um we've seen this because of situations where we have those like ants that have a a size and or in some kinds they don't sometimes just straight up sometimes it's an int and it has a default size which can be a problem so we're going to fix that in both of these just to make sure that they are we're doing an apple to apple comparison and so we have to do a little bit of conversion for those and then we don't worry about that and then um [Music] we're just going to go ahead and now compare the two and uh let's see do we want this uh yeah so we're gonna say here oh and this is just oh actually we can do this so we're going to say let's get let's get rid of this let's take this uh so we're gonna do that otherwise we're gonna say if these things are true um if those two are equal if source then uh oh if they're not equal then we're gonna have to update it and they're gonna walk through our indexes so then what we're gonna do is say okay so now what do we do if we have to update it so now what we wanna do is we're gonna come down here and we're going to say uh this is not going to call row uh so the name is going to be uh source oh call source i'm sorry current source id uh zero so that's our name uh the type is again we can just do um let's see did we we decoded those each time right so we have we're doing the decodes here oh so this is for each one so really what we want to do oh this is if update okay i'm sorry if update so uh name equals that um so this is going to be uh let's go ahead and do it this way source column apologies i'm sort of thinking on the fly here a little bit as i'm going through this so name is going to be here the column type that's going to be source column [Music] 1. and we probably want to do actually let's do it this way student name equals this type equals this what else do we need we need default uh so he's gonna be that okay so we can do default up here we need to enable name type column type whoops where'd you go name type default and so table is good so i just need column let's call them column type instead and then what i'm going to do is here i'm going to do this real quick so i'm going to say if well if it's equal to 1 then i know that the name is equal to oops i'm sorry that's not going to be it let's not be that that's going to be here and so when i walk out of this if the index is 1 then the column type whoops is equal to the current source if [Music] i guess i can do an else here else out let's do lift idx equals zero if a0 then the name is equal to the current source whoops and so we're going to do is we're going to build out our little string if we need to which we don't yes maybe you could say it's just it's a bit of a pain but it also saves us walking through some crap later so uh zero one uh next is the default so four then default equals current source and let's see column type name and default those are the three we need okay so we really need that so those are our three um do we need yeah because right now we're not going to worry about uh some of the other values like a primary care anything like that so we should be good with those three so then that means uh if we do an update so did we already get that we don't have the default okay so then we need to do the default thing then we need to do if those are in there then we're going to change up our default and then we do an offer table we're going to print and we're going to update so let's see what happens so if we run that um let's do yeah we'll just sync rows we're gonna leave that at 10 right now and let's go see what this runs out to be okay so we've got an issue somewhere in here which is in the none okay so uh alter t oh so we need to deal with that none before we had that where did we run into that here we go uh oh so let's update so if let's just do this because i think this is just for the default maybe so if default equals none then default oh let's do this this way else default equals that we've got to blank it back out so if he's in there or he's in there that's good oh let's do this see so if default is not equal to none and fault not equal to that okay there we go let's see if that works okay so now it doesn't uh blah blah blah oh we're still getting default none is that right oh it's probably here let's try that let's just say if default because otherwise it's not let's see what happens here nope that's still not going to give it to us okay if he's not equal to null then default should be equal to that so let's try this again let's see what happens so we're still getting this i'm getting all these nuns in here okay alter table modify oh we're not getting the name so we didn't get the name right uh name name name name name name ah we didn't get name so we're gonna say um which we know right right up here name is never going to be that problem so name is just going to be equal to destination column so you're actually going to be source columns just to be consistent whoops and now we're going to get that there we go and now we're getting missing columns is not defined oh we've got that somewhere in that's in sync columns which we don't need oh i probably moved that from the wrong place where is where is it looking for missing columns no here we go nope that's right all right missing columns there we go i've got that there [Music] oh here we go sync columns does not mean uh that right yep so if we do it there we go so we get all our inserts we fly through and we're gonna see some stuff here for example where there's a varchar2 but here it was a uh a line it shouldn't be and we're gonna go through here and we're gonna see we're gonna see some interesting changes here so we're gonna go through next time around and track these down uh try to see what's going on we're gonna try to get a little bit better of a debug because i don't think it's getting us exactly like it's supposed to uh for example like lookup type default undefined uh we probably need to like we probably need to deal with that so we've got a couple special things to deal with we're gonna track that down next time though so go out there and have yourself a great day a great week and we will talk to you next time [Music] you
Transcript Segments
[Music]
well hello and welcome back we are
continuing our season our series where
we're looking at
uh if i can get back here we're looking
at python and at sql server and sql in
general i guess and we are building a
synchronization
and we've started with tables
we've added columns we've added data so
in each case this is
adding things that don't exist from a
source to a destination as first tables
are concerned um really there's not much
there it's either it's there or it's not
and since this is a
essentially a one-way synchronization
our goal is
just to find tables that exist in the
source and
go ahead and add them into the
destination
we could go vice versa or i guess say
anything that doesn't exist in the
source we want to drop that table
in the destination but i think we'll
keep it now that we probably will swing
back around to that later as a
maybe a setting or something like that
but for now i want to take that next
step
which is with the table
what do we do if a because we've we've
got columns uh first cases column exists
in source
doesn't exist in destination
another situation is
it exists in both but it's different
and that's what we're going to start
into uh today
so before we had this syncrose where
we're just going to synchronize the rows
from one to the other
and what we want to do this time is
we're going to actually do the same
thing essentially but we're going to do
and actually that should have been adros
but we're going to do it with the
columns
and so the way we need to come through
here
where is this at
so with this one what we're going to do
is we're going to start with our we've
got our source our destination databases
and then a list of tables and that
tables list
which is right here source tables
is the list of tables from the source
table uh source database so we're just
going to go through that source results
at all our tables uh missing columns
sync columns here we go
and we're going to need a couple cursors
so we come in we're going to send those
in
and we're going to have a cursor for the
source database
and then we're going to have two
different cursors for the
destination database
because we're going to be
spinning through that a couple different
ways
and we have this missing columns which
we really don't need i don't know why
it's because i copy and paste error so
we're going to do is for each table in
the list
we're going to walk through
and the first thing we do is going to
get and this is the source
we're going to walk through and we're
going to get a description we're going
to get the details of the table in the
source database
and actually what we're going to do is
we're going to get it in both so if you
can see here
get the list of columns in the first in
the source list of columns in the
destination
and then we come through and we
load those into source columns and
destination columns
and then what we're going to do is for
each column in source columns so for
source column in there
we're going to then for each one of
those we're going to walk through the
destination columns
and we're basically going to say that
uh if destination column name
uh let's see is that right yeah i'm
sorry
that's your destination
column
uh for each one of those if
the destination column is equal to
the source column
which is the name then we're going to do
something otherwise
we don't do anything at all so basically
first we're going to do is we're going
to say hey if the name is the same so
we're not going to allow for name
changes because we don't really have a
way to track that that's the same column
so what we're going to do is go ahead
and just assume that we're good
uh other than if the chain if we have
the same name so if the name is a chain
same
they're going to walk through we're
going to assume that we don't need to
update it
uh and you'll sort of see as we go into
it
but otherwise we're going to do is we're
going to walk through
let's see so we're going through
we're going to go through the length of
source column
uh remember these are
arrays essentially so
we're going to get the length of source
column and then we're going to basically
walk through each of them and actually i
guess we could start it at 1.
and right now we're going to print each
of those we're going to print that
should be
source column
and then destination column so we're
going to print what is the
details of each column and it may be
things like you know we skip the name
but it could be type
we've got functions we've got whether
it's primary key or
key or not some things like that
now first thing we need to do is
we have some instances where in either
of those
we're going to have a um and let's
actually just do it like this we're
going to do
a current destination
equals something let's do that and
current source
equals
because what we need to do is we need to
figure out if the type of the row uh
let's see source column because we get
these a little bit differently from time
to time
sometimes they come back in strings
sometimes sometimes they come back as a
byte array
uh so if the source then we're going to
say that the
current source equals whoops
a d
of the
which just basically says let's take
this and we've seen this before you can
go look in the
python stuff but
basically we're going to make cursors
equal to a decode of that
otherwise
source is just the string
and we'll just make it a string so let's
do string of
source column index
and this is just so we can do some
string compares and things like that and
we're going to do the same thing with
the destination
so this needs to be our destination
column
and destination column
and current destination
if not
i'm going to do here
and it's going to destination come so
what we've got is we've figured out
regardless what we have is we're going
to
is that right yeah then we've set these
two up
and what we'll see is we'll see here
that these things are going to sometimes
be different so if we come in here
and we do uh current source
i do it the same order so current source
current destination
now we've got uh some specialty type
things so if index is one
then we're going to have to do something
here um
let's see we're going to go ahead and
we don't need to mess with that so
if it's one then if the
uh
what do we call the current
source curve source dot upper starts
with that or
cursed source
that upper starts with that
then
we have to split it up a little bit
and this is what we've seen before
because of the whoops src
um we've seen this because of situations
where we have those like ants that have
a
a size and
or in some kinds they don't sometimes
just straight up sometimes it's an int
and it has a default size which can be a
problem so we're going to fix that in
both of these
just to make sure that they are
we're doing an apple to apple comparison
and so we have to do a little bit of
conversion for those
and then we don't worry about that
and then um
[Music]
we're just going to go ahead and now
compare the two
and uh let's see
do we want this uh
yeah so we're gonna say
here
oh and this is just
oh actually we can do this so we're
going to say
let's get
let's get rid of this
let's take this uh so we're gonna do
that otherwise we're gonna say if these
things are true
um if those two are equal if source then
uh oh if they're not equal then we're
gonna have to update it and they're
gonna walk through our indexes
so then what we're gonna do is say okay
so now
what do we do if we have to update it
so now what we wanna do is we're gonna
come down here
and we're going to say
uh this is not going to call row uh so
the name
is going to be
uh source
oh
call source
i'm sorry current source id uh zero
so that's our name
uh the type
is again we can just do um
let's see did we we decoded those each
time right so we have
we're doing the decodes here
oh so this is for each one
so
really what we want to do
oh this is if update okay i'm sorry if
update so uh name equals that
um
so this is going to be
uh
let's go ahead and do it this way
source column
apologies i'm sort of thinking on the
fly here a little bit as i'm going
through this so name is going to be here
the
column type
that's going to be source
column
[Music]
1. and we probably want to do
actually let's do it this way
student name equals this
type equals this
what else do we need
we need default
uh
so he's gonna be that
okay so we can do default up here
we need to enable name type column type
whoops
where'd you go
name type
default and
so table is good so i just need column
let's call them column type instead
and then what i'm going to do
is
here
i'm going to do this real quick so i'm
going to say
if
well if it's equal to 1
then i know that the name is equal to
oops i'm sorry that's not going to be it
let's not be that that's going to be
here
and so when i walk out of this if the
index is 1 then the column type
whoops is equal to the current source
if
[Music]
i guess i can do an else here
else out let's do lift
idx equals zero
if a0
then the name
is equal to the current source
whoops
and so we're going to do is we're going
to build out our little string if we
need to
which we don't yes maybe you could say
it's just it's a bit of a pain but it
also saves us walking through some crap
later so uh zero one
uh next is the default so four
then default equals current source
and
let's see column type name and default
those are the three we need okay
so we really need that so those are our
three
um
do we need
yeah because right now we're not going
to worry about uh some of the other
values like a primary care anything like
that so we should be good with those
three
so then that means uh if we do an update
so did we already get that we don't have
the default okay so then we need to do
the default thing
then we need to do if those are in there
then we're going to change up our
default
and
then we do an offer table we're going to
print and we're going to update so let's
see what happens
so if we run that um
let's do
yeah we'll just sync rows we're gonna
leave that at 10 right now and let's go
see what this runs out to be
okay so we've got an issue somewhere in
here which is in the none okay so
uh alter t oh so we need to deal with
that none before we had that
where did we run into that here we go
uh
oh
so
let's update
so if
let's just do this because i think this
is just for the default maybe so if
default
equals none
then default
oh let's do this
this way
else
default equals that we've got to blank
it back out so if he's in there or he's
in there that's good oh
let's do this
see so if default is not equal to none
and
fault
not equal to that
okay there we go
let's see if that works okay so now it
doesn't uh blah blah blah oh we're still
getting default none
is that right
oh
it's probably here let's try that
let's just say if default
because otherwise it's not let's see
what happens here nope that's still not
going to give it to us
okay if he's not equal to null
then
default should be equal to that so let's
try this again let's see what happens
so we're still getting
this i'm getting all these nuns in here
okay
alter table modify oh we're not getting
the name so we didn't get the name right
uh
name name name name name name
ah we didn't get name
so
we're gonna say
um which we know right right up here
name is never going to be that problem
so name is just going to be equal to
destination column so you're actually
going to be source columns just to be
consistent
whoops
and now we're going to get that there we
go and now we're getting missing columns
is not defined
oh we've got that somewhere in that's in
sync columns which we don't need
oh i probably moved that from the wrong
place
where is where is it looking for missing
columns
no here we go nope that's right
all right missing columns
there we go
i've got that there
[Music]
oh here we go
sync columns does not mean
uh
that
right
yep
so if we do it there we go so we get all
our inserts we fly through and we're
gonna see some stuff here for example
where there's a varchar2
but here it was a uh a line it shouldn't
be
and we're gonna go through here
and we're gonna see we're gonna see some
interesting changes here so we're gonna
go through next time around and track
these down uh try to see what's going on
we're gonna try to get a little bit
better of a debug because i don't think
it's getting us
exactly like it's supposed to uh for
example like lookup type default
undefined
uh we probably need to like we probably
need to deal with that so we've got a
couple special things to deal with we're
gonna track that down next time though
so go out there and have yourself a
great day a great week and we will talk
to you
next time
[Music]
you