Detailed Notes
Focus for this episode: This is a step into building out our hierarchy and determining which tables make sense if rows copied are limited and which ones need another approach.
This tutorial walks through the creation of a database synchronization tool in Python. It continues our Python and SQL Tutorials with a useful application that provides a deeper look at SQL structures and commands.
Repository For Code: git clone [email protected]:rbroadhead/dbsync.git
Transcript Text
thank you [Music] well hello and welcome back we're continuing our series of tutorials SQL python sort of Blended together in our database synchronization application and we're getting into walking through our relational hierarchy in order to grab data as we need it particularly when we limit cert you know have a certain number of rows we want to pull across from our source to our destination we need to make sure that we look up the uh the related rows and pull those across as well so it's going to get a little tricky um we started out first time last time we just sort of got some basic stuff together so what you may see is uh let me flip back over here we use this guy quite a bit when we were grabbing our constraint name and table name and let's uh switch us over to giganator if that's a database we're going to be using today and whoop helps to get the name right there we go and so we've got a fair amount of I think that's the one I'm using uh uh yeah that's what I'm using we've got a fair amount of tables we've got a fair amount of relationships we've got a lot of these lkp lkp lookup type tables among others got a lot of these like performer performance which ties a performer to a performance and of course has its keys now what we're doing last time is we had this little smart empty sink so basically we're going to do is we're going to say hey we are going to come through and we're going to figure out if a table is something we can go ahead and load the data for it right away that it's in our clean section or is it not and what we're going to do is we're going to do let's do uh I want to see how do we want to do this so the the first thing we need to do which we did last time was I sort of did this load data and load data for children so when we run it and this only runs for one but we can see here and I tweaked it a little since our last call if I can jump up to the top of this somewhere here we go connect to the database so load data for children so for performer and I did let's see the table name is the last row so for the contract table the performer is tied to it so there is a foreign key from contract back to performer which means that performer has to be loaded before contract is it also means that we may have some issues deleting data now one thing we did in the past is we just went through if you go think this is the sink Rose is we just said we will just clear everything out we're going to drop the foreign keys we're going to truncate all the data if we want to we can do that and what we'll do is let's just say let's create a new one and we're going to call it drop data and it's also going to be self it doesn't need Max rows it does need those two and it's basically that piece whoops but now I got two this so that it all lines up right so now I've got this drop data and what I'm going to do here is instead of doing this I'm going to take it out of the sink Rose it is not in the Sim empty sink and then I'm going to put it here and this is basically because I don't want to accidentally do stuff sometimes and let's add a feature flag of d for drop and uh we'll do that and if D is in there then we'll do it okay so we'll have to drop we would have to sync it first so we should be good did we have an all here uh we did have it all but I think we can just do like this there we go and that'll get us some stuff going there okay I think we are good so now back to this so we've got our you know get our comparison of tables that we need to do and then we're going to come into that's not main it's over here now what we want to do for each of these is we can either load the table data right away or we're going to have to get the children first so for the data that we can load right away which means that it does not have a parent to worry about then all we need to do here is let's see we're going to do the data sync and actually what we probably will do um so here we do it by the limit okay so we're going to do we're going to create like a little one here and so this is going to be uh load table data and it's probably going to need some information so rotate the table the load table data we're going to give it a table and oh we're gonna need our source and our destination which is sort of par for the course so let's do this let's get our cleanup here let's get our startup up here and then we're going to go in we're going to load table data which is uh down here uh oh that's not the one we want we don't want create missing there we go smart empty sink so here what we're going to do um I don't know that it matters but we're going to go ahead and put this in ahead of it just so it's there okay so now what we can do is we can come in and we can say if oh that's missing tape I want to go back up uh where is okay what did I just do to create missing tables sync rows load table data smart empty sync rows okay let's just do it this way so we want to take that and we want to push that after load table data okay so now we're going to come in here and here we're going to just do the load so what we can do is we can do uh self dot load table data we're going to give it the table name and we're off and we're oh we do want to give it and we have to give it Max Rose whoops comma Max rows so in this case here we go so this is a straightforward one and in this case this is basically what we had before uh let me go to a sink Rose it's going to do the same thing and it's going to do whoops load table data for table comma Max rows okay so now we're pulling that I see something weird up there nope okay drop rows sync so now we're going to do a load table data we come in here now this is gonna be a little different for our children so the last thing we're going to do let's see is so in this case it is a child which means so for what for these what we need to do is we need to uh make sure parent is loaded first and what we want to do is this is going to be imported tables so when we get done here we're going to do uh once we're done we're going to import tables dot append table so we're going to go through each of the parents in this case which I think is the first row uh let's see he's right yeah okay so we're going to go for each of those we're going to say parent equals just to do it this way and then what we want to do is say and how do we want to do this so we want to make sure that the parent does a smart oh do we want to do it like that we want to see if the parent has parents so we probably oh this is a good one because we could just keep calling this but we need to call it for each table instead what we want to do is we want to see so then what we need to do let's see we're going to do attempt to load parent and that's going to be our load table data except we're going to have one now it's a little different that's going to be unlimited let's go here because it's going to be a whole different this will be load parent table data and all I'm going to get here uh I will have to pass up Max rows and so this is going to get a little tricky because what I'm going to do is we're gonna have to run this check SQL again for that one and let's see so the first thing I want to do is let's see check if parent is loaded so if parent and imported tables whoop then and I want to say cool tells trying to load the parent so the parent and now we can just say so let's see so that's for each of those and then once we've gone through each of the parents and loaded our way down then we can do uh load this table data so we probably this becomes interesting because what we really don't want to do what we'd like to do is avoid making this call over and over and over again because that's technically what we're going to have to do each time so if we look at that what we could do is instead of doing it based off that we could actually store each of these as a key as a pair so you'd have a table and you'd have what its reference table is and then figure out what its parents are it seems like that would be and we're going to call it here anyway so let's just do that and so this is now going to be what we should probably do is for each of those try to load it this is where we're sort of thinking on the fly a little bit here because it's like what do we really want to do in order to get our data go through all of our tables and then make sure that the parents are hit first now what we're doing here is we're just saying okay I'm going to get all of the top parent rows uh first well it's really not doing that so actually let's do it this way so let's see so okay I'm going to change this up a little bit so what we're going to do first is we're going to import the data for parent rows and then and in this case it's going to be Max rows we know we're good so every parent is cool we're going to get it's um X number of rows and that's the load table that we've done so far and all is good now what we need to do here is essentially the same thing um let's see so for each table we're going to do that okay so now we're going to do the same thing this time we know it is going to be um this is going to be okay well we'll leave it I'll say import data so these are parents or these are root root tables and then we're going to come here and now we're going to do it for uh non-root tables and so this is going to be a little more tricky because what we're going to do here is we're going to say for each of these we're going to come in and we're going to load table but we're going to need to do it for uh we don't really care what Max rows are anymore so we're going to do a table and parent and this is going to be that load parent table data whoop I hope that's right did we have it right like that um well we're gonna go like this so let's do it so load parent table data should be table and parent did I do it in that direction did I do it in that order yep table and parent and then I get to append it and say I'm good actually I probably should send imported tables just to be sure because I don't I can skip it if it's already imported yep yep yep and so now if it is greater than zero so if it shows up once or more now I have to do something a little different okay so it's that load data for channel children I can get rid of all of that and I need to make the parent which is parent equals uh what is that f k whoop oh I have to do that for each of the oh for so I kind of did it before let's see let's just call it for Row in FK rows and then parent equals row uh it's going to be the so oh did I do I did the wrong thing okay so now we're going to do is we're going to come through and for each of these we're now going to call this load parent table data and we're going to give it the table and the parent and then in here the first thing we're going to do is we're going to see oh and the imported rows um uh can we check it yeah we probably want to check it here because what we're going to do here is we're going to say if parent in imported rows then we're just going to return otherwise we're going to come in here and we're going to first so we're going to have a couple things we have to look at here because now what we're going to need to do is see if they have a parent which is basically the same thing here it's going to look fairly similar so I may just take the whole thing uh but I probably won't so okay so first let's well I know I'm um wish washy this but let's go here let's do this check SQL because we're going to need this so we have to see if oh we're gonna have to actually do this for a couple times so this would be good so what we're going to do is we're going to come in here and uh uh we don't need to do four table because we're gonna do it just for this one table and for that table we're going to figure out what it has and for each of those we're going to say if it's uh one then we're going to have to do a special uh do special data load is that right let's see this can get tricky because what we're going to need to do is we're going to say okay if our parent has been loaded that's all we need right now so we have two cases let's see let's go back up here so if parent is loaded actually if all parents because you can have more than one all parents loaded load based on data in parent else we're going to have to go in and uh load the parent or yeah load each parent not loaded so it's going to get a little bit tricky as we start running through this and I think this is a good point to step place to stop we will come back around to this and start digging into it it becomes like I said we're slowing down a little bit here because we're getting into some tricky stuff but I wanted to sort of show the thought process as we're going through this otherwise you're going to end up with a whole bunch of code we're going to fly through and I think it'll be harder to follow but uh as always you can check out the show notes if you want some if you have questions shoot us an email if you have any questions uh info developpinor.com and other than that we're going to let you get back to your day we will be back next time and we're going to continue trudging our way through this one till then go out there and have yourself a great day a great week and we will talk to you next time thank you foreign
Transcript Segments
thank you
[Music]
well hello and welcome back we're
continuing our series of tutorials SQL
python sort of Blended together in our
database synchronization application and
we're getting into walking through our
relational hierarchy in order to grab
data as we need it particularly when we
limit cert you know have a certain
number of rows we want to pull across
from our source to our destination we
need to make sure that we look up the
uh the related rows and pull those
across as well so it's going to get a
little tricky
um we started out first time last time
we just sort of got some basic stuff
together so what you may see is
uh let me flip back over here we use
this guy quite a bit when we were
grabbing our constraint name and table
name
and let's uh switch us over to giganator
if that's a database we're going to be
using today
and whoop
helps to
get the name right there we go
and so we've got a fair amount of I
think that's the one I'm using
uh
uh yeah that's what I'm using we've got
a fair amount of tables we've got a fair
amount of relationships we've got a lot
of these lkp lkp lookup type tables
among others got a lot of these like
performer performance which ties a
performer to a performance and of course
has its keys
now what we're doing last time is we had
this little smart empty sink so
basically we're going to do is we're
going to say hey
we are going to come through and we're
going to figure out if
a table is something we can go ahead and
load the data for it right away that
it's in our
clean section
or is it not
and what we're going to do is we're
going to do let's do uh
I want to see how do we want to do this
so the the first thing we need to do
which we did last time was I sort of did
this
load data and load data for children so
when we run it
and this only runs for one but we can
see here
and I tweaked it a little since our last
call if I can jump up to the top of this
somewhere here we go connect to the
database so load data for children so
for performer
and I did let's see the table name is
the last row so for the contract table
the performer
is tied to it so there is a foreign key
from contract back to performer which
means that performer has to be loaded
before contract is
it also means that we may have some
issues deleting data now one thing we
did in the past is we just went through
if you go think this is the sink Rose
is we just said we will just clear
everything out we're going to drop the
foreign keys we're going to truncate all
the data
if we want to we can do that and
what we'll do is let's just say
let's create a new one and we're going
to call it
drop data
and it's also going to be self it
doesn't need Max rows
it does need those two
and it's basically that piece whoops but
now I got two
this so that it all lines up right so
now I've got this drop data
and what I'm going to do here is instead
of doing this
I'm going to take it out of the sink
Rose
it is not in the Sim empty sink and then
I'm going to put it
here and this is basically because I
don't want to accidentally do stuff
sometimes and let's add a feature flag
of d
for drop
and
uh
we'll do that and if D is in there then
we'll do it okay
so
we'll have to drop we would have to sync
it first so we should be good did we
have an all here
uh we did have it all but I think we can
just do like this
there we go and that'll get us some
stuff going there
okay I think we are good so now back to
this so we've got our you know
get our comparison of tables that we
need to do and then we're going to come
into that's not main it's over here
now what we want to do for each of these
is
we can either load the table data right
away
or we're going to have to get the
children first so for the data that we
can load right away which means that it
does not have
a parent to worry about
then all we need to do here is let's see
we're going to do the data sync and
actually what we probably will do
um
so here we do it by the limit okay
so we're going to do
we're going to create like a little one
here and so this is going to be uh
load table data
and it's probably going to need
some information so rotate the table the
load table data
we're going to give it a table
and
oh we're gonna need our source and our
destination
which is sort of par for the course so
let's do this let's get our cleanup here
let's get our startup up here
and then we're going to go in we're
going to load table data which is uh
down here
uh oh that's not the one we want we
don't want create missing there we go
smart empty sink
so here what we're going to do
um
I don't know that it matters but we're
going to go ahead and put this in ahead
of it just so it's there
okay
so now what we can do is we can come in
and we can say if
oh that's missing tape I want to go back
up uh where is
okay what did I just do
to create missing tables sync rows
load table data smart empty sync rows
okay
let's just do it this way so we want to
take that
and we want to push that after load
table data
okay
so now we're going to come in here and
here we're going to just do the load so
what we can do is we can do
uh self dot load table data we're going
to give it the table name
and we're off and we're oh we do want to
give it
and we have to give it Max Rose whoops
comma Max rows
so in this case
here we go
so this is a straightforward one
and in this case this is basically what
we had before
uh let me go to
a sink Rose
it's going to do the same thing
and it's going to do whoops
load table data
for table
comma Max rows okay
so now we're pulling that
I see something weird up there
nope okay
drop rows sync so now we're going to do
a load table data we come in here now
this
is gonna be a little different for our
children
so the last thing we're going to do
let's see is
so in this case it is a child which
means so for what for these what we need
to do is we need to
uh
make sure parent is loaded
first
and what we want to do is this is going
to be imported tables so when we get
done here we're going to do uh once
we're done we're going to import tables
dot append table
so we're going to go through each of the
parents in this case which I think is
the first row
uh let's see he's right yeah okay
so we're going to go for each of those
we're going to say parent equals just to
do it this way
and then what we want to do is say
and how do we want to do this so we want
to make sure
that the parent does a smart
oh do we want to do it like that we want
to see if the parent has parents
so we probably
oh this is a good one because we could
just keep calling this but we need to
call it for each table
instead what we want to do
is we want to see
so then what we need to do
let's see
we're going to do
attempt to load
parent
and that's going to be our load table
data
except we're going to have one now
it's a little different
that's going to be unlimited let's go
here
because it's going to be a whole
different
this will be load
parent table data
and all I'm going to get here
uh I will have to pass up Max rows
and so this is going to get a little
tricky because what I'm going to do is
we're gonna have to run this check SQL
again
for that
one
and let's see
so the first thing I want to do is let's
see
check if parent is loaded
so if
parent and imported tables
whoop
then
and I want to say cool
tells
trying to load the parent
so the parent
and now
we can just say so let's see so that's
for each of those and then once we've
gone through each of the parents and
loaded our way down then we can do
uh
load this table data
so we probably
this becomes interesting because what we
really don't want to do what we'd like
to do is avoid making this call
over and over and over again
because that's technically what we're
going to have to do each time
so
if we look at that what we could do is
instead of doing it based off that we
could actually store
each of these as a key as a pair so
you'd have a table and you'd have what
its reference table is
and then figure out what its parents are
it seems like that would be
and we're going to call it here anyway
so let's just do that
and so this is now going to be
what we should probably do is for each
of those try to load it
this is where we're sort of thinking on
the fly a little bit here because it's
like what do we really want to do
in order to get our data go through all
of our tables and then make sure that
the parents are hit first
now what we're doing here is we're just
saying okay I'm going to get all of the
top parent rows
uh first well
it's really not doing that
so actually let's do it this way
so let's see so okay
I'm going to change this up a little bit
so what we're going to do first is we're
going to import the data for parent rows
and then and in this case it's going to
be Max rows we know we're good so every
parent is cool we're going to get it's
um
X number of rows
and that's the load table that we've
done so far and all is good
now what we need to do here is
essentially the same thing
um
let's see so for each table we're going
to do that okay so now we're going to do
the same thing
this time
we know it is going to be
um
this is going to be okay well we'll
leave it I'll say import data so these
are parents or these are root root
tables
and then
we're going to come here
and now we're going to do it for
uh
non-root tables
and so this is going to be a little more
tricky because what we're going to do
here is we're going to say
for each of these we're going to come in
and we're going to load table but we're
going to need to do it for uh we don't
really care what Max rows are anymore
so we're going to do a table
and parent
and this is going to be that load
parent table data
whoop
I hope that's right did we have it right
like that
um
well we're gonna go like this so let's
do it so load parent table data
should be table
and parent did I do it in that direction
did I do it in that order
yep table and parent
and then I get to append it and say I'm
good actually I probably should send
imported tables just to be sure because
I don't I can skip it if it's already
imported
yep yep yep
and so now
if it is greater than zero
so if it shows up once or more now I
have to do something a little different
okay so it's that load data for channel
children I can get rid of all of that
and I need to make the parent
which is parent
equals uh what is that
f k
whoop
oh I have to do that for each of the oh
for
so I kind of did it before let's see
let's just call it for
Row in
FK rows
and then parent equals
row
uh it's going to be the so oh
did I do I did the wrong thing okay so
now we're going to do is we're going to
come through and for each of these we're
now going to call this load parent table
data
and we're going to give it the table and
the parent
and then in here the first thing we're
going to do is we're going to see oh and
the
imported rows
um
uh can we check it yeah we probably want
to check it here because what we're
going to do here is we're going to say
if
parent in
imported rows
then we're just going to return
otherwise we're going to come in here
and we're going to first so we're going
to have a couple things we have to look
at here
because now what we're going to need to
do is
see if
they have a parent
which is basically the same thing here
it's going to look fairly similar so I
may just take the whole thing
uh but I probably won't so okay so first
let's well
I know I'm um wish washy this but let's
go here let's do this check SQL because
we're going to need this
so we have to see if
oh we're gonna have to actually do this
for a couple times so this would be good
so what we're going to do is we're going
to come in here
and uh
uh we don't need to do four table
because we're gonna do it just for this
one table
and for that table we're going to figure
out what it has and for each of those
we're going to say if it's
uh one then we're going to have to do a
special uh
do special
data load
is that right
let's see
this can get tricky because what we're
going to need to do is we're going to
say okay if our parent has been loaded
that's all we need right now
so we have two cases let's see let's go
back up here so
if parent is loaded
actually if all parents because you can
have more than one all parents
loaded
load
based on
data in parent
else
we're going to have to go in and uh load
the parent
or yeah load
each parent not loaded so it's going to
get a little bit tricky as we start
running through this and I think this is
a good point to step place to stop we
will come back around to this and start
digging into it it becomes like I said
we're slowing down a little bit here
because we're getting into some tricky
stuff but I wanted to sort of show the
thought process as we're going through
this otherwise you're going to end up
with a whole bunch of code we're going
to fly through and I think it'll be
harder to follow
but uh as always you can check out the
show notes if you want some if you have
questions shoot us an email if you have
any questions uh info developpinor.com
and other than that we're going to let
you get back to your day we will be back
next time and we're going to continue
trudging our way through this one till
then go out there and have yourself a
great day a great week and we will talk
to you
next time
thank you
foreign