📺 Develpreneur YouTube Episode

Video + transcript

SQL Sync Part 24

2022-11-03 •Youtube

Detailed Notes

Focus for this episode: This starts a multi-part look into creating a hierarchy tree of related tables and how to pull related data into as part of a limited data sync.

This tutorial walks through the creation of a database synchronization tool in Python. It continues our Python and SQL Tutorials with a useful application that provides a deeper look at SQL structures and commands.

Repository For Code: git clone [email protected]:rbroadhead/dbsync.git

Transcript Text
thank you
[Music]
well hello and welcome back we're
continuing our series of my Sequel and
python tutorials in this episode we are
going to start digging into one of our
problem child kinds of things that we're
running into which is basically what we
want to do is we're going to be able to
we need to walk the foreign key
hierarchy because what we're going to
need to do is
find places where there is related data
that we're going to have to deal with
and this particularly deals with part
two here how to import related data for
a table because Max Rose is not always
going to work
if you set it to bring everything you
may get it to work but even then what
you can run into is
if you have constraints that exist
so unless you like blow away all your
constraints and then rebuild them
afterwards which is a possibility
but a little bit of a pain and can be
very time consuming instead what we're
going to do is as we are pulling data
and this is going to be both as we
delete data but then also as we pull
data we're going to need to find stuff
that is related for example if you have
a lookup
and uh you know it's let's just say it's
got three values yes no one maybe and
each of those values are you know
relates to one for yes no for two two
for no and three for maybe
and you've got another table that
utilizes that data
you need to be able to bring in the
related data so let's say that you only
bring in one row
and there's another table that uses that
lookup and it brings in its first row if
it uses no as a value two is the value
when you bring in the first row of your
lookup table and it only brings in one
then you're going to end up in an issue
because you're going to have this
data that does not have its proper
lookup or its related value
and so we're going to need to do is
figure out how to determine
you know in that case
is to go through that uh you know call
it like a parent or maybe a core type
table
figure out all the values that it uses
and then make sure that those values get
pulled in from the lookup table
now there are some ways we could get
around this kind of stuff we can do
things like identify tables that are
that are lookups for example
you may have something where you've got
a customer or contact and you turn
around and you say oh this contact hits
the address table so I have to make sure
that for all my contacts I pull in the
right address information well one thing
you could do is you could just pull in
all of the addresses and then only
utilize the ones that you know that
exist in the contact you could actually
even if you wanted sort of brute force
it
figure out you know bring everything in
from the address
fill it all out in the contact and then
delete everything in address that's not
being used in contact but then it gets
sort of it it really isn't that much
simpler because at the end of the day
what we really have to do is be able to
sort of walk that foreign key hierarchy
we need to figure out what our core
tables are
so sort of look at that this way is
we're going to be like so a
determine base or core tables
and then we're going to have to do grab
related data whoops
for the we'll call them look up tables
so in order to do this let's let's talk
in more work
so what we want to do is we're going to
go back and look at if you remember we
used the schema information schema
referential constraints to figure out
all of our
indexes and foreign key relationships so
if we come back here
and we're going to use this query and
take a look at that in this case I'm
going to go with this project over here
because if I do the tutorial there's
only like there's only like two that
exist let's see if I use tutorial
yeah we only have two uh if we use this
other one then now I've got a bunch and
this gives us just a little more data to
work with
now what we want to look in order by
here
so what we've got is our constraints
we've got these names which are really
in this case not as important
because what we want to do
is we're going to go through each of
these
and actually I wonder if this would show
how did I set this up in tutorial oh
okay I did
um so we'll keep it we'll really
simplify it now so what you're going to
do is you're going to see here in your
constraints that table
app user
references address
and then address references lookup type
so there is this in this case it's very
simply there's this like app user needs
address
needs
lookup type
and so what we need to do is figure out
how do we
build that out essentially
now what we need to do is we need to
find the
the leaf data first
if we if we want to build out our
hierarchy
um I guess well and I guess we can go
either way you can either start from
find this guy which would be a leaf your
lookup type
or start here and find your your base
tables because what's going to end up
happening is you're basically going to
say for each table you're going to say
hey if
this table
which would be here
if I have a table name that references
something so if I have an entry where my
that table
exists then that means it references
something
and so I can't pull that data in without
first pulling in the leaf data and so
what we could do and this is sort of to
get us an approach is what we're going
to do is we're going to do we're going
to go down to the leaf tables we're
going to pull in all the records
up to the row number that we give it for
those Leafs
and then as we go up to the next level
we're going to pull it in
and this is where I'm thinking through a
little bit we can pull in data but only
where the data exists on the leaf
and that may be what we want to do I got
to really think about that now actually
you know what let's flip that so here's
what we're going to do
uh I'll go back here just so we have
these little bit of notes so when we do
the max rows thing
what we're going to do is we're going to
apply Max
rows only to
core base tables
so if nothing refers to anything else
everything's a core table we only grab
the you know if it's 100 records we only
have to grab 100 records per row or per
table and we're done
however so we're gonna we're gonna get
those but instead we're going to need to
do
is we're going to need to say okay if
we're going to pull 100 first
we're going to need to take everywhere
that that refers to a child and make
sure that the child records are there
for that table which makes it a little
interesting because that means what we
have to do
is we're going to go here and let's say
we do 100 rows so we're gonna get 100
rows
for uh app user
but within those we're gonna that is
sort of like our start
and then we're going to use say so
grab the
address rows
needed
for the
app user 100
but for that we need to do grab
the look up type
rows
needed for address
and so we need to do is the first thing
we're going to do is we're going to go
through select let's look at this
uh in this specific case so what we
would do is we're going to come through
here
and we're going to say
so for the first thing what we need to
do is just to figure out this hierarchy
is we need to say we're going to find
this and this would be app user
where that
and
table name equals
app user
oh
lowercase
so we get in this case is we get our one
row and what we're going to do is we're
going to have this starting thing so
this is going to be the uh let's do it
this way we'll go in here and we're
going to have
where we do sync rows
let me do this a little differently
because what we're going to do is we're
going to say
okay this is going to be we're going to
call this we're going to call this smart
sink Rose
and this is going to get really nasty
really fast
because first thing we're going to do
okay so we have our Target
tables
so what we're going to need is here
we're going to be let's say
um imported tables because what we're
going to need to do
let's see if we got that is that right
uh
yeah we can do it that way
because what we're going to have to do
is we're going to go through each of
these
tables let's see oh we do a drop first
so we're going to do
since we're doing it this way so this is
empty table smart
we're going to call it smart empty sink
the rows
because what we're going to have to do
is first with each of these
uh which we can probably get away with
this
here
because we can truncate oh because we're
dropping our keys so we'll just we don't
have to worry about our Empty Tables so
we're going to go through and we're
going to empty everything
that's pretty easy to do that's a simple
thing but now
we're going to go and import data
and here we're not going to drop the
foreign Keys anymore
instead we're going to do this check to
see if this is referenced in a foreign
key so
uh let's see
so this thing says it is you know if I
come in through I'm going to say for
each row
that I get here
it's going to have to do something now
here I was dropping foreign keys I don't
want to do that what I want to do
instead is I'm going to say this so if a
table so if
uh length
of FK rows
we'll just say it's less than one
then
we can
go right ahead
and
load data
else
whoops
and let's just say uh
we'll just do print FK rows just so we
can do it
um well let's sit this way
print
load data
for
and this will be uh
the table name so table
and then if not then we need to
uh we'll do pause and
load for child tables
data for children
and then in this case
what I need to do is let me go back and
grab that
uh when I do the SMART stink here we go
so we're doing each of these
for each row
uh smart empty sink then we're gonna do
it for the children but here what we're
going to do
whoops
wrong key uh let's see so for each of
those I'm gonna so I can do the but what
I really want to do is
um
I want the which table do I want I want
the referenced table name
so this is going to be
referenced table name
because now what I need to do is this is
just going to be FK row 0.
and we're going to do it like that for
now
so what we're going to do is we're going
to come through here
and we're going to basically say hey
either
we load data for our table and let's go
back to mate so let's do
where was that smart sync
okay come right in here
smart oh
smart empty sink Rose
and quit
because that'll give us a little bit to
work with
and if we look whoop
before we're just doing print feature
Flags
I don't think I need that
so if we come in here
it doesn't matter what I give it
and it's going to come in here and it's
going to say none type so when I come
into smart empty sync 423
it's going to be because it's empty
where'd we go here
oh uh wait
in Maine
oh dbsync 423
oh because I don't have the target table
it's my mistake that's right I need to
do
uh whatever I do I need to do
I need to do compare tables no matter
what
so that we need to do
something I missed earlier Okay so
so I'm going to go into each of these
and now I'm seeing this uh
oh because I'm doing this for a couple
of databases
so I need to go back to my settings
and do oh because I did kickinator H
goals so I had my three okay
so what I'm doing here is we're going to
see where data clean
it's going to go through each of those
it's going to do our constraints
um
yes I already cleaned all of that stuff
out so I'm not gonna have to
re-synchronize my data but that's okay
and so now we can see here where we're
getting some of these we're going to get
a load data
but then we're going to come back here
and we'll get some of these we're going
to have to do our data sync first and
what we're going to end up doing is turn
this into a little bit of a
self-referential function so that we'll
be able to walk down low the
basically walk the tree and get to the
bottom load it and then work our way
back up and I think that's going to stop
this time I just want to get started
we're definitely getting into some uh
it's going to get pretty complicated
fairly quickly
but I want to step through it still in
some bite-sized chunks and start with
the idea that hey we're going to come in
here
and we're going to look at
so if we look for this one
we are going to look at what's
referenced we're going to find first
tables that do not have any references
those are free we're ready to go with
those but if they aren't then we're
going to have to walk that walk that
path walk down that tree and see where
that goes and that's we're going to pick
up next time so go out there and have
yourself a great day a great week and we
will talk to you next time
foreign
Transcript Segments
10.7

thank you

18.89

[Music]

27.359

well hello and welcome back we're

29.34

continuing our series of my Sequel and

32.46

python tutorials in this episode we are

35.579

going to start digging into one of our

38.579

problem child kinds of things that we're

41.1

running into which is basically what we

43.379

want to do is we're going to be able to

45.059

we need to walk the foreign key

46.86

hierarchy because what we're going to

49.14

need to do is

51.18

find places where there is related data

55.079

that we're going to have to deal with

56.94

and this particularly deals with part

59.64

two here how to import related data for

62.16

a table because Max Rose is not always

64.799

going to work

66

if you set it to bring everything you

69.84

may get it to work but even then what

72.299

you can run into is

74.4

if you have constraints that exist

77.58

so unless you like blow away all your

79.56

constraints and then rebuild them

81

afterwards which is a possibility

84.479

but a little bit of a pain and can be

87.36

very time consuming instead what we're

89.939

going to do is as we are pulling data

93.72

and this is going to be both as we

95.82

delete data but then also as we pull

97.439

data we're going to need to find stuff

99.479

that is related for example if you have

103.079

a lookup

104.64

and uh you know it's let's just say it's

107.1

got three values yes no one maybe and

110.64

each of those values are you know

112.38

relates to one for yes no for two two

115.079

for no and three for maybe

116.939

and you've got another table that

118.86

utilizes that data

121.079

you need to be able to bring in the

124.079

related data so let's say that you only

126.54

bring in one row

128.759

and there's another table that uses that

130.86

lookup and it brings in its first row if

133.56

it uses no as a value two is the value

136.98

when you bring in the first row of your

139.319

lookup table and it only brings in one

141.84

then you're going to end up in an issue

143.58

because you're going to have this

145.319

data that does not have its proper

147.9

lookup or its related value

150.54

and so we're going to need to do is

152.76

figure out how to determine

155.459

you know in that case

157.379

is to go through that uh you know call

161.28

it like a parent or maybe a core type

163.98

table

165.9

figure out all the values that it uses

168.66

and then make sure that those values get

170.76

pulled in from the lookup table

174

now there are some ways we could get

175.26

around this kind of stuff we can do

176.64

things like identify tables that are

179.42

that are lookups for example

183.66

you may have something where you've got

185.16

a customer or contact and you turn

187.98

around and you say oh this contact hits

189.959

the address table so I have to make sure

192.42

that for all my contacts I pull in the

194.519

right address information well one thing

197.04

you could do is you could just pull in

198.3

all of the addresses and then only

200.819

utilize the ones that you know that

203.4

exist in the contact you could actually

205.92

even if you wanted sort of brute force

208.26

it

209.64

figure out you know bring everything in

211.8

from the address

213.12

fill it all out in the contact and then

215.34

delete everything in address that's not

216.9

being used in contact but then it gets

219.599

sort of it it really isn't that much

222.239

simpler because at the end of the day

224.22

what we really have to do is be able to

226.26

sort of walk that foreign key hierarchy

228

we need to figure out what our core

230.4

tables are

232.379

so sort of look at that this way is

234.239

we're going to be like so a

238.5

determine base or core tables

244.319

and then we're going to have to do grab

248

related data whoops

253.68

for the we'll call them look up tables

258.419

so in order to do this let's let's talk

261.6

in more work

264.06

so what we want to do is we're going to

265.44

go back and look at if you remember we

268.44

used the schema information schema

271.88

referential constraints to figure out

273.9

all of our

275.72

indexes and foreign key relationships so

279.18

if we come back here

281.58

and we're going to use this query and

284.16

take a look at that in this case I'm

285.66

going to go with this project over here

287.52

because if I do the tutorial there's

289.919

only like there's only like two that

292.68

exist let's see if I use tutorial

298.56

yeah we only have two uh if we use this

302.58

other one then now I've got a bunch and

305.16

this gives us just a little more data to

306.72

work with

308.4

now what we want to look in order by

311.34

here

312.419

so what we've got is our constraints

315.9

we've got these names which are really

317.52

in this case not as important

319.86

because what we want to do

322.02

is we're going to go through each of

323.34

these

324.479

and actually I wonder if this would show

327.36

how did I set this up in tutorial oh

330.06

okay I did

331.8

um so we'll keep it we'll really

333.12

simplify it now so what you're going to

335.28

do is you're going to see here in your

337.08

constraints that table

339

app user

340.62

references address

343.68

and then address references lookup type

347.22

so there is this in this case it's very

350.82

simply there's this like app user needs

354.3

address

356.96

needs

358.86

lookup type

361.44

and so what we need to do is figure out

363

how do we

364.44

build that out essentially

367.8

now what we need to do is we need to

369.78

find the

371.28

the leaf data first

374.94

if we if we want to build out our

377.58

hierarchy

379.259

um I guess well and I guess we can go

380.58

either way you can either start from

382.139

find this guy which would be a leaf your

384.18

lookup type

386.639

or start here and find your your base

390

tables because what's going to end up

391.86

happening is you're basically going to

393.72

say for each table you're going to say

395.16

hey if

396.72

this table

398.46

which would be here

401.22

if I have a table name that references

403.8

something so if I have an entry where my

406.08

that table

407.22

exists then that means it references

410.34

something

411.3

and so I can't pull that data in without

413.819

first pulling in the leaf data and so

417.84

what we could do and this is sort of to

420.18

get us an approach is what we're going

423.3

to do is we're going to do we're going

425.759

to go down to the leaf tables we're

427.86

going to pull in all the records

430.199

up to the row number that we give it for

433.68

those Leafs

435.36

and then as we go up to the next level

437.22

we're going to pull it in

440.46

and this is where I'm thinking through a

441.78

little bit we can pull in data but only

445.8

where the data exists on the leaf

449.18

and that may be what we want to do I got

452.28

to really think about that now actually

453.539

you know what let's flip that so here's

455.52

what we're going to do

457.099

uh I'll go back here just so we have

459.539

these little bit of notes so when we do

461.88

the max rows thing

464.4

what we're going to do is we're going to

465.66

apply Max

467.699

rows only to

471.38

core base tables

475.38

so if nothing refers to anything else

477.72

everything's a core table we only grab

480.12

the you know if it's 100 records we only

482.28

have to grab 100 records per row or per

484.68

table and we're done

487.44

however so we're gonna we're gonna get

490.38

those but instead we're going to need to

492.24

do

492.9

is we're going to need to say okay if

494.4

we're going to pull 100 first

497.22

we're going to need to take everywhere

500.699

that that refers to a child and make

503.16

sure that the child records are there

505.62

for that table which makes it a little

508.319

interesting because that means what we

509.759

have to do

511.44

is we're going to go here and let's say

513.479

we do 100 rows so we're gonna get 100

515.039

rows

516.779

for uh app user

520.44

but within those we're gonna that is

523.08

sort of like our start

526.08

and then we're going to use say so

529.88

grab the

532.019

address rows

534.72

needed

536.16

for the

538.44

app user 100

540.899

but for that we need to do grab

543.72

the look up type

547.26

rows

548.88

needed for address

554.04

and so we need to do is the first thing

556.44

we're going to do is we're going to go

557.519

through select let's look at this

559.86

uh in this specific case so what we

563.04

would do is we're going to come through

564.24

here

565.86

and we're going to say

569.339

so for the first thing what we need to

570.779

do is just to figure out this hierarchy

574.2

is we need to say we're going to find

576.6

this and this would be app user

580.32

where that

583.14

and

585.42

table name equals

588.72

app user

590.88

oh

593.88

lowercase

595.56

so we get in this case is we get our one

598.68

row and what we're going to do is we're

600.6

going to have this starting thing so

602.94

this is going to be the uh let's do it

605.58

this way we'll go in here and we're

607.26

going to have

608.58

where we do sync rows

612.6

let me do this a little differently

613.62

because what we're going to do is we're

615.42

going to say

622.38

okay this is going to be we're going to

624.3

call this we're going to call this smart

626.399

sink Rose

628.2

and this is going to get really nasty

630.6

really fast

634.32

because first thing we're going to do

637.08

okay so we have our Target

639.72

tables

641.88

so what we're going to need is here

644.82

we're going to be let's say

648.12

um imported tables because what we're

650.64

going to need to do

654.54

let's see if we got that is that right

656.339

uh

659.579

yeah we can do it that way

662.279

because what we're going to have to do

663.66

is we're going to go through each of

665.88

these

667.56

tables let's see oh we do a drop first

672.959

so we're going to do

677.1

since we're doing it this way so this is

679.44

empty table smart

683.339

we're going to call it smart empty sink

685.14

the rows

688.2

because what we're going to have to do

689.64

is first with each of these

694.44

uh which we can probably get away with

697.019

this

698.64

here

701.64

because we can truncate oh because we're

703.2

dropping our keys so we'll just we don't

704.88

have to worry about our Empty Tables so

706.26

we're going to go through and we're

706.98

going to empty everything

710.04

that's pretty easy to do that's a simple

711.899

thing but now

713.88

we're going to go and import data

717.36

and here we're not going to drop the

719.82

foreign Keys anymore

722.12

instead we're going to do this check to

724.62

see if this is referenced in a foreign

726.06

key so

729.32

uh let's see

732.54

so this thing says it is you know if I

736.079

come in through I'm going to say for

737.88

each row

739.32

that I get here

742.2

it's going to have to do something now

744.66

here I was dropping foreign keys I don't

746.519

want to do that what I want to do

750.5

instead is I'm going to say this so if a

753.48

table so if

756.12

uh length

759

of FK rows

763.86

we'll just say it's less than one

766.2

then

767.82

we can

769.5

go right ahead

771.6

and

773.82

load data

778.32

else

782.279

whoops

784.019

and let's just say uh

787.86

we'll just do print FK rows just so we

790.92

can do it

793.38

um well let's sit this way

797.279

print

800.459

load data

802.68

for

804.72

and this will be uh

808.38

the table name so table

812.82

and then if not then we need to

819.839

uh we'll do pause and

823.32

load for child tables

833.24

data for children

837.06

and then in this case

841.079

what I need to do is let me go back and

844.079

grab that

846.36

uh when I do the SMART stink here we go

850.5

so we're doing each of these

855.24

for each row

860.279

uh smart empty sink then we're gonna do

863.88

it for the children but here what we're

865.62

going to do

869.519

whoops

871.56

wrong key uh let's see so for each of

874.62

those I'm gonna so I can do the but what

876.839

I really want to do is

881.459

um

883.8

I want the which table do I want I want

887.399

the referenced table name

890.82

so this is going to be

894.079

referenced table name

899.1

because now what I need to do is this is

901.86

just going to be FK row 0.

910.5

and we're going to do it like that for

911.94

now

913.74

so what we're going to do is we're going

914.82

to come through here

916.5

and we're going to basically say hey

919.98

either

922.26

we load data for our table and let's go

924.899

back to mate so let's do

928.079

where was that smart sync

932.76

okay come right in here

938.959

smart oh

943.079

smart empty sink Rose

948

and quit

952.019

because that'll give us a little bit to

953.76

work with

954.899

and if we look whoop

957.24

before we're just doing print feature

959.519

Flags

961.38

I don't think I need that

963.899

so if we come in here

966.48

it doesn't matter what I give it

968.94

and it's going to come in here and it's

970.26

going to say none type so when I come

972.06

into smart empty sync 423

975.66

it's going to be because it's empty

977.94

where'd we go here

981.899

oh uh wait

985.32

in Maine

987.24

oh dbsync 423

991.98

oh because I don't have the target table

993.48

it's my mistake that's right I need to

995.579

do

1000.98

uh whatever I do I need to do

1007.82

I need to do compare tables no matter

1010.1

what

1012.74

so that we need to do

1016.579

something I missed earlier Okay so

1025.28

so I'm going to go into each of these

1027.02

and now I'm seeing this uh

1033.38

oh because I'm doing this for a couple

1035.299

of databases

1037.1

so I need to go back to my settings

1041.199

and do oh because I did kickinator H

1044.839

goals so I had my three okay

1047.24

so what I'm doing here is we're going to

1049.16

see where data clean

1052.76

it's going to go through each of those

1055.46

it's going to do our constraints

1058.34

um

1062.24

yes I already cleaned all of that stuff

1064.94

out so I'm not gonna have to

1068.24

re-synchronize my data but that's okay

1069.86

and so now we can see here where we're

1072.2

getting some of these we're going to get

1073.28

a load data

1075.2

but then we're going to come back here

1076.88

and we'll get some of these we're going

1078.62

to have to do our data sync first and

1081.86

what we're going to end up doing is turn

1083.66

this into a little bit of a

1086.72

self-referential function so that we'll

1089

be able to walk down low the

1091.66

basically walk the tree and get to the

1095.12

bottom load it and then work our way

1097.039

back up and I think that's going to stop

1099.799

this time I just want to get started

1101.66

we're definitely getting into some uh

1103.88

it's going to get pretty complicated

1105.38

fairly quickly

1107.299

but I want to step through it still in

1109.22

some bite-sized chunks and start with

1111.74

the idea that hey we're going to come in

1113.6

here

1115.52

and we're going to look at

1118.64

so if we look for this one

1123.44

we are going to look at what's

1124.88

referenced we're going to find first

1126.26

tables that do not have any references

1128.36

those are free we're ready to go with

1130.28

those but if they aren't then we're

1132.08

going to have to walk that walk that

1134.12

path walk down that tree and see where

1136.82

that goes and that's we're going to pick

1138.26

up next time so go out there and have

1140.12

yourself a great day a great week and we

1142.76

will talk to you next time

1145.88

foreign