📺 Develpreneur YouTube Episode

Video + transcript

SQL Sync Part 13

2022-09-27 •Youtube

Detailed Notes

Focus for this episode: We wrap up our first pass at index creation and synchronization.

This tutorial walks through the creation of a database synchronization tool in Python. It continues our Python and SQL Tutorials with a useful application that provides a deeper look at SQL structures and commands.

Repository For Code: git clone [email protected]:rbroadhead/dbsync.git

Transcript Text
[Music]
well hello and welcome back we're
continuing our series looking at sql and
python so combined into this database
synchronization thing
and last episode we wrapped up
we're getting into
multi-column indexes and looking at how
to sync those so for example
in this case we've got one it's called
horse underscore b underscore o
which is on the horses table and it ties
breeder and owner together
and so what we need to be able to do is
what we're going to want to be able to
create
is effectively
this line is we want to be able to do
create index give it the name on the
table and then we're going to give it
the two
what do i call it owner breeder i'm
sorry owner yeah
and we're going to give it the two
columns
now
in order to do so
what we're going to do and this is where
we
paused last time because we went a
little bit long
so we're going to do is we're coming
through each of the source indexes
we're going to grab the name which would
be horse underscore b underscore o
and then we've got this destinations
dest names
array
and we're going to say if it's not in
there
if the name is if we haven't already
found this
column
or this index
then we're keeping track of index names
so we're going to create an array called
index names which already exists up
there
so let's do it this way i should be able
to do it this way let's see
let's try it like that
and it doesn't like in a pin somewhere
there
um
list indices must be indexed oh
well let's do that
and if i do that then i should still be
okay
nope oh because it's a list indices so
if i do that
oh
which is fine because what we do is i'm
going to create this
and i'm going to add it okay i'm good
uh sorry it's a little bit sidetracked
there so
i've got this dictionary which is going
to be keyed off of the index name and
then for each of those i'm going to have
an array that is going to be
each of the
columns so i'm grabbing this is what
that idx row 4 is
or at index number 4 that is the column
name so what i'm going to do is then so
i can look here and i can see
when i'm done print out idx names and so
i can see here
that i have one
and it's got two different columns so
basically for each one of those i'm
going to do four name in idx names
and in this case
let's leave that one there for now what
i want to do is i don't want it to be
here what i want to do is for each of
the names in index names i am going to
do for
column name
in
name
so now this is that array
i am going to do uh let's do let's see
nate let's do
column
string let's do column list
oh i can probably do
uh i may be able to do this a little
easier because i can probably say take
uh so for each of the names i'm going to
do
i want to do
idx names
name
and let's see what that does
and let's convert that to a string i
don't know if it's gonna even like that
but let's see what happens because i
think what i want to do is actually do
that join
because i bet i can do that so if i do
this
uh here we go okay
so he's going to come in and he said
it's not quite what i want
so i want
i think i can just do
join
can i do that
let's see what that looks like
uh see join takes exactly one argument
let's try this
oops nope
let's go look up our join real quick
because i think that's what we want to
do so we're going to do a python join i
just forget how to do it right which
sometimes happens
okay so join is going to be oh
uh i do it something and then join it
okay
so i should be able to take um
comma
dot join this
let's see what that looks like
oh no it doesn't like that
um
interesting
so that's not quite going to get us what
we need so i'm going to have to do this
a little more interesting okay and we
have to go with my original approach
which is basically going to be okay so
then
uh for each of the names then i'm going
to do 4
call name in
whoops
idx names
well let's do this
call name let's just see what it does
because i think
it's not going to like that or it's not
going to care because think name is
yeah so it takes that whole
so it can't be a name it can be in
idx names
name
so the dictionary is not callable so it
is a dictionary object
so let's do
so
what i'm missing is something so let's
use python
iterate dictionary
i need to write through dictionary here
we go
oh so here's what we're going to do for
here's what we want to do
we want to do
for that dot keys
and it is a function yep okay so we're
gonna take for each of the keys
then we're gonna do uh for column name
in
idx names
and that's going to be
a name
so now
we should see this work
uh it is not because we're saying that
this is not callable so this is on
102.
so let's do a string
oops wait it's not that way come on
mistake
darn it that's not it
so what happens when you put the wrong
stuff in there so now let's try it
okay there we go
let's see so it came in and it gave us
two columns so
that's gonna be column name so we're
gonna do
um
let's do column let's do it this way
we'll build it out like we normally have
so column name
equals nothing
and then we're going to say if
column name equals nothing
then
column name equals
um
i call it column names
equals
column name
so if he's empty and we're good
there we go
else
then column name equals column names
plus
comma
plus column name oh and for each of
these
um
we want to go ahead and do our little
brackety thing so we're going to do this
and do a little
that just to protect our name and so
then
uh in that case he's gonna do that
plus
that
and then when we're done this is gonna
be this can be column names
and let's see how that looks
did i save it hopefully i did
there we go
so create index on there
like that let's see if that works
and so if we come through here and run
it
now we're going to do through each one
we're going to create it
and we're going to quit so let's see if
we can do that
we created the index so now if we go to
our local
and
tutorial
and we show indexes boom
so if we look here
this is our two row well let's just take
the whole thing
so if we look at the whole thing now
if we look at that
and then we go back and look at that
from our source
uh
here
we should see effectively the same thing
uh oops that's great i don't want to
create index
i want show indexes
and we're gonna see here
let's do it this way
so what we're going to see is
the exceptions or the cardinality
but we can see here oh but we can see
here there's a difference between a
primary versus an id
so we're probably going to want to look
into that but they're all b trees
uh there we've got yeses here for the
nulls
see the collation is a for everybody
column names are the same
index sequence index is the same
so we're good and then non-unique is the
same
so the one thing we may want to look at
is the
uh
primary uh is that right
yeah so key name being primary versus id
so we would just want to be able to
select our
what our primary keys are that we wanted
to set for indexes but now what we have
is we do have the ability to go in
and take any index that exists in our
source
we're now going to run through that and
that will take care of it so we can
actually run our whole database
i'm not going to worry about syncing
rows because
well let's do that
i don't think it matters that much so if
i do like up to ten thousand rows
and i sync it and then sync the columns
which
i probably want to send columns first
to be fair
and i probably want to do
indexes
last
because there may be situations where
there's something a little funky and
we've got to have the data correct
so let's see if that works
if i run that through
then it looks like
we are getting some inserts here
and hopefully it's not going to blow up
and we will see
that we went in
oh i've got this extra little print
stuff that i don't need
um so let's see those men index is
missing
uh where did i just start start miss so
if i go into missing indexes
let's clean that up i've got an extra
print somewhere in here
i don't need prior column because i'm
not using it
here we go index names
there we go
so now if i use it
it should run through
blow out all my data
and boom i'm ready to go so
i'm going to kick this off
i'm going to set it back to zero which
will now get all of my data
should all be synced including
everything um
all of my uh
you know indexes and stuff like that we
do want to look at
um i think i've got a couple that i want
to double check like i've got some
unique and a couple other indexes that
are worth double checking to make sure
that they get created properly
and we will run into those next time
around because i think this is a good
time to wrap this one up as our database
is getting loaded so i'll get to as
always you know you can check this out
there's links in the show notes so you
can hop on to github and take a look at
some of this code
um i don't think all the data's there
but it's you can poke it at whatever
pointed at whatever database you want to
use
and synchronize it to
ideally like a local
copy of it and see how that goes for you
there are other issues that we still
have to deal with we are not done with
the
complexities of doing such a thing
particularly
and i think we'll probably tackle this
next time
uh well it depends on how fast we can
get through
the indexes we'll probably do one more
take a look at indexes unique and such
and then um we're going to try to dig
into some of this some of the issues
that we can run into
when we limit our max rows and what can
cause problems there and some of our
cleanup
that being said we'll wrap this one up
so go out there and have yourself a
great day a great week
and we will talk to you
next time
you
Transcript Segments
1.35

[Music]

27.119

well hello and welcome back we're

28.72

continuing our series looking at sql and

31.76

python so combined into this database

34.16

synchronization thing

36.16

and last episode we wrapped up

38.64

we're getting into

40.76

multi-column indexes and looking at how

43.2

to sync those so for example

45.68

in this case we've got one it's called

48.32

horse underscore b underscore o

50.559

which is on the horses table and it ties

53.28

breeder and owner together

56

and so what we need to be able to do is

58.399

what we're going to want to be able to

59.68

create

61.28

is effectively

65.04

this line is we want to be able to do

66.72

create index give it the name on the

69.119

table and then we're going to give it

70.32

the two

71.6

what do i call it owner breeder i'm

73.2

sorry owner yeah

75.04

and we're going to give it the two

76.24

columns

78.88

now

79.92

in order to do so

81.84

what we're going to do and this is where

83.2

we

84.4

paused last time because we went a

86.159

little bit long

88.08

so we're going to do is we're coming

89.28

through each of the source indexes

92.64

we're going to grab the name which would

94.159

be horse underscore b underscore o

97.439

and then we've got this destinations

99.84

dest names

101.759

array

103.6

and we're going to say if it's not in

105.68

there

106.88

if the name is if we haven't already

108.32

found this

110

column

111.759

or this index

113.36

then we're keeping track of index names

115.119

so we're going to create an array called

116.56

index names which already exists up

118.96

there

121.68

so let's do it this way i should be able

123.119

to do it this way let's see

125.759

let's try it like that

131.599

and it doesn't like in a pin somewhere

133.84

there

135.599

um

150.56

list indices must be indexed oh

156.72

well let's do that

158.64

and if i do that then i should still be

161.04

okay

164.319

nope oh because it's a list indices so

166.879

if i do that

172.319

oh

173.28

which is fine because what we do is i'm

175.2

going to create this

176.72

and i'm going to add it okay i'm good

180.4

uh sorry it's a little bit sidetracked

182.319

there so

184.879

i've got this dictionary which is going

186.48

to be keyed off of the index name and

189.44

then for each of those i'm going to have

190.4

an array that is going to be

192.48

each of the

193.92

columns so i'm grabbing this is what

195.92

that idx row 4 is

198.56

or at index number 4 that is the column

201.12

name so what i'm going to do is then so

203.36

i can look here and i can see

205.44

when i'm done print out idx names and so

207.76

i can see here

209.36

that i have one

211.28

and it's got two different columns so

213.92

basically for each one of those i'm

215.44

going to do four name in idx names

220

and in this case

222.239

let's leave that one there for now what

224.08

i want to do is i don't want it to be

226.239

here what i want to do is for each of

228.959

the names in index names i am going to

231.68

do for

233.36

column name

235.28

in

236.959

name

238.799

so now this is that array

242.72

i am going to do uh let's do let's see

246

nate let's do

247.84

column

250.08

string let's do column list

254.959

oh i can probably do

261.68

uh i may be able to do this a little

263.36

easier because i can probably say take

265.68

uh so for each of the names i'm going to

268.08

do

273.84

i want to do

275.68

idx names

281.44

name

287.04

and let's see what that does

288.8

and let's convert that to a string i

290.32

don't know if it's gonna even like that

291.36

but let's see what happens because i

292.479

think what i want to do is actually do

294.08

that join

296.639

because i bet i can do that so if i do

298.16

this

299.44

uh here we go okay

302.32

so he's going to come in and he said

303.919

it's not quite what i want

306.479

so i want

309.199

i think i can just do

310.8

join

314.32

can i do that

315.6

let's see what that looks like

318.24

uh see join takes exactly one argument

325.12

let's try this

329.919

oops nope

331.44

let's go look up our join real quick

332.96

because i think that's what we want to

334.16

do so we're going to do a python join i

335.84

just forget how to do it right which

337.52

sometimes happens

341.6

okay so join is going to be oh

345.68

uh i do it something and then join it

350.16

okay

351.36

so i should be able to take um

354.4

comma

355.919

dot join this

363.039

let's see what that looks like

366.319

oh no it doesn't like that

368.639

um

373.84

interesting

381.199

so that's not quite going to get us what

382.96

we need so i'm going to have to do this

384.319

a little more interesting okay and we

385.84

have to go with my original approach

387.6

which is basically going to be okay so

389.12

then

390.24

uh for each of the names then i'm going

391.6

to do 4

393.6

call name in

396.08

whoops

403.28

idx names

405.84

well let's do this

410.88

call name let's just see what it does

418.08

because i think

419.599

it's not going to like that or it's not

421.36

going to care because think name is

432.24

yeah so it takes that whole

437.039

so it can't be a name it can be in

442

idx names

445.84

name

451.44

so the dictionary is not callable so it

453.68

is a dictionary object

465.12

so let's do

475.12

so

475.84

what i'm missing is something so let's

477.36

use python

478.479

iterate dictionary

487.36

i need to write through dictionary here

488.56

we go

494.8

oh so here's what we're going to do for

499.12

here's what we want to do

501.12

we want to do

502.879

for that dot keys

506.96

and it is a function yep okay so we're

509.28

gonna take for each of the keys

511.199

then we're gonna do uh for column name

514.159

in

515.919

idx names

520.959

and that's going to be

522.56

a name

525.36

so now

526.56

we should see this work

529.36

uh it is not because we're saying that

531.44

this is not callable so this is on

534.88

102.

546

so let's do a string

554

oops wait it's not that way come on

555.6

mistake

556.8

darn it that's not it

559.36

so what happens when you put the wrong

561.519

stuff in there so now let's try it

563.519

okay there we go

565.12

let's see so it came in and it gave us

566.88

two columns so

569.68

that's gonna be column name so we're

571.04

gonna do

572.88

um

576.32

let's do column let's do it this way

578.399

we'll build it out like we normally have

579.68

so column name

581.36

equals nothing

583.6

and then we're going to say if

587.839

column name equals nothing

591.44

then

592.8

column name equals

598.56

um

600.88

i call it column names

604.32

equals

606.079

column name

609.76

so if he's empty and we're good

612.64

there we go

614.399

else

621.04

then column name equals column names

626

plus

627.04

comma

628.56

plus column name oh and for each of

630.079

these

632.48

um

633.519

we want to go ahead and do our little

636.399

brackety thing so we're going to do this

642

and do a little

643.279

that just to protect our name and so

645.839

then

649.04

uh in that case he's gonna do that

651.68

plus

652.959

that

655.36

and then when we're done this is gonna

656.72

be this can be column names

661.76

and let's see how that looks

663.839

did i save it hopefully i did

665.68

there we go

667.519

so create index on there

671.36

like that let's see if that works

674.56

and so if we come through here and run

676.8

it

679.839

now we're going to do through each one

681.2

we're going to create it

682.56

and we're going to quit so let's see if

683.839

we can do that

686.64

we created the index so now if we go to

689.68

our local

694.079

and

695.76

tutorial

698.48

and we show indexes boom

701.04

so if we look here

702.64

this is our two row well let's just take

704.56

the whole thing

706

so if we look at the whole thing now

708.959

if we look at that

711.2

and then we go back and look at that

712.72

from our source

716

uh

717.68

here

718.88

we should see effectively the same thing

721.76

uh oops that's great i don't want to

723.519

create index

726.24

i want show indexes

728.16

and we're gonna see here

730.399

let's do it this way

735.04

so what we're going to see is

737.36

the exceptions or the cardinality

741.279

but we can see here oh but we can see

743.279

here there's a difference between a

744.32

primary versus an id

746.88

so we're probably going to want to look

747.92

into that but they're all b trees

750.399

uh there we've got yeses here for the

752.56

nulls

754.88

see the collation is a for everybody

757.92

column names are the same

761.519

index sequence index is the same

765.519

so we're good and then non-unique is the

768.399

same

770

so the one thing we may want to look at

771.44

is the

773.2

uh

774.959

primary uh is that right

777.92

yeah so key name being primary versus id

781.68

so we would just want to be able to

783.12

select our

784.959

what our primary keys are that we wanted

787.12

to set for indexes but now what we have

789.6

is we do have the ability to go in

792.399

and take any index that exists in our

794.48

source

795.76

we're now going to run through that and

798.16

that will take care of it so we can

799.519

actually run our whole database

802

i'm not going to worry about syncing

803.2

rows because

805.36

well let's do that

807.839

i don't think it matters that much so if

809.44

i do like up to ten thousand rows

811.6

and i sync it and then sync the columns

815.519

which

816.639

i probably want to send columns first

819.68

to be fair

822.32

and i probably want to do

825.279

indexes

826.839

last

828.639

because there may be situations where

830.8

there's something a little funky and

831.92

we've got to have the data correct

834.32

so let's see if that works

836.32

if i run that through

838

then it looks like

840.56

we are getting some inserts here

843.76

and hopefully it's not going to blow up

850.079

and we will see

851.519

that we went in

852.88

oh i've got this extra little print

854.72

stuff that i don't need

856.8

um so let's see those men index is

859.6

missing

862.56

uh where did i just start start miss so

865.199

if i go into missing indexes

867.839

let's clean that up i've got an extra

869.6

print somewhere in here

871.6

i don't need prior column because i'm

872.959

not using it

875.199

here we go index names

878.079

there we go

879.839

so now if i use it

881.44

it should run through

884.32

blow out all my data

887.519

and boom i'm ready to go so

890.72

i'm going to kick this off

892.72

i'm going to set it back to zero which

894.32

will now get all of my data

896.48

should all be synced including

898.8

everything um

900.48

all of my uh

902.24

you know indexes and stuff like that we

904.48

do want to look at

906.8

um i think i've got a couple that i want

908.959

to double check like i've got some

910.48

unique and a couple other indexes that

912.72

are worth double checking to make sure

914.56

that they get created properly

916.72

and we will run into those next time

919.279

around because i think this is a good

920.959

time to wrap this one up as our database

923.12

is getting loaded so i'll get to as

925.279

always you know you can check this out

926.959

there's links in the show notes so you

929.6

can hop on to github and take a look at

931.519

some of this code

932.88

um i don't think all the data's there

934.56

but it's you can poke it at whatever

936.56

pointed at whatever database you want to

938.24

use

939.12

and synchronize it to

940.959

ideally like a local

942.56

copy of it and see how that goes for you

945.04

there are other issues that we still

946.88

have to deal with we are not done with

949.04

the

950

complexities of doing such a thing

952.16

particularly

953.44

and i think we'll probably tackle this

954.639

next time

956.32

uh well it depends on how fast we can

957.759

get through

958.88

the indexes we'll probably do one more

960.8

take a look at indexes unique and such

963.279

and then um we're going to try to dig

965.68

into some of this some of the issues

967.839

that we can run into

969.44

when we limit our max rows and what can

972.88

cause problems there and some of our

974.399

cleanup

975.36

that being said we'll wrap this one up

977.199

so go out there and have yourself a

978.48

great day a great week

980.639

and we will talk to you

982.8

next time

1000.72

you